ChinAI #173: Language Understanding and Generation Evaluation (LUGE) by Baidu
Luge isn't just for the Winter OIympics
Greetings from a world where…
Abbott Elementary is the best show on television (and it might not even be close)
…As always, the searchable archive of all past issues is here. Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors).
Feature Translation: An Update on Baidu’s Open-Source NLP projects
Context: Baidu recently posted an update (in Chinese) on its “Qianyan (A Thousand Words)” initiative, a open-source dataset project for natural language processing, which was launched (jointly with the China Computer Federation and Chinese Information Processing Society of China) back in 2020. The initial ambition was to build more than 100 Chinese NLP datasets as a way to “accelerate the large-scale implementation of NLP.” The effort has expanded to a series of leaderboards and competition that help assess progress in a range of NLP tasks. These are collected at the LUGE site: Language Understanding and Generation Evaluation Benchmarks (千言中文开源数据集).
Takeaways:
This LUGE/Qianyan update cited a recent speech given by Minlie Huang, Tsinghua University professor, at a developers summit in December 2021, where he reviewed Qianyan’s progress in the past year
To construct these open-source datasets, Baidu has worked with 14 other organizations, including Tsinghua, Meituan, and the Chinese Academy of Sciences.
LUGE dataset downloads have increased by 134 percent from 2021Q1 to 2021Q3 and submissions to competitions and evaluations, I assume, have increased by 649% over that same time span (see image below).
Most submissions to LUGE’s technical evaluations come from universities and research institutes (57%); industry players account for 21%. Some of these evaluation tasks are jointly run at specific events: for instance, Baidu, Huawei, and Google recently teamed up for a workshop that evaluated progress in simultaneous translation.
To adapt to large language models as a new driving force in NLP development, Qianyan is trying to upgrade in three areas: 1) generality - can the model handle multiple subtasks at the same time? 2) trustworthiness - does the model have sufficient robustness, high interpretability, and factual consistency? 3) cross-modality - can the model understand language fused with voices and videos?
Open questions to consider going forward:
At Qianyan’s launch in 2020, Baidu announced it would build 100 Chinese NLP datasets in three years; it’s 2022 and there’s only 36 datasets available. How effective has Baidu’s effort to provide the infrastructure to scale NLP been?
Relatedly, are there ways in which Qianyan/LUGE could serve Baidu’s interest at the expense of the broader Chinese NLP field?
Many have noted that the Chinese NLP field pays less attention to governance and ethics issues. Will the trustworthiness plank of Qianyan’s upgrade bear on these issues?
ChinAI Links (Four to Forward)
Should-read: The Hurdle to Greater U.S.-China Understanding
Yiqin Fu, a PhD student at Stanford, wrote an insightful post about what types of takes dominate in the market for U.S.-China commentary. Includes a very nifty example about the market for translated texts.
Should-watch: Robots at Work in China
I really enjoyed this recent Stanford Digital Economy Lab seminar, where Robert Seamans of NYU’s School of Business presented on a paper about robot adoption and productivity in China. They find: “Chinese state-owned enterprises (SOEs) do not exhibit the same productivity boost as private firms when adopting robots. We also find some evidence that: (1) Chinese SOEs don’t appear to hire the appropriate human capital necessary to take advantage of investment in robots and (2) Chinese SOEs don’t appear to make the investments in complementary assets needed to obtain productivity improvements.”
Should-read: China Wants to Put Data to Work as an Economic Resource—But How?
For DigiChina, Qiheng Chen outlines how Chinese policymakers are looking to leverage data as a key factor to increase economic productivity. Chen brings out thorny issues related to incentivizing data sharing without oversharing, as well as balancing trade-offs between data security and technological development.
Should-read: Defense Department’s 2021 China Military Power Report and Defense Innovation
In Lawfare, Lauren Kahn provides an excellent analysis of the DoD’s 2021 China Military Power Report, highlighting various gaps between China’s vision of leveraging emerging technologies for military capabilities and the reality of bridging the gap between civilian and military development.
Thank you for reading and engaging.
These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is a postdoctoral fellow at Stanford's Center for International Security and Cooperation, sponsored by Stanford's Institute for Human-Centered Artificial Intelligence.
Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).
Any suggestions or feedback? Let me know at chinainewsletter@gmail.com or on Twitter at @jjding99