ChinAI #199: China's Hugging Face?
ModelBest's quest to industrialize large models
Greetings from a world where…
mercury is no longer in retrograde which makes it the ideal time to add a paid ChinAI subscription as a business expense
…As always, the searchable archive of all past issues is here. Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors).
Feature Translation: A Group of The Three-Body Problem Fans Bet on the Next Decade of the AI Industrial Revolution
Context: Thanks for voting in last issue’s Around the Horn. It was a close race with #8, but the first article on the industrialization of foundation models came out on top (I was secretly cheering for this one)! What happens after large models are trained? Our entry point is ModelBest (面壁智能), a start-up that aims to popularize and industrialize large models. This comes from jiqizhineng, which publishes longform articles about China’s AI landscape (link to original Chinese).
Key Takeaways: Zeng Guoyang, co-founder and CTO of ModelBest, is frustrated with the adoption rate of AI technologies
Zeng, who gained admission to Tsinghua University due to his academic record and accomplishments in high school (exempted from taking the gaokao exams via the baosong system), thinks that the productivity gains linked to AI have been impressive but not revolutionary, at least when compared to other general-purpose technologies like electricity.
“For every project and every specific scenario, a very skilled AI engineer must be dispatched to do the adaptation, and the cost is too high,” Zeng tells jiqizhineng.
Here’s the cost breakdown: To begin, an algorithm engineer’s labor to design the model and then train the model around a specific business applications. For training even basic NLP models, thousands of data records are typically required (along with additional labels, sometimes). Per back-of-the-napkin calculations by a Caijing reporter, slightly more complex models require more sample data (~200,000 records). The time and costs devoted to data collection, cleaning, labeling, and enhancement account for more than 50% of the entire development cycle.
ModelBest is betting on the fact that large models — specifically, the approach of general pre-trained language model & fine-tuning on training data tailored to a specific downstream task — have changed the game.
Zeng says, “The data costs have been significantly reduced. In the past, thousands of pieces of data were the threshold. Now, hundreds or even dozens of pieces of business data may achieve the same performance effect.”
Plus, there’s the human capital angle. Per the article: “Even if the team lacks algorithm engineers with a NLP background, there is no need to recruit people to expand to new business scenarios, and the large model can output general NLP capabilities.”
Chinese tech giants like Baidu, Huawei, and Alibaba have all released large language models, so what makes ModelBest different?
Rather than optimizing for models with larger parameters, they are releasing open-source tools to make large models cost-effective to enable use by small businesses, students, and governments.
One specific example: The BMInf toolkit. Previously, to run (not train) a large model on the scale of ~20 billion parameters, you would need a V100 (Nvidia chip), which costs around 50,000RMB. The goal with BMInf is to lower the barrier for implementing these models to a 1060 (Nvidia graphics card that costs ~1,000RMB.
Another open-source project (OpenPrompt) won the Best Demo of a top NLP conference (ACL) in 2022. For reference, Hugging Face’s renowned transformers library won Best Demo in another top NLP conference (EMNLP 2020).
The point here is not to overhype this particular start-up. It’s only raised an angel round of financing. The Hugging Face comparison makes for good newsletter headlines but is probably premature. Adding up all of ModelBest’s open-source projects gives 58,000 stars on Github, while Hugging Face’s transformers project alone has 71,000 stars. Rather, the goal is to shed some light on the long road ahead for the diffusion of large language models. For more details on that road, see:
ChinAI Links (Four to Forward)
Should-read: Hugging Face Primer
Jeff Burke had a good Twitter thread on Hugging Face that helped me get some context for this week’s translation and analysis.
Stephen Nellis, Karen Freifeld, and Alexandra Alper report for Reuters:
The Biden administration published a sweeping set of export controls on Friday, including a measure to cut China off from certain semiconductor chips made anywhere in the world with U.S. equipment, vastly expanding its reach in its bid to slow Beijing's technological and military advances.
By Yifan Wei, Yuen Yuen Ang, and Nan Jia, forthcoming in The China Quarterly:
In 2005, the Chinese government deployed a new financial instrument to accelerate technological catch-up: government guidance funds (GGFs). These are funds established by central and local governments partnering with private venture capital to invest in state-selected priority sectors. GGFs promise to significantly broaden capital access for high-tech ventures that normally struggle to secure funding. The aggregate numbers are impressive: by 2021, there were more than 1,800 GGFs, with an estimated target capital size of USD 1.52 trillion. In practice, however, there are notable gaps between policy ambition and outcomes. Our analysis finds that realized capital fell significantly short of targets, particularly in non-coastal regions, and only 26 per cent of GGFs had met their target capital size by 2021. Several factors account for this policy implementation gap: the lack of quality private-sector partners and ventures, leadership turnover, and the inherent difficulties in evaluating the performance of GGFs.
A new policy brief from Brookings (team of Jessica Brandt, Sarah Kreps, Chris Meserole, Pavneet Singh, and Melanie W. Sisson) that lays out reasonable policy recommendations for how the U.S. could compete with China in AI.
Thank you for reading and engaging.
These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is an Assistant Professor of Political Science at George Washington University.
Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).
Any suggestions or feedback? Let me know at email@example.com or on Twitter at @jjding99