Greetings from a world where…
the answer is simple — either you have money, or you have people [答案很简单——要么有钱,要么有人]
…As always, the searchable archive of all past issues is here. Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors).
Feature Translation: China’s Big 5 Foundation Model Companies contend for supremacy
Context: My NBA obsession reached its peak during the Miami Heat’s “Big Three” era, featuring Lebron James, Dwyane Wade, and Chris Bosh.
Two years ago, China’s large language model ecosystem was in a “hundred-model” melee phase. Now, a “Big Five” has emerged. This week’s AI Era[新智元] report (link to original Chinese) identifies China’s top five foundation model players — Alibaba, Bytedance, Stepfun [阶跃星辰], Zhipu, and DeepSeek — that have separated themselves from the rest. Let’s break down the scouting report on each.
Alibaba: Open Source King
Alibaba is the Chinese tech giant that has 1) invested the most in AI and 2) gone all in on open-source. On the first point, Alibaba will invest 380 billion RMB in AI R&D as well as cloud and AI-related infrastructure over the next three years — an investment scale that surpasses other Chinese tech giants.
As for the second point, by most metrics, Alibaba’s open-source series Qwen is the world’s top open source model (we covered this last October in ChinAI #284). Figure below shows that Hugging Face’s top ten open source models were all based on secondary development of Qwen.
Bytedance: Giant aircraft carrier returning to entrepreneurship
Most people have heard of Bytedance’s super product, Doubao (a ChatGPT-like conversational platform), with more than 100 million monthly active users. This was the first time I learned about its AI programming tool Trae (a competitor to Cursor), which interestingly, is powered by either OpenAI’s GPT-4o or Anthropic’s Claude-3.5-Sonnet.
Bytedance has tons of money, but it also has some intellectual firepower. Wu Yonghui, who heads its large model research team, was a former VP of Google DeepMind.
Stepfun [阶跃星辰]]: Low-key large model national team (firm)
This Shanghai-based company is the most low-profile of the Big Five (or, the Shane Battier of that Miami Heat team). AI Era labels it as a “national team” company because its core investors include Shanghai State-owned Capital Investment Co. and affiliated funds.
In industry circles, Stepfun has earned the nickname “Involution King1 in multimodal models.” It has ranked among the top 10 in Chatbot Arena and placed 1st in the Chinese evaluation platform OpenCompass. Its talent is also impressive. Chief scientist, Xiangyu Zhang, was one of the four co-authors of the ResNet paper, the most cited paper (across all fields) published in the 21st century.
Zhipu: Full-stack innovation, focusing on intelligent agents
Zhipu, incubated at Tsinghua University, is the first Chinese large model startup to prepare for an IPO, with a potential filing on the Hong Kong or Shanghai stock exchange by the end of the year.
It was an early entrant on AI Agent products. From the article: “In terms of intelligent agents, Zhipu proposed the concept of Phone Use and launched Agent products before OpenAI, and released the world's first L3 intelligent agent that integrates deep research and practical operations-AutoGLM Rumination.”
DeepSeek: Research-oriented, good preparation is the key to success
Have you heard of this one? Obviously, DeepSeek shocked the world with its R1 model.
Lost in all the discourse about what this meant for US-China competition was DeepSeek’s research orientation. Here’s how the article puts it: “Engineers are encouraged to improve efficiency from a research perspective without having to face financial pressure to realize profits.”
FULL SCOUTING REPORT: The Top 5 domestic large models contend for supremacy, a decisive battle in AGI
ChinAI Links (Four to Forward)
Should-attend: UC San Diego Book Talk
At least for the foreseeable future, this will be the last stop on my informal book tour: a lecture open to the public organized by the 21st Century China Center at the UC San Diego School of Global Policy and Strategy. It will take place at 5PM West Coast time on Thursday May 22. If you’re around San Diego, surf your way over and say hello.
Should-read: Advancing Department of Defense Test and Evaluation for AI and Autonomous Systems
I’m late to this report by Josh Wallin, published back in March as part of the CNAS AI Safety and Stability Project. Glad to see the emphasis on agile and iterative software development processes as a way to improve safety in ML-enabled military applications (I wrote about this in my recent TNSR article). There’s also fascinating case study of DARPA’s Air Combat Evolution Program, which demonstrates the importance of rapid, iterative cycles of software changes and operator testing.
Should-read: China's province most lacking in universities is frantically building junior colleges (in Chinese)
One of the biggest gaps in discussions about U.S.-China technological competition is the simple realization that education policy is technology policy. This NetEase DataBlog post analyzes Henan’s higher education structure, which is significantly biased toward junior colleges. Why is China’s third-most populous province unable to build new public universities? If there’s interest, I might do a full translation and analysis for a later issue.
Should-read: The Romance of Being Unreadable
Andrea Long Chu, Pulitzer-winning critic for Vulture, reviews Ocean Vuong’s new book The Emperor of Gladness. Chu’s writing is penetrating.
Thank you for reading and engaging.
These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is an Assistant Professor of Political Science at George Washington University.
Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).
Also! Listen to narrations of the ChinAI Newsletter in podcast format here.
卷王: the winner of the rat race of working harder for limited rewards — this term became popularized after a photo of a young man working on his laptop while riding a bicycle went viral on Chinese social media, earning him the nickname “The King of Involution.” Source: https://www.scmp.com/news/people-culture/article/3282001/are-you-digital-nomad-new-lexicon-reflects-rapid-china-growth-changing-social-realities
Jeff, can you shed more light on how Chinese AI stakeholders view the term AGI? CSET has a new report out, "Wuhan's AI Development: China's Alternative Springboard to Artificial General Intelligence (AGI)."
In my (admittedly limited) interactions, I find that the U.S. use of AGI is confusing to my Chinese Track II counterparts. They often refer to Generalized Artificial Intelligence (通用人工智能?), as opposed to Artificial General Intelligence (other than leaders within DeepSeek, who have used some variation of AGI).
Is this a distinction without a difference, or a distinction with a non-trivial difference? Thanks.
Enjoyed it was happy to see Bytedance as we covered him in Beijing as the Wealthiest in the City.