ChinAI #249: China's idle AI computing centers
Beneath the glitz of new constructions, a neglected risk remains
Greetings from a world where…
making jambalaya with shrimp shells was the move
…As always, the searchable archive of all past issues is here. Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors).
Feature Translation: The dilemma of high idling rates in China’s computing center rush
Context: Accompanying the explosion of interest around large models in China this past year, there has also been a surge in the construction of intelligent computing centers (more than 30 Chinese cities are currently building or planning to build such centers). Yet, an AItechtalk [AI科技评论] report identifies a major but neglected issue with this rush: computing centers remaining idle or even shutting down due to irresponsible investments and insufficient demand.
We’re starting 2024 off with a banger. Let’s get into the details.
Key Takeaways: The rapid advance of AI has ushered in a “gold rush” for building intelligent computing centers.
Before the era of large models, intelligent computing centers were mainly used for training and running small-scale deep learning models.
In a post-ChatGPT world, these centers have to upgrade to the 10,000-GPU level (see image below for construction details and progress on ten company-led centers).1 Bigger computing centers provide scale benefits but they also cost a lot more: AItechtalk roughly estimates that operating a computing center at the 10,000-GPU scale requires an annual investment of ~1 billion RMB (~140 million USD).
With this new generation of intelligent computing centers, high idle rates become an even bigger issue because the operating and depreciation costs are so high.
Let’s work through some calculations. Most intelligent computing centers use Nvidia’s DGX A100 system. As the article states, “The cost of operating a DGX A100 system for one year includes depreciation of approximately 425,000 RMB and operating electricity cost of approximately 400,000 RMB, totaling approximately 825,000 RMB.” That’s how we get the above 1 billion RMB annual cost estimate, as a 10,000-card computing center would support about 1,250 servers (multiply that by the 825,000 number and you get around 1 billion).
Depreciation is a big deal, as hardware malfunctions can occur during long-term operation of a computing center. One detail from the article that really caught my eye: “A cloud computing industry professional told AItechtalk that he learned at a recent meeting that a well-known domestic AI chip may be completely scrapped after 30 days of training.” This matches up with some of my findings (Huawei’s efforts to train its PanGu model) in a GovAI report on China’s large language model landscape.
Another piece of evidence comes from high idle rates across China’s data centers in general. According to 2022 data, 42 percent of servers in data centers were not placed on racks and put into use.
Whether the idling problem can be resolved depends on the development of effective large-model applications.
According to this article, China has yet to see very impressive “large-model-native” applications. There’s been a remarkable number of large models trained, but very few successful applications launched based on these large models. This means the demand for intelligent computing centers is not necessarily sustainable; in other words, it is coming from training base models and not from “inference” (running applications).
In sum, this article’s insights underline the “implementation gap” for large language models (see also ChinAI #236), which forces us to look beyond just the flashy headlines of models being released and pay closer attention to how they are being used.
FULL TRANSLATION: The "hot" of large model computing power and "cold" thinking on the cost of 1 billion cards
ChinAI Links (Four to Forward)
Must-read: Asian American Officials Cite Unfair Scrutiny and Lost Jobs in China Spy Tensions
A New York Times report, by Edward Wong and Amy Qin, finds:
In the growing espionage shadow war between the United States and China, some American federal employees with ties to Asia, even distant ones, say they are being unfairly scrutinized by U.S. counterintelligence and security officers and blocked from jobs in which they could help bolster American interests.
The closing story cuts personal for me, as I also served in the State Department through a fellowship (the U.S. Foreign Service Internship program back in 2015 and 2016). It makes me wonder if I would have been rejected like Ruiqi, if I had applied today and not 8 years ago:
One China-born American, Ruiqi Zheng, 25, said the State Department told her she would be denied a security clearance even though she had begun a selective fellowship there. After a clearance process lasting almost two years, she was rejected in 2021 because of ties to family members and others abroad, she said.
“Everyone I knew told me that it was too good to be true, that America would never accept foreign-born Chinese Americans like me,” she said. “But I chose to trust the process.”
Should-read: Best Investigative Stories about China, Hong Kong, and Taiwan
One of my favorite reads of the year is Global Investigative Journalism Network’s round-up of the best investigative reporting about China, Hong Kong, and Taiwan. Compiled by Joey Qi, who was one of the founding members of The Initium Media.
Should-read: China’s game plan for the AI race
For a/symmetric, a thought-provoking take on the competitive dynamics of the global AI race by Mary Hui, a journalist and analyst covering China’s industrial strategies. Her analysis highlights the length and complexity of the “AI industrial chain.”
Should-check out: Some random content I enjoyed in 2023 that I hope more people try out!
*I mean let’s be real, if you’re not getting your content recommendations from the nerd that follows China’s AI developments, who else would you get them from?
Showing Up (directed by Kelly Reichardt): a funny, very quiet, brilliant movie about two art school students and the process of creating things for the world.
Scavengers Reign: animated sci-fi TV series that builds an incredibly imaginative, captivatingly haunting world.
Joy Oladokun’s Proof of Life album.
Thank you for reading and engaging.
These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is an Assistant Professor of Political Science at George Washington University.
Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).
Also! Listen to narrations of the ChinAI Newsletter in podcast format here.
Any suggestions or feedback? Let me know at chinainewsletter@gmail.com or on Twitter at @jjding99
This table has been corrected. In the initial version, the Geely Xingrui intelligent computing center was listed as supporting 8.1 billion FLOPS; thanks to an anonymous reader who dug up additional sources, this figure has been corrected to 810 PFLOPS.
It may very well be, that China has a different approach than "just in time" only.
We know this from the building construction sector. A great number of vast buildings was unoccupied at first and western media was "reporting" on this "problem" and used it as prediction for coming collapse and for failure in planning. But nothing bad happened. Slowly at first and then ever faster the buildings where occupied. This desert turned into another oasis. The "reports" about that success where not written in the West.
Now, as everybody knows, there is a chip-war going on between the West and China. It is getting harder and nastier by the day. The USA knows no limits here. Wouldn't it be prudent to have some resources in reserve, to prepare for the expectable? To have some seemingly "dead capital" standing around is a bit unpleasant. But to have no reserves when the next big blow in the chip war is being dealt, is much more than unpleasant, it can proof to be disastrous.
China knows pretty well, that the West is everything but reliable and responsible.
"Best Investigative Stories about China, Hong Kong, and Taiwan, from.. Global Investigative Journalism Network”.
Kidding, right?
The article you cite begins thus, "The year 2023 was another challenging one for investigative journalism in China. In this year’s World Press Freedom Index released by Reporters Without Borders, China ranked second to last among 180 countries, just ahead of North Korea. The level of information censorship and manipulation by the Chinese authorities has reached a highly sophisticated level".
Both "The World Press Freedom Index" and RSF are funded by the APA, the American Publishers Association which, among other discoveries, found that only 6% of Americans trust their media.
Clearly, 'freedom' in this context means 'freedom to lie".
Meanwhile, 82% of Chinese trust their government media. The highest on earth.