ChinAI #284: Alibaba Qwen2.5, the world's No. 1 open source model?
Plus, detailed notes on the fifty-year effort to improve coal mine roof safety
Greetings from a world where…
my Hawkeyes have been outscored 270-51 by their last 8 ranked foes
…As always, the searchable archive of all past issues is here. Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors).
Feature Translation: Why is (Alibaba’s) Tongyi the most popular open-source large model
***Thanks everyone for voting in last week’s Around the Horn! It was a nail-biter, but option #3 ended up carrying the day. Don’t fret, though: a lot of you voted for option #6, and we’ll most likely have a contributor translate that one for next week.
Context: In August 2023, Alibaba announced that it would open-source its Tongyi Qianwen large models. Since then, Alibaba’s opens source models (the Qwen series) have registered impressive results, even surpassing Meta’s Llama series — the standard-bearer for open-source large models — on a variety of benchmarks (see screenshot below). How did we get here? This AItechtalk article (link to original Chinese), authored by Jin Zhang, gives the breakdown.
Key Takeaways: In the past year, there has been a wave of Chinese open source AI models, with Zhipu, Baichuan, DeepSeek, and Alibaba all launching very strong models.
First, it’s important to establish why open-source large models offer value in a world of strong closed models like OpenAI’s GPT series. The fundamental reason is that they allow individual developers and small and medium-sized firms to fine-tune and deploy their own specialized models at lower costs (instead of training their own large model from scratch). This path was pioneered by global open-source projects such as Llama, Mistral, and Falcon.
The article paraphrases Xiaohu Zhu, managing partner of Jinshajiang Venture Capital: “Domestic open source models are no worse than closed source models, especially Alibaba's Tongyi Qianwen. Many startups use Tongyi open source models to train their own vertical models.”
Based on their analysis of Hugging Face data, Aitechtalk found that developers around the world have trained more than 50,000 “derivative” models (see image below) based on the Qwen series, which is second only to the Llama series (~70,000). According to Github descriptions, many strong large model companies like ModelBest [面壁智能] deploy models founded on Qwen.
Alibaba claims that its flagship open-source model (Qwen2.5-72B), released in September 2024, is the strongest in the world.
Citing the first figure with all the benchmarks above, Jin relates: “Qwen2.5-72B defeated the Llama3.1-405B-Instruct model in 8 of the 14 key benchmarks, and defeated Mistral's latest open source Large-V2-Instruct model in 11 (benchmarks).”
These results are particularly noteworthy because this contest pits Qwen against a Llama model that is an order of magnitude larger in terms of parameter size. From the article, “However, although Llam3.1-405B is powerful, the model parameters are too large and the hardware requirements for deployment are sky-high. For individual developers and small and medium-sized enterprises with limited budgets, it is out of reach.”
Okay, let’s get really into the weeds. The Qwen2.5 large language model is available in seven different sizes (denoted by the number of parameters): 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. As the article states, “For example, 3B is the golden size for end-side devices such as mobile phones. The industry believes that this 3B-4B size means that the model can be quantized to a volume of 2G, which is very suitable for mobile phones…32B is the "bang-for-buck king" that is most anticipated by developers, as the size that can achieve the best balance between performance and power consumption.”
FULL TRANSLATION: Why is (Alibaba’s) Tongyi the most popular open-source large model
JeffJots/ChinAI Link (One to Open)
Must-read: The Canary - Michael Lewis on Chris Mark of the Department of Labor
I’ve been wanting to take careful notes on this longform piece for a while now, and I finally had time this weekend to properly digest this incredible piece of writing. A must-read for anyone interested in the intersection of technology and politics.
Lewis motivates the piece with an overview of the Sammies awards handed out by the Partnership for Public Service, which aims to recognize civil servants — “the carrots in the third-grade play” who do the work without caring about who gets the credit. Their nominations for the Sammies were so modest that Lewis struggled to pick out someone to profile, at least until he came across this paragraph, with four little words:
Christopher Mark: Led the development of industry-wide standards and practices to prevent roof falls in underground mines, leading to the first year (2016) of no roof fall fatalities in the United States. A former coal miner.
Chris Mark’s father, Robert Marks, was a renowned Princeton professor of civil engineering and architecture. Prof. Marks had discovered that certain structures in 12th century Gothic cathedrals were not just aesthetic features but structural ones that prevented their roofs from collapsing.
Chris took a different path and joined a group to strengthen unions, which led him to a coal mine in West Virginia. This decision alienated this father. Lewis writes: “When his father told a friend of the curious path that Chris had put himself on, the friend had said, ‘You must be so proud of him.’ To which Robert replied, ‘I’d be proud of him if he was your kid.”
After working in a mine where people smoked inside the mine even though it upped the odds of a methane explosion, Chris enrolled at Penn State to study mining engineering. There, he realizes that no one had figured out which formula was best suited to design pillars that support the roofs of coal mines. Having experienced roof collapse before, Chris knew that the stakes were high. As he was thinking through his PhD thesis topic, on December 19th, 1984, a roof collapse inside a Utah coal mine, which led to the death of 27 people.
Lewis writes:
His political interest in workers’ rights was morphing into a technical interest in their safety. Coal mining had long been the most dangerous job in the United States. At the height of the Vietnam War, a coal miner was nearly as likely to be killed on the job as an American soldier in uniform was to die in combat, and far more likely to be injured. (And that didn’t include some massive number of deaths that would one day follow from black lung disease.) Up to that point in the 20th century, half of the coal miners who had died on the job — roughly 50,000 people — had been killed by falling roofs. In his classes at Penn State, Chris saw at least one reason for that: The coal mining industry had learned to see the problem as the cost of doing business.
Solving this coal safety problem was uniquely difficult because each mine was unique. This meant that Chris’s challenge was very different from his father’s:
Preventing the roof from collapsing inside a coal mine was less like analyzing the stresses inside a Gothic cathedral than building one from scratch. There was only one way to do it: trial and error. “The science wasn’t there,” said Chris. “It didn’t have a clear mathematical solution or a way to get one.”
As he worked toward his PhD, he joined the U.S. Bureau of Mines, which was created by federal legislation in 1919 after a West Virginia mine explosion killed 362 coal miners.
At the start, much of what Chris did in his new job felt like bricolage. He took data gathered by others and work done by others and repurposed it to his narrow problem. His immediate goal was to create for the pillars inside the tunnels of longwall mines the equivalent of what engineers call a safety rating…By 1994, Chris had figured out how to rate any coal mine roof, on a scale of 1 to 100. He’d created new understanding of the effects on roof strength of various properties of rock masses: the thickness of the sedimentary layers, their sensitivity to moisture, their response to being whacked by a ball-peen hammer, and so on. He’d reduced these to a checklist that any coal mine engineer anywhere in the world could use to evaluate his roof and know just how much support it required. And then he’d traveled to coal fields across the United States to personally deliver to mining engineers the new knowledge, in the form of software he’d written. “Technology transfer has always been central to what I do,” he said. “If you don’t transfer it, you’re just wasting taxpayers’ money.”
Chris didn’t stop there. Next, he noticed that workers were getting injured even in mines where the pillars kept roofs stable, due to small rock pieces falling between the pillars. “He’d been so focused on the bullets that killed that he hadn’t noticed the bullets that usually just wounded.” Roof bolts are designed to solve this issue by using a stronger rock to pin a weaker rock into place. Chris deems them “the single most important technological development in the field of ground control in the entire history of mining.” How did roof bolts shape coal mine safety? Lewis continues:
The standard story — the story accepted by the coal mine industry — was that new technology had led inexorably to greater safety. What had happened was far more interesting — and told you how this little American subculture worked, rather than the way economists who had never seen the inside of a coal mine might imagine that it worked. Roof bolts were indeed more efficient and effective than timber supports in preventing chunks of roof from wounding miners. But they were expensive to install. The coal mine companies had, in effect, figured out how few roof bolts they needed to use to maintain the same level of risk their miners had endured before their invention…
And so, amazingly, for the first 20 years of its use, the main effect of the most important lifesaving technology in the history of coal mining was to increase the efficiency of the mines while preserving existing probabilities of death and injury. Taking advantage, essentially, of people conditioned to a certain level of risk by failing to ameliorate that risk.
In Chris’s account of “The Fifty-Year Effort to Eliminate Roof Fall Fatalities from U.S. Underground Coal Mines,” he attributed half of the progress to better technology and new knowledge and the other half to changes in the culture of coal mining (mostly spurred by federal regulations that empowered mine inspectors to enforce safety rules). When someone else tried to tell this story differently, he somehow found himself finally following his father’s footsteps:
In 2016 — the first year in recorded history that zero underground coal miners were killed by falling roofs — Chris landed in a public spat. He’d seen an article by an economic historian about the history of roof bolts in the journal of Technology and Culture. The historian wanted to argue that roof bolts had taken 20 years to reduce fatality rates because it had taken 20 years for the coal mining industry to learn to use them. All by itself, the market had solved this worker safety problem! The government’s role, in his telling, was as a kind of gentle helpmate of industry. “It was kind of amazing,” said Chris. “What actually happened was the regulators were finally empowered to regulate. Regulators needed to be able to enforce. He elevated the role of technology. He minimized the role of regulators.”
To set the record straight — and maybe also to start a fight with an academic he was bound to win — Chris wrote a long and debate-ending letter to Technology and Culture. As it happened, he knew the journal well. His father had been its editor.
Thank you for reading and engaging.
These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is an Assistant Professor of Political Science at George Washington University.
Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).
Also! Listen to narrations of the ChinAI Newsletter in podcast format here.
Any suggestions or feedback? Let me know at chinainewsletter@gmail.com or on Twitter at @jjding99