Discussion about this post

User's avatar
Yorwba's avatar

"Next, the capex gap reflects a difference in compute usage too. For example, according to MiniMax’s technical report announcing their M1 system, the model was trained for three weeks using 512 H800 chips at a total cost of ~$540,000. Song argues, “This proves a crucial point: competition among large models isn’t necessarily about ‘who is bigger,’ but can also be about ‘who uses computing power more intelligently.’”"

I want to stress that the cost is specifically for the reinforcement learning phase of MiniMax-M1. Your translation makes this clear, but I don't think everyone who reads these blog posts also reads the full translation...

A way to think about reinforcement learning is that it takes the raw material of a pretrained base model, which can simulate human-written text of various ability levels (both students asking for homework help as well as carefully-proofread reference solutions), and pulls out the part we're actually interested in (only correct solutions please!) by rolling the dice a lot of times and strengthening the model whenever it happens to produce good output.

This doesn't have to cost much, e.g. Weibo's VibeThinker-1.5B uses $7,800 for reinforcement learning on top of Alibaba's Qwen2.5-Math-1.5B as the base model and gets close to MiniMax-M1 on many benchmarks, so performance per dollar is much better: https://arxiv.org/abs/2511.06221v1

But ultimately the performance ceiling is determined by the breadth of abilities in the base model (can't reinforcement learn if there's nothing to reinforce), which costs significantly more to pretrain (it's unclear how much in the case of MiniMax-M1, though it might be possible to estimate from the token counts they report).

Overall, I'm not convinced that Chinese companies have an efficiency advantage or are using their computing power more intelligently. Rather, they're making the best of their smaller R&D budgets by training smaller models (or, as in the case of Weibo, building on top of small models released by other companies) and getting worse performance, but not, like, *proportionally* worse (It should be obvious that on a benchmark where the maximum score is 100%, performance can't be proportional to cost or you could take a model that scores 50%, spend three times as much, and get 150%, which is impossible.) This opens up a niche of "good enough" models that are cheaper than the current frontier models.

I think for a company to qualify as the "next OpenAI," they would have to abandon that niche, seriously extrapolate their cost-performance curves to the point where they expect OpenAI's next model to be, and then pay whatever the price to beat that ends up being. But that would be expensive and risky, so I don't expect many companies to make that bet.

Expand full comment
Grace Shao's avatar

Just some thoughts on this. I don’t think minimax will be China’s OpenAI, I think that’ll likely come from BAT (just wrote about it)

I heard that they’re meeting with investors in hk this week.. strong intent to ipo and that’s the narrative they’re putting out but 1) no distribution advantage 2) no money to push frontier (they’re saying they can reach 90% anthropic’s capability) 3) no consistency on strategy - multimodal game? Or leading LLM game? Or cost efficiency game?

Expand full comment
1 more comment...

No posts