Greetings from a world where…
Department Q is good TV
…As always, the searchable archive of all past issues is here. Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors).
Checking in on 3 Chinese AI Safety Benchmarks
Have we done that thing where we lead off with a Chinese idiom? Not yet? Well, you have to do it. How about this one: Everything has a beginning but not always an ending [靡不有初,鲜克有终]. It’s a reminder to follow through and finish what you start.
For this week’s issue, I checked in on the latest updates to three key AI safety benchmarks developed by Chinese organizations. How it started:
SuperCLUE-Safety: the SuperCLUE group, a leading Chinese benchmarking platform, published its first safety benchmark in September 2023. In my initial coverage (ChinAI #237), I described the key categories of safety tests and provided the first round of evaluation results.
The China Academy of Information and Communications Technology (CAICT)’s AI Safety Benchmark: this government-affiliated think tank released its first AI safety evaluations in April 2024. For details on the test questions and model developers who participated, see ChinAI #261. This is a collaborative effort with the China Artificial Intelligence Industry Alliance.
OpenCompass Flames: Developed by the Shanghai AI Lab and Fudan University’s natural language processing group, this Chinese-language benchmark evaluates a model’s value alignment. They made their benchmark available in November 2023.
*Note: Two of these benchmark developers, CAICT and the Shanghai AI Lab, have been identified as among the “most promising Chinese AI Safety Institute counterparts.”
Unfortunately, based on my review of developments in these three AI safety benchmarks since their launch, follow-through is lacking. Let’s run through these in order. How it’s going:
SuperCLUE-Safety: I went to the SuperCLUE website. Since the original September 2023 release of the safety benchmark, they have posted updates to various other benchmarks on a weekly basis (sometimes more frequently). There have been zero posts about the safety benchmark. When scanning recent articles about this benchmark, I found one Sohu article that highlighted AndesGPT’s (an OPPO model) performance from September 2024. According to the SuperCLUE-Safety GitHub page, the last time this benchmark was updated — to include a new batch of models to test — was about 1.5 years ago in January 2024.
CAICT’s AI Safety benchmark: this is the clear standout in terms of regular updates. They release new evaluation results every quarter. In the Q4 2024 update, CAICT tested nine multimodal models (including Deepseek-VL-Chat) and seven large language models against various attack methods (e.g., prompt injection attacks). They do not publicly attribute scores to specific companies. One trend that stood out is the shrinking scope of their updates, at least compared to the relatively comprehensive coverage of the initial safety benchmark from April 2024 — which included topics such as “Appeal for power” and “Inclination for anti-humanity.” For instance, the 2025 Q1 benchmark update, published May 8, 2025, exclusively reports results related to hallucination.
OpenCompass Flames: The corresponding GitHub page was last updated on December 11th, 2023. The OpenCompass leaderboard still reports the results from back then.
I don’t want to overstate the case here. Abandoned is probably too harsh. For one, it may be that I missed some things in my scan. For instance, in December 2024 SuperCLUE published a separate AI security benchmark called DSPSafeBench, which is joint work with the Third Research Institute of the Ministry of Public Security. While it is not linked to the safety benchmark, it does include prompts that gauge resilience to jailbreak attacks. Second, in some cases, model developers can run their own evaluations on these benchmarks, without centralized coordination from the host organization. Maybe Chinese firms will end up relying on international AI safety benchmarks, instead of these efforts led by Chinese organizations.
So, at the end of the day, perhaps this post is really just a reminder to myself to follow through: to check in months and years after the initial hullabaloo.
ChinAI Links (Four to Forward)
Must-read: What China’s generative AI registration data can tell us about China’s AI competitiveness
With research assistance from Ruby Qiu, Kendra Schaefer’s Trivium post analyzes the 3,739 generative AI tools that are listed in China’s algorithm registry. Her preliminary analysis is very insightful; more importantly, she helps contextualize why this is just a valuable starting point for other research into China’s AI ecosystem:
“The fact that this data set exists is pretty incredible. Imagine having access to a definitive list of all public-facing generative algorithms operating in the US. But due to China’s rather heavy-handed governance of the online environment, we have this very robust tool we can use to assess the state of China’s AI ecosystem.”
Should-read: Concordia’s AI Safety in China Newsletter
Concordia’s AI safety newsletter continues to be a useful roundup of developments in China. I went back to this April 2024 issue to get background info on CAICT’s AI Safety Benchmark.
Two other Chinese articles I considered translating this week
We surveyed 12 public companies about the truth behind DeepSeek’s all-in-one servers: Some good details about the actual adoption of “all-in-one” servers (manufactured by Lenovo and other Chinese companies) that integrate DeepSeek models. For AItechtalk, Zhiqi “Erica” Zhao’s hit rate is very high. She was also the author of last week’s featured article.
A daughter spent two years creating her own AI father: This Perpetual Light Studio longform article shares Huang Dou’s efforts to construct an AI version of her dad, who passed away from cancer at age 51. One part of this journey: she collected memories of her father, translated them into English, and fed them into Character.ai to create an AI father.
Thank you for reading and engaging.
These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is an Assistant Professor of Political Science at George Washington University.
Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).
Also! Listen to narrations of the ChinAI Newsletter in podcast format here.