Discover more from ChinAI Newsletter
ChinAI #219: Ernie Bot vs. GPT-4
A six-round contest between the two chatbots
Greetings from a world where…
it’s hard to stay sucka free in a world full of lollipops
…As always, the searchable archive of all past issues is here. Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors).
Feature Translation: Forced to take the field? We comprehensively tested Ernie Bot
Context: There’s some English-language coverage of Ernie Bot (Wenxin Yiyan), which is Baidu’s competitor to OpenAI’s ChatGPT. From what I’ve read, most people writing and commenting about Ernie Bot (including me!) have not actually been able to play around with it because access is limited and Baidu’s API requires a Chinese phone number. Fortunately, other people have done that work: this week’s feature translation tackles a Bilibili video by Chapingjun [差评君], a popular Internet technology blogger who got access to Ernie Bot and tested it against GPT-4/ChatGPT on six different capabilities:
Continuous dialogue capabilities
Since it was posted on March 17, 2023, it has been viewed 954,000 times (link to Bilibili video).
Key Takeaways: The video concludes: “Overall, as we demonstrated earlier, Ernie Bot was essentially beaten by GPT-4 in all aspects.” As the screenshot below shows (7:40 mark), ChatGPT performed better on all six tasks (tied 1:1 with Ernie Bot on math, as both models struggled with the math problems).
Let’s start with the semantic understanding test. In other words, can these chatbots understand the context and meaning that connects language and the real world?
First question: “Dad and Mom get married, does this count as inbreeding/intermarriage? 爸爸妈妈结婚，那算近期结婚吗? (All prompts and responses were in Chinese for both Ernie Bot and ChatGPT).
Ernie Bot’s response: China’s law stipulates that direct blood relatives and blood relatives within three generations are prohibited from marrying. In other words, marriage is prohibited between close relatives such as cousins, uncles, etc. And from a genetic point of view, intermarriages will increase the risk of some common genetic diseases, such as congenital dementia, albinism, heart disease, etc. Therefore, for the health of future generations, it is recommended that people try to avoid intermarriages.
Chapingjun [差评君] comments: The first contestant didn’t really give out the answer we wanted. Instead, it responded to whether China’s legal system restricted marriage between close relatives. This feels a little out of place.
ChatGPT: Dad and Mom refer to the biological parents. They are not related by blood, so their union is not consanguineous. Inbreeding is the marriage of people who are closely related by blood. . .
The video concludes: “Its response was more reasonable, pointing out that Dad and Mom were not blood relations before marriage, and it also explained what constituted marriage between close relatives. So on the first test, ChatGPT comes out ahead.”
Here’s the transcript from the section on coding:
Next, let’s try to test Ernie Bot’s coding capabilities. After all, this is one of the most practical abilities of ChatGPt that has previously gained recognition.
How do we put this…this code just doesn’t look correct. So we used a clever method of comparison.
We sent Ernie Bot’s code to ChatGPT and asked it to evaluate the code to see if there were any issues. ChatGPT not only pointed out the issues in the code and helped us fix the code, but it also said that even though it fixed the code, this doesn’t look like the 2048 game. This code does not exhibit the logic of the 2048 game. Uhh, this… is murder of the heart...ah. I think there’s no need to continue the code competition any further.
Ernie Bot also struggled with personality simulation:
Next, we tried out role-playing tasks. This is a unique skill of ChatGPT. It has no issue acting like a kitty cat. As long as you are willing, it can constantly talk with you like a kitty cat.
As for Ernie Bot? Sorry. By the second message, it had already forgotten that it was a kitty cat. I also tried to have it act like my mom. The emotion was missing. Its acting was stiff. The difference was truly a little too big.
I translated up through the 5:15 mark of the 10-min. video. If you want to help translate the rest, leave a comment in the Google doc. It’s good practice for your Chinese and technical knowledge on large language models!
For more details on the six rounds of tests, see FULL TRANSLATION: Forced to take the field? We comprehensively tested Ernie Bot
ChinAI Links (Four to Forward)
I’m late to this Carnegie Endowment for International Peace article, written by Matt Sheehan and Sharon Du in December 2022, which gives impressive details on China’s algorithm registry. Matt and Sharon closely examined the algorithm registry’s user manual, which shows the type of information that Chinese regulators are gathering:
The most detailed requirements for disclosure in the manual come via a screenshot showing the page for disclosing “Detailed Algorithm Attribute Information.” Here it asks that algorithm providers list the name of each open-source and self-built data set that was used to train the model, as well as the specific source of that data. In addition, it requires the provider to state whether algorithm inputs involve biometric or other personal information.
Should-read: Here It Comes
I’ve enjoyed catching up on Graham Webster’s newsletter, which shares his thinking and news on “three interrelated areas of unstoppable change”: transformation in US-China relations; technology governance in China in the midst of a changing regulatory regime and geopolitical landscape; and climate mitigation and adaptation.
I really enjoyed this piece by Zeyi Yang, for MIT Technology Review, on how artists like Zhang Haijun are using AI programs to generate nostalgic photos of China. Zhang “grew up near the Chongqing Iron and Steel Company, one of the oldest and largest steel factories in China, and remembers watching the workers when he was about seven years old. ‘When I was little, I would often watch them come out of the factory during their break, sit on the ground, smoke a cigarette, and look into the distance. There were stories in their eyes,’ he says.”
Zhang’s process: “Each time, he writes the prompts in Chinese, uses machine translation tools to convert them to English, feeds them into Midjourney, and spends about 20 minutes tweaking them to get the ideal result.”
If you’re at Yale, this looks like a fantastic event on April 12. The blurb: “Regulation of transnational data flows has become one of the most contested issues in US-China relations. Join us for a discussion on this pressing topic with Karen Hao, China tech and society reporter for the Wall Street Journal, and Samm Sacks, Senior Fellow at Yale Law School's Paul Tsai China Center.”
Thank you for reading and engaging.
These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is an Assistant Professor of Political Science at George Washington University.
Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).
Any suggestions or feedback? Let me know at email@example.com or on Twitter at @jjding99