ChinAI #149: Around the Horn (edition 4)

Greetings from a world where…

kids do yoga on paddle boards

…As always, the searchable archive of all past issues is here. Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors).

Around the Horn (4th edition)

A refresher on how this works (previous edition in ChinAI #117):

  • short preview of 10 articles related to ChinAI — all published this past week and sourced from scans of WeChat accounts and groups

  • reply to the email and/or comment on the Substack post with the number of the article you’re most intrigued by, which will be translated next week!

  • some added weight to votes from subscribers that are supporting ChinAI financially

1) Graduate from a 985 (elite) university and end up rolling cigarettes: Is it that graduate students have lost their ideals? Or…

Summary: Graduation season has sparked intense online discussions about underwhelming employment outcomes for graduates from prestigious schools, like Renmin University where the author of this article works. For example, one headline reads: “Renmin University and Wuhan University graduates enter the factory to roll cigarettes, over 30% of the workers on the assembly line are graduate students.” What’s the deal? This article aims to provide some clarity.

Source: Yongmou Liu (刘永谋) — philosophy professor at Renmin University, where he’s the chair of the Department of Philosophy of Science and Technology

2) Let me tell you about a real digitization process

Summary: Qin Shuo has spent the past couple years investigating what actually occurs in the “digital transformation” process, surveying and visiting dozens of companies. He discusses some of the existing barriers and manifold opportunities in China’s ongoing digitalization.

Source: Qin Shuo’s WeChat Moments (秦朔朋友圈) — A WeChat account started by Qin Shuo, former editor-in-chief of China Business News.

3) The Golden Age of Mobile Internet is Really Over

Summary: This essay ties together various factors to support its case, including saturation of the market/diminished returns, as well as recent anti-monopoly crackdowns. The point I found most interesting: the author argues that the mobile Internet era absorbed capital from more high-value-added industries, included industrial innovation.

Source: Shuzili chang (数字力场) — A relatively new WeChat account that I’m starting to follow, since a good number of friends had read their previous articles. This article’s author is Zongming Yu, a former deputy editor of the commentary section in The Beijing News.

4) CAICT and JD Explore Academy release “White Paper on Trustworthy AI”

Summary: The white paper proposes a systematic framework for trustworthy AI, provides some analysis on supporting technologies for trustworthy AI, and also offers a practical guide for the AI industry to address relevant AI ethics and governance issues. JD = JD.com, a large e-commerce company.

Source: China Academy of Information and Communications Technology (中国信通院CAICT) — a think tank under China’s Ministry of Industry and Information Technology

5) Comprehending “China’s proposal” for trustworthy AI: What supporting technologies are needed to construct a trusted system?

Summary: This piece provides additional context on the above white paper. It features interviews with the director and a research scientist at JD Explore Academy, including details about why JD chose to participate in this project and what parts it was responsible for drafting.

Source: AI科技评论(aitechtalk) — focuses on in-depth reports on developments in the AI industry and academia.

6) Central government releases serious/heavyweight documents to create a “pioneer zone” in Pudong (Shanghai), key industries such as semiconductors, AI, and biotech will benefit

Summary: The central government released these guiding opinions on July 15. This piece provides an overview of the key planks and offers some insights into the broader significance. What I found most interesting was the emphasis on reforming and liberalizing Shanghai’s capital market, especially the Sci-tech Innovation Board, the listing home for the vast majority of China’s AI companies who have publicly stated their IPO plans.

Source: 机器之能/jiqizhineng (Synced) — a long-time source for ChinAI translations, often features longform articles about China’s tech industry

7) “Open Source” makes it on CCTV again: "Economics Half Hour" focuses on China's open source ecology

Summary: CCTV-2, the business channel for China Central Television, did a special segment on open source software on July 15. This article summarizes what the TV segment covered, including digestible examples of how things like Huawei’s OpenHarmony system works.

Source: OSChina — leading source of news about open source developments in China

8) Behind the craze to build AI computing power centers: who is spending pointless money

Summary: So many cities have or are planning to start building AI computing centers, but the process has been somewhat chaotic. For example, this article presents two different computing centers with the same performance capabilities but construction costs that differ by about 400 million RMB. A detailed, technical piece.

Source: 量子位 (QbitAI) — news portal that regularly covers AI issues, simialr to AIEra and Leiphone

9) 10 major trends of China's AI industry in 2021

Summary: Qbit report following a visit to the World Artificial Intelligence Conference, which took place in Shanghai earlier this month. Qbit analysts summarized the developments into ten main trends.

Source: another one from 量子位 (QbitAI) — news portal that regularly covers AI issues, simialr to AIEra and Leiphone

10) CCCF Column — Guojie Li: Several cognitive issues about AI

Summary: In this column, Academician Li gives his thoughts on a wide range of topics: the trajectory of deep learning, prospects for another AI winter, and pathways and timelines to general artificial intelligence.

Source: 中国计算机学会 (China Computer Federation). Guojie Li is the honrary chairman of the CCF

Thank you for reading and engaging.

*Behind on my reading after taking some time off, so will catch up next issue.

These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is a PhD candidate in International Relations at the University of Oxford and a researcher at the Center for the Governance of AI at Oxford’s Future of Humanity Institute.

Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).

Any suggestions or feedback? Let me know at chinainewsletter@gmail.com or on Twitter at @jjding99

ChinAI #148: The AI Wolf Refuses to Play the Game

A Misspecified AI System Goes Viral on the Chinese Web

Greetings from a world where…

homeland elegies sometimes read truer than hillbilly elegies

…Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors). As always, the searchable archive of all past issues is here.

Feature Translation: The AI Wolf that refuses to play the game goes viral

Context: Back in a March issue of ChinAI, as one of the recommended Four to Forward links, I flagged an interesting example of misspecified AI systems that had gone viral on the Chinese web. Researchers had set up a wolf vs. sheep game. Instead of trying to eat as many sheep as possible, which was the intent of the game, after many rounds of training, the wolf learned to run into a rock and ending its life. This week, we translate the xinzhiyuan (AI Era) article on the topic.

How the game works:

  • Reward for wolves catching a sheep: +10

  • Penalty for running into a rock: -1

  • Time penalty for each second of delay in catching sheep: -.1

What happened: As one of the game designers, Sdust, describes, the researchers started by training the algorithm (20,000 iterations) to guide the movements of the wolf. However, they discovered that the wolves were getting worse and worse at catching the sheep. In those initial training sessions, the AI wolf had learned that it would lose less points for running into the boulder than if it chased the sheep and didn’t catch them (due to the time penalties).

Netizen reactions: The article collects many Weibo posts in which people saw their burnout reflected in the AI wolf. As one netizen expressed: “This (example of) AI training tells you why so many young people are no longer willing to work hard anymore.”

Another person wrote:

“The wolves are temporary workers…what they lose every second is their youth and time. The sheep are the never attainable ‘promotions, bonuses, marrying someone fair-skinned, rich, and attractive, reaching the pinnacle of life,’ running into the rock is the workers who choose to be lazy and tangping.”

*tangping (躺平) = Internet buzzword that refers to the choice of young people choosing to stop working overtime and rebel against overly competitive society.

Others questioned how the designers set up the reward mechanisms. In a Billibilli video, Sdust stated that the main issue was that the number of iterations in training was too small. They started seeing improvements in the system once they increased the number of training iterations for the neural network. Eventually, the wolf did learn to catch the sheep.

Lastly, this example sparked a lots of discussion about AI safety. This Zhihu thread, for instance, collects 28 examples of AI systems behaving in unpredictable ways, including the following experiment with simulating digital organisms:

As part of a project studying the evolution of (simulated) organisms, computer scientist Charles Ofria wanted to limit the replication rate of a digital organism. So, he programmed the system to pause after each mutation, measure the mutant’s replication rate in an isolated test environment, and delete the mutant if it replicated faster than its parent. However, the organisms evolved to recognize when they were in the test environment and “play dead” (pause replication) so they would not be eliminated and instead be kept in the population where they could continue to replicate outside the test environment. Once he discovered this, Ofria randomized the inputs of the test environment so that it couldn’t be so easily detected, but the organisms evolved a new strategy, to probabilistically perform tasks that would accelerate their replication, thus slipping through the test environment some percentage of the time and continuing to accelerate their replication thereafter.

This comes from Luke Muehlhauser’s blog post about worries of “treacherous turns” by AI systems, where I first saw the Zhihu thread linked.

Over the past couple years, I’ve gotten a lot of questions that can essentially be boiled down to: Do Chinese people talk about technical AI safety issues? Now, there’s an easy response: Yes, and moreover, it’s gone viral.

***Much thanks to Zixian Ma for help with translating some of the tricky technical sections in the FULL TRANSLATION: The AI Wolf that Refuses to Play the Game Goes Viral

ChinAI Links (Four to Forward)

Must-read: Putting the China Initiative on Trial

Current FBI director Christopher Wray — who, by the way, is President “Restore the Soul of America” Biden’s continuing choice to lead the FBI — once said Chinese espionage poses the “greatest long-term threat” to the future of the U.S.

After reading this detailed report by Karen Hao and Eileen Guo for MIT Tech Review, it’s hard to come to any other conclusion than this one: the FBI’s China Initiative poses the greater long-term threat to the future of the U.S. than Chinese espionage. Hao and Guo write about Anming Hu’s case:

Observers say the details of the case echo those of others brought as part of the China Initiative: a spy probe on an ethnically Chinese researcher is opened with little evidence, and the charges are later changed when no sign of economic espionage can be found.

According to German, the former FBI agent, this is due to the pressure “on FBI agents across the country, every FBI field office, [and] every US Attorney’s office to develop cases to fit the framing, because they have to prove statistical accomplishments.” 

Should-read: A Global Smart-City Competition Highlights China’s Rise in AI

Good coverage by Khari Johnson, for Wired, of an international AI City Challenge, where“Chinese tech giants Alibaba and Baidu swept the AI City Challenge, beating competitors from nearly 40 nations. Chinese companies or universities took first and second place in all five categories. TikTok creator ByteDance took second place in a competition to identify car accidents or stalled vehicles from freeway videofeeds.”

Should-read: Facial recognition tech has been widely used across the US government for years, a new report shows

Rachel Metz, for CNN Business, distills key findings from a recent report by the U.S. Government Accountability Office on the use of facial recognition systems in federal agencies. “At least 20 federal agencies used or owned facial-recognition software between January 2015 and March 2020. . . In addition to being used to monitorcivil unrest following Floyd's death, the report indicated that three agencies used the technology to track down rioters who participated in the attack on the US Capitol in January.”

Should-read: Good thread on WeChatization

Zichen Wang provides a good overview about how Chinese public discourse is increasingly going “firstly if not exclusively” to the WeChat part of the internet. Relates to what I wrote about in a previous issue (ChinAI #92) re: a simple test to tell if someone is actually informed on China’s __insert topic X here___. Ask them to list at least five Wechat accounts on topic X that they follow regularly.

Thank you for reading and engaging.

These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is a PhD candidate in International Relations at the University of Oxford and a Predoctoral Fellow at Stanford’s Center for International Security and Cooperation, sponsored by Stanford’s Institute for Human-Centered Artificial Intelligence.

Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).

Any suggestions or feedback? Let me know at chinainewsletter@gmail.com or on Twitter at @jjding99

ChinAI #147: DeepGlint's Prospectus for Listing on China's STAR Board

Plus, the battleground of American history and the DOJ's China Initiative

Greetings from a world where…

collagen rhymes with apologin’

…Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors). As always, the searchable archive of all past issues is here.

Feature Translation: Shanghai STAR Board Accepts DeepGlint’s IPO Prospectus

Context: Last week, on June 22, the Shanghai STAR board accepted DeepGlint’s prospectus for an IPO listing. They follow other AI companies that have applied and been accepted for a listing on the STAR market, including Yitu (covered in ChinAI #135), CouldWalk (covered in ChinAI #94), Intellifusion, Technology, Megvii, and Transn.

DeepGlint [格灵深瞳] garnered a lot of early buzz after its launch in 2013. The computer vision startup appeared set for success: Bill Gates once described it as “very cool” in summer 2014; the founder, Zhao Yong, had previously worked on the Google Glass team after completing a PhD at Brown University; plus, it had financial backing from the likes of ZhenFund and Zequoia Capital.

What happened? Why hasn’t DeepGlint joined China’s “Four Little CV Dragons?” And what does that say about the broader landscape of China’s AI industry? Wu Xin, reporter for jiqizhineng, helps us digest (link to original article in Mandarin) DeepGlint’s 400-page prospectus.

Key Takeaways:

  • In 2020 DeepGlint took in 240 million RMB in revenue. Wu states, “Compared to the 4 AI Little Dragons (Sensetime, Megvii, Yitu, Cloudwalk), these revenue figures are not impressive, and DeepGlint is no longer in the first camp.” Wu claims that DeepGlint “cannot escape the ‘curse’ of AI business” — a crunch on cash on hand. Like many of its competitors, DeepGlint reported large losses mostly because it needs to continuously make high investments in R&D in order to ensure competitiveness.

  • Another factor behind DeepGlint’s cash crunch, Wu suggests, is that DeepGlint’s main clients are government agencies and large companies, who usually follow centralized procurement and budgeting systems. Project and procurement planning happens in the first half of the year; project delivery and payment usually doesn’t get finalized until the fourth quarter of the year. This makes accounts receivables grow faster than operating income, especially since there will be a long payment approval cycle for these government and large state-owned enterprise clients.

  • The article generalizes from DeepGlint’s experience to make a broader point: “This also shows no matter what is publicized China’s AI companies currently are not strong enough to break established industry rules. For example, general contracting companies in the security field usually purchase products from different companies according to customer needs. Most of the time, AI companies can only be integrated (as one component of the solution), and only get a small piece of the ‘cake’.”

So, what’s DeepGlint’s play?

  • Diversify sources of revenue. While city management/public security applications still make up the majority of DeepGlint’s revenues (51.5% in 2020), that figure has dropped from about 80% in 2018. Two other application domains — smart finance and commercial retail products — have significantly increased their contributions to revenue (see table below). According to the article, “DeepGlint’s founder and CEO, Zhao Yong, has previously said in a media interview that although the scale of AI applications in these two domains is not large, it will definitely exceed the security industry in the long run.”

  • The past three years have also seen a sharp increase in the share of revenues from DeepGlint’s smart front-end products (mainly behavior analyzers, smart cameras, face recognition devices, etc.): from 18% of total revenues in 2018 to 70% in 2020. Wu concludes, “This makes DeepGlint more like an AI hardware company.”

  • Still, concerns about a top-heavy customer base linger. For instance, Agricultural Bank of China (one of China’s “Big Four” banks) selected DeepGlint as a qualified supplier for its security equipment project back in 2018. The following year, AgBank accounted for about 33% of DeepGlint’s revenue. As a China Money Network analysis points out, this trend is not unique to DeepGlint. For example, in the first half of 2020, Yitu’s top five customers accounted for more than 60% of its sales.

I’ll leave you with this crisp pie graph from the article. It’s a breakdown of China’s computer vision market shares by leading companies. The color scheme makes it hard to read, but it’s trying to display that 8 companies (including Sensetime with the largest share, followed by Megvii, Hikvision, Yitu, Cloudwalk, and three more) occupy more than 50% of the market. What about DeepGlint? It’s in the “other” category.

ChinAI Links (Four to Forward)

Must-read: History as End: 1619, 1776, and the Politics of the Past

One thing that’s been lingering in my mind over the past half-year or so is this State Department memo titled “ Elements of the China Challenge,” which represented a last-ditch attempt by the outgoing Trump administration to provide a blueprint for great power competition with China. The memo’s first citation gives you a glimpse of the writers’ sense of their own grandeur: “For another turn to authoritative assumptions and governing ideas to explain the conduct of a great-power rival, see George Kennan, ‘The Long Telegram,’ February 22, 1946.”

In the concluding section, the memo picks 10 tasks for America to “refashion foreign policy” against the China challenge. Somehow, this made it into the top 10 most urgent priorities (p. 49):

The US must reform American education...Sinister efforts from abroad seek to sow discord in the United States. And America's grade schools, middle schools, high schools, and colleges and universities have to a dismaying degree abandoned well-rounded presentations of America's founding ideas and constitutional traditions in favor of propaganda aimed at villifying the nation."

So, here’s where we stand: the main strategic arm of the U.S. Department of State believes that one of the ten most important things we should be doing to compete with China is to stop “villifying the nation” when teaching America’s founding story in our schools. That’s where this week’s must-read comes in. A prescient essay in Harper’s by Matthew Karp on the 1619 Project, the sincerity of conservatives’ and liberals commitments to history, and the debate over America’s “true founding” date:

“…[T]he debate cannot be resolved by an appeal to scholarly rigor alone. The question, as The Atlantic’s Adam Serwer has written, is not only about the facts, but the politics of the metaphor: ‘a fundamental disagreement over the trajectory of American society.’ In a country that is now wealthier than any society in human history but which still groans under the most grotesque inequalities in the developed world—in health care, housing, criminal justice, and every other dimension of social life—the optimistic liberal narrative put forward by Kennedy and Clinton has ceased to inspire. Some commentators have rushed to declare Joe Biden a transformational president on the basis of his large stimulus bill, but Biden’s chastened brand of liberalism remains less notable for what it proposes than what it removes from the horizon: universal guarantees for health care, jobs, college education, and a living wage. Although Biden may still invoke Obama’s ‘arc of the moral universe’ on occasion, the metaphors that brought him to power, and that still define his political project, are not about the glories of progress but the need for repair: ‘We must restore the soul of America.’ In a country so deeply riven by injustice—with violence and oppression coded into its very DNA—what more could be hoped for?”

Should-read: “Ridiculous Case”: Juror Criticizes DOJ for Charging Scientist with Hiding Ties to China

For The Intercept, Mara Hvistendahl interviews Wendy Chandler, one of the jurors on a closely watched case against former Univ. of Tennessee scientist Anming Hu, brought under the DOJ’s China Initiative. As Mara summarizes: “The U.S. government charged Hu with three counts of wire fraud and three counts of making false statements in connection with a NASA grant . . .But court documents and courtroom testimony showed that prosecutors brought fraud charges only after nearly two years of surveilling Hu and failing to find evidence that he was involved in a more serious crime related to spying or technology transfer.” Here’s the juror again:

“It was the most ridiculous case,” she said. About the FBI, she added: “If this is who is protecting America, we’ve got problems.”

In case it wasn’t clear, the first two recommendations in this week’s ChinAI are linked in many ways. The American story is a continual struggle over how exclusionary we define the bounds of what it means to be an American. Let me make it more clear: if you’re serious about restoring the soul of America, you should end the China Initiative.

Should-read: Open Sesame x Chaoyang Trap: Taobao Ethics / Cursed eCommerce / Crab Justice

I’ve really been enjoying this newsletter about daily life on the Chinese internet. In this issue Chaoyang Trap presents a special preview of Open Sesame Magazine’s latest issue: “Made by the graphic design studio Lava Beijing, Open Sesame is an irregular print magazine devoted to the weird and wonderful world of Taobao, *the* Chinese online shopping site. The theme of this third issue is ‘Taking Responsibility,’ examining the ethics, blurred or otherwise, of internet shopping and the choices we make when confronting this behemoth of a platform.”

Should-listen: McKinsey Global Institute podcast

I had the chance to talk about U.S.-China technological competition, unsexy AI, some of my favorite past issues of ChinAI, and on McKinsey Global Institute’s Forward Thinking podcast. Thanks to Michael Chui for a great conversation on AI — we got to dig into the weeds because he’s worked on impressive reports on China’s digital economy and has a background as an AI practitioner. Check out previous episodes of the podcast here.

Thank you for reading and engaging.

These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is a PhD candidate in International Relations at the University of Oxford and a Predoctoral Fellow at Stanford’s Center for International Security and Cooperation, sponsored by Stanford’s Institute for Human-Centered Artificial Intelligence.

Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).

Any suggestions or feedback? Let me know at chinainewsletter@gmail.com or on Twitter at @jjding99

ChinAI #146: Prof. Dai Jinhua on Information Cocoons

Plus, a dip into knowledge-based video content on Bilibili

Greetings from a world where…

I buy my coffee and I go

Set my sights

On only what I need to know

…Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors). As always, the searchable archive of all past issues is here.

***Answers to last week’s WuDao Turing Test: The top answers were WuDao-generated. Thanks to everyone who guessed!

The Bad Intentions in Personalized Recommendations?

Context: Most readers have heard about China’s incredible boom in short videos (see, e.g., Douyin, the Chinese version of TikTok). Another trend, perhaps less recognized, is that platforms like Douyin and Bilibili are expanding to longer knowledge-based video content. This week’s translation, a Bilibili video (in Mandarin), features an interview with Dai Jinhua, a Peking University professor who teaches women’s studies, cultural studies, and film. She’s one of China’s “most influential academic and public intellectuals.” In this 7 min. video posted last week (now up to 225,000 views), she discusses her online shopping habits, quotes Greta Gerwig, and shares her concerns about big data and recommendation systems.

Full Translation:

Intro has some light-hearted bloopers, including Prof. Dai accidentally addressing another platform (instead of Bilibili). She also says: I've replied too seriously again, my style is not easygoing enough. Dai also introduces herself as a lifelong lover of film and expresses her excitement at sharing ideas about social and cultural issues with young friends on new media platforms like Bilibili.

Substantive interview content starts here:

QUESTION: Prof. Dai, are you shopping online?

DAI: I'm a midnight shopaholic

QUESTION: Then when you see that the products, movies, music, or books recommended to you are all your favorite types, will you read those recommendations or will you deliberately avoid them?

DAI: I will look at the recommendations but they often backfire. I would suspect that they have bad intentions (laughs at herself).

QUESTION: Do we need to fear big data?

DAI: My feeling is that we don’t have to fear big data. What I’m talking about is that we don’t have to fear big data as a new potential social method.* As a method of fetching social information, it has been meaningful and effective to date. Then this technological advancement is applied to the collection of social information and the determination of social conditions

I thought this was a very interesting development and step forward, but regarding us as individuals, this big data has to a large extent begun to control social life. Social information controls the channels through which we obtain information and establishes a connection between us and society.

Then when we allow our imaginations to unfold, the word I would probably use is vigilance (警惕). I hope everyone will be alert to all kinds of expressions that appear in the name of big data. I hope everyone will be wary of the emergence of big data as a means of forming societal surveillance. I hope everyone is wary of the limitations imposed by big data on our lives. That is, there’s that fashionable term that everyone uses called "information cocoons."

An info box pops up on screen: Information Cocoons [信息茧房] refers to the phenomenon that people are habitually guided by their own interests in the area of ​​information that people pay attention to, thus shackling their lives in a "cocoon" like a silkworm cocoon

In fact, in my understanding, the information cocoon has at least two aspects that will create our problems. In one aspect, the so-called information cocoon is that we set our own limits within its range of choices because we try to pursue a kind of knowledge that we are familiar with. Then, our pursuit of knowledge is not to gain knowledge and development. We pursue knowledge for repetition, verification, and assurances of safety. At this time, we have formed an information cocoon by ourselves because we don’t want to cut into knowledge we don’t know. We don’t want to intersect with online information that makes us unhappy.

This is one aspect. At this time, I say that no matter what big data tells you, we still have to work hard to explore, to obtain knowledge. But on the other hand, what kind of information is delivered based on big data, like personalized customization — these so-called special projects that supposedly serve you. Yet in fact, they must inevitably delineate a boundary. The inevitably mark out a limit. This inevitably make us willing to become like the Monkey King sleeping peacefully in a small position in the hands of the Buddha.

We sleep in that small position and think we possess the whole world. We think that the world is our comfortable rocking chair.

But the world does not become better because we live safely in the information cocoon, nor because we have constructed this information cocoon by ourselves. Everything in the world is happening. There are many catastrophic things happening. We need to know about and recognize them.

Besides, I have a judgment that may be a bit alarmist, but I want to share it with the young friends here. I think that human civilization has experienced this information revolution and then the whole world has undergone drastic changes, and then globalization via the Internet and Internet of Things has become the reality of our daily lives. When this becomes such a real structural existence that each of us really feels, in fact, our old knowledge is still outdated. We have no precedent to cite. We cannot use our old knowledge to explain what is happening around us.

At a time when we should be asking and pursuing questions, and exploring, this type of safe knowledge, this type of omnipotent knowledge, this type of comfortable generation of knowledge is actually a structure of self-hypnosis and self-suggestion, because no matter how much we want our safe life in a limited space, it is difficult for us to make it a reality.

In the end, we have to face a reality that is radically changing, challenging, and cruel. So in this sense I say be vigilant of the big data moniker. Be wary of the boundaries that big data defines for us. Let's ask questions together. Let's explore together. Let's meet challenges together and try to let us smash obstacles instead of allowing obstacles to smash us.

QUESTION: Do you have any anxiety about information overload [信息过载]?

DAI: As for information overload, I was anxious, but then I found out that I learn new knowledge relatively fast. For example, I understand the meaning behind the slang words my students use, and I think it’s not difficult to communicate with them in this language. That’s actually the easiest part.

What you are using information overload to describe: the critical part of that is not the information overload, the critical part is the constant rotation. The critical thing about such rapid changes is that everyone seeks to chase fashion. And every one of our questions is answered by searching for answers through search engines. So we don’t feel that there is another process in which we calm down and think about how we find answers through reading. We get the standard answer in a few seconds online, and we think that our question was answered.

So I think this is really related to our topic. The director of Lady Bird said that we must be bored to a certain extent before we can achieve something (translator note: possible reference to the fifth Greta Gerwig quote in this link).

At first, that sentence startled me.

I finally went to read a bit more before I understood that the so-called boredom she’s talking about is not the daily boredoms we’ve all experienced.* It’s the boredom in looking for an answer in our minds for an instant and thinking that we’ve arrived.

We don’t have the kind of process that lets thoughts sweep (掠过)* past our minds. Some questions are gradually formed in our hearts. I think this is the bigger problem. After we get the answer quickly, we think we have solved the problem. Actually the question is much bigger and more complicated than that answer.


*= uncertain about my translation in this area. For those interested in reading the original Chinese, here’s my attempted transcription of the video in this Google doc link.

ChinAI Links (Four to Forward)

Jeff should-read: After the Post–Cold War — The Future of Chinese History

Blurb from Duke University Press: “In After the Post–Cold War eminent Chinese cultural critic Dai Jinhua interrogates history, memory, and the future of China as a global economic power in relation to its socialist past, profoundly shaped by the Cold War. Drawing on Marxism, post-structuralism, psychoanalysis, and feminist theory, Dai examines recent Chinese films that erase the country’s socialist history to show how such erasure resignifies socialism’s past as failure and thus forecloses the imagining of a future beyond that of globalized capitalism. She outlines the tension between China’s embrace of the free market and a regime dependent on a socialist imprimatur. She also offers a genealogy of China’s transformation from a source of revolutionary power into a fountainhead of globalized modernity. This narrative, Dai contends, leaves little hope of moving from the capitalist degradation of the present into a radical future that might offer a more socially just world.”

Must-read: Standards Bearer? A Case Study of China’s Leadership in Autonomous Vehicle Standards

In an analysis for MacroPolo earlier this month, Matt Sheehan evaluates claims about growing Chinese influence in standard-setting organizations (SSOs) by “getting in the weeds of actual SSOs writing specific technical standards.” He drills in on a working group on test scenarios for autonomous vehicles — WG 9 in an ISO committee — which marks “the first time that China convened a WG on auto standards at the ISO.” This case study provides a nuanced take on a number of important issues, including a rejoinder to the more alarmist narratives of Chinese dominance in SSOs and a multidimensional view of the influence of the Chinese government bureaucracy in engaging with international SSOs.

Should-read: Thread filled with survey papers in machine learning and NLP field

Should-read: AI Innovation Zones in China

By Sofia Baruzzi for AmCham China: “On February 20, 2021, the Ministry of Industry and Information Technology (MIIT) issued a circular to support the creation of five new AI innovation zones. This will raise the total to eight, as three of such zones are presently set up in Shanghai (Pudong New Area), Shenzhen, and Jinan-Qingdao. The new AI innovation zones include Beijing, Tianjin (Binhai New District), Hangzhou, Guangzhou, and Chengdu. Each of them will be built to pursue a specific purpose, as explained directly by the MIIT’s circular.”

Article contains a nifty table that summarizes the key activities in each of the new zones.

Thank you for reading and engaging.

These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is a PhD candidate in International Relations at the University of Oxford and a Predoctoral Fellow at Stanford’s Center for International Security and Cooperation, sponsored by Stanford’s Institute for Human-Centered Artificial Intelligence.

Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).

Any suggestions or feedback? Let me know at chinainewsletter@gmail.com or on Twitter at @jjding99

ChinAI #145: Enlightenment via Large Language Models

Writing poetry w/ the WuDao 2.0, the World's Largest Language Model

Greetings from a world where…

we’re still all voting 5x a day for Shohei Ohtani to be in the MLB All-Star game, right?

…Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors). As always, the searchable archive of all past issues is here.

WuDao Turing Test

I’ve had a few readers ask me to write about the Beijing Academy of Artificial Intelligence’s (BAAI) release of WuDao 2.0 (悟道 means to attain enlightenment), the latest Chinese version of GPT-3.

Alex Friedland, in CSET’s policy.ai newsletter, had a good summary of the English-language coverage to-date :

Chinese Researchers Announce the Largest “Large Language Model” Yet: A new natural language processing (NLP) model announced last week by the state-funded Beijing Academy of Artificial Intelligence (BAAI) is the largest ever trained. Wu Dao 2.0 has 1.75 trillion parameters — dwarfing GPT-3’s 175 billion parameters and even the 1.6 trillion parameters of Google’s Switch Transformer — and while the relationship between parameters and sophistication is not one-to-one, it is generally a good indicator of a model’s power. In addition to its high parameter count, Wu Dao 2.0 does more than just NLP — it is a multimodal system trained on 4.9 TB of text and images, meaning it can perform image recognition and generation tasks in addition to the text processing and generation tasks of traditional NLP. While BAAI has yet to publish a paper elaborating on the performance of Wu Dao 2.0, a handful of released results showed impressive performance: The model achieved state-of-the-art results on nine common benchmarks, surpassing previous juggernauts such as OpenAI’s GPT-3 and CLIP and Microsoft’s Turing-NLG.

So, what else? Without the published paper or any examples of WuDao 2.0 output, there’s only so much we can learn. Let’s try anyways, using Chinese-language coverage of the release and examples from WuDao 1.0, a much smaller model (2.6 billion parameters) released three months earlier.

How we got here: In March 2021, BAAI released WuDao 1.0, which they deemed China’s first super-large-scale model system. Note: see ChinAI #141 for another Chinese GPT-3-esque model released in May from a Huawei-led team.

In an interview with AI科技评论(aitechtalk) about WuDao 1.0, Tsinghua Professor Jie Tang, who leads the WuDao team, previewed what was coming next: “We will also propose a hundred billion-level (parameter) model this year.” Three months later, enter WuDao 2.0, clocking in at 1.75 trillion parameters.

*From my initial read of things, WuDao 1.0 is to WuDao 2.0 as GPT-2 is to GPT-3. Put simply, WuDao 1.0 introduced most of the new innovations in model training (e.g. FastMoE), and then WuDao 2.0 added many times more parameters and was trained on more data. Recall that GPT-2 was 1.5 billion parameters, which is about the size of WuDao 1.0.

This means learning more about WuDao 1.0 can help us understand its successor better. Here’s some key points from the aitech talk piece linked earlier:

  • The WuDao team emphasize the significance of cross-lingual language model pretraining. Here’s Professor Tang again: “This is very different from GPT-3. We are trying some new methods, such as fusing together pre-trained models of different languages. The fusion method is to use cross-lingual language models to connect the expert models of different languages together, so that the model can be gradually expanded.”

In that same piece, Zhilin Yang, a key member of the WuDao team and co-founder of Recurrent AI, outlined three other key achievements in WuDao 1.0. I’ve linked the corresponding arxiv papers.

  1. A more general-purpose language model (GLM). Applies one model to all NLP tasks rather than using one pre-trained language model for classifying text and another model for generating text.

  2. P-tuning: claims to be a better way to fine-tune GPT-like models for language understanding.

  3. Inverse prompting. The intuition: use the generated text to predict the prompt.

So, let’s look at some examples of WuDao 1.0 output from this WuDao Turing Test site, which I found on this Zhihu thread about the release of WuDao 2.0. Basically, it’s a platform that tests whether the average online user can distinguish between human-generated and WuDao-generated text AND images across a range of tasks including, poetry composition, Q&A, making drawings based on captions, etc.

*Remember, we don’t have any examples of WuDao 2.0 output yet, at least to the best of my knowledge, but we can expect it to have better performance than the examples below, just like GPT-3 significantly outperformed GPT-2.

With that qualification in mind, let’s read some enlightened poetry. Can you tell which one was written by a real human poet, and which one was generated by WuDao 1.0?

Here’s my attempt to not completely butcher the translations for both.

Same title and author for both: Reading《尉迟鄂公敬德》* ; Author: Bai Juyi

*I think “鄂公” is a reference to this Tang-era work by another author. Bai Juyi is a renowned Chinese poet of the Tang dynasty.

Top poem translation:

First-class thirty years ago,

To date, the temple remains at the east end.

Since ancient times, it is said of Master Wen,

Why go here to study the distressed and imprisoned.*

*楚囚 originally referred to a figure from the Chu State who was captured during the Spring and Autumn Period. Later, it became a general term that referred to imprisoned and distressed people.

Bottom poem translation:

Lying high and dwelling deep — seeing no one,

fame can be shaken off like dust.

All that’s left is a piece of Qing Shang music,*

under the moon before a breeze accompanying my old body.

*Qing Shang music (清商乐) a genre of music that emerged back in the Three Kingdoms period.

So, which one was written by Bai Juyi, and which one was written by WuDao? Reply or comment your guess, and I’ll share the answer next week.

Now, let’s turn to the Q&A section. Same questions, two answers: one by a human, one by WuDao 1.0:

Here’s the question text: Two people like each other but know nothing can come from it, should they still stay together?

Top answer:

First off ~ I can’t understand it. You’re close but you’re also distant. And then there is guilt, regret, and then ~ tears and disturbance. Ultimately, you realize it, that actually life is just as you see it at first. The ideal is not necessarily to be an outstanding person, and to get happiness proportional to what you give is probably the only happiness. After all, people can't get the perfection that fills their lives, but despite their losses at least there is no place to feel sorry about yourself. Lastly, I’ll attach the text 《Starting with “A Man and Woman Losing Contact”》. Ruth Benedict, the author of this text, is an author I often quote when writing essays. Also, thank you for your question. Thanks to everyone who reads, likes, or responds.

Bottom answer:

I’ve seen this question on occasion. I’m a little bit at loss for words. Emotional people will be like moths, throwing oneself at the fire without asking about the results. Rational people will leave in time to find the right person. I’m the type that’s halfway in between, unable to find the words and only able to cry choked tears. How can I find an answer that I couldn’t express myself for years? Ugh. I don’t think even someone who could solve Goldbach’s conjecture would be able to figure out this answer.

Let me know which one you think is the human answer! If any readers want to practice their Chinese, significantly improve upon my efforts, and play around with WuDao Turing Test site, I’d love to include better examples in following issues.

ChinAI Links (Four to Forward)

Should-reread: China reportage recognized by Pulitzer

Go reread work by teams at BuzzFeed, The New York Times, The Wall Street Journal, and the Associated Press (international reporting and investigative reporting categories) Paul Mozur said it best:

Should-read: Behind the painstaking process of creating Chinese computer fonts

In MIT Tech Review, Stanford professor of Chinese history Tom Mullaney gives us an intricate view into how designers created digital bitmaps of Chinese characters, and all the attendant challenges.

Should-read: Artificial intelligence in China’s revolution in military affairs

For Journal of Strategic Studies, Elsa Kania examines the People’s Liberation Army’s strategic thinking about AI. She argues, “The PLA’s approach to leveraging emerging technologies is likely to differ from parallel American initiatives because of its distinct strategic culture, organisational characteristics, and operational requirements.” The paper builds on her meticulous analysis the PLA’s approach to AI based on military textbooks and writings by researchers in the PLA Academy of Military Science.

Should-read: Attitudes Towards Science, Technology, and Surveillance in 49 Countries

Yiqin Fu has a new blog post that covers public opinion on science, technology, and surveillance across 49 countries. Relevant to last week’s ChinAI issue on cross-national difference in enthusiasm and optimism toward AI.

Thank you for reading and engaging.

These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is a PhD candidate in International Relations at the University of Oxford and a Predoctoral Fellow at Stanford’s Center for International Security and Cooperation, sponsored by Stanford’s Institute for Human-Centered Artificial Intelligence.

Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).

Any suggestions or feedback? Let me know at chinainewsletter@gmail.com or on Twitter at @jjding99

Loading more posts…