ChinAI #155: Microsoft Translator Takes on Classical Chinese
What is "Never gonna give you up" in Classical Chinese?
Greetings from a world where…
…As always, the searchable archive of all past issues is here. Please please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay support access for all AND compensation for awesome ChinAI contributors).
Feature Translation: Testing Microsoft's Classical Chinese AI Translation
Context: On August 25, 2021, Microsoft released a model that translates Classical Chinese (AKA literary Chinese), a written language of ancient times. QbitAI writers played around with the new translation service, testing its performance against Baidu’s translator.
For those interested in more on Classical Chinese and AI, one of my favorite past issues (ChinAI #66: Autumn Chrysanthemums on the Bridge) covered Classical Chinese poetry generation by Huawei’s AI poet Yuefu. Ru-Ping Chen contributed some beautiful translations of the AI-generated Classical Chinese poems.
Key Takeaways: QbitAI tested Microsoft Translator’s Classical Chinese capabilities against Baidu Translator, which was the first to apply machine learning to Classical Chinese translation (Baidu has applied for patents in this area too). They evaluated the two translation engines on three key points:
The huwen [互文] rhetorical method. Under this grammatical structure, if you’re trying to say both A and C have B and D, then you can write it as A has B, and C has D. The example item: 秦时明月汉时关. Meaning: The bright moon and border pass in the Qin and Han dynasties. In this example. A and C are the Qin and Han dynasties. B and D are the bright moon and the pass. Microsoft Translator had the correct translation, but Baidu translates the sentence as the moon in the Qin dynasty and the border pass in the Han dynasty, failing to grasp the huwen method.
Flexible use of parts of speech. Test item: 春风又绿江南岸. In this example, the character 绿, which usually means green (adjective), is used as a verb. Baidu Translator gets it right: The spring breeze blows through and makes everything green. Microsoft Translator spits out:
As the screenshot shows, Microsoft Translator did comprehend that 绿 was used as a very, but it left an extra "可是” [ which means “but”] at the end of the translated text. Apparently, the “but” makes sense when you add the second half of the line in the original poem, so this must be an issue with training models to understand when to cut or include conjunctions.
Inversion (倒装) — sentences in which the object precedes the verb. Example: 我孰与城北徐公美? Here, the sentence’s meaning is: Between me and Xu Gong of Chengbei, who is more beautiful? Both translation engines got this question right.
Final score is . . . a 2:2 tie. Full translation includes more exercises with Microsoft’s Classical Chinese translations. Some of the attempts to translate English to Classical Chinese are especially fun. The translation for “Never gonna give you up” [用不舍汝] was really clean. As for the Classical Chinese version of Yeats’s famous poem “When You Are Old” . . . well, not so much. Take a look! *To readers more versed in Classical Chinese, please feel free to correct and edit my rough translations. Hopefully, I got the general concepts down.
***FULL TRANSLATION: Testing Microsoft’s Classical Chinese AI Translation
ChinAI Links (Four to Forward)
This WashPost article, by Alicia Chen and Lyric Li, studies the demand for AI companions from China’s young adults:
Launched in 2014 as a young woman with a diminutive nickname meaning “Little Ice,” Xiaoice has grown so popular that she performs 14 human lifetimes’ worth of interactions each day, said Li Di, CEO of Xiaoice, which Microsoft spun off in 2020. She’s busiest from 11:30 p.m. to 1 a.m., when users unload their day’s experiences or grow emotional. Xiaoice has 10 million active users in China.
Should-read: Easy as PAI (Publicly Available Information)
Jack Poulson of Tech Inquiry has released a report detailing a system that tracks the complicated financial flows involved in government procurement. This system:
can be used to map out previously unreported deployments of emotion recognition, facial recognition, and location tracking by the U.S. military in consortia involving prominent think tanks (at least partially coordinated through the DC-area office of the Naval Postgraduate School’s Remote Sensing Center). We also map out the subcontracting network for Project Maven, as well as a related Secure Unclassified Network (SUNet) “Publicly Available Information (PAI) enclave” involving Palantir and a preceding “Project CICERO” with USSOCOM’s J2 Intelligence Directorate.
I know many readers recognize the significance of international standard-setting for the development of information and communication technologies. This working paper, by Justus Baron and Olia Whitaker, is the best work I’ve read on Chinese influence in global standard development organizations.
A list compiled by Danka Ellis for bookriot. What are your favorite translated books? At the moment, I’m hooked on Blindness by the Portuguese author José Saramago.
Thank you for reading and engaging.
These are Jeff Ding's (sometimes) weekly translations of Chinese-language musings on AI and related topics. Jeff is a PhD candidate in International Relations at the University of Oxford and a researcher at the Center for the Governance of AI at Oxford’s Future of Humanity Institute.
Check out the archive of all past issues here & please subscribe here to support ChinAI under a Guardian/Wikipedia-style tipping model (everyone gets the same content but those who can pay for a subscription will support access for all).
Any suggestions or feedback? Let me know at firstname.lastname@example.org or on Twitter at @jjding99