Exposing a Chinese AI censorship machine: Data Leak Reveals Shocking Details
/Article


A protest regarding poverty in rural areas of China, a report on a dishonest member of the Communist Party, and a plea for assistance concerning corrupt police extorting business owners are just some of the 133,000 cases that have been inputted into an advanced language model designed to automatically identify any content deemed sensitive by the Chinese government.

A leaked database accessed by TechCrunch reveals that China has established an AI system that boosts its existing censorship mechanisms, surpassing traditional red lines such as the events at Tiananmen Square.

The system seems focused on online censorship of Chinese residents but could also be utilized for additional purposes, like enhancing the already extensive censorship efforts of Chinese AI models.

Xiao Qiang, a researcher at UC Berkeley who specializes in Chinese censorship and examined the dataset, stated to TechCrunch that this serves as "clear evidence" of the Chinese government's desire to utilize language models to strengthen oppression.

This development further proves that authoritarian governments are rapidly adopting cutting-edge AI technologies. For instance, in February, OpenAI disclosed the use of language models by various Chinese entities to monitor anti-government content and defame Chinese dissidents.

The Chinese Embassy in Washington, D.C. expressed to TechCrunch that it condemns baseless attacks on China and emphasizes the country's commitment to the ethical development of AI.

Discovered by security researcher NetAskari, the dataset was found stored in an insecure Elasticsearch database hosted on a Baidu server. There is no indication of the involvement of either company in this data breach.

The dataset creators are unidentified, but records show that the data is recent, with entries as recent as December 2024.

The system, resembling how people prompt ChatGPT, instructs an unnamed language model to identify content related to sensitive subjects in politics, society, and the military. These topics are classified as "highest priority" and require immediate identification. Hot-button subjects in China like pollution scandals, food safety issues, financial fraud, and labor disputes are among the top priorities, as these issues have previously sparked public demonstrations such as the Shifang anti-pollution protests in 2012.

Leave a Reply