[Update May 6 17:00: added information on the context within China’s overall internet censorship.]
The thing about censorship is that, when done well, no one really knows what’s being censored. This is why last week’s leaked documents from Baidu, the largest Chinese-langauge search engine and blogging site, are so titillating. Maybe someone screwed up bad, or maybe someone on the inside had an attack of transparency; whatever the reason, we now have a huge pile of documents detailing Baidu’s censorship policy during the period from November 2008 to March 2009.
The documents, now safely ensconed in a permanent home on Wikileaks, reveal for the first time a detailed inventory of the Chinese government’s priorities for, er, harmonization. There is a blacklist of 798 specific URLs, most of which seem to be recent news articles and discussion forum posts on sites both inside and outside of China. Far more interesting is a long list of sensitive keywords. Included policy documents suggest that the appearance of any of these terms in a blog post triggers a manual review by the staff of Baidu’s censorship team — whose names are listed in another of the leaked documents! While some of these topics have long been outright censored, such as “Tiananmen Square,” others are more general categories to be watched. Taken together, these sensitive terms are a fascinating portrait of China’s institutional paranoia.
Some categories are obvious, such as “Taiwan” and “naked chat”. Other areas are shockingly broad, such as “power” and “tyranny.” Certain media outlets such as Voice of America are considered unacceptable, and “SMS the answer” is forbidden within the “exam information” section. Also, China does not have any ketamine, AIDS, or ethnic conflict, and frowns upon one night stands. The main document of interest begins,
中办发 国办发 温州 鬼村 段桂清 四川广安 广安事件
中组部前部长直言 动物园 集会 涿州 饲养基地 中石油国家电网倒数 张文中 华闻 王政
假冒 记签 校园改造工程 雍战胜 死刑现场 冯巩 陶虹 高勤荣
And I can’t read that either, so below is an automated translation, via The Dark Visitor who clearly used something more formidable than Google Translate. Still, machine translation really doesn’t work as well as one might like, or perhaps “electric chicken” makes perfect sense in context.