Rest of World received exclusive access to a platform that tracks patterns and timing of Chinese online censorship.
By JOANNA CHIU
18 SEPTEMBER 2024
GFWeb provides precise tracking of when Chinese authorities block domains, including sites like ChatGPT, Hugging Face, and Perplexity.
Chinese censors appear focused on blocking AI tools for content generation, such as video and image editing applications.
Censorship spikes might be linked to significant events, such as China’s introduction of AI regulations.
A few months after OpenAI launched ChatGPT in November 2022, the service began to take off in China, with citizens using it to satirize pro-government figures and for homework help. Because OpenAI restricted access to China-based users, local developers created mirror sites to facilitate access to the service. But the ChatGPT boom in China was short-lived. The Chinese government blocked ChatGPT’s domain on March 2, 2023, new research has found.
Historically, tracking when exactly Chinese authorities blocked specific domains was difficult because researchers had to choose to test individual domains. But according to a newly launched platform, GFWeb, which granted Rest of World exclusive first access, the same month that the Chinese government blocked ChatGPT for the first time, authorities also blocked dozens of alternative chatbots and websites that use ChatGPT’s technology. Rest of World also discovered that Hugging Face, the popular machine-learning platform, was blocked in China months before the company reported issues.
GFWeb is now available to the public for free and continuously tests millions of websites from both inside and outside China to identify when exactly they are no longer available to users in China. It detects which sites are blocked by leveraging the Great Firewall’s unique filtering behaviors. The service is primarily funded by the nonprofit Open Technology Fund and received research input from faculty at the University of British Columbia, University of Toronto, University of Chicago, and Stony Brook University.
“This system not only enhances our ability to track the timing and scope of censorship events but also helps identify patterns and shifts in the strategies employed by the Great Firewall,” Nguyen Phong Hoang, the platform’s developer and a University of British Columbia computer scientist, told Rest of World. “I hope GFWeb can empower researchers, policymakers, and the general public to gain deeper insights into the evolving landscape of China censorship.”
It was previously unclear when Hugging Face was first blocked in China. In October 2023, the company reported “regrettable accessibility issues” in the country. In fact, GFWeb data suggests that Huggingface.co was actually blocked on May 7, 2023, months before the company identified the issue.
Data from GFWeb allows observers to spot long-term trends. For instance, it shows that Chinese authorities are particularly concerned with AI tools used for content generation. Besides websites that appear to use ChatGPT’s technology, the majority of blocked AI websites include tools that assist with video and image editing. That includes services like OpenArt and VoiceDub.
This suggests the Chinese Communist Party is “quite sensitive to content-generation platforms not controlled by the regime. That’s the main threat,” Jeffrey Ding, assistant professor of political science at George Washington University and a leading expert on China’s technological capabilities, told Rest of World.
“Blocking AI sites might not keep developers from using VPNs [virtual private networks] to access those tools but presents friction for the average Chinese person to use AI to generate politically sensitive content, such as a video making fun of CCP leaders, or a couplet about Chinese corruption,” Ding said.
He pointed to a viral AI-manipulated video that circulated on Western social media platforms last year showing Chinese President Xi Jinping saying flattering things about American society in fluent English. It contained numerous inaccuracies, and Xi has never given a full address in English.
AI-manipulated content that can misrepresent Chinese leaders is exactly the kind of content that Beijing does not want to see on its own social media, Ding said.
Rest of World’s analysis using the tool found a correlation between spikes in censorship activity and significant events, such as the passing of new AI regulations. For instance, hundreds of religion-related AI websites, including Biblechat.ai and Church.ai, were blocked this spring — coinciding with a surge in launches of spirituality-related AI applications.
In January 2023, China became one of the first countries in the world to introduce rules governing AI deepfake technology and, in August 2023, to enforce comprehensive generative AI regulations. Those dates also correspond to increased blocking of AI domains.
The data suggests authorities are applying preexisting censorship protocols to new AI technology, according to Jeremy Daum, a senior fellow at Yale Law School’s Paul Tsai China Center.
“The big names with heaviest traffic are going to get blocked. Beyond that, there seems to be a period of discovery, so you see batches of sites getting censored,” he told Rest of World. “The process is always mysterious with some automation at lower levels, but when blocking comes in batches, that’s usually a decision by CAC [Cyberspace Administration of China] officials.”
Popular AI chatbots from Chinese companies that must comply with censorship regulations include Baidu’s Ernie Bot, Alibaba’s Tongyi Qianwen, and ByteDance’s Doubao as well as a range from Chinese startups.
Phong was previously the lead researcher on GFWatch, a large nine-month study that provided insights into China’s domain-blocking behaviors and paved the way for the design of GFWeb. He has worked on smaller-scale studies on internet censorship in Turkmenistan and parts of the Middle East.
Charlie Smith, who uses a pseudonym for safety reasons, is a co-founder of GreatFire.org, which offers tools for analyzing and circumventing Chinese internet censorship. Smith said Phong’s work opens up exciting possibilities for researchers.
“Knowing the exact date of censorship is helpful for a number of reasons. We can, for instance, identify if sites are getting blocked around certain events,” he told Rest of World. “It also helps to show how the authorities block these sites. Do they block mostly on Mondays? Do they work on the weekends? We would be able to identify if the authorities block websites according to a pattern.”
“The GFWeb system should encourage more people to initiate tests themselves, which can greatly expand what we know about China’s internet controls,” Smithsaid.
No comments:
Post a Comment