Saturday, June 03, 2023

AI hypocrisy: OpenAI, Google and Anthropic won't let their data be used to train other AI models, but they use everyone else's content

Story by insider@insider.com (Alistair Barr) • Yesterday

Samuel Altman, CEO of OpenAI, testifies before the Senate Judiciary Subcommittee on Privacy, Technology, and the Law May 16, 2023 in Washington, DC. 
Win McNamee/Getty Images© Provided by Business Insider

Microsoft-backed OpenAI, Google and Anthropic ban the use of their content to train other AI models.

However, these companies have been using other online content for their own model training. 

Can Big Tech have it both ways? Reddit and others are trying to stop this.

In the new age of generative AI, big tech companies are following a "do as I say, not as I do" strategy when it comes to the use of online content.

Microsoft-backed OpenAI, along with Google, and Google-backed Anthropic have for years been using online content created by companies to train their generative AI models. This was done without asking for specific permission, and it's part of a brewing legal battle that will decide the future of the web and how copyright laws are applied in this new world.

The tech industry will likely argue that their approach is fair use. That has yet to be decided. However, these big tech companies won't let their own content be used to train other AI models. So why should they be allowed to do this to everyone else?

Take a look at the terms of service for Claude, Anthropic's AI assistant:

"You may not access or use the Services in the following ways, and if any of these restrictions are inconsistent with or ambiguous in relation to the Acceptable Use Policy, the Acceptable Use Policy controls: To develop any products or services that compete with our Services, including to develop or train any artificial intelligence or machine learning algorithms or models."

Here's an excerpt from the top of Google's generative AI terms of use:

"You may not use the Services to develop machine learning models or related technology."

And here's the relevant section from OpenAI's terms of use. This is the company behind ChatGPT.

"You may not... use output from the Services to develop models that compete with OpenAI."

These companies are not dumb, but they are hypocritical

These companies are not dumb. They know that quality content is vital for training new AI models. So it makes sense that they won't allow their output to be used this way.

But why would any other website or company let their content be freely used by these giant tech companies to train their models?

Insider asked OpenAI, Google and Anthropic for comment on Friday. At the time of publication, they had not responded.

Reddit and other companies say enough is enough

Other companies are just beginning to realize what's been happening, and they are not happy. Reddit, which has been used for years in AI model training, plans to start charging for access to its data.

"The Reddit corpus of data is really valuable. But we don't need to give all of that value to some of the largest companies in the world for free," said Steve Huffman, CEO of Reddit.

In April, Elon Musk accused Microsoft, the main backer of OpenAI, of illegally using Twitter's data to train AI models. "Lawsuit time," he tweeted.

"There is so much wrong w/ this premise I don't even know where to start," a Microsoft spokesman wrote in an email to Insider when asked for comment.

OpenAI's CEO Sam Altman is trying to be more thoughtful on this issue, by working on new AI models that respect copyright. "We're trying to work on new models where if an AI system is using your content, or if it's using your style, you get paid for that," he said recently, according to Axios.

Publishers, including Insider which produced this story, have a vested interest here. Some publishers, including News Corp., are already pushing tech companies to pay to use their content for training AI models.

The current way AI models are trained 'breaks' the web

One former Microsoft executive believes something is wrong here. Steven Sinofsky recently said the current way AI models are trained "breaks" the web.

"Crawling used to be allowed in exchange for clicks. But now the crawling simply trains a model and no value is ever delivered to the creator(s) / copyright holders," he tweeted. Insider asked him for comment, but he was traveling on Friday and couldn't respond.


Japan privacy watchdog warns ChatGPT-maker OpenAI on user data
Story by By Kantaro Komiya and Sam Nussey • Yesterday 
SEXIST NERD BOY Illustration shows ChatGPT BY Thomson Reuters

By Kantaro Komiya and Sam Nussey

TOKYO (Reuters) - Japan's privacy watchdog said on Friday it has warned OpenAI, the Microsoft-backed startup behind the ChatGPT chatbot, not to collect sensitive data without people's permission.

OpenAI should minimise the sensitive data it collects for machine learning, the Personal Information Protection Commission said in a statement, adding it may take further action if it has more concerns.

Regulators around the world are scrambling to draw up rules governing the use of generative artificial intelligence (AI), which can create text and images, the impact of which proponents compare to the arrival of the internet.

While Japan has been on the backfoot with some recent technology trends, it is seen as having greater incentive to keep pace with advances in AI and robotics to maintain productivity as its population shrinks.

 OpenAI CTO's Twitter Hacked, Promotes Fraudulent 'OPENAI' Token (CoinDesk)
We are talking about another scam on Twitter.  Duration 7:14   View on Watch


The watchdog noted the need to balance privacy concerns with the potential benefits of generative AI including in accelerating innovation and dealing with problems such as climate change.


Japan is the third-largest source of traffic to OpenAI's website, according to analytics firm Similarweb.

OpenAI CEO Sam Altman in April met Prime Minister Fumio Kishida with an eye to expansion in Japan, ahead of the Group of Seven (G7) leaders summit where Kishida led a discussion on regulating AI.

The EU, a global trendsetter on tech regulation, set up a taskforce on ChatGPT and is working on what could be the first set of rules to govern AI.

In the meantime, the rapid spread of such chatbots has meant regulators have had to rely on existing rules to bridge the gap.

Italian regulator Garante had ChatGPT taken offline before the company agreed to install age verification features and let European users block their information from being used to train the system.

Altman last week said OpenAI had no plans to leave Europe after earlier suggesting the startup might do so if EU regulations were too difficult to comply with.

(Reporting by Kantaro Komiya and Sam Nussey; Editing by Jacqueline Wong, Christopher Cushing and Sharon Singleton)

Factbox-Governments race to regulate AI tools

Story by Reuters • Yesterday

REUTERS NERD BOY SEXISM

(Reuters) - Rapid advances in artificial intelligence (AI) such as Microsoft-backed OpenAI's ChatGPT are complicating governments' efforts to agree laws governing the use of the technology.

Here are the latest steps national and international governing bodies are taking to regulate AI tools:

AUSTRALIA

* Seeking input on regulations

The government is consulting Australia's main science advisory body and is considering next steps, a spokesperson for the industry and science minister said in April.

BRITAIN

* Planning regulations

The Financial Conduct Authority, one of several state regulators that has been tasked with drawing up new guidelines covering AI, is consulting with the Alan Turing Institute and other legal and academic institutions to improve its understanding of the technology, a spokesperson told Reuters.

Britain's competition regulator said on May 4 it would start examining the impact of AI on consumers, businesses and the economy and whether new controls were needed.

Britain said in March it planned to split responsibility for governing AI between its regulators for human rights, health and safety, and competition, rather than creating a new body.

CHINA

* Planning regulations

China's cyberspace regulator in April unveiled draft measures to manage generative AI services, saying it wanted firms to submit security assessments to authorities before they launch offerings to the public.

Beijing will support leading enterprises in building AI models that can challenge ChatGPT, its economy and information technology bureau said in February.

EUROPEAN UNION

* Planning regulations

The U.S. and EU should push the AI industry to adopt a voluntary code of conduct within months to provide safeguards while new laws are developed, EU tech chief Margrethe Vestager said on May 31. Vestager said she believed a draft could be drawn up "within the next weeks", with a final proposal for industry to sign up "very, very soon".

Key EU lawmakers on May 11 agreed on tougher draft rules to rein in generative AI and proposed a ban on facial surveillance. The European Parliament will vote on the draft of the EU's AI Act in June.

EU lawmakers had reached a preliminary deal in April on the draft that could pave the way for the world's first comprehensive laws governing the technology. Copyright protection is central to the bloc's effort to keep AI in check.

The European Data Protection Board, which unites Europe's national privacy watchdogs, set up a task force on ChatGPT in April.

 Can AI be regulated? What to know about the technology's future. (USA TODAY)
Duration 2:07 View on Watch

The European Consumer Organisation (BEUC) has joined in the concern about ChatGPT and other AI chatbots, calling on EU consumer protection agencies to investigate the technology and the potential harm to individuals.

FRANCE

* Investigating possible breaches

France's privacy watchdog CNIL said in April it was investigating several complaints about ChatGPT after the chatbox was temporarily banned in Italy over a suspected breach of privacy rules.

France's National Assembly approved in March the use of AI video surveillance during the 2024 Paris Olympics, overlooking warnings from civil rights groups.

G7

* Seeking input on regulations

Group of Seven leaders meeting in Hiroshima, Japan, acknowledged on May 20 the need for governance of AI and immersive technologies and agreed to have ministers discuss the technology as the "Hiroshima AI process" and report results by the end of 2023.

G7 nations should adopt "risk-based" regulation on AI, G7 digital ministers said after a meeting in April in Japan.

IRELAND

* Seeking input on regulations

Generative AI needs to be regulated, but governing bodies must work out how to do so properly before rushing into prohibitions that "really aren't going to stand up", Ireland's data protection chief said in April.

ITALY

* Investigating possible breaches

Italy's data protection authority Garante plans to review other artificial intelligence platforms and hire AI experts, a top official said on May 22.

ChatGPT became available again to users in Italy in April after being temporarily banned over concerns by the national data protection authority in March.

JAPAN

* Investigating possible breaches

Japan's privacy watchdog said on June 2 it has warned OpenAI not to collect sensitive data without people's permission and to minimise the sensitive data it collects, adding it may take further action if it has more concerns.

SPAIN

* Investigating possible breaches

Spain's data protection agency said in April it was launching a preliminary investigation into potential data breaches by ChatGPT. It has also asked the EU's privacy watchdog to evaluate privacy concerns surrounding ChatGPT, the agency told Reuters in April.

U.S.

* Seeking input on regulations

The U.S. Federal Trade Commission's chief said on May 3 the agency was committed to using existing laws to keep in check some of the dangers of AI, such as enhancing the power of dominant firms and "turbocharging" fraud.

Senator Michael Bennet introduced a bill in April that would create a task force to look at U.S. policies on AI, and identify how best to reduce threats to privacy, civil liberties and due process.

The Biden administration had earlier in April said it was seeking public comments on potential accountability measures for AI systems.

President Joe Biden has also told science and technology advisers that AI could help to address disease and climate change, but it was also important to address potential risks to society, national security and the economy.

(Compiled by Amir Orusov and Alessandro Parodi in Gdansk; editing by Jason Neely, Kirsten Donovan and Milla Nissi)

No comments: