UK kicks off review into training AI models on copyrighted content

UK kicks off review into training AI models on copyrighted content


On Dec. 9, OpenAI made its artificial intelligence video generation model Sora publicly available in the U.S. and other countries.

Cfoto | Future Publishing | Getty Images

The U.K. is drawing up measures to regulate the use of copyrighted content by tech companies to train their artificial intelligence models.

The British government on Tuesday kicked off a consultation which aims to increase clarity for both the creative industries and AI developers when it comes to both how intellectual property is obtained and then used by AI firms for training purposes.

Some artists and publishers are unhappy with the way their content is being scraped freely by companies like OpenAI and Google to train their large language models — AI models trained on huge quantities of data to generate humanlike responses.

Large language models are the foundational technology behind today’s generative AI systems, including the likes of OpenAI’s ChatGPT, Google’s Gemini and Anthropic’s Claude.

Last year, The New York Times brought a lawsuit against Microsoft and OpenAI accusing the companies of infringing its copyright and abusing intellectual property to train large language models.

In response, OpenAI disputed the NYT’s allegations, stating that the use of open web data for training AI models should be considered “fair use” and that it provides an “opt-out” for rights holders “because it’s the right thing to do.”

Separately, image distribution platform Getty Images sued another generative AI firm, Stability AI, in the U.K., accusing it of scraping millions of images from its websites without consent to train its Stable Diffusion AI model. Stability AI has disputed the suit, noting that the training and development of its model took place outside the U.K.

Proposals to be considered

First, the consultation will consider making an exception to copyright law for AI training when used in the context of commercial purposes but while still allowing rights holders to reserve their rights so they can control the use of their content.

Second, the consultation will put forward proposed measures to help creators license and be remunerated for the use of their content by AI model makers, as well as give AI developers clarity over what material can be used for training their models.

The government said more work needs to be done by both the creative industries and technology firms to ensure any standards and requirements for rights reservation and transparency are effective, accessible and widely adopted.

The government is also considering proposals that would require AI model makers to be more transparent about their model training datasets and how they’re obtained so that rights holders can understand when and how their content has been used to train AI.

That could prove controversial — technology firms aren’t especially forthcoming when it comes to the data that fuels their coveted algorithms or how they train them up, given the commercial sensitivities involved in revealing those secrets to potential competitors.

Previously, under former Prime Minister Rishi Sunak, the government attempted to agree a voluntary AI copyright code of practice.

AI copyright rules: U.K. versus U.S.

In a recent interview with CNBC, the boss of app development software firm Appian said he thinks the U.K. is well placed to be the “global leader on this issue.”

“The U.K. has put a stake in the ground declaring its prioritization of personal intellectual property rights,” Matt Calkins, Appian’s CEO, told CNBC. He cited 2018’s Data Protection Act as an example of how the U.K. is “closely associated with intellectual property rights.”

The U.K. is also not “subject to the same overwhelming lobbying blitz from domestic AI leaders that the U.S. is,” Calkins added — meaning it might not be as prone to bowing down to pressure from tech giants as politicians stateside.

“In the U.S., anybody who writes a law about AI is going to hear from Amazon, Oracle, Microsoft or Google before that bill even reaches the floor,” Calkins said.

“That’s a powerful force stopping anyone from writing sensible legislation or protecting the rights of individuals whose intellectual property is being taken wholesale by these major AI players.”

The issue of potential copyright infringement by AI firms is becoming more notable as tech firms are moving toward a more “multimodal” form of AI — that is, AI systems that can understand and generate content in the form of images and video as well as text.

Last week, OpenAI made its AI video generation model Sora publicly available in the U.S. and “most countries internationally.” The tool allows a user to type out a desired scene and produce a high-definition video clip.



Source

California’s Ro Khanna faces Silicon Valley backlash after embracing wealth tax
Technology

California’s Ro Khanna faces Silicon Valley backlash after embracing wealth tax

Democrat Rep. Ro Khanna has embraced a wealth tax in his home state of California, and his longtime allies in Silicon Valley are now threatening to abandon him. California labor groups are trying to add a proposal for a statewide tax on billionaires to the November ballot. The proposal is causing a rift among Democrats […]

Read More
S&P 500 hits new highs, flight cancellations, the restaurant industry’s value push and more in Morning Squawk
Technology

S&P 500 hits new highs, flight cancellations, the restaurant industry’s value push and more in Morning Squawk

Traders work on the floor at the New York Stock Exchange in New York City, U.S., Dec. 17, 2025. Brendan McDermid | Reuters This is CNBC’s Morning Squawk newsletter. Subscribe here to receive future editions in your inbox. Here are five key things investors need to know to start the trading day: 1. Green Christmas Joy to […]

Read More
From data center spas to servers in space: How the energy crunch is reshaping cloud computing
Technology

From data center spas to servers in space: How the energy crunch is reshaping cloud computing

Lenovo in partnership with AKT II and Mamou-Mani imagines the data centers of the future: a data center spa James Cheung, partner at Mamou-Mani Artificial intelligence is advancing at breakneck speed, forcing a rethink of how the power-hungry servers behind the boom can coexist with — and draw less from — the environment. Data centers […]

Read More