Reddit sues Perplexity for scraping of posts, expanding user data battle with AI industry

Reddit sues Perplexity for scraping of posts, expanding user data battle with AI industry


Thomas Fuller | Lightrocket | Getty Images

Social media giant Reddit has launched a lawsuit against artificial intelligence company Perplexity, alleging that it illegally scraped user posts to train its AI model, marking the latest data-rights clash between content owners and the AI industry. 

The complaint filed in New York federal court on Wednesday also named three defendants, which Reddit says helped Perplexity collect its data: Lithuanian data scraper Oxylabs, “former Russian botnet” AWMProxy, and Texas startup SerpApi.

Reddit alleged that the three smaller entities were able to extract its copyrighted content “by masking their identities, hiding their locations and disguising their web scrapers as regular people.”

Perplexity, which runs an AI-powered search engine, denied the allegations and accused Reddit of “extortion” and opposition to an open internet, while SerpApi told CNBC it “strongly disagrees” with Reddit’s claims and intends to defend itself in court. 

The case represents one of many filed by content owners accusing AI firms of using copyrighted material without permission to train their large language models. Reddit, in particular, has been on the front lines of that battle, having launched a similar ongoing lawsuit against AI startup Anthropic in June. CNBC was unable to reach Oxylabs and AWMProxy.

In a statement shared with CNBC, Ben Lee, Chief Legal Officer at Reddit, said that AI companies are” locked in an arms race for quality human content” and that pressure has fueled an “industrial-scale ‘data laundering’ economy.”

Scrapers bypass technological protections to steal data, then sell it to clients hungry for training material. Reddit is a prime target because it’s one of the largest and most dynamic collections of human conversation ever created.

Reddit — which hosts over 100,000 interest-based “subreddit” communities — said in its lawsuit that its user posts had become the most commonly cited source for AI-generated answers on Perplexity. 

It added that it sent Perplexity a cease-and-desist letter, after which it “increased the volume of citations to Reddit forty-fold.”

AI researchers have previously noted that Reddit’s large volume of moderated conversations can help make AI chatbots produce more natural-sounding responses.

In the age of artificial intelligence, Reddit has worked to leverage its massive data pool, permitting access to it only through AI-related licensing agreements. The social media company has signed such agreements with OpenAI and Alphabet‘s Google. 

In a response to the lawsuit, Perplexity, in a post on the Reddit platform, argued that it does not train AI models on content but merely summarizes and cites public Reddit discussions. Therefore, it said it is “impossible” to sign a license agreement.

“A year ago, after explaining this, Reddit insisted we pay anyway, despite lawfully accessing Reddit data. Bowing to strong arm tactics just isn’t how we do business,” the statement read, going on to describe the suit as a “show of force in Reddit’s training data negotiations with Google and OpenAI.” 

“Perplexity believes this is a sad example of what happens when public data becomes a big part of a public company’s business model,” Perplexity added, noting that data licensing has become an increasingly important source of revenue for Reddit. 

In February, Reddit’s COO Jen Wong told the trade publication Adweek that AI licensing deals with Google and OpenAI made up nearly 10% of Reddit’s revenue. 



Source

Nine of the largest pharma companies ink deals with Trump to lower drug prices
World

Nine of the largest pharma companies ink deals with Trump to lower drug prices

President Donald Trump signs an executive order aimed at reducing the cost of prescription drugs and pharmaceuticals by 30% to 80% during an event in the Roosevelt Room of the White House on May 12, 2025, in Washington, DC. Andrew Harnik | Getty Images Several of the largest U.S. and European-based drugmakers inked deals with […]

Read More
Google’s boomerang year: 20% of AI software engineers hired in 2025 were ex-employees
World

Google’s boomerang year: 20% of AI software engineers hired in 2025 were ex-employees

Sundar Pichai, chief executive officer of Alphabet Inc., during the Bloomberg Tech conference in San Francisco, California, US, on Wednesday, June 4, 2025. David Paul Morris | Bloomberg | Getty Images With the AI talent wars heating up between companies like OpenAI, Meta and Anthropic, one way Google has been competing is by aggressively rehiring […]

Read More
Claire’s new owner Ames Watson feuds with Asia-based suppliers over millions in unpaid debt
World

Claire’s new owner Ames Watson feuds with Asia-based suppliers over millions in unpaid debt

Chris Ratcliffe | Bloomberg | Getty Images Tween retailer Claire’s is facing legal challenges from some of its Asia-based suppliers over millions in unpaid debts as it tries to emerge from a second bankruptcy under new ownership, according to claims the suppliers filed in Hong Kong.  The clash with vendors comes as private equity firm […]

Read More