Top Stories This Week

Related Posts

Reddit bars Internet Archive from its website, sparking access concerns

As artificial intelligence models continue to grow and develop, their demand for more and more data also increases rapidly. Now, some companies are making it tougher for AI scrapes to happen, unless companies pay a price.

Reddit has announced that it will be severely limiting the Internet Archive’s Wayback Machine’s access to the communication platform following its accusation that AI companies have been scraping the website for Reddit data. The platform will only be allowing the Internet Archive to save the home page of its website.

‘Until they’re able to defend their site and comply with platform policies … we’re limiting some of their access to Reddit data to protect redditors.’

The limits on the Internet Archive’s access was set to start “ramping up” on Monday, according to the Verge. Reddit did not apparently name any of the AI companies involved in these website data scrapes.

RELATED: Sam Altman loves this TV show. Guess what it says about godlike technology

Photo Illustration by Avishek Das/SOPA Images/LightRocket via Getty Images

“Internet Archive provides a service to the open web, but we’ve been made aware of instances where AI companies violate platform policies, including ours, and scrape data from the Wayback Machine,” Reddit spokesman Tim Rathschmidt told Return.

Some Reddit users pointed out that this move is a far cry from Reddit co-founder Aaron Swartz’s philosophy. Swartz committed suicide in the weeks before he was set to stand trial for allegedly breaking into an MIT closet to download the paid JSTOR archive, which hosts thousands of academic journals. He was committed to making online content free for the public.

“Aaron would be rolling [in his grave] at what this company turned into,” one Reddit user commented.

Rathschmidt emphasized that the change was made in order to protect users: “Until they’re able to defend their site and comply with platform policies (e.g., respecting user privacy, re: deleting removed content), we’re limiting some of their access to Reddit data to protect redditors,” he told Return.

However, it has been speculated that this more aggressive move was financially motivated, given the fact that the platform has struck deals in the past with some AI companies but sued others for not paying its fees. Reddit announced a partnership with OpenAI in May 2024 but sued Anthropic in June of this year for not complying with its demands.

“We have a long-standing relationship with Reddit and continue to have ongoing discussions about this matter,” Mark Graham, director of the Wayback Machine, said in a statement to Return.

Stay informed with diverse insights directly in your inbox. Subscribe to our email updates now to never miss out on the latest perspectives and discussions. No membership, just enlightenment.