Microsoft AI CEO Justifies Scraping Online Content For AI Training, Claims Them As “Freeware”

Low Boon Shen
4 Min Read

AI might be the next big thing in tech, but it also bring out some of the most contentious issues in regards to ethics, copyright, and the legality of scraping content on the internet. Recently, Microsoft AI chief Mustafa Suleyman has made a statement that sparked a controversy – he claims that the internet is “freeware”, suggesting that it’s all free for AI companies to use and train their models with.

Microsoft AI CEO: “It Is Fair Use”

The battle between AI companies seeking for more human-created data to feed their ever-hungry AI models and the creators that do not get compensated, nor was asked for permission to train using their content, has been a hot topic as of late. The creators argued that their content is copyright-protected, and if humans are not allowed to re-use it without permission, the same rule should apply to AI models. However, Suleyman seemed to disagree.

In the clip recorded during Aspen Ideas Festival, the host inquired Suleyman on the topic of AI training, with a question of who should own the intellectual property (IP) in cases like YouTube’s video transcriptions getting used by OpenAI (who has a partnership with Microsoft) to train its models (it’s worth noting that the content is made by the platform’s creators, not YouTube themselves, which sort of answers this question). Here’s his response:

With respect to content that’s already on the open web, the social contract of that content since the 90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been “freeware,” that’s been the understanding. There’s a separate category where a website, a publisher, or a news organization has explicitly said do not crawl or scrape me for any other reason than indexing me so that other people can find this content. That’s a grey area, and I think it’s going to work its way through the courts.

Mustafa Suleyman, Microsoft AI CEO

Suffice to say, the internet really don’t like the idea, much less the creators themselves who are among the first in the firing line of AI’s endless appetite for content to train on. Most of the criticism by users on X (Twitter) have aimed on his incorrect understanding of copyright laws, subsequently implying that everyone is allowed to use copyrighted content unless it’s explicitly stated not to do so. Some has even called for regulators to intervene – in a time when Microsoft is already in hot waters due to antitrust investigation with its partnership with OpenAI, as well as the recent Recall controversy.

Microsoft AI CEO Justifies Scraping Online Content For AI Training, Claims Them As "Freeware" - 17
Image: Microsoft

This statement is certainly not going to ease the tensions between the two parties anytime soon, as the hype around AI begins to simmer down and questions began to be asked on whether these giant corporations are to be trusted with user’s data. In the meantime, OpenAI has been securing deals with various publishing entities and Reddit, and with other competitors already using their own platforms to train content through a quick change on the terms of service (ToS) to avoid legal issues, this AI arms race is not going to slow down anytime soon.

Source: Android Authority

Pokdepinion: I’m pretty sure EU is not liking any of this, given their strong data privacy laws at the very least. I personally do not like the aggressive push on AI Microsoft has been trying to do lately.

Share This Article
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *