Home » News » YouTube videos are also being used to train AI, a study says

YouTube videos are also being used to train AI, a study says

Since the explosion of artificial intelligence began, especially the kind that allows images and videos to be generated from text instructions, doubts have not stopped arising about how the whole thing works. One of the most widespread concerns the indiscriminate use of content published on the Internet to train algorithms.

Advertising On the left, an image from the Getty Images catalogue, on the right, an image generated with artificial intelligence using Stable Diffusion.

In fact, Stable Diffusion directly used free images from the website with the agency’s watermark, which caused the AI ​​itself to try to emulate this watermark in some of the images it generated with quite grotesque results.

Adobe has also had its own controversy, in this case for not making it sufficiently clear in the terms of use of its cloud storage service whether the company would use its users’ content to train Firefly, its own generative AI.

Now the news portal Proof News has added fuel to the fire by claiming that companies such as Apple, Nvidia, Salesforice and Anthropic are using thousands of videos from YouTube (and other platforms) to feed their own algorithms.

Obviously, all of this is happening behind the users’ backs, and despite the fact that YouTube supposedly prohibits using materials from the platform without permission. According to Proof News, 173,536 subtitle files extracted from more than 48,000 channels have been used.

These caption files contain full transcripts from educational and outreach channels such as Khan Academy, MIT, and Harvard University’s own YouTube channel.

Among the material used we can also find stars of the platform such as Marques Brownlee, with more than 19 million subscribers, or PewDewPie, with no less than 111 million followers. Some of the videos used also include conspiracy theories and even content about flat-earthism.

For the moment, this data theft uncovered by Proof News does not include the images of the videos, only the textual content. Although it is only a matter of time before this happens, given the advance of AI that generates video from text. We are undoubtedly entering uncharted territory.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.