Summary:
-
Unauthorized Use: An investigation by Proof News revealed that AI companies like Apple, Nvidia, and Anthropic used subtitles from 173,536 YouTube videos, violating YouTube’s rules against unauthorized data harvesting.
-
Wide-Ranging Sources: The dataset, named YouTube Subtitles, included educational content from channels like Khan Academy, MIT, and Harvard, as well as videos from The Wall Street Journal, NPR, and BBC. Popular YouTube channels like MrBeast, Marques Brownlee, and PewDiePie were also affected.
-
Content Variety: The materials used for training AI included videos promoting conspiracy theories, raising ethical concerns about the type of content AI models are being trained on.
!