Source: https://www.tomshardware.com/tech-i...ing-court-records-reveal-copyright-violations
Laws are for the poor....Facebook parent-company Meta is currently fighting a class action lawsuit alleging copyright infringement and unfair competition, among others, with regards to how it trained LLaMA. According to an X (formerly Twitter) post by vx-underground, court records reveal that the social media company used pirated torrents to download 81.7TB of data from shadow libraries including Anna’s Archive, Z-Library, and LibGen. It then used this information to train its AI models.
The evidence, in the form of written communication, shows the researchers’ concerns about Meta’s use of pirated materials. One senior AI researcher said way back in October 2022, “I don’t think we should use pirated material. I really need to draw a line here.” While another one said, “Using pirated material should be beyond our ethical threshold,” then they added, “SciHub, ResearchGate, LibGen are basically like PirateBay or something like that, they are distributing content that is protected by copyright and they’re infringing it.”
Then, in January 2023, Mark Zuckerberg himself attended a meeting where he said, “We need to move this stuff forward... we need to find a way to unblock all this.” Some three months later, a Meta employee sent a message to another one saying they were concerned about Meta IP addresses being used “to load through pirate content.” They also added, “torrenting from a corporate laptop doesn’t feel right,” followed by laughing out loud emoji.
Aside from those messages, documents also revealed that the company took steps so that its infrastructure wasn’t used in these downloading and seeding operations so that the activity wouldn’t be traced back to Meta. The court documents say that this constitutes evidence of Meta’s unlawful activity, which seems like it’s taking deliberate steps to circumvent copyright laws.