Anthropic wins key US ruling on AI training in authors' copyright lawsuit
A federal court agreed that using copyrighted works to train AI is fair use - but that pirating them to do so infringes creator rights.
Link: Blake Brittain at Reuters.
The headline here focuses on the fair use portion of Anthropic's argument:
"Siding with tech companies on a pivotal question for the AI industry, U.S. District Judge William Alsup said Anthropic made "fair use", opens new tab of books by writers Andrea Bartz, Charles Graeber and Kirk Wallace Johnson to train its Claude large language model."
But it's important to also note that the way Anthropic obtained them is not fair use:
"Alsup also said, however, that Anthropic's copying and storage of more than 7 million pirated books in a "central library" infringed the authors' copyrights and was not fair use. The judge has ordered a trial in December to determine how much Anthropic owes for the infringement."
In other words: training a model on copyright works is inherently transformative and is used to create new works rather than infringe the rights of creators. But how you get them does matter, which is something AI vendors have been hoping to sidestep. If you go out and pirate a whole bunch of books and other media, and then keep that pirated media around, that's a violation, and copyright owners are owed compensation that could reach $150,000 per work.
For now, while this might seem like a win for the AI industry, it maintains an important boundary that protects rights-holders. A lot will depend on how much the court decides is actually owed; that trial begins in December.
[Link]