How ‘learnrights’ would compensate creators for AI model training
What you’ll learn: U.S. law hasn’t determined whether using copyrighted content to train AI models is fair use or infringement. Three researchers make the case for “learnright” laws that would give copyright holders the right to license their content for AI training and claim some share in the revenues coming from generative AI systems.
Human content creators are protected by copyright law, in part to ensure that they’re fairly compensated for their work.
But whether these laws allow artificial intelligence models to learn from human-created content is up for debate — both in court and on Capitol Hill. Encyclopedia Britannica’s lawsuit against OpenAI, for example, is one of the latest allegations of misuse of reference materials. Meanwhile, the U.S. Copyright Office has not made a binding determination about whether using copyrighted works to train AI models is fair use.
To deal with these issues, in 2023 MIT Sloan School of Management professor Thomas Malone proposed “learnright” laws that would give copyright holders the exclusive right to license their content to AI companies for model training.
“Copyright law wasn’t designed for a world with generative AI, and without something like learnright laws, the incentives for people to create new content are likely to be greatly reduced,” said Malone, who is also the director of the MIT Center for Collective Intelligence.
In a more recent article, Malone and co-authors Frank Pasquale of Cornell Law School and Andrew Ting of George Washington University Law School outlined the argument for learnrights and described how they could work legally, economically, and practically.
AI and copyright pose thorny legal questions
U.S. copyright law lets content creators charge others a fee to copy their work, and this potential for profit motivates creators to produce valuable content. But copyright law does not prevent humans from learning from copyrighted material and then producing content of their own that’s different from the original. This is generally referred to as a kind of “fair use,” and it’s permissible under the law.
The problem is that copyright law did not anticipate technologies like generative AI models that can learn from massive amounts of content at a scale no human could ever hope to match. That ability allows AI systems to generate new content that isn’t a direct copy of the original material but couldn’t be generated at all without it. Generative AI providers claim that this is fair use. But dozens of copyright owners have sued generative AI providers, alleging that the resulting output is a derivative work that infringes on their original copyright.
Past copyright cases (including ones involving Anthropic and Meta) suggest that fair use is flexible and contextual, depending on factors like whether the new content affects the potential market value for the copyrighted work.
But none of these cases has completely resolved the legal issues, and no generative AI cases have reached the Supreme Court. The question of the livelihood of creators in a world of AI-driven content remains unanswered.
Licensing content for AI training
Enter learnright. Rather than expanding fair use to include AI learning, learnright is an exclusive right to license copyrighted content for AI training. As Malone and his co-authors write, “there are substantial grounds for predicting that a robust market for learnrights, with reasonable and fair licensing fees, would emerge.”
Then, they argue, generative AI providers could explicitly obtain legal licenses to train their AI models using content created by others, and the creators of that content would be fairly compensated.
Malone, Pasquale, and Ting outline an opt-in program through which creators would register their copyrighted content and AI companies could obtain a license from the creator to use it. AI companies would negotiate compensation either with the creators themselves or with the creators’ literary or artistic agents.
The authors assert that working with agents would benefit both sides: AI companies would negotiate with far fewer entities, and agents would likely have a better sense of market value than individual creators would. Additionally, an opt-in model is more favorable than automatically providing learnright coverage to any copyrighted material, which is impractical and could lead to learnright protections for content of little value.
Generative AI Business Sprint
Attend Online
REGISTER NOW
Benefits of learnright
Malone and his co-authors presented three arguments that support compensating copyright holders whose work is used to train generative AI.
- If AI models produce high-quality content quickly and cheaply without compensating the original creators of this content, that will decrease creators’ motivation to produce new content and thus reduce the volume of original work available to further improve AI models. “It would be unwise to risk such a decline in incentives for human expression,” the researchers write.
- The researchers find it “troubling” that for-profit AI companies cry foul when others use their intellectual property — as was the case when U.S.-based AI firms accused China’s DeepSeek of stealing from them — given that the same companies use copyrighted content without compensating its creators.
- Properly acknowledging how other works influenced one’s own is the right thing to do and the foundation of a thoughtful creative process, the researchers write. Conversely, uncredited and uncompensated use of others’ work falls short of ethical standards and undermines what IP protection is supposed to mean.
Some argue that giving AI models free, broad access to content will lead to better-performing and more diverse AI systems. The authors contend that this isn’t likely to work in the long term because it would erode the incentives for people to create new content in the future.
“In short, learnright law is a promising way for both content creators and AI providers to benefit in a flourishing information marketplace,” Malone said.
Thomas Malone is the Patrick J. McGovern (1959) Professor of Management at the MIT Sloan School of Management and director of the MIT Center for Collective Intelligence. He teaches classes on organizational design, IT, and leadership; his research focuses on how new organizations can be designed to take advantage of the possibilities provided by information technology.
Frank Pasquale is a professor of law at Cornell Law School and Cornell Tech. He is an expert on the law of AI algorithms, and machine learning.
Andrew Ting is a professorial lecturer in law at George Washington University Law School and an adjunct professor at Georgetown University, where he teaches startup law. He has published articles addressing AI, intellectual property, financial regulations, risk management, privacy, and corporate law matters.