Is it possible to hold AI accountable for copyright infringement?

On December 27, 2023, The New York Times published an article by Michael M. Grynbaum and Ryan Mac «The Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted Work». The authors write that the developers of artificial intelligence misused the newspaper materials for training chatbots, which now compete with the news publication. Our team is interested in the issue of copyright regulation for AI.

We have analyzed articles on this issue published in Russian and foreign media in order to consider cases of violations of AI copyright. In our research, we have studied the following aspects of the problem: who is responsible for neural networks, and whether the fair use of works can harm the authors. Our team also tried to predict what lawsuits against artificial intelligence would lead to.

Fair use of materials

The New York Times website published an article on January 8, 2024, “Open AI says New York Times Lawsuit Against It is “Without Merit” by author Cade Metz. The article contains Open AI’s comments on the lawsuit filed against them. Representatives of the company believe that the use of copyrighted works to teach their technologies is fair use in accordance with the law.

However, Cade Metz notes that The Times’ lawsuit included examples where ChatGPT reproduced excerpts from newspaper articles “almost word for word,” and this is not fair use. As proof, he cites comments by Ian Crosby, a lawyer for The Times at the Susman Godfrey law firm.

Often, AI-related companies claim that the legal use of content to teach their technologies is possible, since the material is publicly available and they do not reproduce it completely. On the FindLaw website, on December 26, 2023, an article by Stephen Ellison was published entitled “Can I sue an AI company for copyright infringement?”. In it, the author states that the doctrine in fair use of materials in training neural networks can harm the authors.

Stephen Ellison provides evidence in favor of protecting works used to teach AI technologies. Among them are the commercial goals of artificial intelligence companies, the incomparable amount of work involved in creating material, and the impact of a copyrighted work on the market. The author comes to the conclusion about the illegality of the fair use doctrine in relation to AI.

We asked Ilya Didenko, an expert, a leading web development engineer at Effective Technologies LLC, a consultant and a participant in the fact-checking automation project at N.I. Lobachevsky National Research University, to comment on the use of copyrighted data in AI training. According to him, they can be used at an intermediate stage before the training itself.

Basically, when a person distributes data in open sources, he explicitly agrees to the service rules of data disseminating. In this case, consent is not required. If a person has protected their rights to the content before distribution, it is not advisable to use this data for the training. But they can be used as an intermediate stage to prepare the data for training, that is, mark it up yourself. For example, it is forbidden to use author’s drawings, but if the neural network solves the problem of determining the style of these images, then the image must first be processed, and only after that it must be used for training, — our expert noted.

Who is responsible for the AI materials?

Robayet Syed’s wrote an article «So sue me: Who should be responsible when AI makes mistakes?» published on the Monash University website, says that the issue of AI responsibility should be considered individually in each case.

Robayet Syed argues that liability applies only to legal entities and individuals, and AI is not one of them. He concludes that it is not yet possible to determine who is responsible for the errors of artificial intelligence, but we must be aware of the risks and take measures to reduce them.

We turned to an expert – the Vice-Rector of Lobachevsky State University of Nizhny Novgorod for academic work, Associate Professor, Ph.D. of Legal Sciences, the Dean of the Faculty of Law Evgenia Evgenievna Chernykh, to get an opinion on the issue of responsibility for AI mistakes. According to her, there is no need to introduce a civil legal status for AI since there is always a person behind it and it is him who should bear responsibility.

As for products created by a neural network, humans again come to the fore. They must carry out the analysis and risk assessment. There is one position regarding AI – it is a source of increased danger. But if we take the signs of sources of increased danger, then modern elements of AI do not fall under them. One of them is causing harm beyond human control. This is not suitable for AI. It has no civil liability. Based on this, the question remains unresolved: who would be responsible? I believe that there is no need to introduce a civil liability for AI. Every machine has a man behind it. “We need to bring the man to justice, — noted our respected expert.

Lawsuits against AI

In November 2022, the first high-profile case of lawsuits against companies developing artificial intelligence tools occurred. On November 3, 2022, the GitHab Copilot Litigation website published an article by Matthew Butterick titled “We have filed a lawsuit challenging GitHub Copilot, an artificial intelligence product that relies on unprecedented piracy of open source software. Because AI must be fair and ethical for everyone”.

Microsoft, OpenAI and GitHub, which have developed a virtual assistant tool for GitHub Copilot programmers, have been accused of using someone else’s open source software to train the assistant and the OpenAI Codex AI model. It was uploaded to GitHub by the users themselves.

According to the statement of claim, the license protected the user’s software. The trial is ongoing.
In January 2023, the American photo agency Getty Images and visual artists sued Stability AI, Midjourney and DeviantArt. On January 17, 2023, an article entitled “Getty Images Statement” was published on the Gatty Images website on behalf of the company which referred to a lawsuit filed against Stability AI.

Getty Images accused Stability AI of illegally copying and processing millions of copyrighted images. So, the artists in their class action filed a complaint about the use of them and billions of other images from the Internet without permission in order to train the generative neural network Stable Diffusion. Currently, the trial in this case is also continuous.

What will the court proceedings lead to?

In the article by Juras Jurshenes “How lawsuits and regulation will affect AI in 2024”, published on the Builtin website on December 20, 2023, the author examines the consequences of litigation with artificial intelligence.

In the article «How Lawsuits and Regulation Will Impact AI in 2024» by Juras Jursenes, published on Builtin website on December 20, 2023, the author addresses the issue of the consequences of artificial intelligence litigation

We asked to join our team and consult us a student at the Faculty of Law of Lobachevsky State University of Nizhny Novgorod Alexander Kusakin. He focuses on the topic of legal regulation of AI and takes the initiative to adopt a federal law on copyright for works of neural networks. We also asked him to predict the outcome of litigation against AI. According to him, it is still difficult to hold artificial intelligence itself or its developers accountable if the neural network produces processed and not completely copied products.

We currently do not have a specific clause about what is used to train artificial intelligence. If the texts are not manually typed in an AI program, and it independently finds them, processes them, and presents as something of its own, then it is a case of free use. In addition, these texts are freely available. The most important thing is that artificial intelligence should not be programmed to simply copy these texts. We protect the rights of copyright holders, not only authors. And we defend the socially beneficial goals that are pursued when artificial intelligence is trained. But if the goal is to simply sell other people’s texts, then the owners of neural networks will be held accountable. If you use a paid subscription and distribute materials from there, the question of full compensation will be raised. If they were free, then it would be possible to claim its use for socially beneficial purposes, — said Alexander Kusakin.

Conclusion

The main argument for protecting foreign artificial intelligence companies is Section 107 of the US Copyright Act, which states what may be considered fair use of copyrighted works without the consent of the copyright holder. However, there is an argument that the fair use doctrine of copyrighted works can harm the authors of those works. Our expert Ilya Didenko believes that copyrighted data can be used at an intermediate stage before training the neural network itself.
The issue of responsibility for artificial intelligence mistakes, including copyright violations, is also ambiguous: some believe that it is not possible yet to determine the culprit, another opinion is that the person behind the AI should bear responsibility.
Perhaps practice will solve this issue. In the United States, for example, legal proceedings have been ongoing for two years on claims against artificial intelligence development companies. The author of one of the articles we reviewed, Juras Jursenes, believes that the jurisprudence in these cases will mark the beginning of the regulation of artificial intelligence in accordance with copyright laws. But it is also worth considering how the neural network receives and processes materials, and what goals the neural network developers pursue.

Authors: Anna Sadovina, Anastasia Titova, Alena Manina, Daria Nazarova, Alexander Kusakin

Скоро