ChatGPT and the Underlying Copyright Malady
As per OpenAI’s charter, its “mission is to ensure that artificial general intelligence (AGI)… benefits all of humanity.” To elucidate, AI systems have been generally divided into 3 categories – weak, strong and superintelligence. Weak or narrow AI is focused on doing a particular task or solving a particular problem, such as Siri or Alexa. Strong AI or artificial general intelligence is AI which possesses human-like intelligence in terms of invention and creativity, such as learning and developing skills while doing different tasks. And, Superintelligence, simply put, is a form of intelligence that would outperform the smartest humans in every field.
The ToU state that “[a]s between the parties and to the extent permitted by applicable law, you own all Input, and subject to your compliance with these Terms, OpenAI hereby assigns to you all its right, title and interest in and to Output.” Input refers to the question fed by the user and the answer generated by the chatbot is the Output. The part of the ToU referred to above, won’t be enforceable in the US or India, on the basis of the current legal landscape. The reasons for its non-enforceability in India have been discussed in this post.
For ease, the copyright implications of ChatGPT can be divided into two groupings: for Input and for Output. The possible implications of the Output have only been discussed, as I have proceeded with the assumption that the Input of the question is by a human being and thus, the human being would be granted (or not) authorship and ownership, if the criteria for copyrightability is met. The interesting questions arise when it comes to the Output, such as whether it meets the requirements for copyright protection (ignoring for a moment the AI authorship and ownership questions, which have previously been discussed in depth on the blog here, here and here).
Technical constraints with assignment of Output
The assignment of Output to the user seems absurd and would fail at multiple levels, both technical and practical. Let’s evaluate the technical issues from an Indian standpoint. Firstly, the assumption that the work created by ChatGPT comes to be automatically owned by OpenAI, is not legally tenable. Can AI itself be the author for Output generated by it or would it be the human being providing the Input or the AI’s developer? If we go by the ToU, in order for OpenAI to be considered the owner of ChatGPT’s Output, we would have to examine Sec. 17 of the Copyright Act, 1957 (the Act). But under Sec. 17, India has a human author requirement, as can be evidenced from: i) Sec. 16 of the Act states that “no person” is entitled to copyright, except under the statute; ii) Application for Registration of Copyright (Form-XIV) requires disclosure of name, nationality and address; and iii) in case of computer-generated works, Sec. 2(d)(vi) of the Act allows grant of authorship to the person “who causes the work to be created”. A human being who has merely provided a one-line Input, cannot be considered to have caused the Output to be created in terms of Sec. 2(d)(vi). Thus, it seems that no one can be the author (or a co-author or owner) for an AI-generated work under Indian law. Interestingly, as per the online records of the Copyright Office, ‘Raghav Artificial Intelligence Painting App’ (AI) still continues to be registered as a co-author (along with a human being), despite the issuance of a withdrawal notice more than a year ago (covered here).
Next, assuming if the above hurdle is crossed, then how will the user prove that it has legitimately been assigned copyright for the Output? The ToU by not providing certain details required by Section 19 of the Act, such as term, territory and the amount of royalty and any other consideration payable to the author, fail to fulfil the requirements for valid assignment. The territory would ideally have to be the world, for an artwork or an article published online, and also, it would be difficult to practically take down the same after 5 years. It would probably be possible to take down one article, but how will this work in case of thousands or lakhs of such works. A work can be exploited even in the absence of specification of term and territory (albeit to a limited extent), but an assignment under Indian copyright law cannot be made without stipulating the amount of royalty and consideration payable to the author.
Practical issue with the assignment of Output
Apart from the above technical hurdles, the Output “may not be unique across users” for similar questions, as has been acknowledged in the ToU. Imagine a scenario where one user claims copyright over certain Output and then, another user also claims copyright over the exact same Output which he arrived at independently, and now imagine these kinds of claims on a large scale. Unlike trademark law, the concept of honest and concurrent use does not exist in copyright law, and it cannot, as no two people can come up with the same exact play or book! Unlike the typical copyright infringement suits, where there is usually a party which is clearly in the wrong, the result would be difficult to predict in these circumstances. Scène à faire and merger doctrines would probably be applicable to the Output generated by ChatGPT, specifically in cases where the Input is a basic question and not something complicated. This would mean that such Output cannot be the subject of copyright protection!
What if the AI is infringing someone’s copyright?
The ToU state that the user is responsible for the Input and Output, “including for ensuring that it does not violate any applicable law” or the ToU. What happens in a situation where the Output is not the subject matter of copyright protection, but in fact has inadvertently infringed on someone’s copyright! Upon being asked, ChatGPT itself states that it has been trained on a vast amount of text data. But, the original source of the Output is not attributed to ChatGPT. It is unclear if any permission has been obtained in respect of such data. This places the user who claims copyright in the Output in a difficult position, as the use might not even qualify as fair dealing. In infringement suits before the Calcutta HC & Delhi HC, where the respective Defendant websites/OTT platform allowed streaming of sound recordings protected under copyright law and claimed exception under Section 52(1)(a)(i) of the Act, 1957 (i.e. private or personal use), the Calcutta HC found the streaming of sound recordings (protected under copyright law) for a “fee or revenue may be from its sponsors or from third parties”, to be commercial exploitation, and the Delhi HC also arrived at a similar conclusion.
OpenAI has introduced a paid version called ChatGPT Plus. It is like ChatGPT, but the users get priority access during busy times and faster response times. On its website, OpenAI states that the user’s Input and Output may be used to develop and improve OpenAI’s products and services. OpenAI also seems to have received/been promised billions of dollars in investment. Thus, on the basis of the commercial aspect associated with ChatGPT, it would appear that if OpenAI inadvertently infringes on someone’s copyright, the court could come to the conclusion that OpenAI’s actions and the user’s actions amount to copyright infringement. There is a higher probability of such Output being considered as infringing by the courts, when a user uses ChatGPT’s Output for commercial purposes.
When used in a manner that is commercial or public, the Output might not even qualify as fair use, unless it is sufficiently transformative. Certain Outputs, such as the results when you search for important or significant excerpts from a certain book chapter, may not be considered as fair use. Even if we consider the educational use exception under Section 52(1)(i) of the Act (which was given a wide interpretation in the DU Photocopy case), OpenAI might be held liable, as there is a difference between making and distributing course packs to some students who do not constitute the potential market for the works concerned versus thousands or potentially millions of people who can access important excerpts from such literary works.
In the US, a class action suit has been filed against Microsoft and OpenAI for the creation of GitHub Copilot, claiming that it has been trained on open-source software scraped off the web and violated the attribution requirements set forth by many creators. Getty Images has also filed multiple suits against Stability AI, creator of an AI art generator called Stable Diffusion, in various jurisdictions, for copyright infringement of images from the stock photography company. Pertinently, Getty Images has granted a license to an AI art generator for use of images from its platform. Another lawsuit has also been filed by three artists against AI art generators for scraping their work for training AI tools.
The use of copyrighted material for the purposes of training AI should be in a way that is fair to human creators. The concerns raised in this post and in the abovementioned suits do not have a straightforward solution, but regulation of such use of copyrighted material could be considered, post through examination of the issues and their implications.
Right to paternity
OpenAI’s Sharing & Publication Policy permits posting of the user’s prompts and completions (Output) to social media and live streaming, but this is subject to certain restrictions, one of which is to “[i]ndicate that the content is AI-generated in a way no user could reasonably miss or misunderstand.” This seems to be a clickwrap agreement, as in the process of signing up, OpenAI states that by clicking ‘Continue’, the user agrees to its Terms (which redirects to the ToU page and includes “Service Terms, Sharing & Publication Policy, Usage Policies, and other documentation, guidelines, or policies we may provide in writing”). In case a user chooses to sign up through Gmail (and could be the case for a Microsoft account as well), the Terms are not displayed to or accepted by the user in such an explicit manner. Although the assignment under the ToU is for all rights, it appears that somehow the right to paternity is retained by the AI itself and from one angle, it is probably a good thing to prevent human authors from trying to pass the Output as their own creation. At the same time, it’s a bit ironic that they don’t give credit or even make known what their range of training data is, but want to make this known about the Output. However, retaining the right to paternity does not seem plausible in light of the fact that the AI is not eligible to be an author in India or the US. But the real question, is whether the ToU would be held as valid if challenged in court? How can the ToU be enforced in the absence of AI being granted certain rights under the law? It appears that OpenAI has made two interesting assumptions – first, that AI can be considered to be an author and can somehow retain the right to paternity and second, it is the owner for the Output generated by ChatGPT. It would be interesting to watch how this plays out in courts. Another restriction placed by OpenAI is that the user is not allowed to assign the rights granted to it under the ToU.
The law needs to keep up with technology, in order to deal with the concerns that technologies such as ChatGPT might raise. The emergence and popularity of ChatGPT, raises pertinent IP concerns that need to be addressed at the earliest.
1 thought on “ChatGPT and the Underlying Copyright Malady”
Very insightful and informative writeup.In the next one,you may focus on the jurisprudential aspects of legal personality of AI.