ANI v. Open AI: Time to Talk About ‘Machine Unlearning’

Discussing the relief sought by ANI in its interim injunction application against Open AI, Bharathwaj explains what might be the legal and technical problems in implementing some of the measures sought, especially in context of machine unlearning. Bharathwaj is a 3rd year LLB Student at RGSOIPL, IIT Kharagpur, and loves books and IP. His previous posts can be accessed here. Click for Part 1 and Part 2 of this series.

Image from here

ANI v. Open AI: Time to Talk About ‘Machine Unlearning’

By Bharatwaj Ramakrishnan

THE MODELS ARE BROKEN, THE MODELS ARE BROKEN!
– Allen Newell

Previously, in a two-part post (see here and here), I discussed the OpenAI copyright litigation and the novel Copyright issues faced by the Delhi HC and the relevant issues to be kept in mind while following the litigation as it proceeds further. In this post, which can be seen as an addition to the previous posts, I wish to focus on the ex-parte ad-interim injunction sought by ANI against OpenAI. Para 20 of the 19/11/24 order describes the direction sought by ANI against OpenAI.

Analysing the Interim Injunction Sought by ANI

Simply put, ANI seeks to restrain the OpenAI from either directly or indirectly storing, publishing and reproducing in any manner, including through the ChatGPT model, their copyrighted works.

Secondly, it seeks to prevent or disable ChatGPT from accessing ANI’s works published by them or through their subscribers. The second direction might have been addressed to an extent by OpenAI after its blacklisting of ANI website. Yet, third-party websites hosting ANI’s copyright content can and the GenAI model’s ability to generalise, to an extent, temper the impact of the blacklisting, as we will see later down the post.

Examining these, we see that there are both legal and technical problems involved. To quickly touch on the legal problem, OpenAI has argued that any training data removal would be contrary to its legal obligations under US law due to the ongoing litigation in the US. Yet it has been noted that last year, the New York Times alleged that OpenAI (later acknowledged by Open AI) itself had deleted data related to model training, which could have served as evidence in the ongoing litigation. Even assuming that this legal problem does not exist, there are technical problems with how the interim injunction measures might be implemented.

As reported by Hindustan Times, one of the amicus had also suggested that both OpenAI and ANI during the trial can answer whether unlearning copyrighted content during model training is technically and practically feasible. Thus, it is unclear how the directions sought by ANI can be technically implemented. Secondly, one of the amici has recommended that this issue be answered in the trial. Thus, it is unlikely that it is a simple and straightforward issue that can be resolved at the interim stage. Irrespective of the fate of the interim injunction sought, discussing the possible limits of machine unlearning is useful.

With all that said, let’s talk about Machine Unlearning

Machine Unlearning Defined

A detailed paper written by a score of academics ranging from law, policy, and ML defines machine unlearning as follows:

 “Machine unlearning is now a subarea of machine learning that both develops methods for (1) the targeted removal of the effect of training data from the trained model and (2) the targeted suppression of content in a generative-AI model’s outputs.”

As previously stated, machine unlearning is seen as a technical intervention that can be employed to ensure that Gen AI models are purged of unwanted information and ensure compliance for a model proprietor with laws regarding Copyright, privacy and safety.

Unfortunately, the authors also note a gap between what machine unlearning techniques can do and what the law might require. Further, it has been argued that these gaps or mismatches arise from the motivation for unlearning, targets sought to be unlearned, and technical methods available for unlearning.

To summarise, the core insight of the machine unlearning paper is that the available technical methods, be it back-end removal of observed information (specific data examples the model is trained on) or output suppression, are not full-proof methods to ensure a model does not, in this case, engage in copyright infringement. This is due to mismatches or gaps between how a model learns and processes training data, the technical methods available for machine unlearning and the policy goals we might want to achieve with the unlearning methods available.

I will attempt to illustrate these gaps with two examples which were discussed in light of the ongoing litigation; Firstly, in the ANI case, after OpenAI stated that it had blacklisted ANI’s website but still conceded that ChatGPT might produce material that is related to ANI’s content because it is being sourced from public data or third party sources, secondly, in the context of Harry Potter, wherein a Reuters reporter was able to prompt ChatGPT to produce chapter-wise summary of key events of first part of the series.

Limitations of ‘Machine Unlearning’

Firstly, the blacklisting of ANI content does not say anything about the fact that the ANI’s copyrighted content might be part of the training data on which the foundational model was trained. It does not say anything about the fact whether the foundational model itself might be trained on ANI’s content. Neither does it say that the model itself might rely on this initial training or respond to certain prompts with outputs that it has memorised from the training data related to ANI’s content unless there are guardrails on the output side that might catch these responses and prevent them being provided as output, which in itself might not be fool-proof.

Secondly, as pointed out in the paper, “there is meaningful slippage that occurs when employing a removal technique in service of output suppression. It is unclear which set of information should be targeted for removal to prevent the generation of certain outputs at generation-time”. Thus, they go on to state there is an issue of over-inclusiveness and under-inclusiveness in removing information. Therefore, the fact that ChatGPT might still produce outputs related to ANI’s content due to the presence of third-party sources is an example of the slippage that the authors refer to and a case of under-inclusiveness and how blacklisting of ANI’s website is insufficient on its own to prevent outputs related to ANI’s copyrighted content.

Another relevant issue is that while framing the issues, the Court noted that the Copyrighted content in question is in the form of news. Thus, it is quite possible in some scenarios that ChatGPT might train on content and produce outputs based on other news articles on the web, which might cover a news event similar to the one covered by ANI and can be a separate work in itself whose copyright is not owned by ANI. Yet due to GenAI model’s ability to generalise[1] the model might end up producing content which is substantially similar to ANI’s copyrighted content and might constitute infringement. This could be because, as the Supreme Court in RG Anand puts it, “Where two writers write on the same subject similarities are bound to occur because the central idea of both are the single”. Thus taking from the same idea or set of ideas can lead to similarities during expression or as the authors while touching on the model’s ability to generalise put it, a model is not equal to its outputs. Thus, this is also a case of under-inclusiveness.

The Harry Potter scenario might also reflect a similar problem, for example, even assuming that somehow the books are taken out of the training data set. Somehow, through technical methods, the parameters of the LLM model are adjusted to reflect this reality; one might still get possibly infringing output, as Harry Potter, a very popular novel, might be discussed in detail in other derivative or secondary works like fan fiction or Harry Potter wiki. If they were to form part of the training data or if ChatGPT might refer to other secondary or third-party sources when searching the internet, which might lead to the production of potentially infringing outputs. Something similar happened in the context of a Gen AI model called CommonCanvas (see page 18 of this paper), a text-to-image model trained only on images with Creative Commons licenses. Yet, it produced Mickey Mouse images as output because the training images had personal photographs of people, including Mickey Mouse in the background; a similar version of this problem can be seen in both the ANI case and the Harry Potter book summary issue. Finally, it must be noted that machine unlearning methods do not account for long-standing copyright doctrines, such as the idea-expression dichotomy discussed in the previous post. Not all use of ANI copyrighted content would be copyright infringement. Any unlearning method that is over-inclusive can also prevent OpenAI or any GenAI company from engaging in the lawful use of copyrighted works.

Thus, machine unlearning is not a sure-shot way of ensuring that a model does not engage in infringing behaviour. On the question of infringing outputs, the authors observe that “there is no notion of similarity that can be used to programmatically and comprehensively determine which works are substantially similar in the interest of copyright law.” Thus any unlearning method cannot capture all potentially substantially similar content.

In addition, there is the problem of how Gen AI models are designed internally; purging information from a GenAI model is not similar to removing data from a database since how a model processes information and adjusts its parameters is way more complicated. As the memorization paper puts it, “A large generative-AI model can be like Borges’s Library of Babel: it contains literally incomprehensible immensities, to the point that it is extraordinarily difficult to index or navigate.”

Thus, making a model behave is not an easy task. It’s anybody’s guess as to how it moves forward from here.

I wish to thank Swaraj and Praharsh for their input on the post


[1] in the paper, the authors identify targets of machine learning: Observed information, latent information and high-order concepts. Ideally, any successful unlearning method must seek to target all these learning targets. The last two concepts, latent information and high-order concepts, are there because the model generalises from the training dataset (which is observed information) on which it is trained. The authors note 4 mismatches for our purposes; the third one is relevant, which is “model is not equal to its outputs.” The idea is that even if we remove all the observed information and training data, it would be insufficient due to the fact that the model can generalise. So, this gains significance in the context of news reporting, as I proceed to explain in the post. It is quite possible that another news channel might report on the same facts, and the model might generalise and produce an output that might be substantially similar to ANI’s content.

Tags: , , ,

Leave a Comment

Scroll to Top

Discover more from SpicyIP

Subscribe now to keep reading and get access to the full archive.

Continue reading