Copyright Maximalism by Design? Rethinking DPIIT’s Licensing Centric Approach to AI Training

Critiquing Part – I of the DPIIT committee working paper on the “intersection between Artificial Intelligence and Copyright” for copyright maximalism, Vishno Sudheendra discusses, in this entry for the SpicyIP–jhana Blogpost Writing Competition, the viability of a fair sharing arrangement as a solution to the structural and diffuse impact of GenAI on the creative industries. He strongly argues that the solutions for structural problems posed by GenAI do not lie in copyright law but outside of it. Vishno is also a fourth-year B.A., LL.B (Hons) student at the National Law School of India University, Bangalore, with a keen interest in various aspects of IPR and technology law. Interested readers can also take a look at the previous discussion by Swaraj and Shivam on this working paper.

Copyright Maximalism by Design? Rethinking DPIIT’s Licensing Centric Approach to AI Training

By Vishno Sudheendra

Introduction

The Committee formed by the DPIIT to examine the intersection between Artificial Intelligence and Copyright has released its Working Paper on issues related to use of copyrighted works for AI training. 

The paper proposes a “hybrid approach” which ensures availability of “lawfully” accessed copyrighted content for AI training as a matter of right. It claims to reduce transaction costs for developers, ensure compensation to copyright holders, and introduce judicial oversight over royalty rates and payments.

In this post, I do not seek to examine their hybrid licensing model, but will take a step back and analyse the paper’s examination of the intersection between AI training and copyright, as the question of whether or not to license (which acknowledges copyright infringement) involves a determination of infringement first.

Questionable Premises Underlying the Hybrid Model

Licensing Without Infringement

The paper disclaims itself from conclusively determining on the issue of whether AI training infringes copyright, it goes on to propose a hybrid licensing regime which implies that AI training indeed infringes on copyright. In fact, the very title “One Nation One License One Payment” rests on the premise that AI training infringes copyright, thus requiring licensing from copyright holders. 

The paper notes that AI training happens “using copyrighted materials without authorization from copyright holders”. But is authorization even required since AI training involves non-expressive use of copyrighted material [See for eg: Akshat Agrawal, Shivam Kaushik (Part 1 and Part 2)]. The paper has taken on itself “to protect the copyright in the underlying human-created works, without stifling technological advancement”, indicating that it already has a copyright protectionist lens while approaching this issue.  

Some important points to note before proceeding: Copyright should not be used to clampdown legitimate competition from AI generated works. Moreover, why is the focus to preserve/promote only human generated works in an era where AI increasingly complements creative production? [see Linares-Pellicer et al.]

Rejection of TDM & Opt-Out Models

The paper rejects adopting a Text and Data Mining (TDM) exception model as it believes that such an exception would “undermine copyright and it would leave human creators powerless to seek compensation for use of their works in AI Training”. It is questionable whether TDM exception will “undermine copyright” as non-expressive use (which happens in AI training) is either ways beyond the scope of copyright law, but let’s park that for now. 

The opt-out model is rejected for two major reasons: (i) small creators remain unprotected due to lack of awareness of opt-out and (ii) downstream reuse cannot be prevented once transformed, “hence the control lost over the data is irrecoverable”. The second problem is concerning, not because of its stated reason but because of its underlying presumptions relating to downstream reuse and absolute control over data. Section 14 of the Copyright Act provides all the exclusive rights which a copyright holder has – it does not include an absolute right to exclude anyone from accessing the content. Moreover, Copyright law seeks to enable creative production [Akshat Agrawal] – it neither envisions absolute control over works (then term limits, exceptions and compulsory licensing would not exist) nor does it seek to curb downstream reuse, as long as it is not direct expressive use. 

When Market Disruption Is Mistaken for Copyright Harm

The paper constantly warns against the risk of “decline in human-created works” due to unlicensed AI training. This argument resembles Kadrey v Meta’s market dilution theory – AI generates works in similar genres/subject-matter leading to indirect substitution of works, ultimately undermining incentives to create. The paper magnifies this problem to note that such AI training will lead to underproduction of human-generated works. 

However, is this a problem which copyright law is meant to solve? Can such issues be fit into copyright law? The answer is no. Copyright is work-specific [Section 14, “copyright means the exclusive right …  in respect of a work”], it does not protect authors/humans as a class, nor does it protect markets for genres/styles/future works be it AI/human generated. Market harm must therefore be tied to substitution of the particular copyrighted work, not indirect competition from new works enabled by technology. Treating market dilution caused by AI generated content as copyright-cognisable harm penalises legitimate competition via efficient technology. Copyright cannot be used to freeze technological progress to prevent decline in human-created works as such harms are beyond the scope of copyright law akin to damnum sine injuria. [read more on how market dilution fails copyright law here and here

Moreover, creative destruction – the process by which old economic structures are destroyed and new ones are created via innovation – is the “essential fact about capitalism” (Joseph Schumpeter p.83). The proliferation of AI generated content diluting markets and allegedly reducing human creative production is part of the creative destruction – which is beneficial in the long run. Attempts to retain existing or outdated market structures (via hybrid licensing model) will inevitably backfire as inefficient producers continue to function at a high cost to innovators/consumers/taxpayers, it will reduce the incentive to roll out new products/production methods and lead to stagnation/layoffs/bankruptcies [Michael Cox and Richard Alm]. As Cox and Alm note, attempts to reap the gains of creative destruction without its disruptions often leave societies worse off overall.

Mischaracterisation of Non-Expressive Use

The paper notes arguments stemming from non-expressive use but concludes that it is unclear if it rules out infringement as such statistical relationships are a “product of creative expression” (relying on Kadrey v. Meta). However, this conclusion ignores a crucial distinction between nature of expression (creative or not) and nature of use (expressive/non-expressive) of expression. Any non-expressive use of any expression (creative or not) is beyond the scope of copyright law.  

Moreover, the observations in Kadrey on this issue are itself very questionable. Meta relied on Sega v. Accolade and Sony v. Connectix to argue that LLM-generated competing works constitute legitimate competition, but the Court rejected this by treating LLM outputs as illegitimately benefiting from copyrighted expression. This misreads both precedents, which do not prohibit competition that benefits from intermediate copying, but only copying that exploits protected expression itself [read more here]. Similarly, AI training uses copyrighted texts in a non-expressive manner to extract uncopyrightable linguistic patterns rather than to reproduce expression, making downstream competition legitimate under fair use. By extending protection to such functional elements, the Court in Kadrey wrongly characterised lawful competition as infringement-based market substitution. Thus, reliance on Kadrey to dismiss the non-expressive use argument is questionable.

If Not Copyright, Then What?

Where no copyright infringement is established, market displacement caused by technological competition is not, by itself, a reason for intervention; such disruption is a characteristic feature of creative destruction rather than a cognisable legal harm. However, notwithstanding the absence of copyright-cognisable harm, if policy makers still want to make AI developers pay for  transitional market adjustments (like market dilution) [elaborated in next sub-section] – copyright is not the way. Rather, the alternative may be to have a fair share arrangement [similar to what telecom companies demand from OTTs, read more here, here and here].

From Legal Wrong to Economic Externality: Possible Policy Justifications

These considerations, collectively, help explain why policymakers may perceive a need for intervention even where no legal wrong is established.

  • Transitional market adjustments: AI training generates diffuse, non-excludable benefits (productivity, scale, innovation), while imposing concentrated, short-term losses on certain creative labour markets (market dilution).
  • Asymmetric benefit extraction: Similar to how OTTs are said to ride on telecom infrastructure, AI models ride on cultural and informational infrastructure built over decades by human creators. But crucially, this does not convert culture into a toll road. It merely recognises asymmetric benefit extraction.
  • Limited But-for causality: Large-scale generative AI would not exist in its current form without access to a vast corpora of human-created works.

How is it structured?

Structured not as licensing or permission, but as a time-bound fair share contribution aimed solely at cushioning transitional market adjustment. Functionally, such a contribution would resemble a sector-specific levy, rather than a licensing fee. While any mandatory contribution risks partial costs pass through (passed onto ultimate consumers), however, licensing regime amplify costs, uncertainty, transactional costs, risks of litigation etc., making them far more distortionary than a predictable fair share contribution. Further, such levies do not condition market entry on rights clearance, nor does it privilege large rights holders [read why the Working Paper privileges large rights holders here: Rahul Matthan].

How is it distributed?

Any such fair-share or contribution-based model cannot replicate copyright licensing in disguise. Distribution cannot be work/author-specific as that would presuppose attribution and causation which are assumptions incompatible with non-expressive AI training and the black-box model development.

Instead, any contribution may be channelled into a sectoral creative transition fund aimed at cushioning transitional market adjustment rather than compensating for use. Distribution should therefore be collective, flowing towards public-interest commissioning, grants for creative labour, or institutional support for small publishers and cultural organisations. Contributions may be calculated based on scale and market presence of AI developers.

How is it different from licensing?

Licensing presupposes infringement and leads to copyright maximalism. At best, the causality and dilution arguments point to a transitional market externality rather than a legal wrong. Such a regime would acknowledge economic reliance interests without converting them into perpetual rights over learning or innovation.

Why have a cut-off period?

Pre-AI creators could plausibly claim reliance on old market structures. New creators (or newer works of existing creators), however, enter the market with full awareness of AI’s presence and can price, adapt, and innovate accordingly. Without a sunset clause, a fair share mechanism risks degenerating into a permanent cost on learning and a regulatory barrier to innovation while trying to forcefully sustain old pre-AI market structures.

What are the drawbacks of this model?

That said, even a carefully constrained contribution-based model is not without serious drawbacks. Any mechanism that requires AI developers to make payments on account of training risks spillover effects that may indirectly chill speech and innovation, particularly if contribution thresholds or compliance incentives begin to shape training choices, dataset curation, or model design. Moreover, once payment is normalised in the absence of infringement, there is a non-trivial risk of path dependency: transitional measures harden into permanent obligations and political pressure may convert contributions into entitlements. For these reasons, the first-best policy response remains to do nothing and allow markets, creators, and technological practices to adjust without regulatory intervention.  

Conclusion

The DPIIT Working Paper ultimately seeks to “balance” AI and copyright without first establishing whether AI training infringes copyright at all, and in doing so, it builds a licensing-centric policy response on an unproven legal premise. By treating non-expressive AI training, market dilution, and fears of declining human authorship as copyright-cognisable harms, the paper stretches copyright beyond its doctrinal limits and risks converting it into a tool to police legitimate competition and stifle creative destruction while trying to retain outdated market structures. Copyright law protects only expressive use of particular works, not creators as a class, not genres or markets, and not pre-AI economic structures. If policymakers nevertheless wish to address short-term distributive shocks arising from AI-driven competition, the response must lie outside copyright like a limited time-bound fair share arrangement discussed above. Even then, the first-best outcome remains not doing anything and allowing markets and creative practices to adapt without expanding copyright into domains it was never meant to govern.

I would like to thank Akshat Agrawal and Swaraj Barooah for introducing me to this topic and providing me with valuable inputs and comments in earlier discussions.

Tags: , ,

Leave a Comment

Scroll to Top

Discover more from SpicyIP

Subscribe now to keep reading and get access to the full archive.

Continue reading