
As readers of the blog are aware, the DPIIT formed a Committee to examine the intersection of Generative AI and Copyright, and this Committee recently released their report. There have been many pieces by now that have discussed the 125 pages of the paper, so I will not spend much e-ink re-visiting all that’s said in there. For anyone who has yet to catch up on the report, some links are inserted in the discussion below. My attempt to essentialise the part worth focusing on results in the following 2 lines:
At its core, the committee wants to create a mandatorily enabling copyright exception for “AI”, for all types of “lawfully accessed” content available online, to ensure that these private entities grow. This is claimed to be a balanced way forward as these AI companies will split a royalty rate percentage of their revenues with the (perhaps billion+) content owners that they will somehow be traced and paid.
To reframe it in a way that highlights why this is worth focusing on:
The erstwhile exclusion rights of copyright owners (many of which in themselves had to be fought for, even to be recognised as existing less than 15 years ago), are now being ‘bulldozed’ to lay the foundation for AI mining projects which will mine and extract data from these no-longer-copyright protected material, for the good of “everyone” (represented by the growth of AI companies). Structurally, this transfers value from those at the bottom, towards those at the top. In trying to address a superficial and claimed tension between copyright and AI, it ignores, and inadvertently asks to hasten a much bigger problem – the centralization of power over knowledge governance and production.
Let’s Break That Down (long post ahead)
Initially, I was under the impression that the document puts me in a ‘you got what you asked for’ position. For years, I have been fundamentally unconvinced by the copyright maximalist agenda that has been on-going for the last several decades, as it kept enlarging the scope of excluded materials for benefits that did not have evidence of existing, while diminishing and dismissing the significance of the public domain. And suddenly, this document takes an apparent about-turn. It purports to be in favour of productively demolishing the bundle of rights that copyright has hitherto represented. So why should this still be a problem?
As I will explain over the course of this post – the apparent reversal of policy positions is not on the fulcrum of careful tradeoffs aimed at enhancing the creative corpus created by and available to humans. Rather, this is much more akin to a ‘reverse uno’ that favours commodification of creativity for the benefit of the few and at the cost of the many. For those who are of a nationalistic bend of mind, it should be noted that ‘the few’ here, structurally, are unlikely to be an Indian few either.
As relevant context of copyright policy positions taken in India so far, this mandatory enabling exception for the benefit of a stakeholder is something the country has steadfastly refused to do for humans, whether its for students and access to educational materials, or the print-impaired getting access to materials that the law allows them access to. With that context, let’s come back to the proposal. Regardless of its intent to balance different considerations, it ends up as an elitist recommendation (benefits flow to the top) that has no basis in jurisprudence, that is logistically and legally impractical, and that has severely problematic political economy connotations. Let’s split that step by step.
1. Absence of Academic Rigour
a) Jurisprudence and Theory:
For many, this is the most boring part. After all, who cares about theory? Courts also regularly dismiss such concerns as merely ‘academic’. India as a nation does not seem to care about theory or data for that matter. If we’re lucky, policy committees may consult a few stakeholders and the general public may get 30 days to send comments. To that extent, one could say we were lucky this time. But more on that later.
Coming back to copyright theory and jurisprudence: to stick to the very, very, basics – analysing theoretical justifications helps understand and thus when necessary, adequately modify the rationale, as well as shortcomings and tradeoffs that are made when a law or policy is put in place. If the foundations are ignored, the structure will always be at risk of toppling.
(Feel free to skip and move to the next para). IP law and policy are not ‘fundamental’ to being human. They are (supposed to be) carefully calibrated trade-offs permitted in society, for “net gain” to society. Thus, the reason that Copyright Law allows the temporary and calibrated suspension of the Freedom of Speech and Expression [A.19(1)(a)] rights, is because of the overall benefit that this is supposed to bring society in the form of a richer creative corpus. In information economic terms: Static inefficiency of the information in the creative corpus is allowed because of the dynamic efficiency gains it incentivises. As the rationale goes, if this is under or over calibrated, not only are rights being arbitrarily impinged, but it also leads to a “net loss” to society in the form of fewer creative expressions being birthed. There are other less commonly accepted theories that provide other rationales for the presence of exclusion rights in a free society, however, this is the most well accepted one. An academically rigorous approach would have dealt with some of these in suitable detail. Nonetheless, even if that were too much to do, it is rather straightforward to understand that in common to any policy justifications, is that re-calibrating existing policies (copyright, in this case), rests upon accurately acknowledging and understanding the trade-offs that have been made, and that are being proposed to be made.
While using the language of trade-offs, the document misses the forest for the trees. It is not the purpose of Copyright policy to incentivise the creation of AI or creative tools. That is the purpose of industrial policy. It is the purpose of copyright policy to enable more creative expressions. By whom? This may be a point for longer discussion (hint – have these discussions!), but for now, in my understanding, human creators and technology-assisted human creators both come under this umbrella. “AI as creator” does not. And ‘AI slop’ should specifically not be encouraged. Yes, “AI slop” needs further definitional work (here’s one great attempt) and perhaps these concepts should be rigorously explored further, before laying out policy that governs them, intentionally or unintentionally.
So – How exactly is this helping create more creative work? What are the trade-offs involved? What exactly is this achieving? There does not appear to be any real consideration or examination of this. Rather, it is just assumed. As Akshat Agrawal points out elsewhere, “It warns of “underproduction” and “declining output” without pointing to credible evidence (normative or empirical) from any jurisdiction or sector.”
Notably, even the less rigorous, yet useful methodology of consulting with all direct stakeholders has been bypassed. There were two rounds of invited stakeholder consultations – one with Big Tech, one with Publishers. As important as ‘intermediaries’ are, it is a remarkable miss for the committee to not consult with creators. And yes, as they are not centralised units, this would be a much more intensive exercise than speaking with the handful of power-centralised tech and publishing units. Creators (especially small creators, who are the many, even if not the powerful), and even AI developers (outside of the big 3-4 tech giants burning billions of cash-flow without a clear road to sustainability or profitability), academics (legal, public policy, or tech) and public interest groups – all appear to be missing from the list of those consulted. Perhaps the government should first figure out a robust consultation process that takes these stakeholders seriously. [Fun sidenote: Of the 43 footnotes citing academic publications, 42 refer to authors in the US/UK/EU, and 1 to authors in China. None to any in India. Of course there are plenty of Indian case laws and commissioned reports referred to – but it does raise an interesting query about independent Indian academic thought in general! Thanks to Pranjali, NLU Delhi, for pointing this out – she compiled this list here.]
b) Technology and Policy:
Let’s approach this the other way and say – “copyright and its actors be damned, let’s figure out the best growth for AI companies”. Funnily, here too, the document would fail. Bharath Reddy and Mihir Mahajan point out a number of convincing reasons why this would more likely slow the pace of innovation. In addition to those, I’d like to add another reason:
The document uses interchangeably the concepts of Generative AI, and AI. This has more than merely semantic differences. The premise of the recommendation is that Generative AI approaches are the future that we want to encourage, and that large data-sets are needed for this future. What is the basis for this premise? There is a growing scholarly, business as well as policy oriented question, if not growing consensus, that Generative AI models are likely to be only part of the future of AI, given their various limitations. What happens tomorrow when the world moves on to the next AI approach? Has there been sufficient examination of, say, neurosymbolic AI which requires 1-5% of LLM datasets while showing promise of better results enough for investors to be showing increasing interest too? Several leading AI experts are raising strong questions in this vein. Yann LaCun, Turing award winner and former chief AI scientist at Meta recently said “LLMs are a dead-end”. Professor Emeritus (Neural Science) at NYU, Gary Marcus, has been saying this for a while, and in a piece titled, “ “Scale is all you need” – is dead” that coincidentally came out the same day as the Committee Report (Dec 08, 2025), he rips apart the idea that scaling and large dataset reliance is the way forward, while also compiling viewpoints from several other leaders in the field that are coming to the same conclusion. This includes a recent MIT study (PDF) that opens with this line: “Despite $30–40 billion in enterprise investment into GenAI, this report uncovers a surprising result in that 95% of organizations are getting zero return.”

So, why are we throwing everything behind this approach? I won’t pretend to be an AI expert – Let me give a very generous benefit of doubt. Maybe all of this is wrong or maybe I’m looking in the completely wrong direction. But the document would be a lot stronger, if it first showed validation of the very premise, by referencing evidence that this is indeed the future worth throwing everything behind!
2. Utterly Impractical
The recommendation given by the committee requires a Copyright Royalties Collective for AI Training (CRCAT) which would consist of members who are Collective Management Organisations (CMOs) and Copyright Societies. Prashant Reddy, as readers of the blog would know, has done some of the most impactful research and scrutiny of India’s experience with copyright licensing societies (Go ask Grok/Perplexity/ChatGPT/etc make a bullet point list of issues Prashant has dug up on Copyright Societies over the years, to get a sense of how much work he’s put into this!). And he quite straightforwardly questions whether the report is dead on arrival – pointing out several impracticalities, logistically as well as legally, also stating,: “The proposed model is far too complex for a state that struggles to regulate traffic on its roads. Simply put, India lacks the administrative capacity to enforce it effectively.”
Let’s assume, for the sake of impractical convenience, that magically all those barriers too are overcome. How exactly does this play out? A very simplified thought exercise: The DPIIT report itself states that there are likely over a billion different copyright owners. If anything, this number will only continue to grow. (The Internet Archive recently celebrated its 1 trillionth archived page). After a huge sudden growth, OpenAI (for eg) is now expected to make ~$20billion in annualised revenues (not profits – which aren’t expected for a while still) in 2025. At an unrealistically generous royalty rate of 10% of revenues, split over the estimated 1 billion copyright owners, this brings the received amount to a grand total of $2 per head. And to give an idea of how unrealistically generous this 10% rate is – the majority of content created in human history is of course in the public domain (and let’s not even get into the Traditional Knowledge/Cultural Appropriation questions here). The ‘copyrighted’ portion is only a small part of the data that the LLMs train on. Thus, this royalty rate, whatever it is, will be a much contested one, and will also likely be a tiny fraction of the above estimate of $2 per head!
Let’s not forget that even if we make the impractical assumption above, there are still other ‘costs’ that come up here. Some that I can think of include expenses that will go into creating a centralised monitoring organisation, tracing the billion owners, seeing which of those are Indian, getting their payment details, etc. Of course, with this comes the need to verify before making a payment – so Aadhaar verification? Whatever the method is, would it not now involve having a registry of online content and verified KYC linkage of content authors? Another whole can of worms that is better left unopened!
3. Ignorant of Political Economy Concerns: Deterioration of the Digital Commons
The one condition imposed upon AI companies, is that copyrighted content they access must be “lawfully accessed”. What does this mean? It means that while copyright protection doesn’t matter, other laws do. Prashant, as well as Bharath and Mihir, in their respective pieces linked above, point out some practical considerations of how this may affect different players differently.
Practically speaking, there are other considerations too – as a content creator online, the natural response is one that is already commonly seen online these days: the raising of more and more paywalls. To the extent that it makes sense to do (i.e., let’s say Indian content, since the copyright exclusion is no longer legally enforceable), AI companies will still nonetheless figure their way around several of these. A few technically proficient regular users too will have the technical knowhow to access content for uses they are legally permitted. The vast majority of real human users though – as many of you would’ve personally noticed already – will continue to be hit by more and more paywall barriers. This is not merely an inconvenience. Rather, this is a serious concern as it also means our accessible ‘commons’ content and infrastructure is getting reduced.
A healthy public sphere and a robust creative corpus are structurally interdependent. In other words, if you want a more diverse creative corpus, you require a robust, accessible, public sphere to learn, engage and draw upon! Creativity does not arise in isolation.
And this finally brings me to what I believe the core problem is with the entire approach of the committee’s report – its devastating effect on the political economy of information and creativity. To repeat a line from my opening para – “In trying to address a superficial tension between copyright and AI, it ignores, and inadvertently asks to hasten a much bigger problem – the centralization of power over knowledge governance and production.”
Cory Doctorow, in a brilliant new book titled “Enshittification”, points to the common pattern of degradation to online users being seen due to concentration of power in platforms. As he explains, platforms start by offering high value to users, easy interoperability, low switching costs, etc. This mimics a public commons and encourages users to join en-masse. Once users pour in, businesses follow, and so tweaks are made to slowly start extracting rents from businesses. Finally, platforms move on to squeezing both users and businesses. APIs get locked, interoperability is blocked, and due to network effects combined with intentional lock-ins, and difficult switch out costs, users (and thus businesses) stay on the platform. This in turn ensures competitive services aren’t given a chance to actually grow. Nonetheless, users must continue to play the algorithmic games to stay ‘relevant’ or ‘visible’. In a happy coincidence for the platform companies, this creates more information about the users which the platforms can continue to then harvest and mine. And of course, even as they extract value from content all over the internet, it would be no surprise that once users and businesses are locked in, that platforms reassert legal locks (trade secrets, patent protection, licensing restrictions) on their own workings. Note the pattern here – centralised power accumulates on top, while value lessens at the bottom. Formalists may find Doctorow’s book title offensive (I can’t stress enough though – I highly recommend that everyone read it!), so let me also point to noted economist Marianna Mazzucato’s work, as she provides a useful macro-economic framing for this in her book The Value of Everything, where she distinguishes between ‘value creation’ and ‘value extraction’, and discusses the ease with which ‘winners’ in the market economy continue winning by extracting rather than creating value.
The centralized architecture (CRCAT as provided in the report), in a guise of ‘fairness’ by ‘recompensing’ (unwilling) participants, is structurally functioning as a funnel of value, that allows maximum extraction upwards, while restricting what real users can do. It treats the commons, currently accessible in the (already delicate and continuously fragmenting) public sphere, as nothing more than extraction infrastructure, with a namesake mediation via a body that is coming into being just to ensure the private AI companies can increase their bottom line – and for what? It is not clear. Meanwhile, even though in theory smaller AI companies can also take advantage of this infrastructure, in practice it is likely for several reasons that we’ll see the same patterns we’ve seen all around the world – the big few will cannibalise any potential competition, and we will have just a handful sitting at the top. Will those companies be forced to offer their products back into the commons? Will they be forced to allow low-switch out costs, and high interoperability with their competition? Experience shows us that these have never been serious considerations in other contexts. It is unlikely to be so now.
Thus, sadly, it seems that the Committee’s report reveals a further depth to this process that Doctorow and Mazucato warn of – the conversion of not just a private platform, but now the public sphere itself, which would otherwise be a site for productive value creation, into mere extraction infrastructure which moves value up the chain to the few big players. Simultaneously leaving smaller creators with less control or power over their creations, as well as smaller AI developers and Open Source Projects without the ability to compete on financial grounds with deep pocketed BigTech.
Needless to say, one hopes (or even assumes?) the proposal cannot go forward in its current form. What is the alternative? Perhaps a starting point would be to have more discussion on what problem we’re exactly solving for in the first place, as well as a clear explanation of what exactly we want to achieve. [Please note: The Committee invites comments on the draft within 30 days of Dec 08, 2025].
Big shoutout and thanks to the SpicyIP group who have been discussing this paper for the days since the report came out, and in particular Akshat, Bharathwaj, Kartik, Praharsh and Sonisha for their comments and feedback on this draft!
