On Gandhi, Malamud and the JNU Data Depot

In response to my piece on the ‘JNU data depot’, I received a couple of emails from Carl Malamud alleging factual inaccuracies in my piece. I invited him to respond to the errors on SpicyIP, in response to which he started to abuse me. When I told him, I was going to publish the abusive email on SpicyIP, he apologized and the next day withdrew the apology in another abusive email. He then publicly attacked SpicyIP on Twitter.

 I find it necessary to discuss his backlash on this blog because Malamud likes to market himself as a Gandhian in India. His book on Code Swaraj with Sam Pitroda is all about taking inspiration from Gandhi for a modern- day campaign of civil resistance against the state to democratize access to knowledge for all citizens.

Last year, on Gandhi’s birthday we published an interview of Malamud on SpicyIP, where he unequivocally described himself as a Gandhian. Here’s an interesting quote from Malamud: “I would definitely describe myself as a Gandhian, in the sense of being a student of Gandhiji and trying to learn from his many examples.”

While I was teaching at NALSAR last year, I had organised a lecture by Malamud after he expressed his interest to speak at our university. Once again, he made Gandhi and Gandhian values a part of his lecture.

Whether or not you like Gandhi, you have to admit that he is a very tough act to follow. This was a man who would have chosen death over untruth and who chose the path of non-violence even if he was being physically attacked. There is a reason we call him the Mahatma.

So, imagine my surprise when I received abusive emails from Malamud for critiquing his project on legitimate grounds. But the abusive emails are a minor offence compared to his refusal to confirm whether he had sourced his papers from pirated databases, like SciHub, despite specifically being asked the question by Priyanka Pulla who wrote about his project in Nature.

I flagged this issue in my earlier post but did not delve into much detail. It is however an important issue because a database built from legitimate copies will be viewed very differently in a court of law from a database built from a source like Sci-Hub which has been declared to be a pirated database. I assure you the Google Books case would have ended very differently in the United States if Google had uploaded scanned copies of pirated books rather than legitimate copies stored in libraries. A database built from pirated copies will automatically be copyright infringing because of theories of secondary liability where although the party has not committed the primary act of illegally copying, it has benefited from the same. The same stands true for the DU Photocopy case – the case would have ended very differently if the Rameshwari Photocopy shop was photocopying pirated textbooks instead of legitimate copies from the DU library.

There is no question of a fair dealing analysis in such cases. For example, if Malamud has got the papers for the database from Sci-Hub which has been declared to be a pirated database by multiple courts in the west, there is a strong case of secondary liability against Malamud and JNU. If the source of papers is from a legitimate database, a court will have to examine whether contractual conditions have been violated because most publishers have contractual conditions preventing the use of their copyrighted material in such a manner. I presume this is the reason that Malamud did not launch a similar data depot in the US despite the Google Book case offering legal cover for such a project.

If Gandhi had launched the data depot at JNU, he would have been brave enough to be honest about the source of these papers. A commitment to the complete truth was an integral part of Gandhi’s values, even if it meant facing harsh consequences like a lawsuit for copyright infringement. Gandhi was ready to choose death over being untrue. It is only in testing conditions, that a person’s true commitment to the truth can be tested. Any person can be honest in cases with no consequences.

A second tangential issue that I wanted to comment upon is regarding Malamud’s brand of activism. His venture in JNU has exposed the public university to significant financial risk because JNU and the Government of India has contracts with multiple publishers across the board for accessing databases. I am quite sure that hosting the JNU data depot violates those contracts. The cost of defence, especially in foreign arbitration will be significant and I do not think cash starved Indian universities should be expending their money on such litigation. If Malamud’s intention was to help Indian scientists, he could have silently provided the data depot to JNU without calling a journalist to write about it in the most widely read science journals in the world. But the need for publicity is unfortunately the oxygen of the non-profit world without which it is difficult to raise funds from donor organisations.

Prashant Reddy

T. Prashant Reddy graduated from the National Law School of India University, Bangalore, with a B.A.LLB (Hons.) degree in 2008. He later graduated with a LLM degree (Law, Science & Technology) from the Stanford Law School in 2013. Prashant has worked with law firms in Delhi and in academia in India and Singapore. He is also co-author of the book Create, Copy, Disrupt: India's Intellectual Property Dilemmas (OUP).


  1. Stephen Winlow

    As a practitioner of law I often try and seek endorsement of what I believe or profess from other practioners. However, when I read SpicyIP, I do so to bring about an objective standard to my belief. I kindle my intellectual integrity to bow down to better informed and more knowledgeable people who contribute their expert views to Spicy IP.

    Like Prashant Redfy, I feel that the subject article serves only what the author believes. It does not objectively delve into the niceties of the applicable law and the relevant and material facts concerning JNU’s Text and Data Mining Depot.

    Here are questions that we can pose for the sake of assessing the law:

    Can copyright law restrict acccess to data and allow it’s author exclusivity in “communicating it to the public” by licensing it or assigning it for a price?

    How did Carl and Andrew source 73 Million journal articles that have proprietary and/or confidential value ?

    Did the duo lawfully procure it for JNU’s avowed purposes or was it pilfered ?

    Can the facade of enabling or facilitating research justify the unfair and unauthorized means adopted to acquire researchable content ?

    Will not JNU and it’s researcher community violate Indian Copyright law just as much as anyone committing piracy of proprietary content on the world wide web?

    Since the articles in the JNU data depot are still under copyright protection and included in the database without permission from it’s copyright owners would not this be pirated content?

    Will not the JNU depot offend Copyright Act 1957 since the accumulation of content of the depot is admittedly unauthorized and without a license?

    Can the JNU Depot’s claim that no one will be allowed to read or download those works from that facility be accepted at face value ?

    Isn’t JNU facility doing something more illegal by permitting use of a computer software to unauthorizedly crawl over pirated text and data ?

    How can JNU lawfully prevent researchers indirectly using it’s facility for commercial research purposes?

    Can the data’s restricted deployment or end use be guaranteed by the JNU depot?

    Can UK, Europe or US law have any persuasive value in Indian courts, when specific Indian law is available or applicable to electronic data and protects the same?

    Can Indian law look into the nature of exploitation if the source of content or it’s acquisition itself is unauthorized ?

    Will not unlawful access to copyrighted content automatically constitute infringement of copyright?

    Should not researchers be wary of accessing unauthorized data and pressing this into service for their research work ?

    Would not unauthorized accumulation and exploitation of proprietary content by the JNU depot dehors it’s end use be violative of Copyright Act 1957?

    When the source of the accumulated content itself is unauthorized will not the argument of “fair use” or “dealing”fall flat of the law?

    Is not DU decision of Endlaw, J sub-judice before the SC ?

    Can the JNU depot ‘ex facie’ carrying on an illegal activity take subterfuge that it is extended only for research purposes ?

    Can access restrictions prevent and regulate eventual end use?

    Will not a researcher taking advantage of such unauthorized content for research become an accessory to the fact of infringement?

    Must not the accumulation of content by the JNU depot first of all be legal and authorized?

    Must there not be a Paper trail of procurement of content available with JNU?

    Will not misinformation, plagiarism and piracy become the order of the day if this interpretation by the author is upheld in India?

    Will not a short cut in relation to law always be a wrong cut since it would be easy and wrong to take this defense against charge of infringement?

    Are not all rights reserved by a copyright holder and these include all future rights as well?.

    Will not this specific TDM technology based exploitation of proprietary content or data fall straightly within the mischief of infringement dehors it’s effect on the information market?

    Since the Depot and it’s content accumulation itself has no legal standing under Copyright law in India will not any of it’s use, access and exploitation be illegal too?

    Can the pursuit of knowledge or it’s dissemination by any University or reseracher be at the cost of violating somebody’s IPR??

  2. DS

    Prashant notes that Malamud has refused to confirm whether “he (Malmud) had sourced his papers from pirated databases, like SciHub, despite specifically being asked the question by Priyanka Pulla who wrote about his project in Nature.”

    However in the Nature article linked in the article, it is clear that they have come from SciHub, Malamud only refuses to comment on who gave it to him as it states “And around the same time that he heard about the Rameshwari judgment, he had come into possession (he won’t say how) of eight hard drives containing millions of journal articles from Sci-Hub, the pirate website that distributes paywalled papers for anyone to read.”

    According to Prashant’s analysis, therefore the debate of this being illegal seems to pretty much be clear.

    1. Prashant Reddy Post author

      Actually, it is not that clear. Here is a different extract from the Nature article:

      “Ultimately, he zeroed in on the idea of the JNU text-mining depot instead. (Malamud has also helped to set up another mining facility with 250 terabytes of data at the Indian Institute of Technology Delhi, which isn’t in use yet.) But he is cagey about where the depot’s articles come from. Asked directly whether some of the text-mining depot’s articles come from Sci-Hub, he said he wouldn’t comment, and named only sources that provide free-to-download versions of papers (such as PubMed Central and the ‘Unpaywall’ tool).”

      It says that he is cagey about the source of the articles and would not comment if the articles are from Sci-Hub.



Leave a Reply

Your email address will not be published. Required fields are marked *