While the Joint Parliamentary Committee (JPC) released its much-awaited report on a sparkling new data protection legislation late last year, as reported, the bill may be scrapped for a completely new law. The personal data protection bill had created significant buzz in the industry and society. So much so, the JPC received feedback from over 200 stakeholders representing a cross-section of society. After years of consultations from 2018 onwards, the bill transformed from a ‘Personal Data Protection Bill’ to a ‘Data Protection Bill’ to what may now be known as a ‘Digital India Act’. Change in titles suggests a shift in focus – from protection of privacy to regulation of data as an asset. Be that as it may, as reported, the government is still clear that technology should be defined by ‘openness, safety, trust and accountability’ to users.
Transparency in the manner in which personal data is processed is one of the foundations of ensuring openness, trust and accountability. The bedrock of data handling processes, the need for and boundaries of algorithmic transparency, has been a much debated subjected. In India, algorithms are excluded from patent protection and being mere ideas, do not qualify for copyright protection either. They may, however, still be commercially valuable and are often held as trade secrets. Since they are held as trade secrets, their disclosure for satisfying transparency obligations becomes contentious.
Algorithms are a set of instructions or steps for completing a task. As we may know by now, the internet works on data hungry algorithms. There is also a noticeable movement towards adoption of automated technology by the state to increase efficiency. Be it the push for the creation of ‘smart cities’, the use of facial recognition systems for crime mapping, or the judiciary’s enthusiasm towards adoption of artificial intelligence (AI) systems, the automation movement is becoming more pronounced. While algorithms may appear to be superficially neutral, they can perpetrate biases and cause significant harm. However, algorithms, especially predictive algorithms, are not easy to understand because of their complexity. They are made up of multiple layers of instructions to solve complex problems. Understanding decisions taken by algorithms could involve unravelling multiple such layers. While there is a growing repository of algorithmic information on the Patent Office website due to the increasing number of patent applications for computer related inventions, the applications may not always disclose key algorithms.
This makes data processing opaque. There have therefore been calls for making the inner workings of algorithms transparent. In-line with these demands, the JPC in its report recommended that data fiduciaries – ‘where applicable’ – should make available information regarding ‘the fairness of their algorithms or methods used for processing of personal data’.
Conceptually, this obligation to ensure transparency with respect to algorithms is welcome where the State is the data fiduciary. It obligates the State (which adopts the technology and therefore determines the means and purpose of processing personal data) to make available information about fairness of the algorithms that it uses. The presumable aim is that the public will be able to examine algorithms that are used to carry out public functions thereby increasing transparency in the way the State operates. This is line with the requirements under the Right to Information Act, 2005 which requires the State to act transparently. Performance of regulatory functions and ensuring accountability of officials while discharging their functions are matters of public importance that require transparency. Moreover, if the State uses algorithms that it cannot understand or explain, it is far from ‘smart’. The hope therefore is that this requirement will not only benefit transparency but also enlighten the State itself.
Since the State often procures technology from third party service provides, transparency is not easy to achieve. Third parties that provide technological solutions to the State may retain ownership over their proprietary algorithms and may also require the State to adhere to broad confidentiality obligations that cover everything around their technology. This ensures that the commercial value of their algorithms is preserved since they are kept secret and are protected contractually from disclosure. As seen in the past, the Delhi Police refused to share information about its the facial recognition technology citing a trade secrecy exemption under the Right to Information Act, 2005.
Without a legal obligation requiring some level of algorithmic transparency, the State may throw-up its hands and exclaim such exemptions. The absence of such an obligation only dilutes trust, openness, transparency and accountability – antithetical to the functioning of a democratic country. The hope, however, is that this obligation will continue even in new versions of the law vis a vis the State (and will hopefully not be diluted by broad governmental exceptions). While we would not want a Coca-Cola type situation again – where Coca-Cola chose to withdraw from India (for years) instead of handing over its secret formula as demanded by the government – a balance can surely be reached.
The State will need to rethink contractual clauses with technology providers. Ideally, if technology is custom made for the State, the State should require the third-party solution provider to assign all intellectual property rights (including algorithms) to the State. Once the State is the owner of the technology, it can make all the required disclosures under law and cannot evade these obligations on the grounds of trade secrecy. While this is a desirable situation, it may not be feasible in all cases. For example, machine learning algorithms having the same underlying architecture could be trained on different data and made ‘custom’. Given that the underlying algorithms may have multiple uses, the company that owns the algorithms may not want to give up ownership.
If the State cannot manage to negotiate an assignment (the technology only being licensed), it must ensure that its confidentiality obligations are narrowed down. A narrow confidentiality clause will force both the State and the technology provider to sit down and spend time on understanding the technology that will be deployed. This will enable identification of different parts of the technology and their correlation with both the State’s transparency obligations and the technology provider’s trade secrecy requirements. Only a narrow set of information related to the technology should be protected as confidential. The remaining aspects of the technology should be free to be disclosed in order to enable the State to meaningfully adhere to its transparency obligations. Algorithm audits should be conducted by the State and made available – the format and periodicity of which should be agreed to and set out in the contract.
Some of these options have been explored in countries like the USA to maintain a balance between government deployed AI technologies and democratic values such as transparency.
Beyond contractual clauses, the provision of law that sets out these obligations requires a lot more clarity than the clause in the present Data Protection Bill, 2021. For instance, does the algorithm itself need to be disclosed to prove fairness? Will a self-certification satisfy this requirement? What is fair and who decides what is fair? Will this apply only to certain kinds of algorithms such as predictive algorithms? Is the disclosure requirement a public disclosure or a confidential disclosure? Without clarity on these aspects, the requirement of the law will remain a feel-good provision with little hope of implementation especially vis-à-vis the State.
A more systemic issue that also requires consideration in order to enable better transparency is the need to strengthen the judicial system itself. The judicial system should be strengthened to deal swiftly with misappropriation claims. It is usually not the disclosure of confidential information/trade secrets that is the issue. It is its misappropriation through improper use that is the cause of concern. In jurisdictions with weak enforcement mechanisms, companies may make business decisions to tighten secrecy rather than promote disclosure.
Click here to view this post on SpicyIP and leave a comment.
2 thoughts on “Algorithmic Transparency and the Smart State”
“Explainability” is the concept that a machine learning (ML) model and its output can be explained in human language terms in a way that makes sense to a human being.
Not all ML models are explainable. While some basic ML like decision-trees can be called algorithms and can be explained in “if this then that” terms, newer and advanced neural-net based ML models are not at all explainable in a human language sense.
Secondly, the model is trained on some data, after which it performs a prediction. The predictive quality depends in no small measure upon the quality and content of the training data rather than the model code itself.
Consequently, “algorithmic transparency” is a futile expectation as it is equivalent to mandating the use of lower-performing explainable models. To make a generalized law, trained models will need to be treated as black boxes where only the inputs and outputs are observable.
Thanks for your comment and for explaining the issues involved with ‘explainability’. There is a tension between adoption of advanced ML technology and public trust and accountability – when it comes to use by the state.
As you suggest, one option could be to have a generalised law that treats trained models as black boxes where only inputs and outputs are observable. However, this may not be optimal in all cases – especially when individual interests are involved and stakes are high.
Some have suggested a ‘balancing test’ where levels of explainability (and relatedly algorithmic transparency) are tailored to the interests / stakes involved.
In certain cases, the state should stay away from using algorithms that can’t be explained to the public. Simpler technology with transparent algorithms should be used instead. However, in other cases, more complex, advanced ML systems may be used.
So a generalised law requiring transparency should be the rule with room for exceptions.
Also, between the service provider (private sector) and the public agency adopting the tech – the public agency should examine all aspects of the private sector algorithms and training data that they rely on.