Friday, April 11, 2014

Aadhar: A Look from First Principles of Technology



My two blog posts [of Mar. 28, 2014 and Apr. 8, 2014] on the legal and constitutional implications of the Aadhar project evoked strong responses - calls, e-mails, comments and the ilk - from some of my close friends, ex-classmates and ex-colleagues. 

Some said that, by focusing on the legality of the scheme, I was guilty of missing the forest for the trees. Others argued that the benefits of Aadhar were extensive enough to ignore such minor technicalities

The end justifies the means, doesn’t it?” they posed validly

So the scheme was keenly debated each time. But, in each debate / discussion, neither side was convinced about the other side’s views, claims or arguments.  

And then, I watched on YouTube an interview of Sri. Nandan Nilekani with CNN-IBN’s Sri. Rajdeep Sardesai. In that interview, Sri. Nilekani said, “...once anybody looks at the (Aadhar) scheme from first principles, they will come to the conclusion that this was the right way to do it...”

That was an invitation for me to look at the technological first principles behind Aadhar and to attempt verifying and validating Sri. Nilekani’s claim. Well, he had thrown down the gauntlet and how could I back off? 

Caveat: While I am not cynical of “unique identification” per se, I am surely critical about the way UIDAI has gone about the Aadhar implementation. 

Technology Pieces to the Aadhar Puzzle  

As is well-known, the UIDAI paradigm consists of four core operational components: 
  1. Resident Enrollment: Resident data (demographic and biometric) is collected through an application process by Enrollment agencies. Biometric data captured includes all ten finger prints, photograph and both iris scans. The collated resident information is then verified and submitted to the Central Identities Data Repository (CIDR) through designated Registrars. The CIDR runs a de-duplication check to ensure that the resident is not already enrolled.  
  2. Resident Data Storage: The Central Identities Data Repository (CIDR) is at the heart of the UIDAI system. The repository contains demographic and biometric data of all enrolled residents. It stores resident records and issues unique identification numbers based on verification and authentication of resident data.  
  3. Resident Data Update: Updates are periodically effected (based on applications for editing / amending demographic data) to reflect changes in resident data.
  4. Identity Authentication: Registrars will create infrastructure for enabling both online and offline authentication of individual identities based on data contained in the central repository, i.e., CIDR

Resident Enrollment  

Biometric data collection poses major technical challenges. Indeed the success or otherwise of ‘unique identification’ or ‘validation of identity’ of residents depends entirely on the quality, consistency and reliability of biometric data capture and pattern recognition. Hence, let me first examine issues related to biometric data collated by UIDAI.  

A) Fingerprinting 
UIDAI has selected 10 fingerprints, i.e., three slap fingerprint images (4+4+2), as one of the core biometric modalities for unique identification. The quality of fingerprint data captured is a key success factor for the success of ‘unique identification,’ which is accomplished through pattern recognition using algorithms. Fingerprint patterns are aggregates of ridges (arches, loops and whorls), minutia points, etc. 

UIDAI has standardized on scanners based on optical fingerprint imaging. It is well-known that as a technology, optical fingerprint imaging is sensitive to:
  • Scratched or dirty touch surface of the scanner will produce poor quality images of fingerprints.
  • Imaging capabilities are affected by the quality of skin on the finger. A dirty or marked finger is difficult to image properly. Most Indians, particularly in rural areas, cannot be expected to have clean fingers.
  • Eroded outer layer of skin (due to aging or hard labor or damaged papillae or otherwise) may wipe out the finger ridges to the point where fingerprints are no longer visible.
  • In cold temperatures (typically below 15°C, which is common in India, especially during winters), fingers loose moisture. This leads to dry fingertips, which present poor quality image scans. Even though the scanner registers it, the fingerprint fails during matching.
  • Besides the BioScan-10 fingerprint scanner, of BioEnable Technologies, chosen by UIDAI for enrollments does not have ‘live finger’ detection (typically done through detection of blood flow in fingers). This means that the equipment can be fooled with fake fingers or images of fingerprint.  
Hence, with such practical issues and technological challenges in the quality and reliability of fingerprint image capture to tackle, many questions arise. 

To begin with, what is the efficacy of fingerprint pattern recognition and effectiveness of identity authentication in the UIDAI system? How reliable will scanners that are deployed in hot, dusty environments during the enrollment process be in producing good images of fingerprints? What is the average quality level of fingerprint scans across all demographic segments? 


B) Iris Scanning
Iris patterns are no doubt complex, random and unique. They lend themselves to pattern recognition and hence identity authentication. It uses camera technology and subtle infrared illumination to acquire images of the detail rich, intricate structures of the iris. Besides speed of matching and low probability of false matches, iris scanning offers the advantage of stability of an internal, protected, yet externally visible organ for identity authentication.

Nevertheless, iris pattern recognition too offers some practical and technological difficulties, such as:
  • Iris scanners are sensitive to lighting levels and hence, accuracy and efficacy can be affected by changes in lighting.  
  • Iris recognition is susceptible to poor image quality and can be tricked using images generated from digital codes of stores irises.
  • Alcohol consumption causes recognition degradation, since the pupil dilates / constricts causing deformation of the iris pattern.
  • Cataract surgery too can cause iris texture changes, thus making pattern recognition no longer feasible.  
Thus, it is apparent that iris recognition too entails uncertainties about effectiveness. 

Database Management System Architecture  

It is further unclear as to how the UIDAI data repository is designed and implemented. Is the database distributed in a n-tier architecture? Or, is it centralized in a monolithic database? If it is the former, then what design precautions have been taken to ensure data integrity and consistency? If it is the latter, what mechanisms are in place to ensure data availability?

Further, it appears the impact of network and application architecture on system performance has also not been either thoroughly tested or made public. 

Identity Authentication  

Another area of the Aadhar system that lacks clarity is the robustness and reliability of identity verification. For instance, what is the incidence of false negatives and false positives in the system during identity authentication (which is the primary revenue-generator)? Further, it is not clear if any consistency and reliability tests with regard to identity verification were carried out on the Aadhar application to evaluate its effectiveness.  

The impact of network and application architecture on system performance has perhaps also not been tested.

Aadhar Proof-of-Concept (PoC)

The UIDAI Proof-of-Concept (PoC) was restricted to the resident enrollment process. Further, the Biometric Technology in Aadhar Enrollment report states that the PoC only looked into false positive identification rate (FPIR = 0.057%) and false negative identification rate (FNIR = 0.035%) in the enrollment process. 

Besides, the PoC study was carried out in South India during the summer months, when the impact of many aggravating conditions (such as, dry fingertips, ambient light, etc.) would be minimal.

Hence, it is unclear as to what percent false positives and false negatives would have resulted from a study of end-to-end, "enrollment to de-duplication to repeated identity authentication" process

Further, the PoC study did not focus on the system’s effectiveness in detecting manipulated biometric submissions (e.g., left hand of one person, right hand of another person and eyes of a third person for scans) for creating fake and fraudulent identities. Were any volume or stress or load tests carried out to determine system robustness and reliability? 

A relevant point to be noted is that the UIDAI Biometrics Standards Committee, in its report titled ‘Biometrics Design Standards for UID Applications’ concluded that two factors raise uncertainty on the extent of accuracy achievable through fingerprints. First, the scaling of database size from fifty million to a billion has not been adequately analyzed. Second, the fingerprint quality, the most important variable for determining accuracy, has not been studied in depth in the Indian context. 

The report goes on to claim that biometric software needs to be tuned to local data. If the software is not tuned, it can generate additional errors in the range of 2 to 3%. As per the report, an unchecked operational process too can increase the false acceptance rate to over 10%. It is not clear how UIDAI went about addressing these issues in the PoC and beyond. 

Conclusion  

In the overall analysis, the PoC did NOT really look into the end-to-end process of fingerprint data capture, comparison and matching studies over a protracted period of time to truly simulate real-world conditions and to ascertain true error rates of identity authentication. This is contrary to what one would expect for a project with as huge a proposed expenditure outlay as the Aadhar scheme. 

Thus, my evaluation of the UIDAI system from first principles yielded more questions than answers. Indeed whether the cumulative effect of all system inadequacies, inefficiencies and inherent weaknesses have been factored into risk assessment is unclear.

So then, will Sri. Nandan Nilekani accept that, at a bare minimum, the Aadhar system has been implemented hastily?

Regardless, the questions that will plague people's mind are: 
  1. Did due diligence suffer because it was public money being spent at UIDAI?   
  2. Would Sri. Nilekani have rushed through with the execution of any such project at Infosys?
Well, I guess those questions will never be answered satisfactorily. 

I only hope though that UPA and Sri. Nilekani's haste does not lead to waste on Aadhar!!