IEEE Access (Jan 2021)

An Indexing Algorithm Based on Clustering of Minutia Cylinder Codes for Fast Latent Fingerprint Identification

  • Ismay Perez-Sanchez,
  • Barbara Cervantes,
  • Miguel Angel Medina-Perez,
  • Raul Monroy,
  • Octavio Loyola-Gonzalez,
  • Salvador Garcia,
  • Francisco Herrera

DOI
https://doi.org/10.1109/ACCESS.2021.3088314
Journal volume & issue
Vol. 9
pp. 85488 – 85499

Abstract

Read online

Latent fingerprint identification is one of the leading forensic activities to clarify criminal acts. However, its computational cost hinders the rapid decision making in the identification of an individual when large databases are involved. To reduce the search time used to generate the fingerprint candidates’ order to be compared, fingerprint indexing algorithms that reduce the search space while minimizing the increase in the error rate (compared to the identification) are developed. In the present research, we propose an algorithm for indexing latent fingerprints based on minutia cylinder codes (MCC). This type of minutiae descriptor presents a fixed structure, which brings advantages in terms of efficiency. Besides, in recent studies, this descriptor has shown an identification error rate, at the local level, lower than the other descriptors reported in the literature. Our indexing proposal requires an initial step to construct the indices, in which it uses k-means++ clustering algorithm to create groups of similar minutia cylinder codes corresponding to the impressions of a set of databases. K-means++ allows for a better outcome over other clustering algorithms because of the selection of the proper centroids. The buckets associated with each index are populated with the background databases. Then, given a latent fingerprint, the algorithm extracts the minutia cylinder codes associated with the clusters’ indices with the lowest distance respect to each descriptor of this latent fingerprint. Finally, it integrates the votes represented by the fingerprints obtained to select the candidate impressions. We conduct a set of experiments in which our proposal outperforms current rival algorithms in presence of different databases and descriptors. Also, the primary experiment reduces the search space by four orders of magnitude when the background database contains more than one million impressions.

Keywords