Emerging Infectious Diseases (Jun 1998)

Accommodating Error Analysis in Comparison and Clustering of Molecular Fingerprints

  • Hugh Salamon,
  • Mark R. Segal,
  • Alfredo Ponce de Leon,
  • Peter M. Small

DOI
https://doi.org/10.3201/eid0402.980203
Journal volume & issue
Vol. 4, no. 2
pp. 159 – 168

Abstract

Read online

Molecular epidemiologic studies of infectious diseases rely on pathogen genotype comparisons, which usually yield patterns comprising sets of DNA fragments (DNA fingerprints). We use a highly developed genotyping system, IS6110-based restriction fragment length polymorphism analysis of Mycobacterium tuberculosis, to develop a computational method that automates comparison of large numbers of fingerprints. Because error in fragment length measurements is proportional to fragment length and is positively correlated for fragments within a lane, an align-and-count method that compensates for relative scaling of lanes reliably counts matching fragments between lanes. Results of a two-step method we developed to cluster identical fingerprints agree closely with 5 years of computer-assisted visual matching among 1,335 M. tuberculosis fingerprints. Fully documented and validated methods of automated comparison and clustering will greatly expand the scope of molecular epidemiology.

Keywords