BMC Bioinformatics (May 2024)

Machine learning based DNA melt curve profiling enables automated novel genotype detection

  • Aaron Boussina,
  • Lennart Langouche,
  • Augustine C. Obirieze,
  • Mridu Sinha,
  • Hannah Mack,
  • William Leineweber,
  • April Aralar,
  • David T. Pride,
  • Todd P. Coleman,
  • Stephanie I. Fraley

DOI
https://doi.org/10.1186/s12859-024-05747-0
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Surveillance for genetic variation of microbial pathogens, both within and among species, plays an important role in informing research, diagnostic, prevention, and treatment activities for disease control. However, large-scale systematic screening for novel genotypes remains challenging in part due to technological limitations. Towards addressing this challenge, we present an advancement in universal microbial high resolution melting (HRM) analysis that is capable of accomplishing both known genotype identification and novel genotype detection. Specifically, this novel surveillance functionality is achieved through time-series modeling of sequence-defined HRM curves, which is uniquely enabled by the large-scale melt curve datasets generated using our high-throughput digital HRM platform. Taking the detection of bacterial genotypes as a model application, we demonstrate that our algorithms accomplish an overall classification accuracy over 99.7% and perform novelty detection with a sensitivity of 0.96, specificity of 0.96 and Youden index of 0.92. Since HRM-based DNA profiling is an inexpensive and rapid technique, our results add support for the feasibility of its use in surveillance applications.

Keywords