PLOS Digital Health (May 2023)

Machine learning to predict bacteriologic confirmation of Mycobacterium tuberculosis in infants and very young children.

  • Jonathan P Smith,
  • Kyle Milligan,
  • Kimberly D McCarthy,
  • Walter Mchembere,
  • Elisha Okeyo,
  • Susan K Musau,
  • Albert Okumu,
  • Rinn Song,
  • Eleanor S Click,
  • Kevin P Cain

DOI
https://doi.org/10.1371/journal.pdig.0000249
Journal volume & issue
Vol. 2, no. 5
p. e0000249

Abstract

Read online

Diagnosis of tuberculosis (TB) among young children (<5 years) is challenging due to the paucibacillary nature of clinical disease and clinical similarities to other childhood diseases. We used machine learning to develop accurate prediction models of microbial confirmation with simply defined and easily obtainable clinical, demographic, and radiologic factors. We evaluated eleven supervised machine learning models (using stepwise regression, regularized regression, decision tree, and support vector machine approaches) to predict microbial confirmation in young children (<5 years) using samples from invasive (reference-standard) or noninvasive procedure. Models were trained and tested using data from a large prospective cohort of young children with symptoms suggestive of TB in Kenya. Model performance was evaluated using areas under the receiver operating curve (AUROC) and precision-recall curve (AUPRC), accuracy metrics. (i.e., sensitivity, specificity), F-beta scores, Cohen's Kappa, and Matthew's Correlation Coefficient. Among 262 included children, 29 (11%) were microbially confirmed using any sampling technique. Models were accurate at predicting microbial confirmation in samples obtained from invasive procedures (AUROC range: 0.84-0.90) and from noninvasive procedures (AUROC range: 0.83-0.89). History of household contact with a confirmed case of TB, immunological evidence of TB infection, and a chest x-ray consistent with TB disease were consistently influential across models. Our results suggest machine learning can accurately predict microbial confirmation of M. tuberculosis in young children using simply defined features and increase the bacteriologic yield in diagnostic cohorts. These findings may facilitate clinical decision making and guide clinical research into novel biomarkers of TB disease in young children.