Frontiers in Microbiology (Mar 2022)

Large-Scale Samples Based Rapid Detection of Ciprofloxacin Resistance in Klebsiella pneumoniae Using Machine Learning Methods

  • Chunxuan Wang,
  • Chunxuan Wang,
  • Zhuo Wang,
  • Zhuo Wang,
  • Hsin-Yao Wang,
  • Hsin-Yao Wang,
  • Chia-Ru Chung,
  • Jorng-Tzong Horng,
  • Jorng-Tzong Horng,
  • Jang-Jih Lu,
  • Jang-Jih Lu,
  • Jang-Jih Lu,
  • Tzong-Yi Lee,
  • Tzong-Yi Lee

DOI
https://doi.org/10.3389/fmicb.2022.827451
Journal volume & issue
Vol. 13

Abstract

Read online

Klebsiella pneumoniae is one of the most common causes of hospital- and community-acquired pneumoniae. Resistance to the extensively used quinolone antibiotic, such as ciprofloxacin, has increased in Klebsiella pneumoniae, which leads to the increase in the risk of initial antibiotic selection for Klebsiella pneumoniae treatment. Rapid and precise identification of ciprofloxacin-resistant Klebsiella pneumoniae (CIRKP) is essential for clinical therapy. Nowadays, matrix-assisted laser desorption ionization time-of-flight mass spectrometry (MALDI-TOF MS) is another approach to discover antibiotic-resistant bacteria due to its shorter inspection time and lower cost than other current methods. Machine learning methods are introduced to assist in discovering significant biomarkers from MALDI-TOF MS data and construct prediction models for rapid antibiotic resistance identification. This study examined 16,997 samples taken from June 2013 to February 2018 as part of a longitudinal investigation done by Change Gung Memorial Hospitals (CGMH) at the Linkou branch. We applied traditional statistical approaches to identify significant biomarkers, and then a comparison was made between high-importance features in machine learning models and statistically selected features. Large-scale data guaranteed the statistical power of selected biomarkers. Besides, clustering analysis analyzed suspicious sub-strains to provide potential information about their influences on antibiotic resistance identification performance. For modeling, to simulate the real antibiotic resistance predicting challenges, we included basic information about patients and the types of specimen carriers into the model construction process and separated the training and testing sets by time. Final performance reached an area under the receiver operating characteristic curve (AUC) of 0.89 for support vector machine (SVM) and extreme gradient boosting (XGB) models. Also, logistic regression and random forest models both achieved AUC around 0.85. In conclusion, models provide sensitive forecasts of CIRKP, which may aid in early antibiotic selection against Klebsiella pneumoniae. The suspicious sub-strains could affect the model performance. Further works could keep on searching for methods to improve both the model accuracy and stability.

Keywords