A cross-cohort analysis of dental plaque microbiome in early childhood caries
Mohd Wasif Khan,
Daryl Lerh Xing Fung,
Robert J. Schroth,
Prashen Chelikani,
Pingzhao Hu
Affiliations
Mohd Wasif Khan
Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, Canada; Children’s Hospital Research Institute of Manitoba, Winnipeg, MB, Canada
Daryl Lerh Xing Fung
Department of Computer Science, University of Manitoba, Winnipeg, MB, Canada
Robert J. Schroth
Children’s Hospital Research Institute of Manitoba, Winnipeg, MB, Canada; Department of Preventive Dental Science, University of Manitoba, Winnipeg, MB, Canada; Department of Pediatrics and Child Health, University of Manitoba, Winnipeg, MB, Canada
Prashen Chelikani
Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, Canada; Children’s Hospital Research Institute of Manitoba, Winnipeg, MB, Canada; Manitoba Chemosensory Biology Research Group, Department of Oral Biology, University of Manitoba, Winnipeg, MB, Canada; Corresponding author
Pingzhao Hu
Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, Canada; Children’s Hospital Research Institute of Manitoba, Winnipeg, MB, Canada; Department of Computer Science, University of Manitoba, Winnipeg, MB, Canada; Department of Biochemistry, Western University, London, ON, Canada; Corresponding author
Summary: Early childhood caries (ECC) is a multifactorial disease with a microbiome playing a significant role in caries progression. Understanding changes at the microbiome level in ECC is required to develop diagnostic and preventive strategies. In our study, we combined data from small independent cohorts to compare microbiome composition using a unified pipeline and applied a batch correction to avoid the pitfalls of batch effects. Our meta-analysis identified common biomarker species between different studies. We identified the best machine learning method for the classification of ECC versus caries-free samples and compared the performance of this method using a leave-one-dataset-out approach. Our random forest model was found to be generalizable when used in combination with other studies. While our results highlight the potential microbial species involved in ECC and disease classification, we also mentioned the limitations that can serve as a guide for future researchers to design and use appropriate tools for such analyses.