PLoS ONE (Jan 2022)
Development and Validation of Clinical Diagnostic Model for Girls with Central Precocious Puberty: Machine-learning Approaches
Abstract
Background A brief gonadotropin-releasing hormone analogues (GnRHa) stimulation test which solely focused on LH 30-minute post-stimulation was considered to identify girls with central precocious puberty (CPP). However, it was tested using traditional statistical methods. With advanced computer science, we aimed to develop a machine learning-based diagnostic model that processed baseline CPP-related variables and a brief GnRHa stimulation test for CPP diagnosis. Methods We recruited girls suspected of precocious puberty and underwent a GnRHa stimulation test at Children Hospital 2, Vietnam, and Cathay General Hospital, Taiwan. Clinical data, bone age measurement, and 30-min post-stimulation blood test were used to build up the predictive model. The candidate model was developed by different machine learning algorithms that were mainly evaluated by sensitivity, specificity, the area under the receiver operator characteristic curve (AUC), and F1-score in internal and external validation data to classify girls as CPP and non-CPP at different time-points (0-min, 30-min, 60-min, and 120-min post-stimulation). Results Among the 614 girls diagnosed with PP, 524 (85.3%) had CPP. The random forest algorithm yielded the highest value of F1-score (0.976), specificity (0.893), positive predicted value (0.987), and relatively high value of AUC (0.972) that contributed to high probability to identify CPP. The performance metrics of the 30-min post-stimulation diagnostic model including sensitivity and specificity surpassed those of the 0-minute model (0-min) and were equivalent to those of the model obtained 60-min and 120-min post-stimulation. Hence, our machine learning-based model helps shorten the stimulation test to 30 minutes after GnRHa injection, in general, it requires 120 minutes for a completed GnRHa stimulation test. Conclusions We developed a diagnostic model based on clinical features and a single sample 30-minute post-stimulation to identify CPP in girls that can reduce distress for children caused by multiple blood samplings.