Scientific Reports (Jul 2022)
Training a machine learning classifier to identify ADHD based on real-world clinical data from medical records
Abstract
Abstract The diagnostic process of attention deficit hyperactivity disorder (ADHD) is complex and relies on criteria sensitive to subjective biases. This may cause significant delays in appropriate treatment initiation. An automated analysis relying on subjective and objective measures might not only simplify the diagnostic process and reduce the time to diagnosis, but also improve reproducibility. While recent machine learning studies have succeeded at distinguishing ADHD from healthy controls, the clinical process requires differentiating among other or multiple psychiatric conditions. We trained a linear support vector machine (SVM) classifier to detect participants with ADHD in a population showing a broad spectrum of psychiatric conditions using anonymized data from clinical records (N = 299 participants). We differentiated children and adolescents with ADHD from those not having the condition with an accuracy of 66.1%. SVM using single features showed slight differences between features and overlapping standard deviations of the achieved accuracies. An automated feature selection achieved the best performance using a combination 19 features. Real-world clinical data from medical records can be used to automatically identify individuals with ADHD among help-seeking individuals using machine learning. The relevant diagnostic information can be reduced using an automated feature selection without loss of performance. A broad combination of symptoms across different domains, rather than specific domains, seems to indicate an ADHD diagnosis.