Molecular Autism (Dec 2017)

Sparsifying machine learning models identify stable subsets of predictive features for behavioral detection of autism

  • Sebastien Levy,
  • Marlena Duda,
  • Nick Haber,
  • Dennis P. Wall

DOI
https://doi.org/10.1186/s13229-017-0180-6
Journal volume & issue
Vol. 8, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Background Autism spectrum disorder (ASD) diagnosis can be delayed due in part to the time required for administration of standard exams, such as the Autism Diagnostic Observation Schedule (ADOS). Shorter and potentially mobilized approaches would help to alleviate bottlenecks in the healthcare system. Previous work using machine learning suggested that a subset of the behaviors measured by ADOS can achieve clinically acceptable levels of accuracy. Here we expand on this initial work to build sparse models that have higher potential to generalize to the clinical population. Methods We assembled a collection of score sheets for two ADOS modules, one for children with phrased speech (Module 2; 1319 ASD cases, 70 controls) and the other for children with verbal fluency (Module 3; 2870 ASD cases, 273 controls). We used sparsity/parsimony enforcing regularization techniques in a nested cross validation grid search to select features for 17 unique supervised learning models, encoding missing values as additional indicator features. We augmented our feature sets with gender and age to train minimal and interpretable classifiers capable of robust detection of ASD from non-ASD. Results By applying 17 unique supervised learning methods across 5 classification families tuned for sparse use of features and to be within 1 standard error of the optimal model, we find reduced sets of 10 and 5 features used in a majority of models. We tested the performance of the most interpretable of these sparse models, including Logistic Regression with L2 regularization or Linear SVM with L1 regularization. We obtained an area under the ROC curve of 0.95 for ADOS Module 3 and 0.93 for ADOS Module 2 with less than or equal to 10 features. Conclusions The resulting models provide improved stability over previous machine learning efforts to minimize the time complexity of autism detection due to regularization and a small parameter space. These robustness techniques yield classifiers that are sparse, interpretable and that have potential to generalize to alternative modes of autism screening, diagnosis and monitoring, possibly including analysis of short home videos.

Keywords