Therapeutic Advances in Musculoskeletal Disease (Jul 2022)

Predicting osteoarthritis in adults using statistical data mining and machine learning

  • Carlo M. Bertoncelli,
  • Paola Altamura,
  • Sikha Bagui,
  • Subhash Bagui,
  • Edgar Ramos Vieira,
  • Stefania Costantini,
  • Marco Monticone,
  • Federico Solla,
  • Domenico Bertoncelli

DOI
https://doi.org/10.1177/1759720X221104935
Journal volume & issue
Vol. 14

Abstract

Read online

Background: Osteoarthritis (OA) has traditionally been considered a disease of older adults (⩾65 years old), but it may appear in younger adults. However, the risk factors for OA in younger adults need to be further evaluated. Objectives: To develop a prediction model for identifying risk factors of OA in subjects aged 20–50 years and compare the performance of different machine learning models. Methods: We included data from 52,512 participants of the National Health and Nutrition Examination Survey; of those, we analyzed only subjects aged 20–50 years ( n = 19,133), with or without OA. The supervised machine learning model ‘Deep PredictMed’ based on logistic regression, deep neural network (DNN), and support vector machine was used for identifying demographic and personal characteristics that are associated with OA. Finally, we compared the performance of the different models. Results: Being a female ( p < 0.001), older age ( p < 0.001), a smoker ( p < 0.001), higher body mass index ( p < 0.001), high blood pressure ( p < 0.001), race/ethnicity (lowest risk among Mexican Americans, p = 0.01), and physical and mental limitations ( p < 0.001) were associated with having OA. Best predictive performance yielded a 75% area under the receiver operating characteristic curve. Conclusion: Sex (female), age (older), smoking (yes), body mass index (higher), blood pressure (high), race/ethnicity, and physical and mental limitations are risk factors for having OA in adults aged 20–50 years. The best predictive performance was achieved using DNN algorithms.