BMC Bioinformatics (Feb 2022)

A hierarchical opportunistic screening model for osteoporosis using machine learning applied to clinical data and CT images

  • Liyu Liu,
  • Meng Si,
  • Hecheng Ma,
  • Menglin Cong,
  • Quanzheng Xu,
  • Qinghua Sun,
  • Weiming Wu,
  • Cong Wang,
  • Michael J. Fagan,
  • Luis A. J. Mur,
  • Qing Yang,
  • Bing Ji

DOI
https://doi.org/10.1186/s12859-022-04596-z
Journal volume & issue
Vol. 23, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Background Osteoporosis is a common metabolic skeletal disease and usually lacks obvious symptoms. Many individuals are not diagnosed until osteoporotic fractures occur. Bone mineral density (BMD) measured by dual-energy X-ray absorptiometry (DXA) is the gold standard for osteoporosis detection. However, only a limited percentage of people with osteoporosis risks undergo the DXA test. As a result, it is vital to develop methods to identify individuals at-risk based on methods other than DXA. Results We proposed a hierarchical model with three layers to detect osteoporosis using clinical data (including demographic characteristics and routine laboratory tests data) and CT images covering lumbar vertebral bodies rather than DXA data via machine learning. 2210 individuals over age 40 were collected retrospectively, among which 246 individuals’ clinical data and CT images are both available. Irrelevant and redundant features were removed via statistical analysis. Consequently, 28 features, including 16 clinical data and 12 texture features demonstrated statistically significant differences (p < 0.05) between osteoporosis and normal groups. Six machine learning algorithms including logistic regression (LR), support vector machine with radial-basis function kernel, artificial neural network, random forests, eXtreme Gradient Boosting and Stacking that combined the above five classifiers were employed as classifiers to assess the performances of the model. Furthermore, to diminish the influence of data partitioning, the dataset was randomly split into training and test set with stratified sampling repeated five times. The results demonstrated that the hierarchical model based on LR showed better performances with an area under the receiver operating characteristic curve of 0.818, 0.838, and 0.962 for three layers, respectively in distinguishing individuals with osteoporosis and normal BMD. Conclusions The proposed model showed great potential in opportunistic screening for osteoporosis without additional expense. It is hoped that this model could serve to detect osteoporosis as early as possible and thereby prevent serious complications of osteoporosis, such as osteoporosis fractures.

Keywords