Computational and Structural Biotechnology Journal (Dec 2024)
Unsupervised machine learning for risk stratification and identification of relevant subgroups of ascending aorta dimensions using cardiac CT and clinical data
Abstract
The potential of precision population health lies in its capacity to utilize robust patient data for customized prevention and care targeted at specific groups. Machine learning has the potential to automatically identify clinically relevant subgroups of individuals, considering heterogeneous data sources. This study aimed to assess whether unsupervised machine learning (UML) techniques could interpret different clinical data to uncover clinically significant subgroups of patients suspected of coronary artery disease and identify different ranges of aorta dimensions in the different identified subgroups. We employed a random forest-based cluster analysis, utilizing 14 variables from 1170 (717 men/453 women) participants. The unsupervised clustering approach successfully identified four distinct subgroups of individuals with specific clinical characteristics, and this allows us to interpret and assess different ranges of aorta dimensions for each cluster. By employing flexible UML algorithms, we can effectively process heterogeneous patient data and gain deeper insights into clinical interpretation and risk assessment.