Frontiers in Cardiovascular Medicine (Nov 2023)
Feature-based clustering of the left ventricular strain curve for cardiovascular risk stratification in the general population
Abstract
ObjectiveIdentifying individuals with subclinical cardiovascular (CV) disease could improve monitoring and risk stratification. While peak left ventricular (LV) systolic strain has emerged as a strong prognostic factor, few studies have analyzed the whole temporal profiles of the deformation curves during the complete cardiac cycle. Therefore, in this longitudinal study, we applied an unsupervised machine learning approach based on time-series-derived features from the LV strain curve to identify distinct strain phenogroups that might be related to the risk of adverse cardiovascular events in the general population.MethodWe prospectively studied 1,185 community-dwelling individuals (mean age, 53.2 years; 51.3% women), in whom we acquired clinical and echocardiographic data including LV strain traces at baseline and collected adverse events on average 9.1 years later. A Gaussian Mixture Model (GMM) was applied to features derived from LV strain curves, including the slopes during systole, early and late diastole, peak strain, and the duration and height of diastasis. We evaluated the performance of the model using the clinical characteristics of the participants and the incidence of adverse events in the training dataset. To ascertain the validity of the trained model, we used an additional community-based cohort (n = 545) as external validation cohort.ResultsThe most appropriate number of clusters to separate the LV strain curves was four. In clusters 1 and 2, we observed differences in age and heart rate distributions, but they had similarly low prevalence of CV risk factors. Cluster 4 had the worst combination of CV risk factors, and a higher prevalence of LV hypertrophy and diastolic dysfunction than in other clusters. In cluster 3, the reported values were in between those of strain clusters 2 and 4. Adjusting for traditional covariables, we observed that clusters 3 and 4 had a significantly higher risk for CV (28% and 20%, P ≤ 0.038) and cardiac (57% and 43%, P ≤ 0.024) adverse events. Using SHAP values we observed that the features that incorporate temporal information, such as the slope during systole and early diastole, had a higher impact on the model's decision than peak LV systolic strain.ConclusionEmploying a GMM on features derived from the raw LV strain curves, we extracted clinically significant phenogroups which could provide additive prognostic information over the peak LV strain.
Keywords