BMC Medical Informatics and Decision Making (Nov 2022)
Machine learning in a real-world PFO study: analysis of data from multi-centers in China
Abstract
Abstract Purpose The association of patent foreman ovale (PFO) and cryptogenic stroke has been studied for years. Although device closure overall decreases the risk for recurrent stroke, treatment effects varied across different studies. In this study, we aimed to detect sub-clusters in post-closure PFO patients and identify potential predictors for adverse outcomes. Methods We analyzed patients with embolic stroke of undetermined sources and PFO from 7 centers in China. Machine learning and Cox regression analysis were used. Results Using unsupervised hierarchical clustering on principal components, two main clusters were identified and a total of 196 patients were included. The average age was 42.7 (12.37) years and 64.80% (127/196) were female. During a median follow-up of 739 days, 12 (6.9%) adverse events happened, including 6 (3.45%) recurrent stroke, 5 (2.87%) transient ischemic attack (TIA) and one death (0.6%). Compared to cluster 1 (n = 77, 39.20%), patients in cluster 2 (n = 119, 60.71%) were more likely to be male, had higher systolic and diastolic blood pressure, higher body mass index, lower high-density lipoprotein cholesterol and increased proportion of presence of atrial septal aneurysm. Using random forest survival (RFS) analysis, eight top ranking features were selected and used for prediction model construction. As a result, the RFS model outperformed the traditional Cox regression model (C-index: 0.87 vs. 0.54). Conclusions There were 2 main clusters in post-closure PFO patients. Traditional cardiovascular profiles remain top ranking predictors for future recurrence of stroke or TIA. However, whether maximizing the management of these factors would provide extra benefits warrants further investigations.
Keywords