Computational and Mathematical Biophysics (Mar 2024)
On building machine learning models for medical dataset with correlated features
Abstract
This work builds machine learning models for the dataset generated using a numerical model developed on an idealized human artery. The model has been constructed accounting for varying blood characteristics as it flows through arteries with variable vascular properties, and it is applied to simulate blood flow in the femoral and its continued artery. For this purpose, we designed a pipeline model consisting of three components to include the major segments of the femoral artery: CFA, the common femoral artery and SFA, the superficial artery, and its continued one, the popliteal artery (PA). A notable point of this study is that the features and target variables of the former component pipe form the set of features of the latter, thus resulting in multicollinearity among the features in the third component pipe. Thus, we worked on understanding the effect of these correlated features on the target variables using regularized linear regression models, ensemble, and boosting algorithms. This study highlighted the blood velocity in CFA as the primary influential factor for wall shear stress in both CFA and SFA. Additionally, it established the blood rheology in PA as a significant factor for the same in it. Nevertheless, because the study relies on idealized conditions, these discoveries necessitate thorough clinical validation.
Keywords