Healthcare Analytics (Nov 2022)
Meta-Health Stack: A new approach for breast cancer prediction
Abstract
Data analytics and machine learning have grown in importance to efficiently manage large amounts of healthcare data. Recent statistics indicate that breast cancer is the most commonly diagnosed cancer worldwide. Different tumor features are available in various datasets for breast cancer detection. Filtering those to obtain an accurate diagnosis is time-consuming and challenging. Machine learning algorithms are beneficial for finding a significant relationship between various features and malignant tumors. This research proposes a new ensemble-based framework named Meta-Health Stack to predict breast cancer more efficiently. In this framework, to extract the most relevant features, the Extra Trees classifier is used to integrate the attributes obtained from Variance Inflation Factor, Pearson’s Correlation, and Information Gain to detect the tumors’ hidden patterns. Finally, three approaches, including Boosting, Bagging, and Voting, were combined with equal weights together through the Stacking approach. The proposed method resulted in a 97% F1-score and 98% precision tested on Wisconsin Diagnosed Diagnostic Breast Cancer (WDBC) dataset. Based on the findings, we noticed that the suggested framework’s performance works perfectly due to the selection of more appropriate features by the Extra Trees algorithm. Furthermore, we recommend that this proposed framework be used to diagnose breast cancer in its early stages as it works effectively. Using this framework, breast cancer recovery and therapy will be more successful. Moreover, to evaluate the performance of the proposed framework, it has been implemented on three other medical datasets. Results show an appropriate performance in predicting other illnesses as well.