Proceedings on Engineering Sciences (Sep 2024)
REVOLUTIONIZING FEATURE ENGINEERING FOR ROBUST ENSEMBLE MACHINE LEARNING BY HYBRIDIZING MRMR INSIGHT AND CHI2 INDEPENDENCE
Abstract
In the realm of data science, dealing with real-world datasets often presents a formidable challenge, primarily due to the sheer volume of features that significantly lack relevance or may be redundant. Effective feature engineering is vital in constructing robust ensemble ML models, where the choice of input features influences overall performance. Towards this, the present research presents a novel framework to feature engineering by hybridizing the MRMR insights and Chi2 independence techniques. MRMR emphasizes feature relevance and non-redundancy, while Chi2 quantifies the independence of features from the target variable. The hybrid framework adheres to the incremental feature engineering approach, with the goal of improving predictive accuracy, model robustness, and adaptability. Through extensive experimentation on employed water quality dataset, the framework illustrates the superiority of hybrid model over using MRMR and Chi2 independently. The results of the proposed HFE-EML exhibit substantial improvements, reaching approximately 99.10% in ensemble machine learning models' performance, reduced overfitting, and enhanced generalization.
Keywords