IEEE Access (Jan 2022)
Practically Implementation of Information Loss: Sensitivity, Risk by Different Feature Selection Techniques
Abstract
The modern perspective to deal with bulky data generations schemes of latest technologies in terms of dimensionality and sample size to extract meaningful information also to support automated knowledge discovery and pattern recognition process form datasets a lot of Data Mining (DM) and Machine Learning (ML) techniques developed. In each dataset features are the key factors for machine learning task. In modern research mindset classification algorithms are focused to get high accuracy by taking in account prior features and less focus on features having low characteristic values. In this paper we focused on those features which are usually ignored in selection phase as low scale features which may decrease model performance but in future for the most sensitive scenarios will focus on minor information which will alert about performance fluctuation in practical implementation of that model. For practical verification of our concept, we implemented rule-based classification algorithms and different features selection techniques with 3 search methods using WEKA data mining tool. The experimental results show that the fewer selected features provide high accuracy i.e >90% in some cases having less focus on specificity.
Keywords