Journal of Mahani Mathematical Research (Nov 2023)
A multi-objective optimization approach for online streaming feature selection using fuzzy Pareto dominance
Abstract
Feature selection is one of the most important tasks in machine learning. Traditional feature selection methods are inadequate for reducing the dimensionality of online data streams because they assume that the feature space is fixed and every time a feature is added, the algorithm must be executed from the beginning, which in addition to not performing real-time processing, causes many unnecessary calculations and resource consumption. In many real-world applications such as weather forecasting, stock markets, clinical research, natural disasters, and vital-sign monitoring, the feature space changes dynamically, and feature streams are added to the data over time. Existing online streaming feature selection (OSFS) methods suffer from problems such as high computational complexity, long processing time, sensitivity to parameters, and failure to account for redundancy between features. In this paper, the process of OSFS is modeled as a multi-objective optimization problem for the first time. When a feature stream arrives, it is evaluated in the multi-objective space using fuzzy Pareto dominance, where three feature selection methods are considered as our objectives. Features are ranked according to their degree of dominance in the multi-objective space over other features. We proposed an effective method to select a minimum subset of features in a short time. Experiments were conducted using two classifiers and eight OSFS algorithms with real-world datasets. The results show that the proposed method selects a minimal subset of features in a reasonable time for all datasets.
Keywords