IEEE Access (Jan 2024)
Feature Selection With Group-Sparse Stochastic Gates
Abstract
Identifying features significantly influencing the target outcome is crucial for understanding complex relationships, reducing computational costs, and improving model generalization in high-dimensional data. While powerful for discovering intricate relationships, deep learning-based feature selection methods often overlook inherent group structures in data, such as gene pathways or categorical variables. Consequently, these methods may fail to select informative features within the relevant groups, potentially leading to the selection of less informative features and ultimately, lower model performance. To address this challenge, we propose a novel deep learning-based feature selection method that achieves both intra-group and inter-group sparsity. By introducing a penalty term that encourages group sparsity, our method effectively selects informative groups of features, thereby improving model performance. We validate our approach through experiments on synthetic and real-world datasets with predefined group structures. Our method achieved a $2.5~\sim ~8.2$ % reduction in prediction RMSE and a $0.3~\sim ~1.9$ % improvement in prediction accuracy compared to existing methods. Furthermore, our approach demonstrated a 50% increase in the selection of biologically relevant features, enhancing model interpretability and alignment with relevant scientific literature. These results confirm the effectiveness of our method in leveraging group structures to improve both performance and interpretability.
Keywords