Feature Selection With Group-Sparse Stochastic Gates

Hyeryn Park; Changhee Lee

doi:10.1109/ACCESS.2024.3432509

IEEE Access (Jan 2024)

Feature Selection With Group-Sparse Stochastic Gates

Hyeryn Park,
Changhee Lee

Affiliations

Hyeryn Park: ORCiD; Department of Artificial Intelligence, Chung-Ang University, Seoul, South Korea
Changhee Lee: ORCiD; Department of Artificial Intelligence, Chung-Ang University, Seoul, South Korea

DOI: https://doi.org/10.1109/ACCESS.2024.3432509
Journal volume & issue: Vol. 12
pp. 102299 – 102312

Abstract

Read online

Identifying features significantly influencing the target outcome is crucial for understanding complex relationships, reducing computational costs, and improving model generalization in high-dimensional data. While powerful for discovering intricate relationships, deep learning-based feature selection methods often overlook inherent group structures in data, such as gene pathways or categorical variables. Consequently, these methods may fail to select informative features within the relevant groups, potentially leading to the selection of less informative features and ultimately, lower model performance. To address this challenge, we propose a novel deep learning-based feature selection method that achieves both intra-group and inter-group sparsity. By introducing a penalty term that encourages group sparsity, our method effectively selects informative groups of features, thereby improving model performance. We validate our approach through experiments on synthetic and real-world datasets with predefined group structures. Our method achieved a $2.5~\sim ~8.2$ % reduction in prediction RMSE and a $0.3~\sim ~1.9$ % improvement in prediction accuracy compared to existing methods. Furthermore, our approach demonstrated a 50% increase in the selection of biologically relevant features, enhancing model interpretability and alignment with relevant scientific literature. These results confirm the effectiveness of our method in leveraging group structures to improve both performance and interpretability.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords