NeuroImage (Oct 2024)

Local-structure-preservation and redundancy-removal-based feature selection method and its application to the identification of biomarkers for schizophrenia

  • Ying Xing,
  • Godfrey D. Pearlson,
  • Peter Kochunov,
  • Vince D. Calhoun,
  • Yuhui Du

Journal volume & issue
Vol. 299
p. 120839

Abstract

Read online

Accurate diagnosis of mental disorders is expected to be achieved through the identification of reliable neuroimaging biomarkers with the help of cutting-edge feature selection techniques. However, existing feature selection methods often fall short in capturing the local structural characteristics among samples and effectively eliminating redundant features, resulting in inadequate performance in disorder prediction. To address this gap, we propose a novel supervised method named local-structure-preservation and redundancy-removal-based feature selection (LRFS), and then apply it to the identification of meaningful biomarkers for schizophrenia (SZ). LRFS method leverages graph-based regularization to preserve original sample similarity relationships during data transformation, thus retaining crucial local structure information. Additionally, it introduces redundancy-removal regularization based on interrelationships among features to exclude similar and redundant features from high-dimensional data. Moreover, LRFS method incorporates l2,1 sparse regularization that enables selecting a sparse and noise-robust feature subset. Experimental evaluations on eight public datasets with diverse properties demonstrate the superior performance of our method over nine popular feature selection methods in identifying discriminative features, with average classification accuracy gains ranging from 1.30 % to 9.11 %. Furthermore, the LRFS method demonstrates superior discriminability in four functional magnetic resonance imaging (fMRI) datasets from 708 healthy controls (HCs) and 537 SZ patients, with an average increase in classification accuracy ranging from 1.89 % to 9.24 % compared to other nine methods. Notably, our method reveals reproducible and significant changes in SZ patients relative to HCs across the four datasets, predominantly in the thalamus-related functional network connectivity, which exhibit a significant correlation with clinical symptoms. Convergence analysis, parameter sensitivity analysis, and ablation studies further demonstrate the effectiveness and robustness of our method. In short, our proposed feature selection method effectively identifies discriminative and reliable features that hold the potential to be biomarkers, paving the way for the elucidation of brain abnormalities and the advancement of precise diagnosis of mental disorders.

Keywords