PLoS Computational Biology (May 2022)

BOSO: A novel feature selection algorithm for linear regression with high-dimensional data.

  • Luis V Valcárcel,
  • Edurne San José-Enériz,
  • Xabier Cendoya,
  • Ángel Rubio,
  • Xabier Agirre,
  • Felipe Prósper,
  • Francisco J Planes

DOI
https://doi.org/10.1371/journal.pcbi.1010180
Journal volume & issue
Vol. 18, no. 5
p. e1010180

Abstract

Read online

With the frenetic growth of high-dimensional datasets in different biomedical domains, there is an urgent need to develop predictive methods able to deal with this complexity. Feature selection is a relevant strategy in machine learning to address this challenge. We introduce a novel feature selection algorithm for linear regression called BOSO (Bilevel Optimization Selector Operator). We conducted a benchmark of BOSO with key algorithms in the literature, finding a superior accuracy for feature selection in high-dimensional datasets. Proof-of-concept of BOSO for predicting drug sensitivity in cancer is presented. A detailed analysis is carried out for methotrexate, a well-studied drug targeting cancer metabolism.