Frontiers in Genetics (Jul 2023)

Feature selection translates drug response predictors from cell lines to patients

  • Shinsheng Yuan,
  • Shinsheng Yuan,
  • Yen-Chou Chen,
  • Chi-Hsuan Tsai,
  • Huei-Wen Chen,
  • Grace S. Shieh,
  • Grace S. Shieh,
  • Grace S. Shieh,
  • Grace S. Shieh

DOI
https://doi.org/10.3389/fgene.2023.1217414
Journal volume & issue
Vol. 14

Abstract

Read online

Targeted therapies and chemotherapies are prevalent in cancer treatment. Identification of predictive markers to stratify cancer patients who will respond to these therapies remains challenging because patient drug response data are limited. As large amounts of drug response data have been generated by cell lines, methods to efficiently translate cell-line-trained predictors to human tumors will be useful in clinical practice. Here, we propose versatile feature selection procedures that can be combined with any classifier. For demonstration, we combined the feature selection procedures with a (linear) logit model and a (non-linear) K-nearest neighbor and trained these on cell lines to result in LogitDA and KNNDA, respectively. We show that LogitDA/KNNDA significantly outperforms existing methods, e.g., a logistic model and a deep learning method trained by thousands of genes, in prediction AUC (0.70–1.00 for seven of the ten drugs tested) and is interpretable. This may be due to the fact that sample sizes are often limited in the area of drug response prediction. We further derive a novel adjustment on the prediction cutoff for LogitDA to yield a prediction accuracy of 0.70–0.93 for seven drugs, including erlotinib and cetuximab, whose pathways relevant to anti-cancer therapies are also uncovered. These results indicate that our methods can efficiently translate cell-line-trained predictors into tumors.

Keywords