International Journal of Applied Mathematics and Computer Science (Mar 2019)

Machine learning techniques combined with dose profiles indicate radiation response biomarkers

  • Papiez Anna,
  • Badie Christophe,
  • Polanska Joanna

DOI
https://doi.org/10.2478/amcs-2019-0013
Journal volume & issue
Vol. 29, no. 1
pp. 169 – 178

Abstract

Read online

The focus of this research is to combine statistical and machine learning tools in application to a high-throughput biological data set on ionizing radiation response. The analyzed data consist of two gene expression sets obtained in studies of radiosensitive and radioresistant breast cancer patients undergoing radiotherapy. The data sets were similar in principle; however, the treatment dose differed. It is shown that introducing mathematical adjustments in data preprocessing, differentiation and trend testing, and classification, coupled with current biological knowledge, allows efficient data analysis and obtaining accurate results. The tools used to customize the analysis workflow were batch effect filtration with empirical Bayes models, identifying gene trends through the Jonckheere–Terpstra test and linear interpolation adjustment according to specific gene profiles for multiple random validation. The application of non-standard techniques enabled successful sample classification at the rate of 93.5% and the identification of potential biomarkers of radiation response in breast cancer, which were confirmed with an independent Monte Carlo feature selection approach and by literature references. This study shows that using customized analysis workflows is a necessary step towards novel discoveries in complex fields such as personalized individual therapy.

Keywords