Journal of Big Data (Jul 2021)

Prediction of chemoresistance trait of cancer cell lines using machine learning algorithms and systems biology analysis

  • Atousa Ataei,
  • Niloufar Seyed Majidi,
  • Javad Zahiri,
  • Mehrdad Rostami,
  • S. Shahriar Arab,
  • Albert A. Rizvanov

DOI
https://doi.org/10.1186/s40537-021-00477-z
Journal volume & issue
Vol. 8, no. 1
pp. 1 – 21

Abstract

Read online

Abstract Most of the current cancer treatment approaches are invasive along with a broad spectrum of side effects. Furthermore, cancer drug resistance known as chemoresistance is a huge obstacle during treatment. This study aims to predict the resistance of several cancer cell-lines to a drug known as Cisplatin. In this papers the NCBI GEO database was used to obtain data and then the harvested data was normalized and its batch effects were corrected by the Combat software. In order to select the appropriate features for machine learning, the feature selection/reduction was performed based on the Fisher Score method. Six different algorithms were then used as machine learning algorithms to detect Cisplatin resistant and sensitive samples in cancer cell lines. Moreover, Differentially Expressed Genes (DEGs) between all the sensitive and resistance samples were harvested. The selected genes were enriched in biological pathways by the enrichr database. Topological analysis was then performed on the constructed networks using Cytoscape software. Finally, the biological description of the output genes from the performed analyses was investigated through literature review. Among the six classifiers which were trained to distinguish between cisplatin resistance samples and the sensitive ones, the KNN and the Naïve Bayes algorithms were proposed as the most convenient machines according to some calculated measures. Furthermore, the results of the systems biology analysis determined several potential chemoresistance genes among which PTGER3, YWHAH, CTNNB1, ANKRD50, EDNRB, ACSL6, IFNG and, CTNNB1 are topologically more important than others. These predictions pave the way for further experimental researches.

Keywords