BMC Medical Genomics (Nov 2021)

Are we there yet? A machine learning architecture to predict organotropic metastases

  • Michael Skaro,
  • Marcus Hill,
  • Yi Zhou,
  • Shannon Quinn,
  • Melissa B. Davis,
  • Andrea Sboner,
  • Mandi Murph,
  • Jonathan Arnold

DOI
https://doi.org/10.1186/s12920-021-01122-7
Journal volume & issue
Vol. 14, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background & Aims Cancer metastasis into distant organs is an evolutionarily selective process. A better understanding of the driving forces endowing proliferative plasticity of tumor seeds in distant soils is required to develop and adapt better treatment systems for this lethal stage of the disease. To this end, we aimed to utilize transcript expression profiling features to predict the site-specific metastases of primary tumors and second, to identify the determinants of tissue specific progression. Methods We used statistical machine learning for transcript feature selection to optimize classification and built tree-based classifiers to predict tissue specific sites of metastatic progression. Results We developed a novel machine learning architecture that analyzes 33 types of RNA transcriptome profiles from The Cancer Genome Atlas (TCGA) database. Our classifier identifies the tumor type, derives synthetic instances of primary tumors metastasizing to distant organs and classifies the site-specific metastases in 16 types of cancers metastasizing to 12 locations. Conclusions We have demonstrated that site specific metastatic progression is predictable using transcriptomic profiling data from primary tumors and that the overrepresented biological processes in tumors metastasizing to congruent distant loci are highly overlapping. These results indicate site-specific progression was organotropic and core features of biological signaling pathways are identifiable that may describe proliferative plasticity in distant soils.

Keywords