Frontiers in Bioengineering and Biotechnology (Aug 2020)

A Deep Learning Framework to Predict Tumor Tissue-of-Origin Based on Copy Number Alteration

  • Ying Liang,
  • Haifeng Wang,
  • Jialiang Yang,
  • Xiong Li,
  • Chan Dai,
  • Peng Shao,
  • Geng Tian,
  • Bo Wang,
  • Yinglong Wang

DOI
https://doi.org/10.3389/fbioe.2020.00701
Journal volume & issue
Vol. 8

Abstract

Read online

Cancer of unknown primary site (CUPS) is a type of metastatic tumor for which the sites of tumor origin cannot be determined. Precise diagnosis of the tissue origin for metastatic CUPS is crucial for developing treatment schemes to improve patient prognosis. Recently, there have been many studies using various cancer biomarkers to predict the tissue-of-origin (TOO) of CUPS. However, only a very few of them use copy number alteration (CNA) to trance TOO. In this paper, a two-step computational framework called CNA_origin is introduced to predict the tissue-of-origin of a tumor from its gene CNA levels. CNA_origin set up an intellectual deep-learning network mainly composed of an autoencoder and a convolution neural network (CNN). Based on real datasets released from the public database, CNA_origin had an overall accuracy of 83.81% on 10-fold cross-validation and 79% on independent datasets for predicting tumor origin, which improved the accuracy by 7.75 and 9.72% compared with the method published in a previous paper. Our results suggested that the autoencoder model can extract key characteristics of CNA and that the CNN classifier model developed in this study can predict the origin of tumors robustly and effectively. CNA_origin was written in Python and can be downloaded from https://github.com/YingLianghnu/CNA_origin.

Keywords