IEEE Access (Jan 2021)
A Multi-Learning Training Approach for Distinguishing Low and High Risk Cancer Patients
Abstract
All cancers are caused by changes in the DNA within cells that occur over the course of an individual’s lifetime. These mutations confer extensive genetic and phenotype variations within individuals, making the identification of appropriate treatments hard and costly. Moreover, cancer datasets are usually highly sparse due to the presence of few samples and many input features, making it difficult to design accurate predictors to classify patients into risk groups. Here, we report on the Multi Learning Training (MuLT) algorithm, which employs supervised, unsupervised, and self-supervised learning methods in order to take advantage of the interplay of clinical and molecular features for distinguishing low and high risk cancer patients. Our solution is evaluated using three independent and public cancer data sets considering three different performance aspects, through 5-fold cross-validation experiments. MuLT outranks other methods achieving AUCs between 0.65 and 0.77 and mean squared errors smaller than 0.24, while reducing classification complexity. These findings confirm the benefits of combining different learning algorithms and of coupling molecular and clinical data for supporting clinical decision making in Oncology.
Keywords