A Cascade Deep Forest Model for Breast Cancer Subtype Classification Using Multi-Omics Data

Ala’a El-Nabawy; Nahla A. Belal; Nashwa El-Bendary

doi:10.3390/math9131574

Mathematics (Jul 2021)

A Cascade Deep Forest Model for Breast Cancer Subtype Classification Using Multi-Omics Data

Ala’a El-Nabawy,
Nahla A. Belal,
Nashwa El-Bendary

Affiliations

Ala’a El-Nabawy: Orange Labs., Smart Village 12511, Giza Governorate, Egypt
Nahla A. Belal: College of Computing and Information Technology, Arab Academy for Science, Technology, and Maritime Transport, Smart Village, Giza 12577, Egypt
Nashwa El-Bendary: College of Computing and Information Technology, Arab Academy for Science, Technology, and Maritime Transport, Smart Village, Giza 12577, Egypt

DOI: https://doi.org/10.3390/math9131574
Journal volume & issue: Vol. 9, no. 13
p. 1574

Abstract

Read online

Automated diagnosis systems aim to reduce the cost of diagnosis while maintaining the same efficiency. Many methods have been used for breast cancer subtype classification. Some use single data source, while others integrate many data sources, the case that results in reduced computational performance as opposed to accuracy. Breast cancer data, especially biological data, is known for its imbalance, with lack of extensive amounts of histopathological images as biological data. Recent studies have shown that cascade Deep Forest ensemble model achieves a competitive classification accuracy compared with other alternatives, such as the general ensemble learning methods and the conventional deep neural networks (DNNs), especially for imbalanced training sets, through learning hyper-representations through using cascade ensemble decision trees. In this work, a cascade Deep Forest is employed to classify breast cancer subtypes, IntClust and Pam50, using multi-omics datasets and different configurations. The results obtained recorded an accuracy of 83.45% for 5 subtypes and 77.55% for 10 subtypes. The significance of this work is that it is shown that using gene expression data alone with the cascade Deep Forest classifier achieves comparable accuracy to other techniques with higher computational performance, where the time recorded is about 5 s for 10 subtypes, and 7 s for 5 subtypes.

Published in Mathematics

ISSN: 2227-7390 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science: Mathematics
Website: http://www.mdpi.com/journal/mathematics

About the journal

Abstract

Keywords