Recognition of cancer mediating genes using MLP-SDAE model

Sougata Sheet; Ranjan Ghosh; Anupam Ghosh

doi:10.1016/j.sasc.2024.200079

Systems and Soft Computing (Dec 2024)

Recognition of cancer mediating genes using MLP-SDAE model

Sougata Sheet,
Ranjan Ghosh,
Anupam Ghosh

Affiliations

Sougata Sheet: Department of Computer Science & Engineering, Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, Odisha, 751030, India; A. K. Choudhury school of IT, University of Calcutta, Kolkata, 700106, West Bengal, India; Corresponding author at: Department of Computer Science & Engineering, Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, Odisha, 751030, India.
Ranjan Ghosh: Institute of Radio Physics and Electronics, University of Calcutta, Kolkata, 700009, West Bengal, India
Anupam Ghosh: Department of Computer Science & Engineering, Netaji Subhash Engineering College, Garia, 700152, West Bengal, India

DOI: https://doi.org/10.1016/j.sasc.2024.200079
Journal volume & issue: Vol. 6
p. 200079

Abstract

Read online

This article introduces a predictive deep learning model called MLP-SDAE, which combines Multilayer Perceptron (MLP) and Stacked Denoising Auto-encoder (SDAE) techniques. Our model, MLP-SDAE is trained using Stacked Denoising Auto-Encoder for feature selection, and backpropagation is employed within the MLP structure. We have incorporated dropout to enhance the model’s performance and prevent overfitting. The primary objective of the MLP-SDAE model is to identify associations among genes that have undergone significant alterations from a normal to a diseased state based on their expression behaviors. This concept allows us to predict disease-mediating genes and their altered associations. The methodology involves calculating gene-based correlation coefficients and selecting a subset of genes based on this analysis. We have demonstrated the effectiveness of our methods using four gene expression datasets related to human leukemia, lung, colon, and breast cancer. As a result, we have identified several potentially important genes, such as CACLA, HBA, IGFBP3, EFGR, TFN, TP53, LI6, and TMTC1, which may play a crucial role in developing these cancers. Furthermore, we conducted a comprehensive comparative study with other deep learning techniques, including Recurrent Neural Network (RNN), Deep Belief Network (DBN), Deep Boltzmann Machine (DBM), Auto-encoder (AE), and Denoising Auto-encoder (DAE). Our results have been validated through biochemical pathway analysis, t-tests, F-score, Gene Ontology (GO) identification, and the NCBI database. These validations demonstrate that our proposed MLP-SDAE model outperforms existing methods.

Published in Systems and Soft Computing

ISSN: 2772-9419 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.sciencedirect.com/journal/systems-and-soft-computing

About the journal

Abstract

Keywords