Deep learning regularization techniques to genomics data

Harouna Soumare; Alia Benkahla; Nabil Gmati

Array (Sep 2021)

Deep learning regularization techniques to genomics data

Harouna Soumare,
Alia Benkahla,
Nabil Gmati

Affiliations

Harouna Soumare: The Laboratory of Mathematical Modelling and Numeric in Engineering Sciences, National Engineering School of Tunis, University of Tunis El Manar, Rue Béchir Salem Belkhiria Campus Universitaire, B.P. 37, 1002, Tunis Belvédère, Tunisia; Laboratory of BioInformatics, BioMathematics, and BioStatistics, Institut Pasteur de Tunis, 13 Place Pasteur, B.P. 74 1002, Tunis, Belvédère, Tunisia; Corresponding author. The Laboratory of Mathematical Modelling and Numeric in Engineering Sciences, National Engineering School of Tunis, University of Tunis El Manar, Rue Béchir Salem Belkhiria Campus Universitaire, B.P. 37, 1002, Tunis Belvédère, Tunisia.
Alia Benkahla: Laboratory of BioInformatics, BioMathematics, and BioStatistics, Institut Pasteur de Tunis, 13 Place Pasteur, B.P. 74 1002, Tunis, Belvédère, Tunisia
Nabil Gmati: College of Sciences & Basic and Applied Scientific Research Center, Imam Abdulrahman Bin Faisal University, P.O. Box 1982, 31441, Dammam, Saudi Arabia

Journal volume & issue: Vol. 11
p. 100068

Abstract

Read online

Deep Learning algorithms have achieved a great success in many domains where large scale datasets are used. However, training these algorithms on high dimensional data requires the adjustment of many parameters. Avoiding overfitting problem is difficult. Regularization techniques such as L1 and L2 are used to prevent the parameters of training model from being large. Another commonly used regularization method called Dropout randomly removes some hidden units during the training phase. In this work, we describe some architectures of Deep Learning algorithms, we explain optimization process for training them and attempt to establish a theoretical relationship between L2-regularization and Dropout. We experimentally compare the effect of these techniques on the learning model using genomics datasets.

Published in Array

ISSN: 2590-0056 (Online)
Publisher: Elsevier
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering: Electronics: Computer engineering. Computer hardware; Science: Mathematics: Instruments and machines: Electronic computers. Computer science
Website: https://www.journals.elsevier.com/array

About the journal

Abstract

Keywords