Artificial Intelligence in the Life Sciences (Dec 2021)
AutoGenome: An AutoML tool for genomic research
Abstract
Deep learning has achieved great successes in traditional fields like computer vision (CV), natural language processing (NLP), speech processing, and more. These advancements have greatly inspired researchers in genomics and made deep learning in genomics an exciting and popular topic. The convolutional neural network (CNN) and recurrent neural network (RNN) are frequently used to solve genomic sequencing and prediction problems, and multiple layer perception (MLP) and auto-encoders (AE) are frequently used for genomic profiling data like RNA expression data and gene mutation data. Here, we introduce a new neural network architecture-the residual fully-connected neural network (RFCN)-and describe its advantage in modeling genomic profiling data. We also incorporate AutoML algorithms and implement AutoGenome, an end-to-end, automated deep learning framework for genomic studies. By utilizing the proposed RFCN architecture, automatic hyper-parameter search, and neural architecture search algorithms, AutoGenome can automatically train high-performance deep learning models for various kinds of genomic profiling data. To help researchers better understand the trained models, AutoGenome can assess the importance of different features and export the most critical features for supervised learning tasks and the representative latent vectors for unsupervised learning tasks. We expect AutoGenome will become a popular tool in genomic studies.