G3: Genes, Genomes, Genetics (Nov 2020)

A Multivariate Poisson Deep Learning Model for Genomic Prediction of Count Data

  • Osval Antonio Montesinos-López,
  • José Cricelio Montesinos-López,
  • Pawan Singh,
  • Nerida Lozano-Ramirez,
  • Alberto Barrón-López,
  • Abelardo Montesinos-López,
  • José Crossa

DOI
https://doi.org/10.1534/g3.120.401631
Journal volume & issue
Vol. 10, no. 11
pp. 4177 – 4190

Abstract

Read online

The paradigm called genomic selection (GS) is a revolutionary way of developing new plants and animals. This is a predictive methodology, since it uses learning methods to perform its task. Unfortunately, there is no universal model that can be used for all types of predictions; for this reason, specific methodologies are required for each type of output (response variables). Since there is a lack of efficient methodologies for multivariate count data outcomes, in this paper, a multivariate Poisson deep neural network (MPDN) model is proposed for the genomic prediction of various count outcomes simultaneously. The MPDN model uses the minus log-likelihood of a Poisson distribution as a loss function, in hidden layers for capturing nonlinear patterns using the rectified linear unit (RELU) activation function and, in the output layer, the exponential activation function was used for producing outputs on the same scale of counts. The proposed MPDN model was compared to conventional generalized Poisson regression models and univariate Poisson deep learning models in two experimental data sets of count data. We found that the proposed MPDL outperformed univariate Poisson deep neural network models, but did not outperform, in terms of prediction, the univariate generalized Poisson regression models. All deep learning models were implemented in Tensorflow as back-end and Keras as front-end, which allows implementing these models on moderate and large data sets, which is a significant advantage over previous GS models for multivariate count data.

Keywords