PNNGS, a multi-convolutional parallel neural network for genomic selection

Zhengchao Xie; Lin Weng; Jingjing He; Xianzhong Feng; Xiaogang Xu; Yinxing Ma; Panpan Bai; Qihui Kong

doi:10.3389/fpls.2024.1410596

Frontiers in Plant Science (Sep 2024)

PNNGS, a multi-convolutional parallel neural network for genomic selection

Zhengchao Xie,
Lin Weng,
Jingjing He,
Xianzhong Feng,
Xiaogang Xu,
Yinxing Ma,
Panpan Bai,
Qihui Kong

Affiliations

Zhengchao Xie: Research Center for Life Sciences Computing, Zhejiang Laboratory, Hangzhou, China
Lin Weng: Research Center for Life Sciences Computing, Zhejiang Laboratory, Hangzhou, China
Jingjing He: Research Center for Life Sciences Computing, Zhejiang Laboratory, Hangzhou, China
Xianzhong Feng: Key Laboratory of Soybean Molecular Design Breeding, Northeast Institute of Geography and Agroecology, Chinese Academy of Sciences, Changchun, China
Xiaogang Xu: School of Computer Science and Technology, Zhejiang Gongshang University, Hangzhou, China
Yinxing Ma: Research Center for Life Sciences Computing, Zhejiang Laboratory, Hangzhou, China
Panpan Bai: Research Center for Life Sciences Computing, Zhejiang Laboratory, Hangzhou, China
Qihui Kong: Research Center for Life Sciences Computing, Zhejiang Laboratory, Hangzhou, China

DOI: https://doi.org/10.3389/fpls.2024.1410596
Journal volume & issue: Vol. 15

Abstract

Read online

Genomic selection (GS) can accomplish breeding faster than phenotypic selection. Improving prediction accuracy is the key to promoting GS. To improve the GS prediction accuracy and stability, we introduce parallel convolution to deep learning for GS and call it a parallel neural network for genomic selection (PNNGS). In PNNGS, information passes through convolutions of different kernel sizes in parallel. The convolutions in each branch are connected with residuals. Four different Lp loss functions train PNNGS. Through experiments, the optimal number of parallel paths for rice, sunflower, wheat, and maize is found to be 4, 6, 4, and 3, respectively. Phenotype prediction is performed on 24 cases through ridge-regression best linear unbiased prediction (RRBLUP), random forests (RF), support vector regression (SVR), deep neural network genomic prediction (DNNGP), and PNNGS. Serial DNNGP and parallel PNNGS outperform the other three algorithms. On average, PNNGS prediction accuracy is 0.031 larger than DNNGP prediction accuracy, indicating that parallelism can improve the GS model. Plants are divided into clusters through principal component analysis (PCA) and K-means clustering algorithms. The sample sizes of different clusters vary greatly, indicating that this is unbalanced data. Through stratified sampling, the prediction stability and accuracy of PNNGS are improved. When the training samples are reduced in small clusters, the prediction accuracy of PNNGS decreases significantly. Increasing the sample size of small clusters is critical to improving the prediction accuracy of GS.

Published in Frontiers in Plant Science

ISSN: 1664-462X (Online)
Publisher: Frontiers Media S.A.
Country of publisher: Switzerland
LCC subjects: Agriculture: Plant culture
Website: https://www.frontiersin.org/journals/plant-science

About the journal

Abstract

Keywords