Transcriptome-based prediction for polygenic traits in rice using different gene subsets

Ryokei Tanaka; Tsubasa Kawai; Taiji Kawakatsu; Nobuhiro Tanaka; Matthew Shenton; Shiori Yabe; Yusaku Uga

doi:10.1186/s12864-024-10803-3

BMC Genomics (Oct 2024)

Transcriptome-based prediction for polygenic traits in rice using different gene subsets

Ryokei Tanaka,
Tsubasa Kawai,
Taiji Kawakatsu,
Nobuhiro Tanaka,
Matthew Shenton,
Shiori Yabe,
Yusaku Uga

Affiliations

Ryokei Tanaka: Institute of Crop Sciences, National Agriculture & Food Research Organization
Tsubasa Kawai: Institute of Crop Sciences, National Agriculture & Food Research Organization
Taiji Kawakatsu: Institute of Agrobiological Sciences, National Agriculture & Food Research Organization
Nobuhiro Tanaka: Institute of Crop Sciences, National Agriculture & Food Research Organization
Matthew Shenton: Institute of Crop Sciences, National Agriculture & Food Research Organization
Shiori Yabe: Institute of Crop Sciences, National Agriculture & Food Research Organization
Yusaku Uga: Institute of Crop Sciences, National Agriculture & Food Research Organization

DOI: https://doi.org/10.1186/s12864-024-10803-3
Journal volume & issue: Vol. 25, no. 1
pp. 1 – 14

Abstract

Read online

Abstract Background Transcriptome-based prediction of complex phenotypes is a relatively new statistical method that links genetic variation to phenotypic variation. The selection of large-effect genes based on a priori biological knowledge is beneficial for predicting oligogenic traits; however, such a simple gene selection method is not applicable to polygenic traits because causal genes or large-effect loci are often unknown. Here, we used several gene-level features and tested whether it was possible to select a gene subset that resulted in better predictive ability than using all genes for predicting a polygenic trait. Results Using the phenotypic values of shoot and root traits and transcript abundances in leaves and roots of 57 rice accessions, we evaluated the predictive abilities of the transcriptome-based prediction models. Leaf transcripts predicted shoot phenotypes, such as plant height, more accurately than root transcripts, whereas root transcripts predicted root phenotypes, such as crown root length, more accurately than leaf transcripts. Furthermore, we used the following three features to train the prediction model: (1) tissue specificity of the transcripts, (2) ontology annotations, and (3) co-expression modules for selecting gene subsets. Although models trained by a gene subset often resulted in lower predictive abilities than the model trained by all genes, some gene subsets showed improved predictive ability. For example, using genes expressed in roots but not in leaves, the predictive ability for crown root diameter was improved by more than 10% (R 2 = 0.59 when using all genes; R 2 = 0.66, using 1,554 root-specifically expressed genes). Similarly, genes annotated as “gibberellic acid sensitivity” showed higher predictive ability than using all genes for root dry weight. Conclusions Our results highlight both the possibility and difficulty of selecting an appropriate gene subset to predict polygenic traits from transcript abundance, given the current biological knowledge and information. Further integration of multiple sources of information, as well as improvements in gene characterization, may enable the selection of an optimal gene set for the prediction of polygenic phenotypes.

Published in BMC Genomics

ISSN: 1471-2164 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Technology: Chemical technology: Biotechnology; Science: Biology (General): Genetics
Website: http://bmcgenomics.biomedcentral.com

About the journal

Abstract

Keywords