High-quality faba bean reference transcripts generated using PacBio and Illumina RNA-seq data

Na Zhao; Enqiang Zhou; Yamei Miao; Dong Xue; Yongqiang Wang; Kaihua Wang; Chunyan Gu; Mengnan Yao; Yao Zhou; Bo Li; Xuejun Wang; Libin Wei

doi:10.1038/s41597-024-03204-4

Scientific Data (Apr 2024)

High-quality faba bean reference transcripts generated using PacBio and Illumina RNA-seq data

Na Zhao,
Enqiang Zhou,
Yamei Miao,
Dong Xue,
Yongqiang Wang,
Kaihua Wang,
Chunyan Gu,
Mengnan Yao,
Yao Zhou,
Bo Li,
Xuejun Wang,
Libin Wei

Affiliations

Na Zhao: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Enqiang Zhou: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Yamei Miao: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Dong Xue: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Yongqiang Wang: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Kaihua Wang: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Chunyan Gu: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Mengnan Yao: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Yao Zhou: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Bo Li: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Xuejun Wang: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science
Libin Wei: Department of Economic Crops, Jiangsu Yanjiang Institute of Agricultural Science

DOI: https://doi.org/10.1038/s41597-024-03204-4
Journal volume & issue: Vol. 11, no. 1
pp. 1 – 7

Abstract

Read online

Abstract The genome of faba bean was first published in 2023. To promote future molecular breeding studies, we improved the quality of the faba genome based on high-density genetic maps and the Illumina and Pacbio RNA-seq datasets. Two high-density genetic maps were used to conduct the scaffold ordering and orientation of faba bean, culminating in an increased length (i.e., 14.28 Mbp) of chromosomes and a decrease in the number of scaffolds by 45. In gene model mining and optimisation, the PacBio and Illumina RNA-seq datasets from 37 samples allowed for the identification and correction 121,606 transcripts, and the data facilitated a prediction of 15,640 alternative splicing events, 2,148 lncRNAs, and 1,752 fusion transcripts, thus allowing for a clearer understanding of the gene structures underlying the faba genome. Moreover, a total of 38,850 new genes including 56,188 transcripts were identified compared with the reference genome. Finally, the genetic data of the reference genome was integrated and a comprehensive and complete faba bean transcriptome sequence of 103,267 transcripts derived from 54,753 uni-genes was formed.

Published in Scientific Data

ISSN: 2052-4463 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Science
Website: https://www.nature.com/sdata/

About the journal