Scientific Data (Aug 2023)

Genome assembly of two diploid and one auto-tetraploid Cyclocarya paliurus genomes

  • Yinquan Qu,
  • Xulan Shang,
  • Shengzuo Fang,
  • Xingtan Zhang,
  • Xiangxiang Fu

DOI
https://doi.org/10.1038/s41597-023-02402-w
Journal volume & issue
Vol. 10, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Cyclocarya paliurus, an endemic species in the genus Juglandaceae with the character of heterodichogamy, is one of triterpene-rich medicinal plants in China. To uncover the genetic mechanisms behind the special characteristics, we sequenced the genomes of two diploid (protandry, PA-dip and protogyny, PG-dip) and one auto-tetraploid (PA-tetra) C. paliurus genomes. Based on 134.9 (~225x), 75.5 (~125x) and 271.8 Gb (~226x) subreads of PacBio platform sequencing data, we assembled 586.62 Mb (contig N50 = 1.9 Mb), 583.45 Mb (contig N50 = 1.4 Mb), and 2.38 Gb (contig N50 = 430.9 kb) for PA-dip, PG-dip and PA-tetra genome, respectively. Furthermore, 543.53, 553.87, and 2168.65 Mb in PA-dip, PG-dip, and PA-tetra, were respectively anchored to 16, 16, and 64 pseudo-chromosomes using over 65.4 Gb (~109x), 68 Gb (~113x), and 264 (~220x) Hi-C sequencing data. Annotation of PA-dip, PG-dip, and PA-tetra genome assembly identified 34,699, 35,221, and 34,633 protein-coding genes (90,752 gene models) or allele-defined genes, respectively. In addition, 45 accessions from nine locations were re-sequenced, and more than 10 × coverage reads were generated.