Scientific Data (Aug 2023)
Genome assembly of two diploid and one auto-tetraploid Cyclocarya paliurus genomes
Abstract
Abstract Cyclocarya paliurus, an endemic species in the genus Juglandaceae with the character of heterodichogamy, is one of triterpene-rich medicinal plants in China. To uncover the genetic mechanisms behind the special characteristics, we sequenced the genomes of two diploid (protandry, PA-dip and protogyny, PG-dip) and one auto-tetraploid (PA-tetra) C. paliurus genomes. Based on 134.9 (~225x), 75.5 (~125x) and 271.8 Gb (~226x) subreads of PacBio platform sequencing data, we assembled 586.62 Mb (contig N50 = 1.9 Mb), 583.45 Mb (contig N50 = 1.4 Mb), and 2.38 Gb (contig N50 = 430.9 kb) for PA-dip, PG-dip and PA-tetra genome, respectively. Furthermore, 543.53, 553.87, and 2168.65 Mb in PA-dip, PG-dip, and PA-tetra, were respectively anchored to 16, 16, and 64 pseudo-chromosomes using over 65.4 Gb (~109x), 68 Gb (~113x), and 264 (~220x) Hi-C sequencing data. Annotation of PA-dip, PG-dip, and PA-tetra genome assembly identified 34,699, 35,221, and 34,633 protein-coding genes (90,752 gene models) or allele-defined genes, respectively. In addition, 45 accessions from nine locations were re-sequenced, and more than 10 × coverage reads were generated.