Scientific Data (Jul 2024)
Telomere-to-telomere genome assembly of the goose Anser cygnoides
Abstract
Abstract Our study presents the assembly of a high-quality Taihu goose genome at the Telomere-to-Telomere (T2T) level. By employing advanced sequencing technologies, including Pacific Biosciences HiFi reads, Oxford Nanopore long reads, Illumina short reads, and chromatin conformation capture (Hi-C), we achieved an exceptional assembly. The T2T assembly encompasses a total length of 1,197,991,206 bp, with contigs N50 reaching 33,928,929 bp and scaffold N50 attaining 81,007,908 bp. It consists of 73 scaffolds, including 38 autosomes and one pair of Z/W sex chromosomes. Importantly, 33 autosomes were assembled without any gap, resulting in a contiguous representation. Furthermore, gene annotation efforts identified 34,898 genes, including 436,162 RNA transcripts, encompassing 806,158 exons, 743,910 introns, 651,148 coding sequences (CDS), and 135,622 untranslated regions (UTR). The T2T-level chromosome-scale goose genome assembly provides a vital foundation for future genetic improvement and understanding the genetic mechanisms underlying important traits in geese.