Scientific Data (Jun 2024)

Chromosome-level genome assembly and annotation of a potential model organism Gossypium arboreum ZB-1

  • Rongnan Sun,
  • Yuqing Wu,
  • Xinyu Zhang,
  • Minghua Lv,
  • Dongliang Yu,
  • Yuqiang Sun

DOI
https://doi.org/10.1038/s41597-024-03481-z
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Recent advancements in plant regeneration and synthetic polyploid creation have been documented in Gossypium arboreum ZB-1. These developments make ZB-1 a potential model within the Gossypium genus for investigating gene function and polyploidy. This work generated the sequence and annotation of the ZB-1 genome. The contig-level genome was constructed using the PacBio high-fidelity reads, encompassing 81 contigs with an N50 length of 112.12 Mb. The Hi-C data assisted the construction of the chromosome-level genome, which consists of 13 pseudo-chromosomes and 39 un-anchored contigs, with a total length of about 1.67 Gb. Repetitive sequences accounted for about 69.7% of the genome in length. Based on ab initio and evidence-based prediction, we have identified 48,021 protein-coding genes in the ZB-1 genome. Comparative genomics analysis revealed conserved gene content and arrangement between ZB-1 and G. arboreum SXY1. The single nucleotide polymorphism occurrence rate between ZB-1 and SXY1 was about 0.54 per 1,000 nucleotides. This study enriched the genomic resources for further exploration into cotton regeneration and polyploidy mechanisms.