Scientific Data (Jan 2024)

Chromosome-level assembly of Triplophysa yarkandensis genome based on the single molecule real-time sequencing

  • Jiacheng She,
  • Shengao Chen,
  • Xuan Liu,
  • Bin Huo

DOI
https://doi.org/10.1038/s41597-023-02900-x
Journal volume & issue
Vol. 11, no. 1
pp. 1 – 7

Abstract

Read online

Abstract Triplophysa yarkandensis, a species of freshwater fish endemic to Xinjiang, China, is currently classified as endangered. The objective of this study was to obtain the chromosome-level genome of T. yarkandensis using PacBio and Hi-C techniques. The PacBio sequencing technology resulted in an assembly of 520.64 Mb, with a contig N50 size of 1.30 Mb. Hi-C data was utilized for chromosome mapping, ultimately yielding 25 chromosome sequences. The success rate of chromosome mapping was 93%, with a scaffold N50 of 19.14 Mb, and a BUSCO evaluation integrity of 94.1%. The genome of T. yarkandensis encompasses 25,505 predicted protein-coding genes, with a total of 30,673 proteins predicted. The BUSCO evaluation integrity for predicted protein-coding genes was found to be 91.5%. Additionally, the genome contained a genomic repeat sequence accounting for 27.29% of its total length. Future research employing comparative genomics holds considerable importance in elucidating the molecular mechanisms behind saline-alkali adaptation and ensuring the conservation of biological resources.