Data (Nov 2022)

Reference-Guided Draft Genome Assembly, Annotation and SSR Mining Data of the Peruvian Creole Cattle (<i>Bos taurus</i>)

  • Richard Estrada,
  • Flor-Anita Corredor,
  • Deyanira Figueroa,
  • Wilian Salazar,
  • Carlos Quilcate,
  • Héctor V. Vásquez,
  • Jorge L. Maicelo,
  • Jhony Gonzales,
  • Carlos I. Arbizu

DOI
https://doi.org/10.3390/data7110155
Journal volume & issue
Vol. 7, no. 11
p. 155

Abstract

Read online

The Peruvian creole cattle (PCC) is a neglected breed and an essential livestock resource in the Andean region of Peru. To develop a modern breeding program and conservation strategies for the PCC, a better understanding of the genetics of this breed is needed. We sequenced the whole genome of the PCC using a de novo assembly approach with a paired-end 150 strategy on the Illumina HiSeq 2500 platform, obtaining 320 GB of sequencing data. A reference scaffolding was used to improve the draft genome. The obtained genome size of the PCC was 2.81 Gb with a contig N50 of 108 Mb and 92.59% complete BUSCOs. This genome size is similar to the genome references of Bos taurus and B. indicus. In addition, we identified 40.22% of repetitive DNA of the genome assembly, of which retroelements occupy 32.39% of the total genome. A total of 19,803 protein-coding genes were annotated in the PCC genome. For SSR data mining, we detected similar statistics in comparison with other breeds. The PCC genome will contribute to a better understanding of the genetics of this species and its adaptation to tough conditions in the Andean ecosystem.

Keywords