مجله بیوتکنولوژی کشاورزی (Nov 2023)

Identification of genomic variations of Azeri buffaloes using whole genome sequencing

  • Milad Hosseini,
  • Hossein Moradi shahrbabak,
  • Mohamad Moradi shahrbabak

DOI
https://doi.org/10.22103/jab.2023.21310.1474
Journal volume & issue
Vol. 15, no. 4
pp. 227 – 238

Abstract

Read online

AbstractObjectiveAdvances in next generation sequencing technologies have created the ability to efficiently and economically sequence the whole genome more than ever and provide the opportunity to discover and introduce multiple polymorphisms throughout the genome of organisms. Azeri buffalo is one of the most important breeds of Iran and it is scattered in the north to the northwest of the country and is completely adapted to the environmental conditions of this geographical region. The main purpose of this study is to introduce the genomic diversity of Iranian buffaloes and categorize them. Also, introducing the effects of these variations on different genomic regions of Azeri buffaloes have potential applications in breeding programs.Materials and MethodsIn this study, whole genome sequencing of 5 heads of Azeri buffaloes native to Iran was done by Illumina sequencing platform. Data quality was measured by FastQC software. BWA-MEM software was used for alignment with the reference genome. Finally, the variants were identified using freebayes and the SnpEff program was used to calculate the effects of the variants by mentioning their type, location and number. The alignment result of high quality reads with the reference genome showed that the alignment percentage for all 5 samples was over 97.5%, which indicates the high quality of short reads.The coverage in the sequenced samples was determined between 4x and 12.8x.Resultsfinally, 76,298,858 million variants were identified, including 57,921,822 SNPs, 6,162,328 indels, 10,534,042 MNPs, and 1,680,666 MIXEDs. Small deletions and insertions with a minimum length of 1 bp, a maximum length of 28 bp and an average of 1.39 bp were identified. From the total number of variants, the highest frequency of variants was observed in intergenic regions, 53,789,879 (62.022 percent), intron 24,003,682 (27.677 percent), respectively, and the variants detected in exons had the lowest number. ConclusionsIdentification of genome-level variations such as snps, small insertions and deletions, and multi-nucleotide polymorphisms in different populations are a valuable resource in genetic research and can be useful in locating genomic segments responsible for important economic traits. Also, the large volume of SNPs enables genome-wide association studies in animals. In addition, the genomic variations identified in the present study can be used to develop high-density SNP arrays in Iranian breeds for genetic and breeding applications.

Keywords