GigaByte (Oct 2024)

SMARTER-database: a tool to integrate SNP array datasets for sheep and goat breeds

  • Paolo Cozzi ,
  • Arianna Manunza ,
  • Johanna Ramirez-Diaz ,
  • Valentina Tsartsianidou ,
  • Konstantinos Gkagkavouzis ,
  • Pablo Peraza ,
  • Anna Maria Johansson ,
  • Juan José Arranz ,
  • Fernando Freire ,
  • Szilvia Kusza ,
  • Filippo Biscarini ,
  • Lucy Peters ,
  • Gwenola Tosser-Klopp ,
  • Gabriel Ciappesoni ,
  • Alexandros Triantafyllidis ,
  • Rachel Rupp ,
  • Bertrand Servin ,
  • Alessandra Stella

DOI
https://doi.org/10.46471/gigabyte.139

Abstract

Read online

Underutilized sheep and goat breeds can adapt to challenging environments due to their genetics. Integrating publicly available genomic datasets with new data will facilitate genetic diversity analyses; however, this process is complicated by data discrepancies, such as outdated assembly versions or different data formats. Here, we present the SMARTER-database, a collection of tools and scripts to standardize genomic data and metadata, mainly from SNP chip arrays on global small ruminant populations, with a focus on reproducibility. SMARTER-database harmonizes genotypes for about 12,000 sheep and 6,000 goats to a uniform coding and assembly version. Users can access the genotype data via File Transfer Protocol and interact with the metadata through a web interface or using their custom scripts, enabling efficient filtering and selection of samples. These tools will empower researchers to focus on the crucial aspects of adaptation and contribute to livestock sustainability, leveraging the rich dataset provided by the SMARTER-database. Availability and implementation The code is available as open-source software under the MIT license at https://github.com/cnr-ibba/SMARTER-database.