BMC Genomics (Mar 2021)

Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome

  • Richard J. Edwards,
  • Matt A. Field,
  • James M. Ferguson,
  • Olga Dudchenko,
  • Jens Keilwagen,
  • Benjamin D. Rosen,
  • Gary S. Johnson,
  • Edward S. Rice,
  • La Deanna Hillier,
  • Jillian M. Hammond,
  • Samuel G. Towarnicki,
  • Arina Omer,
  • Ruqayya Khan,
  • Ksenia Skvortsova,
  • Ozren Bogdanovic,
  • Robert A. Zammit,
  • Erez Lieberman Aiden,
  • Wesley C. Warren,
  • J. William O. Ballard

DOI
https://doi.org/10.1186/s12864-021-07493-6
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 19

Abstract

Read online

Abstract Background Basenjis are considered an ancient dog breed of central African origins that still live and hunt with tribesmen in the African Congo. Nicknamed the barkless dog, Basenjis possess unique phylogeny, geographical origins and traits, making their genome structure of great interest. The increasing number of available canid reference genomes allows us to examine the impact the choice of reference genome makes with regard to reference genome quality and breed relatedness. Results Here, we report two high quality de novo Basenji genome assemblies: a female, China (CanFam_Bas), and a male, Wags. We conduct pairwise comparisons and report structural variations between assembled genomes of three dog breeds: Basenji (CanFam_Bas), Boxer (CanFam3.1) and German Shepherd Dog (GSD) (CanFam_GSD). CanFam_Bas is superior to CanFam3.1 in terms of genome contiguity and comparable overall to the high quality CanFam_GSD assembly. By aligning short read data from 58 representative dog breeds to three reference genomes, we demonstrate how the choice of reference genome significantly impacts both read mapping and variant detection. Conclusions The growing number of high-quality canid reference genomes means the choice of reference genome is an increasingly critical decision in subsequent canid variant analyses. The basal position of the Basenji makes it suitable for variant analysis for targeted applications of specific dog breeds. However, we believe more comprehensive analyses across the entire family of canids is more suited to a pangenome approach. Collectively this work highlights the importance the choice of reference genome makes in all variation studies.

Keywords