Genome Biology (Sep 2021)

Pandora: nucleotide-resolution bacterial pan-genomics with reference graphs

  • Rachel M. Colquhoun,
  • Michael B. Hall,
  • Leandro Lima,
  • Leah W. Roberts,
  • Kerri M. Malone,
  • Martin Hunt,
  • Brice Letcher,
  • Jane Hawkey,
  • Sophie George,
  • Louise Pankhurst,
  • Zamin Iqbal

DOI
https://doi.org/10.1186/s13059-021-02473-1
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 30

Abstract

Read online

Abstract We present pandora, a novel pan-genome graph structure and algorithms for identifying variants across the full bacterial pan-genome. As much bacterial adaptability hinges on the accessory genome, methods which analyze SNPs in just the core genome have unsatisfactory limitations. Pandora approximates a sequenced genome as a recombinant of references, detects novel variation and pan-genotypes multiple samples. Using a reference graph of 578 Escherichia coli genomes, we compare 20 diverse isolates. Pandora recovers more rare SNPs than single-reference-based tools, is significantly better than picking the closest RefSeq reference, and provides a stable framework for analyzing diverse samples without reference bias.

Keywords