Genome Biology (Jul 2023)

vamos: variable-number tandem repeats annotation using efficient motif sets

  • Jingwen Ren,
  • Bida Gu,
  • Mark J. P. Chaisson

DOI
https://doi.org/10.1186/s13059-023-03010-y
Journal volume & issue
Vol. 24, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Roughly 3% of the human genome is composed of variable-number tandem repeats (VNTRs): arrays of motifs at least six bases. These loci are highly polymorphic, yet current approaches that define and merge variants based on alignment breakpoints do not capture their full diversity. Here we present a method vamos: VNTR Annotation using efficient Motif Sets that instead annotates VNTR using repeat composition under different levels of motif diversity. Using vamos we estimate 7.4–16.7 alleles per locus when applied to 74 haplotype-resolved human assemblies, compared to breakpoint-based approaches that estimate 4.0–5.5 alleles per locus.

Keywords