BMC Bioinformatics (Jul 2024)

Orthanq: transparent and uncertainty-aware haplotype quantification with application in HLA-typing

  • Hamdiye Uzuner,
  • Annette Paschen,
  • Dirk Schadendorf,
  • Johannes Köster

DOI
https://doi.org/10.1186/s12859-024-05832-4
Journal volume & issue
Vol. 25, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Background Identification of human leukocyte antigen (HLA) types from DNA-sequenced human samples is important in organ transplantation and cancer immunotherapy and remains a challenging task considering sequence homology and extreme polymorphism of HLA genes. Results We present Orthanq, a novel statistical model and corresponding application for transparent and uncertainty-aware quantification of haplotypes. We utilize our approach to perform HLA typing while, for the first time, reporting uncertainty of predictions and transparently observing mutations beyond reported HLA types. Using 99 gold standard samples from 1000 Genomes, Illumina Platinum Genomes and Genome In a Bottle projects, we show that Orthanq can provide overall superior accuracy and shorter runtimes than state-of-the-art HLA typers. Conclusions Orthanq is the first approach that allows to directly utilize existing pangenome alignments and type all HLA loci. Moreover, it can be generalized for usages beyond HLA typing, e.g. for virus lineage quantification. Orthanq is available under https://orthanq.github.io .

Keywords