Evaluation of imputation performance of multiple reference panels in a Pakistani population
Jiayi Xu,
Dongjing Liu,
Arsalan Hassan,
Giulio Genovese,
Alanna C. Cote,
Brian Fennessy,
Esther Cheng,
Alexander W. Charney,
James A. Knowles,
Muhammad Ayub,
Roseann E. Peterson,
Tim B. Bigdeli,
Laura M. Huckins
Affiliations
Jiayi Xu
Department of Psychiatry, Yale School of Medicine, New Haven, CT 06510, USA; Corresponding author
Dongjing Liu
Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Arsalan Hassan
University of Peshawar, Khyber Pakhtunkhwa, Peshawar 25120, Pakistan; Institute of Omics and Health Research, Lahore, Pakistan
Giulio Genovese
Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Stanley Center, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
Alanna C. Cote
Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Brian Fennessy
Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Esther Cheng
Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Alexander W. Charney
Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
James A. Knowles
The Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ 08854, USA
Muhammad Ayub
University College London, London WC1E 6BT, UK
Roseann E. Peterson
Department of Psychiatry and Behavioral Sciences, Institute for Genomics in Health, State University of New York Downstate Health Sciences University, Brooklyn, NY 11203, USA
Tim B. Bigdeli
Department of Psychiatry and Behavioral Sciences, Institute for Genomics in Health, State University of New York Downstate Health Sciences University, Brooklyn, NY 11203, USA
Laura M. Huckins
Department of Psychiatry, Yale School of Medicine, New Haven, CT 06510, USA; Corresponding author
Summary: Genotype imputation is crucial for genome-wide association studies (GWASs), but reference panels and existing benchmarking studies prioritize European individuals. Consequently, it is unclear which publicly available reference panel should be used for Pakistani individuals, and whether ancestry composition or sample size of the panel matters more for imputation accuracy. Our study compared different reference panels to impute genotype data in 1,814 Pakistani individuals, finding the best performance balancing accuracy and coverage with meta-imputation with TOPMed and the expanded 1000 Genomes (ex1KG) reference. Imputation accuracy of ex1KG outperformed TOPMed for common variants despite its 30-fold smaller sample size, supporting efforts to create future panels with diverse populations.