Journal of Biostatistics and Epidemiology (Oct 2022)

Non-parametric MCMC Gibbs sampler approach and misclassification assessment of estimating haplotype frequencies among related statistical approaches

  • Gie Ken-Dror

DOI
https://doi.org/10.18502/jbe.v8i3.12304
Journal volume & issue
Vol. 8, no. 3

Abstract

Read online

Abstract Introduction: Haplotype analysis allows higher resolution analysis in genetic association studies and is used as a reference panel for genotype imputation in genome-wide association studies. Haplotypes estimates from genotypes among unrelated individuals, but misclassification of the haplotype reconstruction will directly affect the accuracy of the results. Methods: This study proposes a novel statistical method Gibbs sampler algorithm to estimate haplotype frequency and quantify the influence of misclassification bias of the estimate haplotype. The performance of the algorithm is evaluated on simulated datasets assuming that linkage phase unknown. The simulation used different minor allele frequencies at each single nucleotide polymorphisms (SNPs) and different linkage-disequilibrium between the SNPs. Results: The Gibbs sampler algorithm presents higher accuracy among over seven SNPs or less, validated, and deals with missing genotype compared to previous related statistical approaches. Misclassification of estimated haplotypes leads to non-difference bias in exposure and affects haplotype estimates in haplotype analysis. The observed odds ratio underestimates the association between haplotype and phenotype by 36% to 99%. Conclusion: The Gibbs sampler algorithm provides higher accuracy and robust effectiveness performance, handles missing genotypes, and provides uncertain probabilities of haplotype frequencies. The misclassification bias of the estimate haplotype underestimates the genetic association by more than forty percent.

Keywords