G3: Genes, Genomes, Genetics (Mar 2023)
Avoiding misleading estimates using mtDNA heteroplasmy statistics to study bottleneck size and selection
Abstract
AbstractMitochondrial DNA heteroplasmy samples can shed light on vital developmental and genetic processes shaping mitochondrial DNA populations. The sample means and sample variance of a set of heteroplasmy observations are typically used both to estimate bottleneck sizes and to perform fits to the theoretical “Kimura” distribution in seeking evidence for mitochondrial DNA selection. However, each of these applications raises problems. Sample statistics do not generally provide optimal fits to the Kimura distribution and so can give misleading results in hypothesis testing, including false positive signals of selection. Using sample variance can give misleading results for bottleneck size estimates, particularly for small samples. These issues can and do lead to false positive results for mitochondrial DNA mechanisms—all published experimental datasets we re-analyzed, reported as displaying departures from the Kimura model, do not in fact give evidence for such departures. Here we outline a maximum likelihood approach that is simple to implement computationally and addresses all of these issues. We advocate the use of maximum likelihood fits and explicit hypothesis tests, not fits and Kolmogorov–Smirnov tests via summary statistics, for ongoing work with mitochondrial DNA heteroplasmy.