BMC Evolutionary Biology (Sep 2017)
Appropriate homoplasy metrics in linked SSRs to predict an underestimation of demographic expansion times
Abstract
Abstract Background Homoplasy affects demographic inference estimates. This effect has been recognized and corrective methods have been developed. However, no studies so far have defined what homoplasy metrics best describe the effects on demographic inference, or have attempted to estimate such metrics in real data. Here we study how homoplasy in chloroplast microsatellites (cpSSR) affects inference of population expansion time. cpSSRs are popular markers for inferring historical demography in plants due to their high mutation rate and limited recombination. Results In cpSSRs, homoplasy is usually quantified as the probability that two markers or haplotypes that are identical by state are not identical by descent (Homoplasy index, P). Here we propose a new measure of multi-locus homoplasy in linked SSR called Distance Homoplasy (DH), which measures the proportion of pairwise differences not observed due to homoplasy, and we compare it to P and its per cpSSR locus average, which we call Mean Size Homoplasy (MSH). We use simulations and analytical derivations to show that, out of the three homoplasy metrics analyzed, MSH and DH are more correlated to changes in the population expansion time and to the underestimation of that demographic parameter using cpSSR. We perform simulations to show that Approximate Bayesian Computation (ABC) can be used to obtain reasonable estimates of MSH and DH. Finally, we use ABC to estimate the expansion time, MSH and DH from a chloroplast SSR dataset in Pinus caribaea. To our knowledge, this is the first time that homoplasy has been estimated in population genetic data. Conclusions We show that MSH and DH should be used to quantify how homoplasy affects estimates of population expansion time. We also demonstrate how ABC provides a methodology to estimate homoplasy in population genetic data.
Keywords