BMC Bioinformatics (Oct 2019)

Legofit: estimating population history from genetic data

  • Alan R. Rogers

DOI
https://doi.org/10.1186/s12859-019-3154-1
Journal volume & issue
Vol. 20, no. 1
pp. 1 – 10

Abstract

Read online

Abstract Background Our current understanding of archaic admixture in humans relies on statistical methods with large biases, whose magnitudes depend on the sizes and separation times of ancestral populations. To avoid these biases, it is necessary to estimate these parameters simultaneously with those describing admixture. Genetic estimates of population histories also confront problems of statistical identifiability: different models or different combinations of parameter values may fit the data equally well. To deal with this problem, we need methods of model selection and model averaging, which are lacking from most existing software. Results The Legofit software package allows simultaneous estimation of parameters describing admixture, and the sizes and separation times of ancestral populations. It includes facilities for data manipulation, estimation, analysis of residuals, model selection, and model averaging. Conclusions Legofit uses genetic data to study the history of a subdivided population. It is unaffected by recent history and can therefore focus on the deep history of population size, subdivision, and admixture. It outperforms several statistical methods that have been widely used to study population history and should be useful in any species for which DNA sequence data is available from several populations.

Keywords