Computational and Molecular Population Genetics, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
Simon Aeschbacher
Computational and Molecular Population Genetics, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland; Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
Alexandre Thiéry
Computational and Molecular Population Genetics, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
Computational and Molecular Population Genetics, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
Disentangling the effect on genomic diversity of natural selection from that of demography is notoriously difficult, but necessary to properly reconstruct the history of species. Here, we use high-quality human genomic data to show that purifying selection at linked sites (i.e. background selection, BGS) and GC-biased gene conversion (gBGC) together affect as much as 95% of the variants of our genome. We find that the magnitude and relative importance of BGS and gBGC are largely determined by variation in recombination rate and base composition. Importantly, synonymous sites and non-transcribed regions are also affected, albeit to different degrees. Their use for demographic inference can lead to strong biases. However, by conditioning on genomic regions with recombination rates above 1.5 cM/Mb and mutation types (C↔G, A↔T), we identify a set of SNPs that is mostly unaffected by BGS or gBGC, and that avoids these biases in the reconstruction of human history.