BMC Genomics (Aug 2018)
In-solution Y-chromosome capture-enrichment on ancient DNA libraries
Abstract
Abstract Background As most ancient biological samples have low levels of endogenous DNA, it is advantageous to enrich for specific genomic regions prior to sequencing. One approach—in-solution capture-enrichment—retrieves sequences of interest and reduces the fraction of microbial DNA. In this work, we implement a capture-enrichment approach targeting informative regions of the Y chromosome in six human archaeological remains excavated in the Caribbean and dated between 200 and 3000 years BP. We compare the recovery rate of Y-chromosome capture (YCC) alone, whole-genome capture followed by YCC (WGC + YCC) versus non-enriched (pre-capture) libraries. Results The six samples show different levels of initial endogenous content, with very low (< 0.05%, 4 samples) or low (0.1–1.54%, 2 samples) percentages of sequenced reads mapping to the human genome. We recover 12–9549 times more targeted unique Y-chromosome sequences after capture, where 0.0–6.2% (WGC + YCC) and 0.0–23.5% (YCC) of the sequence reads were on-target, compared to 0.0–0.00003% pre-capture. In samples with endogenous DNA content greater than 0.1%, we found that WGC followed by YCC (WGC + YCC) yields lower enrichment due to the loss of complexity in consecutive capture experiments, whereas in samples with lower endogenous content, the libraries’ initial low complexity leads to minor proportions of Y-chromosome reads. Finally, increasing recovery of informative sites enabled us to assign Y-chromosome haplogroups to some of the archeological remains and gain insights about their paternal lineages and origins. Conclusions We present to our knowledge the first in-solution capture-enrichment method targeting the human Y-chromosome in aDNA sequencing libraries. YCC and WGC + YCC enrichments lead to an increase in the amount of Y-DNA sequences, as compared to libraries not enriched for the Y-chromosome. Our probe design effectively recovers regions of the Y-chromosome bearing phylogenetically informative sites, allowing us to identify paternal lineages with less sequencing than needed for pre-capture libraries. Finally, we recommend considering the endogenous content in the experimental design and avoiding consecutive rounds of capture, as clonality increases considerably with each round.
Keywords