Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov-Dec;32(11-12):2068-2078.
doi: 10.1101/gr.276728.122. Epub 2022 Dec 14.

Three assays for in-solution enrichment of ancient human DNA at more than a million SNPs

Affiliations

Three assays for in-solution enrichment of ancient human DNA at more than a million SNPs

Nadin Rohland et al. Genome Res. 2022 Nov-Dec.

Abstract

The strategy of in-solution enrichment for hundreds of thousands of single-nucleotide polymorphisms (SNPs) has been used to analyze >70% of individuals with genome-scale ancient DNA published to date. This approach makes it economical to study ancient samples with low proportions of human DNA and increases the rate of conversion of sampled remains into interpretable data. So far, nearly all such data have been generated using a set of bait sequences targeting about 1.24 million SNPs (the "1240k reagent"), but synthesis of the reagent has been cost-effective for only a few laboratories. In 2021, two companies, Daicel Arbor Biosciences and Twist Bioscience, made available assays that target the same core set of SNPs along with supplementary content. We test all three assays on a common set of 27 ancient DNA libraries and show that all three are effective at enriching many hundreds of thousands of SNPs. For all assays, one round of enrichment produces data that are as useful as two. In our testing, the "Twist Ancient DNA" assay produces the highest coverages, greatest uniformity on targeted positions, and almost no bias toward enriching one allele more than another relative to shotgun sequencing. We also identify hundreds of thousands of targeted SNPs for which there is minimal allelic bias when comparing 1240k data to either shotgun or Twist data. This facilitates coanalysis of the large data sets that have been generated using 1240k and Twist capture, as well as shotgun sequencing approaches.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Characterization of enrichment. (A) Degree of enrichment as a function of distance from 1,150,639 targeted autosomal SNPs (position 0) for the 15 high-coverage libraries at the bottom of Table 1; enrichment at the SNP relative to positions 100 bp away is shown in the legend. (B) Variation in coverage across SNP targets for the same libraries. (C) Proportion of nucleotides that are guanine or cytosine (GC) has a downward bias relative to the unenriched library for Arbor, upward for 1240k, and little bias for Twist Ancient DNA; this analysis uses data from the first 10 libraries in Table 1 with full results from both rounds of capture. (D) All assays preferentially enrich longer molecules, with the least length effect for Twist Ancient DNA (medians in legend, 10 libraries of data). All plots reflect data before removal of duplicated sequences as our goal is to study effectiveness of enrichment on a per-molecule basis.
Figure 2.
Figure 2.
Performance of the three assays over a range of sequencing depths. For 10 libraries (five double-stranded [DS], and five single-stranded [SS] libraries) with varying percentages of human sequences before enrichment (0.1%–86.7%), we show the number of unique SNPs at different levels of sequencing depth (based on down-sampling). For a typical amount of sequencing of a capture experiment (25 million merged sequences), and after removal of duplicated sequences, the Twist Ancient DNA assay always enriches for more SNPs than the other two assays. For most experiments, more SNPs are retrieved after one round of enrichment than after two. We did not perform the two-enrichment-round Twist Ancient DNA experiment for the two libraries with the highest endogenous content (S1633.E1.L1 and S10871.E1.L6).
Figure 3.
Figure 3.
Population genetic effects of enrichment and an effective filter for reducing bias. (A) Projection of data from 15 libraries in the last rows of Table 1 onto a PCA of modern West Eurasians (gray squares) shows nearly identical positions regardless of data source. (B) We compute symmetry statistics of the form f4(library 1 − reagent 1, library 1 − reagent 2; library 2 − reagent 1, library 2 − reagent 2) and plot Z-scores for all 105 = 15 × 14/2 pairwise comparisons of the libraries (box-and-whisker plots show range, 25th and 75th percentiles, and mean). The statistics involving Arbor Complete are shown in green; remaining comparisons involving 1240k are shown in red; and the Twist–shotgun comparison is in blue. We show results both for all SNPs targets (left) and after applying the bias filter retaining a subset of 42% of autosomal SNPs (right). Results for this figure reflect data after removal of duplicated sequences.
Figure 4.
Figure 4.
Variation in reference bias across SNPs. (A) All analyses are based on sequences from loci ascertained as highly likely to be heterozygous, corrected for stochastic error in the estimates using the expectation maximization (EM) algorithm described in Supplemental Text S2. (B) Mean and standard deviation of EM-corrected distributions stratified by sequence length (longer sequences align more reliably so have less bias). Results for this figure reflect data before removal of duplicated sequences.

References

    1. The 1000 Genomes Project Consortium. 2015. A global reference for human genetic variation. Nature 526: 68–74. 10.1038/nature15393 - DOI - PMC - PubMed
    1. Bergström A, McCarthy SA, Hui R, Almarri MA, Ayub Q, Danecek P, Chen Y, Felkel S, Hallast P, Kamm J, et al. 2020. Insights into human genetic variation and population history from 929 diverse genomes. Science 367: eaay5012. 10.1126/science.aay5012 - DOI - PMC - PubMed
    1. Burbano HA, Hodges E, Green RE, Briggs AW, Krause J, Meyer M, Good JM, Maricic T, Johnson PL, Xuan Z, et al. 2010. Targeted investigation of the Neandertal genome by array-based sequence capture. Science 328: 723–725. 10.1126/science.1188046 - DOI - PMC - PubMed
    1. Carpenter ML, Buenrostro JD, Valdiosera C, Schroeder H, Allentoft ME, Sikora M, Rasmussen M, Gravel S, Guillén S, Nekhrizov G, et al. 2013. Pulling out the 1%: whole-genome capture for the targeted enrichment of ancient DNA sequencing libraries. Am J Hum Genet 93: 852–864. 10.1016/j.ajhg.2013.10.002 - DOI - PMC - PubMed
    1. Castellano S, Parra G, Sánchez-Quinto FA, Racimo F, Kuhlwilm M, Kircher M, Sawyer S, Fu Q, Heinze A, Nickel B, et al. 2014. Patterns of coding variation in the complete exomes of three Neandertals. Proc Natl Acad Sci 111: 6666–6671. 10.1073/pnas.1405138111 - DOI - PMC - PubMed

Publication types

LinkOut - more resources