Pulling out the 1%: whole-genome capture for the targeted enrichment of ancient DNA sequencing libraries

Meredith L Carpenter¹, Jason D Buenrostro¹, Cristina Valdiosera², Hannes Schroeder³, Morten E Allentoft³, Martin Sikora¹, Morten Rasmussen³, Simon Gravel⁴, Sonia Guillén⁵, Georgi Nekhrizov⁶, Krasimir Leshtakov⁷, Diana Dimitrova⁶, Nikola Theodossiev⁷, Davide Pettener⁸, Donata Luiselli⁸, Karla Sandoval¹, Andrés Moreno-Estrada¹, Yingrui Li⁹, Jun Wang¹⁰, M Thomas P Gilbert¹¹, Eske Willerslev³, William J Greenleaf¹², Carlos D Bustamante¹³

Affiliations

¹ Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA.
² Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen 1350, Denmark; Department of Archaeology, Environment, and Community Planning, Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia.
³ Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen 1350, Denmark.
⁴ Department of Human Genetics and Génome Québec Innovation Centre, McGill University, Montréal, QC H3A 0G1, Canada.
⁵ Centro Mallqui, Calle Ugarte y Moscoso 165, San Isidro, Lima 27, Peru.
⁶ Bulgarian Academy of Sciences, National Institute of Archaeology, Sofia 1000, Bulgaria.
⁷ Department of Archaeology, Sofia University St. Kliment Ohridski, Sofia 1504, Bulgaria.
⁸ Dipartimento di Scienze Biologiche, Geologiche e Ambientali (BiGeA), Università di Bologna, Via Selmi 3, 40126 Bologna, Italy.
⁹ BGI-Shenzhen, Shenzhen 518083, China.
¹⁰ BGI-Shenzhen, Shenzhen 518083, China; King Abdulaziz University, Jeddah 21589, Saudi Arabia; Department of Biology, University of Copenhagen, Copenhagen 2200, Denmark; Macau University of Science and Technology, Taipa, Macau 999078, China.
¹¹ Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen 1350, Denmark; Ancient DNA Laboratory, Murdoch University, South Street, Perth, WA 6150, Australia.
¹² Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA. Electronic address: wjg@stanford.edu.
¹³ Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA. Electronic address: cdbustam@stanford.edu.

PMID: 24568772
PMCID: PMC3824117
DOI: 10.1016/j.ajhg.2013.10.002

Pulling out the 1%: whole-genome capture for the targeted enrichment of ancient DNA sequencing libraries

Meredith L Carpenter et al. Am J Hum Genet. 2013.

. 2013 Nov 7;93(5):852-64.

doi: 10.1016/j.ajhg.2013.10.002. Epub 2013 Oct 25.

Authors

Affiliations

¹ Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA.
² Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen 1350, Denmark; Department of Archaeology, Environment, and Community Planning, Faculty of Humanities and Social Sciences, La Trobe University, Melbourne, VIC 3086, Australia.
³ Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen 1350, Denmark.
⁴ Department of Human Genetics and Génome Québec Innovation Centre, McGill University, Montréal, QC H3A 0G1, Canada.
⁵ Centro Mallqui, Calle Ugarte y Moscoso 165, San Isidro, Lima 27, Peru.
⁶ Bulgarian Academy of Sciences, National Institute of Archaeology, Sofia 1000, Bulgaria.
⁷ Department of Archaeology, Sofia University St. Kliment Ohridski, Sofia 1504, Bulgaria.
⁸ Dipartimento di Scienze Biologiche, Geologiche e Ambientali (BiGeA), Università di Bologna, Via Selmi 3, 40126 Bologna, Italy.
⁹ BGI-Shenzhen, Shenzhen 518083, China.
¹⁰ BGI-Shenzhen, Shenzhen 518083, China; King Abdulaziz University, Jeddah 21589, Saudi Arabia; Department of Biology, University of Copenhagen, Copenhagen 2200, Denmark; Macau University of Science and Technology, Taipa, Macau 999078, China.
¹¹ Centre for GeoGenetics, Natural History Museum of Denmark, Copenhagen 1350, Denmark; Ancient DNA Laboratory, Murdoch University, South Street, Perth, WA 6150, Australia.
¹² Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA. Electronic address: wjg@stanford.edu.
¹³ Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA. Electronic address: cdbustam@stanford.edu.

PMID: 24568772
PMCID: PMC3824117
DOI: 10.1016/j.ajhg.2013.10.002

Abstract

Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062-147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217-73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples.

PubMed Disclaimer

Figures

**Figure 1**
Schematic of the Whole-Genome In-Solution Capture Process To generate the RNA “bait” library, a human genomic library is created via adapters containing T7 RNA polymerase promoters (green boxes). This library is subjected to in vitro transcription via T7 RNA polymerase and biotin-16-UTP (stars), creating a biotinylated bait library. Meanwhile, the ancient DNA library (aDNA “pond”) is prepared via standard indexed Illumina adapters (purple boxes). These aDNA libraries often contain <1% endogenous DNA, with the remainder being environmental in origin. During hybridization, the bait and pond are combined in the presence of adaptor-blocking RNA oligos (blue zigzags), which are complimentary to the indexed Illumina adapters and thus prevent nonspecific hybridization between adapters in the aDNA library. After hybridization, the biotinylated bait and bound aDNA is pulled down with streptavidin-coated magnetic beads, and any unbound DNA is washed away. Finally, the DNA is eluted and amplified for sequencing.

**Figure 2**
Results of Increased Sequencing of Samples M4 and NA40 (A) Yield of unique fragments for M4 (Bronze Age hair) precapture (blue) and postcapture (red) libraries with increasing amounts of sequencing. The fold enrichment in number of unique reads with increasing amounts of sequencing is plotted in green, with values on the secondary y axis. (B) Yield of unique fragments for NA40 (Peruvian bone) precapture (blue) and postcapture (red) libraries with increasing amounts of sequencing. The fold enrichment in number of unique reads with increasing amounts of sequencing is plotted in green, with values on the secondary y axis. (C) Venn diagram showing the overlap between the NA40 pre- and postcapture libraries based on sequencing of 12.3 million reads. (D) Coverage plot of the M4 and NA40 libraries based on sequencing of 18.6 million and 12.3 million reads, respectively. Shown is a random 10-megabase segment of chromosome 1. Coverage was calculated in 1 kb windows across the region. (E) Insert size distribution for NA40 pre- and postcapture libraries. (F) Percent GC content of reads for NA40 pre- and postcapture libraries.

**Figure 3**
Principal Component Analysis of Pre- and Postcapture Samples Based on Sequencing One Million Reads Each Principal component analysis of SNPs overlapping between the 1000 Genomes reference panel and each ancient individual, with Native American individuals also included in (E) and (F). The principal components were calculated with the modern individuals only, and the ancient individual was then projected onto the plot. Shown are (A) V2 (Bulgarian tooth) precapture and (B) postcapture; (C) M4 (Bronze Age hair) precapture and (D) postcapture; and (E) NA40 (Peruvian bone) precapture and (F) postcapture. Population key: ASW, Americans of African ancestry in SW USA; AYM, Aymara from the Peruvian Andes; CEU, Utah residents (CEPH) with Northern and Western European ancestry; CHB, Han Chinese in Beijing, China; CHS, Southern Han Chinese; CLM, Colombians from Medellin, Columbia; FIN, Finnish in Finland; GBR, British in England and Scotland; IBS, Iberian population in Spain; JPT, Japanese in Tokyo, Japan; KAR, Karitiana from the Brazilian Amazon; LWK, Luhya in Webuye, Kenya; MAY, Mayan from Mexico; MXL, Mexican ancestry from Los Angeles, USA; PUR, Puerto Ricans from Puerto Rico; TSI, Toscani in Italy; YRI, Yoruba in Ibadan, Nigeria.

See this image and copyright information in PMC

References

1. Green R.E., Krause J., Briggs A.W., Maricic T., Stenzel U., Kircher M., Patterson N., Li H., Zhai W., Fritz M.H.-Y. A draft sequence of the Neandertal genome. Science. 2010;328:710–722. - PMC - PubMed
1. Rasmussen M., Li Y., Lindgreen S., Pedersen J.S., Albrechtsen A., Moltke I., Metspalu M., Metspalu E., Kivisild T., Gupta R. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature. 2010;463:757–762. - PMC - PubMed
1. Rasmussen M., Guo X., Wang Y., Lohmueller K.E., Rasmussen S., Albrechtsen A., Skotte L., Lindgreen S., Metspalu M., Jombart T. An Aboriginal Australian genome reveals separate human dispersals into Asia. Science. 2011;334:94–98. - PMC - PubMed
1. Keller A., Graefen A., Ball M., Matzas M., Boisguerin V., Maixner F., Leidinger P., Backes C., Khairat R., Forster M. New insights into the Tyrolean Iceman’s origin and phenotype as inferred by whole-genome sequencing. Nat Commun. 2012;3:698. - PubMed
1. Sánchez-Quinto F., Schroeder H., Ramirez O., Avila-Arcos M.C., Pybus M., Olalde I., Velazquez A.M., Marcos M.E., Encinas J.M., Bertranpetit J. Genomic affinities of two 7,000-year-old Iberian hunter-gatherers. Curr. Biol. 2012;22:1494–1499. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Pulling out the 1%: whole-genome capture for the targeted enrichment of ancient DNA sequencing libraries

Affiliations

Pulling out the 1%: whole-genome capture for the targeted enrichment of ancient DNA sequencing libraries

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources