Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 Jun;20(6):323-340.
doi: 10.1038/s41576-019-0119-1.

Ancient pathogen genomics as an emerging tool for infectious disease research

Affiliations
Review

Ancient pathogen genomics as an emerging tool for infectious disease research

Maria A Spyrou et al. Nat Rev Genet. 2019 Jun.

Abstract

Over the past decade, a genomics revolution, made possible through the development of high-throughput sequencing, has triggered considerable progress in the study of ancient DNA, enabling complete genomes of past organisms to be reconstructed. A newly established branch of this field, ancient pathogen genomics, affords an in-depth view of microbial evolution by providing a molecular fossil record for a number of human-associated pathogens. Recent accomplishments include the confident identification of causative agents from past pandemics, the discovery of microbial lineages that are now extinct, the extrapolation of past emergence events on a chronological scale and the characterization of long-term evolutionary history of microorganisms that remain relevant to public health today. In this Review, we discuss methodological advancements, persistent challenges and novel revelations gained through the study of ancient pathogen genomes.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Selected cultural time periods and epidemics or pandemics of human history.
This overview provides a timeline of key events in predominantly Eurasian history since the Neolithic period (upper panel, grey squares), which have overlapped temporally and geographically with major historical epidemics or pandemics (lower panel, beige squares). The respective citations are indicated, in which whole-genome or low-coverage genome-wide data from pathogens implicated in those events have been reconstructed by ancient DNA analysis. B19V, human parvovirus B19; bce, before current era; ce, current era; HBV, hepatitis B virus; H. pylori, Helicobacter pylori; SARS, severe acute respiratory syndrome; Y. pestis, Yersinia pestis.
Fig. 2
Fig. 2. Methods for the detection and isolation of pathogen DNA from ancient metagenomic specimens.
The diagram provides an overview of techniques used for pathogen DNA detection in ancient remains by distinguishing between laboratory and computational methods. In both cases, processing begins with the extraction of DNA from ancient specimens. As part of the laboratory pipeline, direct screening of extracts can be performed by PCR (quantitative (qPCR) or conventional) against species-specific genes, as done previously,,,. PCR techniques alone, however, can suffer from frequent false-positive results and should therefore always be coupled with further verification methods such as downstream genome enrichment and/or next-generation sequencing (NGS) in order to ensure ancient DNA (aDNA) authentication of putatively positive samples. Alternatively, construction of NGS libraries, has enabled pathogen screening via fluorescence-based detection on microarrays and via DNA enrichment approaches. The latter has been achieved, through single locus in-solution capture, or through simultaneous screening for multiple pathogens using microarray-based enrichment of species-specific loci and enables post-NGS aDNA authentication. In addition, data produced by direct (shotgun) sequencing of NGS libraries before enrichment can also be used for pathogen screening using computational tools. After pre-processing, reads can be directly mapped against a target reference genome (in cases for which contextual information is suggestive of a causative organism) or against a multigenome reference composed of closely related species to achieve increased mapping specificity of ancient reads. Alternatively, ancient pathogen DNA can also be detected using metagenomic profiling methods, as presented elsewhere,,, through taxonomic assignment of shotgun NGS reads. Both approaches allow for subsequent assessment of aDNA authenticity and can be followed by whole pathogen genome retrieval through targeted enrichment or direct sequencing of positive sample libraries.
Fig. 3
Fig. 3. Methods for whole-genome analysis of clonal and recombining pathogens.
The diagram is an overview of whole-genome analysis applied to date for ancient microbial data sets and distinguishes the methods used for clonal and recombining pathogens; of note, the depicted summary is not meant to represent an exhaustive pipeline of all possible analyses that could be undertaken. Ancient genome reconstruction is usually initiated through reference-based mapping or through de novo assembly of the data, although the latter has only been possible in exceptional cases of ancient DNA (aDNA) preservation,. Subsequently, the genomes are assessed for their coverage depth and gene content for evaluation of their quality, which is also relevant for the comparative identification of virulence genes over their evolutionary time frames. Here, we show an example of virulence factor presence-or-absence analysis in the form of a heat map, as done previously,,,. In addition, a comparison of the ancient genome or genomes with modern genomes can be carried out for single-nucleotide polymorphism (SNP) identification and for assessment of SNP effects (using SnpEff), which is particularly relevant for variants that seem to be unique to the ancient genome or genomes. Initial evolutionary inference can often be carried out through phylogenetic analysis and by testing for possible evidence of recombination in the analysed data set, for example, by comparing the support of different phylogenetic topologies and by identifying potential recombination regions and homoplasies,. If the data support clonal evolution, robust phylogenetic inference (for example, through a maximum-likelihood approach) is followed by assessment of the temporal signal in the data,. If the data set shows a sufficient phylogenetic signal, molecular dating analysis and demographic modelling are considered possible, although the size of the data set will determine whether such analyses will be feasible and meaningful. Alternatively, if recombination is confirmed, genetic relationships between microbial clades or populations can be determined through phylogenetic network analysis or through the use of population genetic methods such as principal component analysis (PCA) and identification of ancestral admixture components,. In this case, the assessment of the temporal signal and proceeding with molecular dating analysis is cautioned and likely best performed after exclusion of recombination regions from all genomes in the data set. MRCA, most recent common ancestor. NGS, next-generation sequencing.
Fig. 4
Fig. 4. Map of published modern and ancient Yersinia pestis genomes.
Published ancient specimens that have yielded whole Yersinia pestis genomes and genome-wide data are shown in triangles (n = 38), and their different colours indicate time period distinctions. A set of modern Y. pestis genomes (n = 336), from the following publications (released until 2018),,–,–,, are shown as grey circles within their geographical country or region of isolation, and the size of each circle is proportional to the number of strains sequenced from each location (number indicated when more than one genome is shown). The areas highlighted in brown are regions that contain active plague foci as determined by contemporary or historical data. ybp, years before present. Adapted with permission from the ‘Global distribution of natural plague foci as of March 2016’ from https://www.who.int/csr/disease/plague/Plague-map-2016.pdf.
Fig. 5
Fig. 5. Yersinia pestis ecology and transmission cycle.
A simplified version of the Yersinia pestis enzootic cycle, during which the bacterium is maintained among wild rodent populations through a flea-dependent transmission mechanism. Under poorly understood circumstances, plague epizootics, which are best explained as animal epidemics, can occur among susceptible rodent populations. During those periods, humans and other mammals are at highest risk of becoming infected with Y. pestis. Plague can manifest in humans in the bubonic, pneumonic and septicaemic forms. Pneumonic plague is the only form that can result in airborne transmission between humans.
Fig. 6
Fig. 6. Evolutionary history of Yersinia pestis.
A phylogenetic tree graphic depicting the evolutionary history of Yersinia pestis based on both ancient and modern genomes. Ancient strains that have been previously characterized by phylogenetic analysis are represented with coloured circles among the tree branches as follows: a Middle Neolithic genome is shown in yellow; Late Neolithic and Bronze Age (LNBA) genomes are shown in purple; a Late Bronze Age genome (RT5) encompassing signatures of flea adaptation is shown in blue; a pre-Justinian, 2nd century of the current era (ce), genome is shown in green; first-plague-pandemic genomes are shown in black; second plague pandemic, 14th-century genomes are shown in red; and post-Black Death (up until 18th century ce) genomes are shown in grey. Modern lineages are simplified and shown as branches of equal length in order to enhance the clarity of the graphic. The geographical distribution of modern strains is as follows (using universal country abbreviations): branch 1 (UGA, DRC, KEN, DZA, MDG, CHN, IND, IDN, MNM, USA and PER), branch 2 (RUS, AZE, KAZ, KGZ, UZB, TKM, CHN, IRN and NPL), branch 3 (CHN and MNG), branch 4 (RUS and MNG) and branch 0, including lineages 0.ANT3 (CHN and KGZ), 0.ANT5 (KGZ and KAZ), 0.ANT2 (CHN), 0.ANT1 (CHN), 0.PE5 (MNG), 0.PE4 (TJK, UZB, KGZ, RUS, CHN and MNG), 0.PE2 (GEO, ARM, AZE and RUS) and 0.PE7 (CHN). ybp, years before present.

Comment in

  • A genomic approach to microbiology.
    [No authors listed] [No authors listed] Nat Rev Genet. 2019 Jun;20(6):311. doi: 10.1038/s41576-019-0131-5. Nat Rev Genet. 2019. PMID: 31101903 No abstract available.

References

    1. Armelagos GJ, Goodman AH, Jacobs KH. The origins of agriculture: population growth during a period of declining health. Popul. Environ. 1991;13:9–22. doi: 10.1007/BF01256568. - DOI
    1. Barrett R, Kuzawa CW, McDade T, Armelagos GJ. Emerging and re-emerging infectious diseases: the third epidemiologic transition. Annu. Rev. Anthropol. 1998;27:247–271. doi: 10.1146/annurev.anthro.27.1.247. - DOI
    1. Ortner, D. J. Identification of Pathological Conditions in Human Skeletal Remains 2nd edn (Academic Press, 2003).
    1. Buikstra, J. E. & Roberts, C. The Global History of Paleopathology: Pioneers and Prospects (Oxford Univ. Press, 2012).
    1. Arriaza BT, Salo W, Aufderheide AC, Holcomb TA. Pre-Columbian tuberculosis in Northern Chile: molecular and skeletal evidence. Am. J. Phys. Anthropol. 1995;98:37–45. doi: 10.1002/ajpa.1330980104. - DOI - PubMed

MeSH terms