Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Aug 24;122(16):13401-13446.
doi: 10.1021/acs.chemrev.1c00703. Epub 2022 Jul 15.

Paleoproteomics

Affiliations
Review

Paleoproteomics

Christina Warinner et al. Chem Rev. .

Abstract

Paleoproteomics, the study of ancient proteins, is a rapidly growing field at the intersection of molecular biology, paleontology, archaeology, paleoecology, and history. Paleoproteomics research leverages the longevity and diversity of proteins to explore fundamental questions about the past. While its origins predate the characterization of DNA, it was only with the advent of soft ionization mass spectrometry that the study of ancient proteins became truly feasible. Technological gains over the past 20 years have allowed increasing opportunities to better understand preservation, degradation, and recovery of the rich bioarchive of ancient proteins found in the archaeological and paleontological records. Growing from a handful of studies in the 1990s on individual highly abundant ancient proteins, paleoproteomics today is an expanding field with diverse applications ranging from the taxonomic identification of highly fragmented bones and shells and the phylogenetic resolution of extinct species to the exploration of past cuisines from dental calculus and pottery food crusts and the characterization of past diseases. More broadly, these studies have opened new doors in understanding past human-animal interactions, the reconstruction of past environments and environmental changes, the expansion of the hominin fossil record through large scale screening of nondiagnostic bone fragments, and the phylogenetic resolution of the vertebrate fossil record. Even with these advances, much of the ancient proteomic record still remains unexplored. Here we provide an overview of the history of the field, a summary of the major methods and applications currently in use, and a critical evaluation of current challenges. We conclude by looking to the future, for which innovative solutions and emerging technology will play an important role in enabling us to access the still unexplored "dark" proteome, allowing for a fuller understanding of the role ancient proteins can play in the interpretation of the past.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Milestones in ancient protein mass spectrometry. The broadest applications of protein mass spectrometry in archaeology today are ZooMS (zooarchaeology by mass spectrometry), which applies MALDI-TOF MS peptide mass fingerprinting to collagens, keratins, and other high abundance proteins (left), and shotgun proteomics, which uses high-resolution LC–MS/MS to identify diverse, low abundance proteins in complex mixtures (right).
Figure 2
Figure 2
Conceptual stages of protein incorporation and recovery in archaeological samples. Archaeological proteins represent a small fraction of the proteins that were once present during life. Careful consideration of the full history of a sample, from incorporation to analysis, must be taken into account in order to make accurate inferences about the past.
Figure 3
Figure 3
Example MS and MS/MS spectra obtained from archaeological samples. (A) Mass spectrum of sheep (Ovis) type I collagen obtained by MALDI-TOF MS from an archaeological small ruminant bone bone at the site of Tepe Yahya, Iran (YTC-248, Peabody Museum No. 986-7-60/22498). (B) Tandem mass spectrum of sheep (Ovis) β-lactoglobulin milk protein obtained by nano-HPLC–MS/MS from human dental calculus at the Iron Age pastoralist site of Marinskaya 5, Russia (MKA018). (C) Tandem mass spectrum of sesame seed (Sesamum) 11S globulin protein obtained by nano-HPLC–MS/MS from human dental calculus at the Late Bronze Age city of Meggido, Israel (MGD011).
Figure 4
Figure 4
Sources of bias in ancient protein identification. Differences in the density of enzymatic cut sites, number of disulfide bonds, and protein length influence protein detectability (A). The representation of proteins across taxa is highly uneven in major databases, such as UniProtKB (B,C). (A) Comparison of four Ovis aries (sheep) proteins of archaeological interest: AMELX, used for determining sex; BLG, a milk protein; KAP4-2, a protein component of wool; COL1α1, a bone protein used for taxonomic identification. Selected properties influencing protein detectability using LC–MS/MS techniques include predicted peptide size following trypsin digestion (cut sites are shown as dashed lines) and location of disulfide linkages (green boxes). Peptide lengths of 6–30 amino acids are most suitable for detection, and those outside this range are marked as less likely to be found due to size (red border). Note that endogenous proteolysis of amelogenin during dental development causes additional cleavages that are not shown. Only half (736 amino acids) of the collagen protein is shown for space reasons. (B) Comparison of the number of characterized species to the number of protein entries in UniProtKB (SwissProt and TrEMBL) for the major taxonomic classes Magnoliopsida (flowering plants), Mammalia (mammals), Aves (birds), and Actinopterygii (ray-finned fishes); inset shows enhanced view of the number of reviewed (SwissProt) protein entries. Numbers of characterized species were obtained from refs (−607). (C) Reviewed and unreviewed protein entries available in UniProtKB for humans and common plants and animals consumed in ancient Mesoamerican diets; inset shows enhanced view of taxa with <1000 protein entries.
Figure 5
Figure 5
Representative examples of ancient proteomes. Well-preserved ancient proteomes contain distinctive groups of proteins that reflect the protein composition of the original tissue or material, such as human bone (A), human dental calculus (B), artist materials (C), and pottery crusts (D). As such, the composition of an ancient proteome can aid in its authentication. Data were searched against the SwissProt database using Mascot using the parameters described in ref (102). Protein identifications were established at <5.0% protein FDR and <1.0% peptide FDR in Scaffold v.5 (Proteome Software), and proteins with a minimum of 97% protein identification probability and at least two unique peptides were accepted. The top 15 proteins (by number of PSMs) per sample source were visualized as a treemap and labeled by their corresponding gene name; trypsin, keratins, serum albumin, and microbial proteins were excluded from the analysis. *Ovostatin; **riboflavin-binding protein; ***B3-hordein.

References

    1. Boyd W. C.; Boyd L. G. Blood Grouping Tests on 300 Mummies: With Notes on the Precipitin-Test. J. Immunol. 1937, 32, 307–319.
    1. Abelson P. H. Paleobiochemistry: Organic Constituents of Fossils. Carnegie Institution of Washington, Yearbook 1954, 53, 97–101.
    1. Ostrom P. H.; Schall M.; Gandhi H.; Shen T.-L.; Hauschka P. V.; Strahler J. R.; Gage D. A. New Strategies for Characterizing Ancient Proteins Using Matrix-Assisted Laser Desorption Ionization Mass Spectrometry. Geochim. Cosmochim. Acta 2000, 64, 1043–1050. 10.1016/S0016-7037(99)00381-6. - DOI
    1. Hendy J.; Welker F.; Demarchi B.; Speller C.; Warinner C.; Collins M. J. A Guide to Ancient Protein Studies. Nat. Ecol. Evol. 2018, 2, 791–799. 10.1038/s41559-018-0510-x. - DOI - PubMed
    1. Hendy J. Ancient Protein Analysis in Archaeology. Sci. Adv. 2021, 7, eabb9314 10.1126/sciadv.abb9314. - DOI - PMC - PubMed

Publication types