Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Jan;27(1):316-330.
doi: 10.1002/pro.3331. Epub 2017 Nov 11.

RCSB Protein Data Bank: Sustaining a living digital data resource that enables breakthroughs in scientific research and biomedical education

Affiliations
Review

RCSB Protein Data Bank: Sustaining a living digital data resource that enables breakthroughs in scientific research and biomedical education

Stephen K Burley et al. Protein Sci. 2018 Jan.

Abstract

The Protein Data Bank (PDB) is one of two archival resources for experimental data central to biomedical research and education worldwide (the other key Primary Data Archive in biology being the International Nucleotide Sequence Database Collaboration). The PDB currently houses >134,000 atomic level biomolecular structures determined by crystallography, NMR spectroscopy, and 3D electron microscopy. It was established in 1971 as the first open-access, digital-data resource in biology, and is managed by the Worldwide Protein Data Bank partnership (wwPDB; wwpdb.org). US PDB operations are conducted by the RCSB Protein Data Bank (RCSB PDB; RCSB.org; Rutgers University and UC San Diego) and funded by NSF, NIH, and DoE. The RCSB PDB serves as the global Archive Keeper for the wwPDB. During calendar 2016, >591 million structure data files were downloaded from the PDB by Data Consumers working in every sovereign nation recognized by the United Nations. During this same period, the RCSB PDB processed >5300 new atomic level biomolecular structures plus experimental data and metadata coming into the archive from Data Depositors working in the Americas and Oceania. In addition, RCSB PDB served >1 million RCSB.org users worldwide with PDB data integrated with ∼40 external data resources providing rich structural views of fundamental biology, biomedicine, and energy sciences, and >600,000 PDB101.rcsb.org educational website users around the globe. RCSB PDB resources are described in detail together with metrics documenting the impact of access to PDB data on basic and applied research, clinical medicine, education, and the economy.

Keywords: 3D electron microscopy; FAIR principles; NMR spectroscopy; PDB; PDBx/mmCIF; Protein Data Bank; RCSB; Research Collaboratory for Structure Bioinformatics; Worldwide Protein Data Bank; biocuration; chemical component dictionary; crystallography; data archive; data deposition; integrative/hybrid methods; macromolecular structure; metadata; open access; validation; wwPDB.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Week in the life of the RCSB PDB, showing the progression from data deposition at wwPDB regional data centers to preparation and finalization of weekly releases by the RCSB PDB acting as PDB Archive Keeper, followed by Stage I (partial) and Stage II (full) Global PDB Data Release.
Figure 2
Figure 2
OneDep system workflow.
Figure 3
Figure 3
PDB depositions by geography (January 2000‐June 2017) and breakdown of funding source for US depositions.
Figure 4
Figure 4
Data integrated from external resources enables research. Information about publications, sequence annotations, drug interactions, and more are updated regularly to enable Data Consumers to browse entire PDB archive by external annotations, access annotations for individual structures, and visualize data in 2D and 3D. Examples shown, clockwise from the upper right: Images from PoseView32 are available on Structure Summary pages; the Gene View tool illustrates correspondences between the human genome and PDB structure; metabolic pathways maps in the Pathway View identify pathway components with PDB structures and homology models.10
Figure 5
Figure 5
Fraction of published PDB structures cited in subject‐area publications. The impact of individual structures can also be assessed using PDB archive data download statistics. An RCSB PDB study completed in July 2017 documented that each PDB structure has been downloaded an average of ∼30,400 times since 2007. Some PDB structures are extremely “popular.” The top 1% of downloaded structures have each been downloaded an average ∼105,000 times since 2007. Individual structure download statistics are provided on the wwPDB website (www.wwpdb.org/stats/search).
Figure 6
Figure 6
Category‐normalized citation impact of publications in different subject categories citing Berman et al. (2000) from Clarivate Analytics.37
Figure 7
Figure 7
Representative PDB structures that exemplifying impact on our understanding of Fundamental Biology, Biomedicine, and Energy Research. (a) Nucleosome Core Particle (PDB ID 1aoi41); (b) Major Histocompatibility Complex 1 (1hla43); (c) Photosystem II (1s5l47).
Figure 8
Figure 8
(a) XFEL serial crystallography reveals what happens when adenine binds to a riboswitch66 and (b) I/HM multi‐scale structural model of the nuclear pore Nup84 complex.67

References

    1. Protein Data Bank (1971) Crystallography: Protein Data Bank. Nature New Biol 233:223–223.
    1. Cold Spring Harbor Symposia on Quantitative Biology (1972) Cold Spring Laboratory Press.
    1. Wilkinson MD, Dumontier M, Aalbersberg IJJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J‐W, da Silva Santos LB, Bourne PE, Bouwman J, Brookes AJ, Clark T, Crosas M, Dillo I, Dumon O, Edmunds S, Evelo CT, Finkers R, Gonzalez‐Beltran A, Gray AJG, Groth P, Goble C, Grethe JS, Heringa J, 't Hoen PAC, Hooft R, Kuhn T, Kok R, Kok J, Lusher SJ, Martone ME, Mons A, Packer AL, Persson B, Rocca‐Serra P, Roos M, van Schaik R, Sansone S‐A, Schultes E, Sengstag T, Slater T, Strawn G, Swertz MA, Thompson M, van der Lei J, van Mulligen E, Velterop J, Waagmeester A, Wittenburg P, Wolstencroft K, Zhao J, Mons B (2016) The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3:160018. - PMC - PubMed
    1. Sullivan LH (1896) The tall office building artistically considered. Lippincott's Mag 339:403–409.
    1. Watson JD, Crick FH (1953) Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171:737–738. - PubMed

Publication types

LinkOut - more resources