Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov 30:11:giac118.
doi: 10.1093/gigascience/giac118.

3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources

Affiliations

3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources

Mihaly Varadi et al. Gigascience. .

Abstract

While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, the gap between the number of known protein sequences and their experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational protein modeling approaches. While often powerful on their own, most methods have strengths and weaknesses. Therefore, it benefits researchers to examine models from various model providers and perform comparative analysis to identify what models can best address their specific use cases. To make data from a large array of model providers more easily accessible to the broader scientific community, we established 3D-Beacons, a collaborative initiative to create a federated network with unified data access mechanisms. The 3D-Beacons Network allows researchers to collate coordinate files and metadata for experimentally determined and theoretical protein models from state-of-the-art and specialist model providers and also from the Protein Data Bank.

Keywords: bioinformatics; experimentally determined structures computationally predicted structures; federated data network; structural biology.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1:
Figure 1:
Growth of the UniProt and the PDB databases. This figure shows the number of accessions (on a logarithmic scale) throughout the past decade. In 2011, the UniProt had 161× as many protein sequences as the number of PDB entries. This ratio grew by an order of magnitude and was 1,132 to 1 in 2021, showing that the gap between known protein sequences and their structures keeps increasing.
Figure 2:
Figure 2:
Highlighting the strengths and weaknesses of modeling techniques. Each modeling approach has limitations and specific strengths. For example, AlphaFill complements AlphaFold models by placing obligate ligands in their contexts (A). Other data providers, such as the Protein Ensemble Database, provide conformational ensembles for intrinsically disordered proteins (IDPs), for example, for the human Alpha-synuclein (B).
Figure 3:
Figure 3:
Schematic overview of the 3D-Beacons Network. Data providers standardize their meta-information and make their models available through 3D-Beacons API instances. The 3D-Beacons Registry links these instances to the central 3D-Beacons Hub API, which can be openly accessed by the scientific community and other data services.
Figure 4:
Figure 4:
Graphical user interface of 3D-Beacons. While the main focus of the 3D-Beacons Network is to provide programmatic access to experimentally determined and computationally predicted protein structures, we also provide a graphical user interface where researchers can query for specific proteins using UniProt accessions. This interface displays which section of the protein sequence the models cover and provides an interactive 3-dimensional view.

References

    1. Batool M, Ahmad B, Choi SA. Structure-based drug discovery paradigm. Int J Mol Sci. 2019;20(11):2783. - PMC - PubMed
    1. Ochoa D, Hercules A, Carmona M, et al. Open Targets Platform: supporting systematic drug-target identification and prioritisation. Nucleic Acids Res. 2021;49(D1):D1302–10. - PMC - PubMed
    1. Zhu B, Wang D, Wei N. Enzyme discovery and engineering for sustainable plastic recycling. Trends Biotechnol. 2022;40(1):22–37. - PubMed
    1. Lee D, Redfern O, Orengo C. Predicting protein function from sequence and structure. Nat Rev Mol Cell Biol. 2007;8(12):995–1005. - PubMed
    1. Varadi M, Berrisford J, Deshpande M, et al. PDBe-KB: a community-driven resource for structural and functional annotations. Nucleic Acids Res. 2020;48(D1):D344–53. - PMC - PubMed

Publication types