Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(6):e38913.
doi: 10.1371/journal.pone.0038913. Epub 2012 Jun 13.

How many protein-protein interactions types exist in nature?

Affiliations

How many protein-protein interactions types exist in nature?

Leonardo Garma et al. PLoS One. 2012.

Abstract

"Protein quaternary structure universe" refers to the ensemble of all protein-protein complexes across all organisms in nature. The number of quaternary folds thus corresponds to the number of ways proteins physically interact with other proteins. This study focuses on answering two basic questions: Whether the number of protein-protein interactions is limited and, if yes, how many different quaternary folds exist in nature. By all-to-all sequence and structure comparisons, we grouped the protein complexes in the protein data bank (PDB) into 3,629 families and 1,761 folds. A statistical model was introduced to obtain the quantitative relation between the numbers of quaternary families and quaternary folds in nature. The total number of possible protein-protein interactions was estimated around 4,000, which indicates that the current protein repository contains only 42% of quaternary folds in nature and a full coverage needs approximately a quarter century of experimental effort. The results have important implications to the protein complex structural modeling and the structure genomics of protein-protein interactions.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The corresponding author is an associate editor of PLoS ONE. This does not alter the authors’ adherence to all the PLoS ONE policies on sharing data and materials.

Figures

Figure 1
Figure 1. Graphical representation of all non-redundant protein-protein complex structures in the PDB.
Each node represents a known complex structure and two nodes are connected by an edge if the rTM-score between the two structures is >0.5. The orphan nodes are shown in black while nodes which are connected by at least one edge shown in yellow. Representative examples from eight largest clusters are listed together with the protein name.
Figure 2
Figure 2. Histogram of complex structural clusters versus size of the clusters.
The solid curve is the fitting result from Eq. 12. Inset: the same data drawn in logarithm scale.
Figure 3
Figure 3. The estimated number of quaternary folds versus the number of quaternary families in nature.
The solid curve is the fitting from Eq. 13 and dotted line indicates the number of quaternary families following Orengo et al. estimation.
Figure 4
Figure 4. The number of new complex structure entries deposited per year in the PDB.
Data are presented in terms of unique structures (sequence identity <90%), families (mapped with unique Pfam families), and folds (rTM-score <0.5).

Similar articles

Cited by

References

    1. Levitt M. Nature of the protein universe. Proceedings of the National Academy of Sciences of the United States of America. 2009;106:11079–11084. - PMC - PubMed
    1. Chothia C. Proteins. One thousand families for the molecular biologist. Nature. 1992;357:543–544. - PubMed
    1. Zhang C, DeLisi C. Estimating the number of protein folds. J Mol Biol. 1998;284:1301–1305. - PubMed
    1. Govindarajan S, Recabarren R, Goldstein RA. Estimating the total number of protein folds. Proteins. 1999;35:408–414. - PubMed
    1. Liu X, Fan K, Wang W. The number of protein folds and their distribution over families in nature. Proteins. 2004;54:491–499. - PubMed

Publication types