Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul 12;365(6449):185-189.
doi: 10.1126/science.aaw6718. Epub 2019 Jul 11.

Protein interaction networks revealed by proteome coevolution

Affiliations

Protein interaction networks revealed by proteome coevolution

Qian Cong et al. Science. .

Abstract

Residue-residue coevolution has been observed across a number of protein-protein interfaces, but the extent of residue coevolution between protein families on the whole-proteome scale has not been systematically studied. We investigate coevolution between 5.4 million pairs of proteins in Escherichia coli and between 3.9 millions pairs in Mycobacterium tuberculosis We find strong coevolution for binary complexes involved in metabolism and weaker coevolution for larger complexes playing roles in genetic information processing. We take advantage of this coevolution, in combination with structure modeling, to predict protein-protein interactions (PPIs) with an accuracy that benchmark studies suggest is considerably higher than that of proteome-wide two-hybrid and mass spectrometry screens. We identify hundreds of previously uncharacterized PPIs in E. coli and M. tuberculosis that both add components to known protein complexes and networks and establish the existence of new ones.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1.
Fig. 1.. PPI identification by using coevolution.
(A) Distribution of E. coli protein family sizes. Nf90 = N90 / L, where L is the number of aligned positions in the alignment, and N90 is the number of sequences in the alignment, filtered at 90% sequence identity. The black box indicates selected protein pairs. (B) Screening pipeline. (C) Protein pairs were ranked by MI, and lines sweep MI threshold values from high (left) to low (right). The number (P) of pairs above a MI threshold, the number (T) of gold-standard pairs, and their overlap (TP) are used to calculate Precision (TP/P, y-axis) and Recall (TP/T, x-axis). Baseline (blue) represents random ranking of pairs. Improved performance (green) is achieved by using an average product correction (APC). (D) Enhanced recovery of gold-standard pairs by using global statistical methods (DCA and GREMLIN). Green curve includes APC-like procedures to penalize false-positive hubs. (E) Further increase in precision through protein-protein docking calculations. Pairs were ranked by the sum of the probability of contacts made in the best-fitting docked complex. (F) Performance of experimental and coevolution screens on diverse benchmarks. The size of each benchmark is shown in parentheses. Cells are colored by performance: green for the best and red for the worst. Coevolution+, increased coverage by supplementing input to GREMLIN and docking screens with pairs missed in initial stages but identified in previous experimental studies (materials and methods M6.1); F-score, harmonic mean of precision and recall; Pre, precision; Rec, recall; TP, true positives.
Fig. 2.
Fig. 2.. Coevolution in known protein complexes.
The extent of coevolution is higher in complexes with fewer subunits (A) and varies with the function of the complexes (B). (C to E) Obligate and transient interactions revealed by coevolution provide insights into function. Bars connecting coevolving residues are in green if an experimental structure containing the interface has been determined and in red if not. Black arrows indicate inferred movements of proteins. (C) LPS transporter consisting of periplasmic LPS-binding protein LptA (orange), LptE (yellow), and outer-membrane β-barrel LptD (light blue). (D) Acetyl-CoA carboxylase complex consisting of biotin carboxylase (AccC, yellow), biotin carboxyl carrier (AccB, pink), and carboxyltransferase subunits (AccA and AccD, light blue and orange). (E) Self-inhibitory mechanism of DNA polymerase V (umuD2C). The magenta β-strand in umuD (yellow, top) is cleaved upon activation by RecA. The remaining umuD’ dimerizes (bottom) and causes the green β strand blocking the active site (black spheres below the magenta strand) in umuC (light blue) to move away and release inhibition of polymerase activity.
Fig. 3.
Fig. 3.. Examples of new components of known complexes and newly identified complexes.
(A) Fractions of coevolving complexes that are consistent with previous structural and experimental data. (B) Predicted interactions between nonribosomal proteins and core ribosomal proteins are indicated by bars color-coded as in (A) (full names are in table S15). (C and D) Previously unknown interfaces extending those in crystal structures. (E to H) Interactions supported by large-scale experiments. (I to T) Previously unidentified interactions. (C) Coevolution suggests that both the C- (shown) and N-terminal (not shown, in cocrystal) domains of antitoxin MqsA interact with toxin MqsR, possibly forming a higher-order complex. (D) DNA mismatch repair proteins MutS and MutL (C terminus). (E) Sec translocon accessory protein YajC and membrane protein insertase YidC. (F) Cell division protein FtsX and murein hydrolase activator EnvC. (G) DNA polymerase III subunit delta and ferredoxin YfhL. (H) Protein YciI and riboflavin biosynthesis protein RibD. (I) Thioesterase TesA and protein YbbP. (J) tRNA methyltransferase TrmD and tRNA sulfurtransferase ThiI. (K) 1,2-phenylacetyl-CoA epoxidase, subunits C and D. (L) RNA polymerase sigma factor FliA (green), flagellar biosynthetic protein FlhB (orange), and secretion chaperone FliS (yellow). (M) D-ribose pyranase RbsD and sigma D regulator Rsd. (N) Cell division topological specificity factor MinE and tRNA-modifying protein YgfZ. (O) Phosphate transporter ATPase PstB (green), phosphate transporter accessory protein PhoU (blue), phosphate regulon sensor protein PhoR (yellow), and ribosome hibernation promoting factor Hpf (pink). (P) Transcriptional factor BolA and ribosome modulation factor Rmf. (Q) LPS exporter ATPase LptB and protein YbbN. (R) Membrane protein quality-control factor QmcA and protein YbbJ. (S) Nucleoside triphosphatase RdgB and DNA utilization protein HofN. (T) Macro-domain Ter protein MatP and protein YjjV.
Fig. 4.
Fig. 4.. Examples of coevolving protein networks.
Blue lines connect coevolving protein pairs, and green lines connect proteins interacting in experimentally determined structures. (A) Network of transcription elongation factors. (B) Outer-membrane integrity maintenance network. (C) Linkage between phosphate transport and regulation of transcription and ribosome activity. (D) Chaperones and tRNA modification enzymes are coupled to DNA replication initiation, perhaps decreasing it under stress conditions. (E) Stress response network. (F) Network connecting flagella components and regulators of their synthesis.
Fig. 5.
Fig. 5.. Functional relatedness of predicted interacting partners in M. tuberculosis.
Functional relatedness was assessed by using the M. tuberculosis functional network in the STRING database (high: STRING combined score ≥ 0.4, missing: not in the STRING database). A considerable fraction (40%) of coevolution-based predictions (green) involve partners that are predicted to be functionally related, whereas a much lower fraction (orange, 2.0%) of PPIs identified in a previous experimental screen involves functionally related partners, almost as low (gray, 1.3%) as randomly selected pairs. The 384 predicted PPIs involving partners lacking homologs in E. coli and without STRING annotations (blue bar on left) are likely to be of most interest to M. tuberculosis researchers. Mtb, M. tuberculosis.

Comment in

  • Mapping global protein contacts.
    Vajda S, Emili A. Vajda S, et al. Science. 2019 Jul 12;365(6449):120-121. doi: 10.1126/science.aay1440. Science. 2019. PMID: 31296755 Free PMC article. No abstract available.
  • PPI discovery using proteome coevolution.
    Singh A. Singh A. Nat Methods. 2019 Sep;16(9):804. doi: 10.1038/s41592-019-0566-9. Nat Methods. 2019. PMID: 31471609 No abstract available.

References

    1. Marks DS et al., PLOS ONE 6, e28766 (2011). - PMC - PubMed
    1. Ovchinnikov S. et al., Science 355, 294–298 (2017). - PMC - PubMed
    1. Wang S, Sun S, Li Z, Zhang R, Xu J, PLOS Comput. Biol 13, e1005324 (2017). - PMC - PubMed
    1. Hopf TA et al., eLife 3, e03430 (2014). - PubMed
    1. Ovchinnikov S, Kamisetty H, Baker D, eLife 3, e02030 (2014). - PMC - PubMed

Publication types

MeSH terms