Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Aug 27;16(8):e1008145.
doi: 10.1371/journal.pcbi.1008145. eCollection 2020 Aug.

Determining the interaction status and evolutionary fate of duplicated homomeric proteins

Affiliations

Determining the interaction status and evolutionary fate of duplicated homomeric proteins

Saurav Mallik et al. PLoS Comput Biol. .

Abstract

Oligomeric proteins are central to life. Duplication and divergence of their genes is a key evolutionary driver, also because duplications can yield very different outcomes. Given a homomeric ancestor, duplication can yield two paralogs that form two distinct homomeric complexes, or a heteromeric complex comprising both paralogs. Alternatively, one paralog remains a homomer while the other acquires a new partner. However, so far, conflicting trends have been noted with respect to which fate dominates, primarily because different methods and criteria are being used to assign the interaction status of paralogs. Here, we systematically analyzed all Saccharomyces cerevisiae and Escherichia coli oligomeric complexes that include paralogous proteins. We found that the proportions of homo-hetero duplication fates strongly depend on a variety of factors, yet that nonetheless, rigorous filtering gives a consistent picture. In E. coli about 50%, of the paralogous pairs appear to have retained the ancestral homomeric interaction, whereas in S. cerevisiae only ~10% retained a homomeric state. This difference was also observed when unique complexes were counted instead of paralogous gene pairs. We further show that this difference is accounted for by multiple cases of heteromeric yeast complexes that share common ancestry with homomeric bacterial complexes. Our analysis settles contradicting trends and conflicting previous analyses, and provides a systematic and rigorous pipeline for delineating the fate of duplicated oligomers in any organism for which protein-protein interaction data are available.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. The potential evolutionary fates of duplicated homomeric proteins and the analysis pipeline for identifying them.
(A) Duplication of a gene encoding a homomeric protein, and the emergence of the first mutation(s), leads to a statistical mixture of homo- and heteromeric complexes (i). Upon further divergence, three outcomes may arise: two distinct homomeric complexes (ii), a heteromeric complex involving both paralogs (iii), or loss of homomeric interaction in one copy, and gain of new interacting partners in the other paralog (iv). (B) Our analysis aimed to identify these four different evolutionary fates. It comprised three steps: (1) The genomes of E. coli and S. cerevisiae were each scanned to identify all possible paralogous protein pairs. These pairs were classified into three categories with increasing confidence of paralog assignment (note that all categories in our analysis are inclusive, i.e., low-confidence paralogs include the medium-confidence ones, and the medium include the low-confidence pairs). (2) Interactions of these paralogs were identified and classified to homo- and heteromeric ones. Macromolecular complexes were collected from the Protein Data Bank (PDB complexes, inter-subunit interactions were obtained from crystal structure data) and the Complex Portal database (CS and C complexes, inter-subunit interactions were predicted from the PPI data). The S. cerevisiae PPI data were extracted from seven databases, and the E. coli data from eight databases. The raw PPI data were filtered using various criteria to exclude potential false-positives. (3) Finally, based on the identified interactions, the paralogous pairs were assigned to one of the four potential fates (i-iv, panel A) with either a flexible or a stringent criterion.
Fig 2
Fig 2. The distribution of divergence modes of S. cerevisiae and E. coli paralogous pairs.
The four divergence modes, obligatory-homo, obligatory-hetero, mixed and hetero-others, are described in Fig 1A. (A) The distribution of S. cerevisiae paralogous pairs in PPI data (right panel) and in curated complexes (left panel). Presented are the distributions for different stringencies of analysis, along its 3 steps (Fig 1B). Step-1, paralog assignment, is presented in columns, shaded in green, from low-confidence in pale green to high-confidence paralogs in dark green. Step-2, identifying interactions, also in columns, from white (raw PPI data) to dark grey (filter-3). Step-3, the divergence mode, is presented in rows–the top set of rows represent the flexible criterion (shaded in yellow), and the bottom rows the stringent criterion (dark yellow). The dominant divergence modes, or fates, are highlighted in darker shades of red. (B) The distribution of E. coli paralogous.
Fig 3
Fig 3. The distribution of complexes comprising homo- and heteromeric paralogs in S. cerevisiae and in E. coli.
This analysis was based on the curated complexes databases. The column annotations and color shades are the same as in Fig 2. (A) The numbers of unique S. cerevisiae complexes comprising paralogs assigned to the different homo/hetero divergence modes. Note that the different confidence levels for paralog assignment (LC, MC, HC) show that same trend as in Fig 2B, curated complex panel. (B) The same for E. coli.
Fig 4
Fig 4. Different modes of prokaryotic homomer to eukaryotic heteromer transition.
Gene duplication of an ancestral non-ring-like homomer may produce a heteromeric complex that may (i) or may not (ii) retain the ancestral oligomeric order (i.e., the total number of subunits in the complex). After the first gene duplication and the subsequent emergence of a heteromeric interaction, multiple rounds of duplication may follow in which the descendant paralogs retain the heteromeric interaction (iii). For ring-like complexes, multiple rounds of intra-ring gene duplications result in heteromeric rings, while keeping (iv) or changing the ancestral oligomeric order (v). For each mode of transition, an example case is provided.

References

    1. Penel S, Arigon AM, Dufayard JF, Sertier AS, Daubin V, Duret L, et al. Databases of homologous gene families for comparative genomics. BMC Bioinformatics. 2009;10: 1–13. 10.1186/1471-2105-10-1 - DOI - PMC - PubMed
    1. Pereira-Leal JB, Levy ED, Kamp C, Teichmann SA. Evolution of protein complexes by duplication of homomeric interactions. Genome Biol. 2007;8 10.1186/gb-2007-8-4-r51 - DOI - PMC - PubMed
    1. Hochberg GKA, Shepherd DA, Marklund EG, Santhanagoplan I, Degiacomi MT, Laganowsky A, et al. Structural principles that enable oligomeric small heat-shock protein paralogs to evolve distinct functions. Science (80-). 2018;359: 930–935. 10.1126/science.aam7229 - DOI - PMC - PubMed
    1. Marchant A, Cisneros AF, Dube AK, Gagnon-Arsenault I, Ascendo D, Jain H, et al. The role of structural pleiotropy and regulatory evolution in the retention of heteromers of paralogs. Elife. 2019;8: 1–34. 10.7554/eLife.46754 - DOI - PMC - PubMed
    1. Diss G, Gagnon-Arsenault I, Dion-Coté AM, Vignaud H, Ascencio DI, Berger CM, et al. Gene duplication can impart fragility, not robustness, in the yeast protein interaction network. Science (80-). 2017;355: 630–634. 10.1126/science.aai7685 - DOI - PubMed

Publication types

MeSH terms

Substances