Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 29;16(6):e1007997.
doi: 10.1371/journal.pcbi.1007997. eCollection 2020 Jun.

Stochastic ordering of complexoform protein assembly by genetic circuits

Affiliations

Stochastic ordering of complexoform protein assembly by genetic circuits

Mikkel Herholdt Jensen et al. PLoS Comput Biol. .

Abstract

Top-down proteomics has enabled the elucidation of heterogeneous protein complexes with different cofactors, post-translational modifications, and protein membership. This heterogeneity is believed to play a previously unknown role in cellular processes. The different molecular forms of a protein complex have come to be called "complex isoform" or "complexoform". Despite the elucidation of the complexoform, it remains unclear how and whether cellular circuits control the distribution of a complexoform. To help address this issue, we first simulate a generic three-protein complexoform to reveal the control of its distribution by the timing of gene transcription, mRNA translation, and protein transport. Overall, we ran 265 computational experiments: each averaged over 1,000 stochastic simulations. Based on the experiments, we show that genes arranged in a single operon, a cascade, or as two operons all give rise to the different protein composition of complexoform because of timing differences in protein-synthesis order. We also show that changes in the kinetics of expression, protein transport, or protein binding dramatically alter the distribution of the complexoform. Furthermore, both stochastic and transient kinetics control the assembly of the complexoform when the expression and assembly occur concurrently. We test our model against the biological cellulosome system. With biologically relevant rates, we find that the genetic circuitry controls the average final complexoform assembly and the variation in the assembly structure. Our results highlight the importance of both the genetic circuit architecture and kinetics in determining the distribution of a complexoform. Our work has a broad impact on our understanding of non-equilibrium processes in both living and synthetic biological systems.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Illustration of the complexoform assembly model system.
Simulated processes and rates are indicated by arrows and italicized text. In the model, genetic circuitry coding for either protein X or protein Y is initiated by a promoter and transcribed to mRNA. Translational machinery then produces proteins X and Y, which are exported and compete for a limited number of scaffold proteins, each with two docking sites. The simulation also incorporates a rate of loss of mRNA and external protein. Generic simulations are carried out with 10 external scaffold proteins (i.e., 20 docking sites).
Fig 2
Fig 2. Genetic circuit architecture modulates complexoform assembly.
Four different genetic circuits expressing proteins X and Y (parallel, cascade, series uncoupled, and series coupled) are simulated with identical first-order rates of transcription (1 t-1), translation (0.1 [C]-1t-1), export (1 t-1), and binding (1 [C]-1t-1) to 10 scaffold proteins each with 2 docking sites. The pie charts indicate the average outcomes from 1,000 stochastic simulations, showing the percentage of XX (black), YY (white), and XY/YX complexoforms (grey). The simulation results demonstrate that the genetic circuit architecture controls the final protein assembly structure.
Fig 3
Fig 3. Computational simulations of complexoform assembly using a generic model system.
Stochastic solutions (XX: black circles; YY: white circles; XY: grey circles) and deterministic solutions (XX: solid black line; YY: dashed line; XY: solid grey line) for parallel (first column), cascade (second column), series uncoupled (third column), and series coupled (fourth column). For the parallel circuit, the deterministic solutions for XX and YY overlap perfectly, so only one line is visible on the graph. For each circuit, variations in translation rate, protein export rate, and binding rate reveal an “equilibrium” regime, in which the stochastic and deterministic solutions are in relatively good agreement, and a “non-equilibrium” regime, in which the stochastic and deterministic solutions disagree. More stringent genetic circuits (e.g., the series coupled circuit) exhibits a lower discrepancy between the stochastic outcome and the deterministic solution.
Fig 4
Fig 4. The discrepancy between the deterministic and stochastic solutions, and stochastic variation in protein assembly structure for four stochastically simulated genetic circuits.
The discrepancy between the stochastic simulations and deterministic solutions (solid line) tends to be greater when rates are high and assembly proceeds quickly, suggesting that the protein assembly in this limit is not accurately represented by the corresponding deterministic solution. However, when assembly is slow, the predicted assembly matches the equilibrium prediction from mixing large amounts of the constituents X and Y, and all circuits yield similar protein assembly products. Circuits with a more controlled sequence of expression (e.g., the series coupled circuit) exhibit an overall better agreement with the deterministic solution. The stochastic variation in assembly (dashed line) also varies between the different circuits, suggesting that the type of genetic circuit, as well as the kinetics of assembly, both play into determining the variability of the assembled structure.
Fig 5
Fig 5. The genetic circuits generate similar assembly times, regardless of which rate is varied.
Shown is the stochastically simulated average fraction of XX (black), YY (white), and XY complexoform (grey) in the protein assemblies when varying translation rate (circles), protein export rate (squares), and protein binding rates (diamonds). Each circuit exhibits a transition from the fast non-equilibrium regime to the slow equilibrium regime, although the exact transition time is dependent on the type of genetic circuit. A notable exception is the series coupled circuit, indicating that the highly sequential architecture of this circuit modulates the protein assembly structure in both the fast and slow regimes.
Fig 6
Fig 6. Stochastic simulations and deterministic solutions for a cellulosome model system, in which two proteins X and Y expressed in a parallel gene circuit bind to 10 scaffold binding sites to form a complexoform.
The outcome of 1,000 stochastic simulations is indicated as grey bars, while the deterministic solution is indicated by a dashed line. The model parameters are summarized in Table 1, with the translation rate constant in this figure being varied across several orders of magnitude: 3.10-5 s-1 (A), 3.10-2 s-1 (B), and 3.101 s-1 (C). For each simulation condition, the stochastic variation in cellulosome assembly is calculated as the standard deviation of the distribution of stochastic simulations. The discrepancy between the stochastic simulations and deterministic solution is calculated as the average difference between each simulation and the deterministic solution.
Fig 7
Fig 7. Computational simulations of cellulosome assembly consisting of a scaffold with 10 available binding sites for proteins X and Y for four stochastically simulated genetic circuits.
The model parameters are summarized in Table 1. The vertical axis indicates the number of scaffold binding sites occupied by protein X. The average of 1,000 stochastic simulations is indicated by solid circles for each condition, with error bars indicating the standard deviation from the simulation. Deterministic solution results are indicated by dashed lines. The discrepancy between stochastic and deterministic solutions (calculated as the average difference between each simulation and the deterministic solution) is indicated by solid lines. Grey dots indicate the estimate of physiological rates (Table 1), about which each rate is varied.

References

    1. Fonslow BR, Moresco JJ, Tu PG, Aalto AP, Pasquinelli AE, Dillin AG. Mass spectrometry-based shotgun proteomic analysis of C. elegans protein complexes. WormBook: the online review of C. elegans biology. 2014:1–18. - PMC - PubMed
    1. Skinner OS, Havugimana PC, Haverland NA, Fornelli L, Early BP, Greer JB, et al. An informatic framework for decoding protein complexes by top-down mass spectrometry. Nature methods. 2016;13(3):237 10.1038/nmeth.3731 - DOI - PMC - PubMed
    1. Van De Waterbeemd M, Fort KL, Boll D, Reinhardt-Szyba M, Routh A, Makarov A, et al. High-fidelity mass analysis unveils heterogeneity in intact ribosomal particles. Nature methods. 2017;14(3):283 10.1038/nmeth.4147 - DOI - PubMed
    1. Ben-Nissan G, Belov ME, Morgenstern D, Levin Y, Dym O, Arkind G, et al. Triple-stage mass spectrometry unravels the heterogeneity of an endogenous protein complex. Analytical chemistry. 2017;89(8):4708–4715. 10.1021/acs.analchem.7b00518 - DOI - PMC - PubMed
    1. Bayer EA, Belaich JP, Shoham Y, Lamed R. The cellulosomes: multienzyme machines for degradation of plant cell wall polysaccharides. Annu. Rev. Microbiol. 2004. October 13;58:521–554. 10.1146/annurev.micro.57.030502.091022 - DOI - PubMed

Publication types