Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 20;22(1):605.
doi: 10.1186/s12859-021-04510-z.

COSNet i : ComplexOme-Structural Network Interpreter used to study spatial enrichment in metazoan ribosomes

Affiliations

COSNet i : ComplexOme-Structural Network Interpreter used to study spatial enrichment in metazoan ribosomes

Federico Martinez-Seidel et al. BMC Bioinformatics. .

Abstract

Background: Upon environmental stimuli, ribosomes are surmised to undergo compositional rearrangements due to abundance changes among proteins assembled into the complex, leading to modulated structural and functional characteristics. Here, we present the ComplexOme-Structural Network Interpreter ( COSNet i ), a computational method to allow testing whether ribosomal proteins (rProteins) that exhibit abundance changes under specific conditions are spatially confined to particular regions within the large ribosomal complex.

Results: COSNet i translates experimentally determined structures into graphs, with nodes representing proteins and edges the spatial proximity between them. In its first implementation, COSNet i considers rProteins and ignores rRNA and other objects. Spatial regions are defined using a random walk with restart methodology, followed by a procedure to obtain a minimum set of regions that cover all proteins in the complex. Structural coherence is achieved by applying weights to the edges reflecting the physical proximity between purportedly contacting proteins. The weighting probabilistically guides the random-walk path trajectory. Parameter tuning during region selection provides the option to tailor the method to specific biological questions by yielding regions of different sizes with minimum overlaps. In addition, other graph community detection algorithms may be used for the COSNet i workflow, considering that they yield different sized, non-overlapping regions. All tested algorithms result in the same node kernels under equivalent regions. Based on the defined regions, available abundance change information of proteins is mapped onto the graph and subsequently tested for enrichment in any of the defined spatial regions. We applied COSNet i to the cytosolic ribosome structures of Saccharomyces cerevisiae, Oryctolagus cuniculus, and Triticum aestivum using datasets with available quantitative protein abundance change information. We found that in yeast, substoichiometric rProteins depleted from translating polysomes are significantly constrained to a ribosomal region close to the tRNA entry and exit sites.

Conclusions: COSNet i offers a computational method to partition multi-protein complexes into structural regions and a statistical approach to test for spatial enrichments of any given subsets of proteins. COSNet i is applicable to any multi-protein complex given appropriate structural and abundance-change data. COSNet i is publicly available as a GitHub repository https://github.com/MSeidelFed/COSNet_i and can be installed using the python installer pip.

Keywords: Omics integration; Ribosomal protein substoichiometry; Ribosome structure; Specialized ribosomes; Structural systems biology.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
COSNeti step-by-step detailed workflow. Also related to Additional file 1 where the two detailed examples from this manuscript were optimized and developed. COSNeti is divided in five steps that must be completed plus accessory functions that allow users to perform quality control checks or produce alternative outputs along the way. The input data is an mmCIF file. Step 1 extracts all the protein entities from the input file as PDBs using split_cif_by_entity.py. Additionally, Step 1’ allows checking the percentage of coverage of each modelled protein sequence as compared to its FASTA sequence using check_cif_completeness.py. Step 2 prepares the PDB files by building a list of combined names for each protein pair using combination.py. In parallel the workflow offers as Step 2’ the opportunity to reindex the residues column inside PDBs in case there are disruptions in the structures that would lead to holes by using reindex_pdb.py or its batch counterpart batch_reindex_pdb.py. Step 3 takes the list of PDB combinations and fits distance matrices across each file pair using calculate_distance.py or its batch counterpart batch_calc_dist.py. Step 4 uses the distance matrices to build a list of contacts and a graph through the use of contacts_from_dist.py. Finally, Step 5.1 integrates the Omics abundances into the graph analyses through intcryomics.py. Alternatively, if there is not a binary Omics file users may rely on Step 5.2 intcryomics_sigassign.py to manually select the protein entities that feature significant changes. Step 5.1’ returns customized graph files that can be used to highlight specific regions in the networks using pimp_my_network.py while Step 5.2’ allows users to investigate structural coherence in the regions selected through the existing graph community detection algorithm Infomap using Region_selection_infomap.py. mmCIF icon was taken from IUCr
Fig. 2
Fig. 2
COSNeti workflow emphasizing the novelties within our consensus random walk sampling procedure. Illustration of structural sampling and testing methodology used in COSNeti to test whether proteome heterogeneity is spatially confined in multi-protein complexes. The upper panel depicts the workflow, divided into three parts: proximity network building from structural data, consensus random walk sampling based on the input network, and statistical testing of the defined regions. The lower panel shows in-depth the novelties within the consensus random walk sampling procedure
Fig. 3
Fig. 3
Ribosomal protein networks at different distance thresholds (dt) between amino acid residues in contact. Highlights of the polypeptide exit tunnel yielded region are outlined in black as a measure of structural and biological accurateness of the obtained networks. The networks were built using the COSNeti workflow (https://github.com/MSeidelFed/COSNet_i) with default values from PDBx/mmCIF entries 6SNT, 6GZ5, and 4V7E corresponding to Saccharomyces cerevisiae (bottom panel), Oryctolagus cuniculus (middle panel) and Triticum aestivum (upper panel) ribosome structures. The networks were analyzed as undirected graphs in Cytoscape [34], a larger node size indicates larger degree, the thickness of edges is defined as a transit probability between nodes calculated based on the number of contacts between each protein pair and the network layout is edge-weighted spring-embedded to simulate a real structurally connected network with forces acting upon it. The 60S subunit nodes have been highlighted in light blue/black, and the 40S subunit nodes in light yellow, nodes that belong to the PET region of the 60S LSU (i.e., region containing eL39 rProtein family) have been highlighted in black. Note that as dt gets larger outlier proteins get into the defined regions while when dt is lower rProteins are not fully interconnected and many nodes are missing. Species icons were exported from BioRender (https://biorender.com/) under a paid license. The network interactions and weights have been compiled in Additional file 2
Fig. 4
Fig. 4
Overlap between Obtained PET Regions of Yeast and Rabbit at Varying Iteration Number for Consensus. The intcryomics.py function was run using a default walking length of 14 of the network nodes (i.e., walking length of 20 nodes) and 4, 9, 15, 21, or 50 iterations. The resulting regions that contained the PET signature rProteins, i.e., eL39 and eL37, were concatenated for a single run, and intersected with the resulting PET regions from other runs. The results from the intersection were visualized using Venn diagrams with the VennDiagram [74] package in R software [75]
Fig. 5
Fig. 5
Optimized yeast and rabbit ribosomal protein networks. a Saccharomyces cerevisiae network, b Oryctolagus cuniculus network built at a contact threshold (dt) of 12 Å between amino acid residues. The network layout is Edge-weighted Spring-Embedded. The weights of edges correspond to the number of contacts between two rProteins and in that sense are proportional to the transit probability defined as the main influence during COSNeti random walk. A larger node size corresponds to a larger node degree. Nodes belonging to the 60S large subunit (LSU) have been colored blue and nodes belonging to the 40S small subunit (SSU) have been colored yellow. Note that there are three conserved/sampled interface pathways between rProteins from the two subunits (Table 3). The network representations have been created in Cytoscape [34]
Fig. 6
Fig. 6
Spatial confinement during ribosome specialization: a test case of the COSNeti workflow. Optimized conditions were used to test whether the distribution of substoichiometric proteins is significantly constrained to specific ribosomal regions in yeast and mammalian systems. The weighed graph used to select regions was optimized as detailed in Fig. 5. The code commands used to produce our results are outlined in Code Chunk 1. The mammalian and yeast systems were tested and only the yeast depleted substoichiometric rProteins were significantly localized in a SSU region (colored in yellow shades) after Bonferroni correction of the Fisher exact test p-values (i.e., Region 2: P = 0.00004, Padj = 0.0005). The mRNA has been colored red to outline its relative location as compared to the region enriched in depleted proteins. Ribosomal structures are rotated 90 in the y-axis at a time in order to visualize the boundaries of the significantly changed region
Fig. 7
Fig. 7
Violin plots of rProtein sequence percentage coverage in interpreted Cryo-EM densities of cytosolic ribosomes. Structures are derived from PDBx/mmCIF entries 6SNT, 6GZ5 and 4V7E corresponding to Saccharomyces cerevisiae (bottom), Oryctolagus cuniculus (middle) and Triticum aestivum (upper) ribosome structures. The percentage of coverage per rProtein was calculated using the check_cif_completeness.py from the COSNeti methodology
Fig. 8
Fig. 8
Histograms summarizing the node degree statistics of the optimized rabbit and yeast ribosomal protein networks. Networks were analyzed in Cytoscape [34]. Distributions of node degrees were plotted in Pareto-scaled histograms featuring the number of nodes on the left y-axis and the proportion of nodes on the right y-axis. Note that in both cases a heavy-tailed distribution peaking at a range of 2.5–4.5 degree characterizes more than 30% of the nodes in both networks. As in other figures and supplemental tables, the rabbit network (6GZ5) is identified by red font and the yeast network (6SNT) is identified by black font

Similar articles

Cited by

References

    1. Reuveni S, Ehrenberg M, Paulsson J. Ribosomes are optimized for autocatalytic production. Nature. 2017;547:293–297. - PMC - PubMed
    1. Baßler J, Hurt E. Eukaryotic ribosome assembly. Annu Rev Biochem. 2019;88:281–306. - PubMed
    1. Woolford JL, Baserga SJ. Ribosome biogenesis in the yeast Saccharomyces cerevisiae. Genetics. 2013;195:643–81. - PMC - PubMed
    1. Sáez-Vásquez J, Delseny M. Ribosome biogenesis in plants: from functional 45S ribosomal DNA organization to ribosome assembly factors. Plant Cell. 2019;31:1945–67. - PMC - PubMed
    1. Emmott E, Jovanovic M, Slavov N. Ribosome stoichiometry: from form to function. Trends Biochem Sci. 2019;44:95–109. - PMC - PubMed

LinkOut - more resources