Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Sep;9(9):e1001144.
doi: 10.1371/journal.pbio.1001144. Epub 2011 Sep 6.

Genetic variation shapes protein networks mainly through non-transcriptional mechanisms

Affiliations

Genetic variation shapes protein networks mainly through non-transcriptional mechanisms

Eric J Foss et al. PLoS Biol. 2011 Sep.

Abstract

Networks of co-regulated transcripts in genetically diverse populations have been studied extensively, but little is known about the degree to which these networks cause similar co-variation at the protein level. We quantified 354 proteins in a genetically diverse population of yeast segregants, which allowed for the first time construction of a coherent protein co-variation matrix. We identified tightly co-regulated groups of 36 and 93 proteins that were made up predominantly of genes involved in ribosome biogenesis and amino acid metabolism, respectively. Even though the ribosomal genes were tightly co-regulated at both the protein and transcript levels, genetic regulation of proteins was entirely distinct from that of transcripts, and almost no genes in this network showed a significant correlation between protein and transcript levels. This result calls into question the widely held belief that in yeast, as opposed to higher eukaryotes, ribosomal protein levels are regulated primarily by regulating transcript levels. Furthermore, although genetic regulation of the amino acid network was more similar for proteins and transcripts, regression analysis demonstrated that even here, proteins vary predominantly as a result of non-transcriptional variation. We also found that cis regulation, which is common in the transcriptome, is rare at the level of the proteome. We conclude that most inter-individual variation in levels of these particular high abundance proteins in this genetically diverse population is not caused by variation of their underlying transcripts.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Three possibilities for regulating protein levels.
Protein levels can be regulated through control of transcript levels, translation, and protein stability. The weight of the arrows here indicates the size of the effect. In this example in which protein levels are controlled primarily non-transcriptionally, there will nonetheless be a correlation between protein and transcript levels.
Figure 2
Figure 2. Networks of protein-protein and transcript-transcript co-regulation.
Communities are defined on the basis of k-cliques, complete (and in our case undirected) subgraphs of size k, and are comprised of the union of all k-cliques that can be reached from each other through a series of adjacent k-cliques (where adjacency means sharing k−1 nodes). The most stringently defined (highest k value) protein community was a 35-clique community notably enriched for genes involved in amino acid metabolism. Lowering the k threshold to 19 simply expanded this community whereas at k = 18, there appeared five closely related communities (which we will henceforth refer to as a single community) that were enriched for ribosomal proteins. An analogous approach with transcripts yielded communities of 67 and 127 genes involved in amino acid metabolism and ribosome biogenesis, respectively. No gene was in both protein communities and just two genes were in both transcript communities. Communities of co-regulated proteins are on the left in blue and communities of co-regulated transcripts are on the right in green. In both data sets, the ribosomal community is above the amino acid community. Connections are plotted only for the 160 most highly connected genes within each data set, regardless of those genes' membership in any community. The two genes that are present in both transcript communities, VAS1 and YHR020W, are arbitrarily plotted in the ribosomal community and connections involving these genes are not plotted.
Figure 3
Figure 3. Genetic regulation of protein and transcript levels.
(A) Correlation coefficients between protein and transcript levels are plotted in ascending order. The dashed lines at 0.24 and 0.31 indicate the average cutoffs for significance at p<0.05 and p<0.01, respectively, based on 1,000 permutations done separately for each gene. A very small number of genes show significant negative correlations. These may reflect protein-destabilizing polymorphisms for which the cell tries to compensate by increasing transcription, though we note that none of these loci are on the same chromosome as the regulated gene and thus any destabilization would have to act in trans. (B) Distribution of genetic regulators of proteins (top) and transcripts (bottom). The genome was divided into 20 kb bins arranged from the beginning of chromosome 1 on the left to the end of chromosome 16 on the right. The number of linkages in each bin is plotted for proteins on the top and transcripts on the bottom. Dashed vertical lines indicate the borders between chromosomes. The two insets show the locations of genes regulated by the hotspot on chromosome 3 (upper left insert) and those regulated by the hotspot on chromosome 13 (lower right insert) with proteins on the top and transcripts on the bottom. (The hotspot on chromosome 3 is likely caused by a deletion of LEU2 combined with a tightly linked polymorphism in ILV6 and the hotspot on chromosome 13 is likely caused by a polymorphism in BUL2 . The horizontal line in the insets represents the genomic location, going from the beginning of chromosome 1 on the left to the end of chromosome 16 on the right. (C) Locations of genetic regulators specifically for those genes in the protein ribosomal network are plotted as in (B). (D) Locations of genetic regulators specifically for those genes in the protein amino acid network are plotted as in (B).
Figure 4
Figure 4. Regression analysis to test for causality.
Panels A through C show an example of how the coordinates for a single point (representing the linkage between a locus at base pair 662,627 on chromosome 12 and the level of protein ACS2) in panel D were obtained. (A) The levels of protein ACS2 are higher in segregants that inherited SNP 12_662,627 from the RM (vineyard) compared to segregants that inherited this SNP from the BY (laboratory) parent. The p value for linkage is 6.03×10−10, and the negative log of the p value is 9.22. (B) Each segregant is plotted according to levels of the ACS2 transcript on the x-axis and the ACS2 protein on the y-axis and a regression line was calculated (slope = 0.62, y intercept = 0.0058). The regression line was used to calculate the residuals for protein levels after protein levels are regressed on transcript levels. The original values for two points (indicated by arrows) are shown as dashed lines and the corresponding residuals are shown as solid lines. (C) Same as panel A except rather than plotting the original protein levels (dashed lines in B), the residual protein levels after proteins have been regressed on transcripts (solid lines in B) are plotted. Unlike the original ACS2 protein levels (plotted in A), regressed protein levels are not different between the segregants that inherited BY and RM SNP 12_662,627 marker (p value of 0.178, the negative log of which is 0.75). (D) Each protein linkage with a p value less than 0.05 is plotted according to the negative log of the p value for linkage on the x-axis and the negative log of the p value for linkage after protein levels have been regressed on transcript levels on the y-axis. The horizontal line at 1.3 indicates the cutoff for significant linkage after normalization for transcript levels, and the three proteins that lose linkage after regression are indicated by arrows. Points close to the diagonal line, like those in the gray oval, are essentially unaffected by normalization for transcript levels. Two points were omitted to help with scale. The single point at coordinates (9.22, 0.75) whose calculation is described in panels A through C is noted.
Figure 5
Figure 5. cis-linkage for the proteome versus the transcriptome.
Each transcript linkage (p<0.05) is plotted on the left according to the location of the regulatory locus on the x-axis and the regulated gene on the y-axis, and the same is done for proteins on the right. Vertical strips indicate hot spots and points falling on the diagonal (gray line) indicate cis-linkage, i.e. the location of the genetic regulator is the same as that of the regulated gene.
Figure 6
Figure 6. Similarity between genetic regulation of amino acid transcripts and ribosomal proteins.
Loci that influence transcript and protein levels (p<0.05) are plotted as in Figure 2B–D, but regulators of proteins in the protein ribosomal community are plotted above the horizontal line while regulators of transcripts in the transcript amino acid community are plotted below the line.

Comment in

References

    1. Chen Y, Zhu J, Lum P. Y, Yang X, Pinto S, et al. Variations in DNA elucidate molecular networks that cause disease. Nature. 2008;452:429–435. - PMC - PubMed
    1. Emilsson V, Thorleifsson G, Zhang B, Leonardson A. S, Zink F, et al. Genetics of gene expression and its effect on disease. Nature. 2008;452:423–428. - PubMed
    1. de la Fuente A. From ‘differential expression’ to ‘differential networking’ - identification of dysfunctional regulatory networks in diseases. Trends Genet. 2010;26:326–333. - PubMed
    1. Ravasi T, Suzuki H, Cannistraci C. V, Katayama S, Bajic V. B, et al. An atlas of combinatorial transcriptional regulation in mouse and man. Cell. 2010;140:744–752. - PMC - PubMed
    1. Schadt E. E. Molecular networks as sensors and drivers of common human diseases. Nature. 2009;461:218–223. - PubMed

Publication types