Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 3;39(5):msac087.
doi: 10.1093/molbev/msac087.

Low Complexity Regions in Mammalian Proteins are Associated with Low Protein Abundance and High Transcript Abundance

Affiliations

Low Complexity Regions in Mammalian Proteins are Associated with Low Protein Abundance and High Transcript Abundance

Zachery W Dickson et al. Mol Biol Evol. .

Abstract

Low Complexity Regions (LCRs) are present in a surprisingly large number of eukaryotic proteins. These highly repetitive and compositionally biased sequences are often structurally disordered, bind promiscuously, and evolve rapidly. Frequently studied in terms of evolutionary dynamics, little is known about how LCRs affect the expression of the proteins which contain them. It would be expected that rapidly evolving LCRs are unlikely to be tolerated in strongly conserved, highly abundant proteins, leading to lower overall abundance in proteins which contain LCRs. To test this hypothesis and examine the associations of protein abundance and transcript abundance with the presence of LCRs, we have integrated high-throughput data from across mammals. We have found that LCRs are indeed associated with reduced protein abundance, but are also associated with elevated transcript abundance. These associations are qualitatively consistent across 12 human tissues and nine mammalian species. The differential impacts of LCRs on abundance at the protein and transcript level are not explained by differences in either protein degradation rates or the inefficiency of translation for LCR containing proteins. We suggest that rapidly evolving LCRs are a source of selective pressure on the regulatory mechanisms which maintain steady-state protein abundance levels.

Keywords: low complexity, protein abundance, transcript abundance, protein regulation.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Under permutation all abundance quartiles are significantly different based on LCR status. The distribution of shifts in abundance quartiles after one million permutations is shown; the lower plots are insets of the upper plots. The observed shift for each quartile (dashed lines) can be compared to the matching distribution of location shifts under permutation. LCR+ proteins have lower abundance for all quartiles (P<0.023) but higher for TAb at all three quartiles (P2×106).
Fig. 2.
Fig. 2.
Observed shifts in LCR status remain after controlling for several technical explanations, and known biological conditions. Bars represent empirical P values, calculated as the proportion of 100,000 permutations of GTEx and PaxDB human data with quartile shifts at least as large as the observed shift. Bars above the horizontal axis have observed shifts where LCR+ proteins are greater than the null expectation, while bars below the horizontal axis represent observed shifts where LCR+ proteins are below the null expectation. Dotted horizontal lines indicate a significance threshold of 0.05. All results are qualitatively similar to the baseline analysis.
Fig. 3.
Fig. 3.
Differences in abundance between LCR+ and LCR proteins are consistent across tissues in humans. Bars represent empirical P values, calculated as the proportion of 1×106 permutations of GTEx and PaxDB human data with quartile shifts at least as large as the observed shift. Bars above the horizontal axis have observed shifts where LCR+ proteins are greater than the null expectation, while bars below the horizontal axis represent observed shifts where LCR+ proteins are below the null expectation. Dotted horizontal lines indicate a significance threshold of 0.05. TAb shifts are consistently, significantly positive while the PAb shifts are consistently, significantly negative.
Fig. 4.
Fig. 4.
Logistic regression shows PAb and TAb are significantly associated with the probability of a protein containing an LCR. Regression is based on GTEx and PaxDB human abundance data as well as Shwanhäusser mouse data controlling for protein degradation and translation efficiency. Regressors are standardized so that the effect magnitudes may be compared (A) Estimated regression coefficients (maroon lines) are compared to a standard normal distribution (Teal). PAb and LCRs are negatively correlated, while the opposite is true for TAb. (B) Split violin plots showing the distributions of the regressors’ values across LCR+ (maroon) and LCR (gray) proteins. Yellow bars indicate the median, and interquartile for the distribution in which the bar is embedded.
Fig. 5.
Fig. 5.
TAb is associated with an increased probability of an LCR being present, based on consistently processed RNA-Seq data from nine mammalian species. Regressors are standardized so that the effect magnitudes may be compared. (A) Estimated regression coefficients (contrastingly colored lines) are compared to a standard normal distribution (Teal). In all cases, TAb is positively associated with the presence of LCRs. (B) Split violin plots showing the distributions of the regressors’ values across LCR+ (maroon) and LCR (gray) proteins. Yellow bars indicate the median, and interquartile for the distribution in which the bar is embedded.

Similar articles

Cited by

References

    1. Bihorel S, Baudin M. 2018. neldermead: R Port of the “Scilab” Neldermead Module. R package version 1.0-11.
    1. Brawand D, Soumillon M, Necsulea A, Julien P, Csardi G, Harrigan P, Weier M, Liechti A, Aximu-Petri A, Kircher M, et al. 2011. The evolution of gene expression levels in mammalian organs. Nature 478:343–348. - PubMed
    1. Cambridge SB, Gnad F, Nguyen C, Bermejo JL, Krüger M, Mann M. 2011. Systems-wide proteomic analysis in mammalian cells reveals conserved, functional protein turnover. J Proteome Res. 10:5275–5284. - PubMed
    1. Carelli F, Liechti A, Halbert J, Warnefors M, Kaessmann H. 2018. Repurposing of promoters and enhancers during mammalian evolution. Nat Commun. 9:4066. - PMC - PubMed
    1. Cascarina SM, Ross ED. 2018. Proteome-scale relationships between local amino acid composition and protein fates and functions. PLoS Comput Biol. 14:1–33. - PMC - PubMed

Publication types