Dissecting the role of low-complexity regions in the evolution of vertebrate proteins
- PMID: 22920595
- PMCID: PMC3523016
- DOI: 10.1186/1471-2148-12-155
Dissecting the role of low-complexity regions in the evolution of vertebrate proteins
Abstract
Background: Low-complexity regions (LCRs) in proteins are tracts that are highly enriched in one or a few amino acids. Given their high abundance, and their capacity to expand in relatively short periods of time through replication slippage, they can greatly contribute to increase protein sequence space and generate novel protein functions. However, little is known about the global impact of LCRs on protein evolution.
Results: We have traced back the evolutionary history of 2,802 LCRs from a large set of homologous protein families from H.sapiens, M.musculus, G.gallus, D.rerio and C.intestinalis. Transcriptional factors and other regulatory functions are overrepresented in proteins containing LCRs. We have found that the gain of novel LCRs is frequently associated with repeat expansion whereas the loss of LCRs is more often due to accumulation of amino acid substitutions as opposed to deletions. This dichotomy results in net protein sequence gain over time. We have detected a significant increase in the rate of accumulation of novel LCRs in the ancestral Amniota and mammalian branches, and a reduction in the chicken branch. Alanine and/or glycine-rich LCRs are overrepresented in recently emerged LCR sets from all branches, suggesting that their expansion is better tolerated than for other LCR types. LCRs enriched in positively charged amino acids show the contrary pattern, indicating an important effect of purifying selection in their maintenance.
Conclusion: We have performed the first large-scale study on the evolutionary dynamics of LCRs in protein families. The study has shown that the composition of an LCR is an important determinant of its evolutionary pattern.
Figures




Similar articles
-
Low-complexity regions in fungi display functional groups and are depleted in positively charged amino acids.NAR Genom Bioinform. 2025 Feb 27;7(1):lqaf014. doi: 10.1093/nargab/lqaf014. eCollection 2025 Mar. NAR Genom Bioinform. 2025. PMID: 40041205 Free PMC article.
-
Low Complexity Regions in Mammalian Proteins are Associated with Low Protein Abundance and High Transcript Abundance.Mol Biol Evol. 2022 May 3;39(5):msac087. doi: 10.1093/molbev/msac087. Mol Biol Evol. 2022. PMID: 35482425 Free PMC article.
-
Key Role of Amino Acid Repeat Expansions in the Functional Diversification of Duplicated Transcription Factors.Mol Biol Evol. 2015 Sep;32(9):2263-72. doi: 10.1093/molbev/msv103. Epub 2015 Apr 29. Mol Biol Evol. 2015. PMID: 25931513 Free PMC article.
-
Disentangling the complexity of low complexity proteins.Brief Bioinform. 2020 Mar 23;21(2):458-472. doi: 10.1093/bib/bbz007. Brief Bioinform. 2020. PMID: 30698641 Free PMC article. Review.
-
Molecular-evolutionary mechanisms for genomic disorders.Curr Opin Genet Dev. 2002 Jun;12(3):312-9. doi: 10.1016/s0959-437x(02)00304-0. Curr Opin Genet Dev. 2002. PMID: 12076675 Review.
Cited by
-
A unified view of low complexity regions (LCRs) across species.Elife. 2022 Sep 13;11:e77058. doi: 10.7554/eLife.77058. Elife. 2022. PMID: 36098382 Free PMC article.
-
Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases.Nucleic Acids Res. 2019 Dec 2;47(21):10994-11006. doi: 10.1093/nar/gkz841. Nucleic Acids Res. 2019. PMID: 31584084 Free PMC article. Review.
-
Ancient lineages of the keratin-associated protein (KRTAP) genes and their co-option in the evolution of the hair follicle.BMC Ecol Evol. 2023 Mar 20;23(1):7. doi: 10.1186/s12862-023-02107-z. BMC Ecol Evol. 2023. PMID: 36941546 Free PMC article.
-
The low-complexity domain of the FUS RNA binding protein self-assembles via the mutually exclusive use of two distinct cross-β cores.Proc Natl Acad Sci U S A. 2021 Oct 19;118(42):e2114412118. doi: 10.1073/pnas.2114412118. Proc Natl Acad Sci U S A. 2021. PMID: 34654750 Free PMC article.
-
Increased substitution rates surrounding low-complexity regions within primate proteins.Genome Biol Evol. 2014 Mar;6(3):655-65. doi: 10.1093/gbe/evu042. Genome Biol Evol. 2014. PMID: 24572016 Free PMC article.
References
-
- Wootton JC, Federhen S. Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 1996;266:554–571. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources