Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Nov 4;47(19):9998-10009.
doi: 10.1093/nar/gkz730.

Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved

Affiliations

Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved

Chrysa Ntountoumi et al. Nucleic Acids Res. .

Abstract

We provide the first high-throughput analysis of the properties and functional role of Low Complexity Regions (LCRs) in more than 1500 prokaryotic and phage proteomes. We observe that, contrary to a widespread belief based on older and sparse data, LCRs actually have a significant, persistent and highly conserved presence and role in many and diverse prokaryotes. Their specific amino acid content is linked to proteins with certain molecular functions, such as the binding of RNA, DNA, metal-ions and polysaccharides. In addition, LCRs have been repeatedly identified in very ancient, and usually highly expressed proteins of the translation machinery. At last, based on the amino acid content enriched in certain categories, we have developed a neural network web server to identify LCRs and accurately predict whether they can bind nucleic acids, metal-ions or are involved in chaperone functions. An evaluation of the tool showed that it is highly accurate for eukaryotic proteins as well.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
(A) Frequency and (B) enrichment of amino acids in LCRs. Enrichment was based on the background frequency obtained from the complete set of analyzed proteomes. The order of amino acids in the graphs is based on their biosynthetic energetic cost, as calculated in (44).

Similar articles

Cited by

References

    1. Wootton J.C. Non-globular domains in protein sequences: automated segmentation using complexity measures. Comput. Chem. 1994; 18:269–285. - PubMed
    1. Wootton J.C., Drummond M.H.. The Q-linker: a class of interdomain sequences found in bacterial multidomain regulatory proteins. Protein. Eng. 1989; 2:535–543. - PubMed
    1. Huntley M.A., Golding G.B.. Simple sequences are rare in the Protein Data Bank. Proteins. 2002; 48:134–140. - PubMed
    1. Muralidharan V., Oksman A., Iwamoto M., Wandless T.J., Goldberg D.E.. Asparagine repeat function in a Plasmodium falciparum protein assessed via a regulatable fluorescent affinity tag. Proc. Natl. Acad. Sci. U.S.A. 2011; 108:4411–4416. - PMC - PubMed
    1. Altschul S.F., Gish W., Miller W., Myers E.W., Lipman D.J.. Basic local alignment search tool. J. Mol. Biol. 1990; 215:403–410. - PubMed

Publication types