Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2004 Apr;14(4):549-54.
doi: 10.1101/gr.1925704.

Comparative analysis of amino acid repeats in rodents and humans

Affiliations
Comparative Study

Comparative analysis of amino acid repeats in rodents and humans

M Mar Albà et al. Genome Res. 2004 Apr.

Abstract

Amino acid tandem repeats, also called homopolymeric tracts, are extremely abundant in eukaryotic proteins. To gain insight into the genome-wide evolution of these regions in mammals, we analyzed the repeat content in a large data set of rat-mouse-human orthologs. Our results show that human proteins contain more amino acid repeats than rodent proteins and that trinucleotide repeats are also more abundant in human coding sequences. Using the human species as an outgroup, we were able to address differences in repeat loss and repeat gain in the rat and mouse lineages. In this data set, mouse proteins contain substantially more repeats than rat proteins, which can be at least partly attributed to a higher repeat loss in the rat lineage. The data are consistent with a role for trinucleotide slippage in the generation of novel amino acid repeats. We confirm the previously observed functional bias of proteins with repeats, with overrepresentation of transcription factors and DNA-binding proteins. We show that genes encoding amino acid repeats tend to have an unusually high GC content, and that differences in coding GC content among orthologs are directly related to the presence/absence of repeats. We propose that the different GC content isochore structure in rodents and humans may result in an increased amino acid repeat prevalence in the human lineage.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Tandem amino acid repeat counts in 7039 rat–mouse–human orthologous proteins. A repeat size cut-off of at least 5 repeat units, or at least 8 repeat units, was used.
Figure 2
Figure 2
Gene ontology (GO) functions overrepresented in human proteins containing different amino acid repeat types (P < 0.05, correcting for multiple tests).
Figure 3
Figure 3
Percentage of coding sequences in different GC content ratio intervals. “all” refers to the totality of orthologous coding regions, and “with rep” refers to coding regions encoding amino acid repeats, after discarding the regions encoding repeats.
Figure 4
Figure 4
Comparison of coding GC content ratio content between orthologous coding sequences encoding amino acid repeats. (A) Rat–mouse comparison, (B) rat–human comparison, (C) mouse–human comparison. In the calculation of the GC content, all regions encoding repeats were eliminated. Two types of data sets were used: (1) only one species ortholog contained repeat/s (e.g., “rat” in A corresponds to rat–mouse orthologous pairs in which only the rat gene encodes amino acid repeats), and (2) both orthologs encoded at least one repeat in an equivalent position (conserved). In each data set, the fraction of pairs in which GC content ratio was superior in one of the two species (e.g., mouse > rat) was calculated. The number of sequence pairs ranged from 65 in C (mouse) to 721 in A (conserved).

References

    1. Adkins, R.M., Gelke, E.L., Rowe, D., and Honeycutt, R.L. 2001. Molecular phylogeny and divergence time estimates for major rodent groups: Evidence from multiple genes. Mol. Biol. Evol. 18: 777-791. - PubMed
    1. Albà, M.M., Santibáñez-Koref, M.F., and Hancock, J.M. 1999a. Amino acid reiterations in yeast are over-represented in particular classes of proteins and show evidence of a slippage-like mutational process. J. Mol. Evol. 49: 789-797. - PubMed
    1. Albà, M.M., Santibáñez-Koref, M.F., and Hancock, J.M. 1999b. Conservation of polyglutamine tract size between mouse and human depends on codon interruption. Mol. Biol. Evol. 16: 1641-1644. - PubMed
    1. Albà, M.M., Santibáñez-Koref, M.F., and Hancock, J.M. 2001. The comparative genomics of polyglutamine repeats: Extreme difference in the codon organization of repeat-encoding regions between mammals and Drosophila. J. Mol. Evol. 52: 249-259. - PubMed
    1. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., et al. 2000. Gene ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25: 25-29. - PMC - PubMed

Publication types

LinkOut - more resources