Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 8:9:285.
doi: 10.1186/1471-2148-9-285.

Codon usage is associated with the evolutionary age of genes in metazoan genomes

Affiliations

Codon usage is associated with the evolutionary age of genes in metazoan genomes

Yosef Prat et al. BMC Evol Biol. .

Abstract

Background: Codon usage may vary significantly between different organisms and between genes within the same organism. Several evolutionary processes have been postulated to be the predominant determinants of codon usage: selection, mutation, and genetic drift. However, the relative contribution of each of these factors in different species remains debatable. The availability of complete genomes for tens of multicellular organisms provides an opportunity to inspect the relationship between codon usage and the evolutionary age of genes.

Results: We assign an evolutionary age to a gene based on the relative positions of its identified homologues in a standard phylogenetic tree. This yields a classification of all genes in a genome to several evolutionary age classes. The present study starts from the observation that each age class of genes has a unique codon usage and proceeds to provide a quantitative analysis of the codon usage in these classes. This observation is made for the genomes of Homo sapiens, Mus musculus, and Drosophila melanogaster. It is even more remarkable that the differences between codon usages in different age groups exhibit similar and consistent behavior in various organisms. While we find that GC content and gene length are also associated with the evolutionary age of genes, they can provide only a partial explanation for the observed codon usage.

Conclusion: While factors such as GC content, mutational bias, and selection shape the codon usage in a genome, the evolutionary history of an organism over hundreds of millions of years is an overlooked property that is strongly linked to GC content, protein length, and, even more significantly, to the codon usage of metazoan genomes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Phylogenetic tree used to define the relative age groups for the human and mouse genes. The labeled age classes were defined as the major evolutionary branching points with respect to the 27 genomes analyzed and the species of interest (human or mouse). Thus, genes are grouped according to their estimated time of appearance in evolution. For example, human genes in Age group 5 are presumed to have appeared after the split between birds and mammals, since they do not have homologues in the non-mammal species studied. On the other hand, they already existed in the least common ancestor (LCA) of all mammals, as evidenced by their respective homologues in O. anatinus. The species included in the analysis are: C. elegans (worm), D. melanogaster (fruitfly), T. rubripes (fugu), X. tropicalis (xenopus), G. gallus (chicken), O. anatinus (platypus), M. domestica (Gray Short-tailed opossum), B. Taurus (cow), C. familiaris (dog), D. novemcinctus (nine-banded armadillo), E. telfairi (lesser hedgehog tenrec), E. europaeus (west european hedgehog), F. catus (cat), L. Africana (elephant), M. lucifugus (bat), S. araneus (common shrew), C. porcellus (guinea pig), M. musculus (mouse), O. princes (pika), O. cuniculus (rabbit), R. norvegicos (rat), S. tridecemlineatu (squirrel), P. troglodytes (chimpanzee), M. mulatta (macaque), M. murinus (gray mouse lemur), O. garnettii (bushbaby), and H. sapiens (human). For the analysis of the human genome, Age 1 includes only primate-specific genes, while for the analysis of the mouse genome, Age 1 includes only rabbit and rodent-specific genes. Note that the evolutionary time scale (in millions of years ago, MYA) is approximate.
Figure 2
Figure 2
Age-dependent codon usages for representative codons. The age dependent codon usages for the arginine (top), threonine (middle), and cysteine (bottom) codons for the mouse and human genes are shown. In the right column, the codon usages for these amino acids after a random reshuffling of the age assignments for the human genes are shown. See Figure 1 for the definition of the age groups used.
Figure 3
Figure 3
Age-dependent GC content and length of human, mouse, and fly genes. For each age group, the average GC content of the coding regions of the genes, or average protein length, is shown. See Figure 1 and Table 2 for the definition of the age groups used. For each of human, mouse, and fly, the variance between age groups for both GC content and protein length is statistically significant (permutation test, p < 10-6).
Figure 4
Figure 4
Age-dependent codon usage for fixed GC content and length. For each of human, mouse, and fly (left to right), its genes were binned by either their GC contents (top) or lengths (bottom). For each such bin, the number of codons (out of the 59 analyzed) with statistically significant age-dependent variance is shown. A particular codon was labeled as being age-dependent if its variance between ages was different than random (false discovery rate (FDR) corrected for multiple hypotheses < 0.05). Red 'X' mark bins for which the sub-division into age groups resulted in some groups having fewer than 5 genes; the results for such bins should be disregarded, since they are statistically inadequate.
Figure 5
Figure 5
Codon age-responsiveness for the 59 degenerately coding codons. For each codon, the age-dependent variance was calculated. For each genome, the 59 resulting variance scores were rank-ordered. The 59 codons are sorted by their rank-ordering in the human genome, and the rank-ordering in the mouse genome is compared. A strong overall similarity of codon rank-ordering between the human and mouse genomes is shown. Spearman's rank correlation test: ρ = 0.92, p-value < 10-6. The codons are colored according to a standard biochemical grouping of the amino acids for which they encode.

Similar articles

Cited by

References

    1. Kanaya S, Yamada Y, Kinouchi M, Kudo Y, Ikemura T. Codon usage and tRNA genes in eukaryotes: correlation of codon usage diversity with translation efficiency and with CG-dinucleotide usage as assessed by multivariate analysis. J Mol Evol. 2001;53:290–298. doi: 10.1007/s002390010219. - DOI - PubMed
    1. Bulmer M. The selection-mutation-drift theory of synonymous codon usage. Genetics. 1991;129:897–907. - PMC - PubMed
    1. Sharp PM, Averof M, Lloyd AT, Matassi G, Peden JF. DNA sequence evolution: the sounds of silence. Philos Trans R Soc Lond B Biol Sci. 1995;349:241–247. doi: 10.1098/rstb.1995.0108. - DOI - PubMed
    1. Sharp PM, Tuohy TM, Mosurski KR. Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res. 1986;14:5125–5143. doi: 10.1093/nar/14.13.5125. - DOI - PMC - PubMed
    1. Moriyama EN, Powell JR. Gene length and codon usage bias in Drosophila melanogaster, Saccharomyces cerevisiae and Escherichia coli. Nucleic Acids Res. 1998;26:3188–3193. doi: 10.1093/nar/26.13.3188. - DOI - PMC - PubMed

Publication types

LinkOut - more resources