Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Sep 2;18(1):391.
doi: 10.1186/s12859-017-1793-7.

A new and updated resource for codon usage tables

Affiliations

A new and updated resource for codon usage tables

John Athey et al. BMC Bioinformatics. .

Abstract

Background: Due to the degeneracy of the genetic code, most amino acids can be encoded by multiple synonymous codons. Synonymous codons naturally occur with different frequencies in different organisms. The choice of codons may affect protein expression, structure, and function. Recombinant gene technologies commonly take advantage of the former effect by implementing a technique termed codon optimization, in which codons are replaced with synonymous ones in order to increase protein expression. This technique relies on the accurate knowledge of codon usage frequencies. Accurately quantifying codon usage bias for different organisms is useful not only for codon optimization, but also for evolutionary and translation studies: phylogenetic relations of organisms, and host-pathogen co-evolution relationships, may be explored through their codon usage similarities. Furthermore, codon usage has been shown to affect protein structure and function through interfering with translation kinetics, and cotranslational protein folding.

Results: Despite the obvious need for accurate codon usage tables, currently available resources are either limited in scope, encompassing only organisms from specific domains of life, or greatly outdated. Taking advantage of the exponential growth of GenBank and the creation of NCBI's RefSeq database, we have developed a new database, the High-performance Integrated Virtual Environment-Codon Usage Tables (HIVE-CUTs), to present and analyse codon usage tables for every organism with publicly available sequencing data. Compared to existing databases, this new database is more comprehensive, addresses concerns that limited the accuracy of earlier databases, and provides several new functionalities, such as the ability to view and compare codon usage between individual organisms and across taxonomical clades, through graphical representation or through commonly used indices. In addition, it is being routinely updated to keep up with the continuous flow of new data in GenBank and RefSeq.

Conclusion: Given the impact of codon usage bias on recombinant gene technologies, this database will facilitate effective development and review of recombinant drug products and will be instrumental in a wide area of biological research. The database is available at hive.biochemistry.gwu.edu/review/codon .

Keywords: Codon optimization; Codon usage bias; Recombinant protein therapeutics; Translational kinetics.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
HIVE Platform [36]. A client process submits the information request from the HTML form or web application into the HIVE server; this request is queued for execution and it is computed inside the distributed environment. The front end monitors the status of the request and once the computation is finished, data is retrieved and visualizations are prepared to be sent to the client’s web page
Fig. 2
Fig. 2
Screenshot of HIVE-CUTs webpage with Homo sapiens results. Results include codon usage frequencies per 1000 codons as a plain text table (top left) and graph (bottom), in the default order specified by NCBI’s standard genetic code definition. The GC frequency in the genome and at each codon position is also presented in a graph (top right). The help panel is included (right)
Fig. 3
Fig. 3
Screenshots of HIVE-CUTs webpage with Candida albicans, Saccharomyces cerevisiae, and Aspergillus fumigatus results. a Taxonomy tree showing the evolutionary relationship between the species. b The GC frequency in the genome and at each position of the codon plotted for all three species for comparison. c Codon frequencies per 1000 codons plotted for all three species
Fig. 4
Fig. 4
Differences in codon optimization based on the HIVE-CUT and the Kazusa codon usage tables. The HIVE-CUT and the Kazusa codon usage tables were entered in the codon optimization algorithm ATGme to determine the number of suboptimal codons [39]. The Venn diagram shows how many codons were determined to be sub-optimal in the human coagulation factor IX gene for expression in CHO (Cricetulus griseus) cells. The codon usage tables used appear in Additional file 2
Fig. 5
Fig. 5
Rare codon cluster distribution based on the HIVE-CUT and the Kazusa codon usage tables. The %MinMax algorithm [40] was implemented to generate results for the interferon beta-1b gene sequence of Homo sapiens and Gorilla gorilla gorilla. The human and gorilla proteins have similar amino acid sequences and show similar results with the HIVE-CUT; however, highly divergent results were observed with Kazusa CUTs. The codon usage tables for these species used in the calculation of the translation rate appear in Additional file 3

References

    1. Grantham R, Gautier C, Gouy M, Mercier R, Pave A. Codon catalog usage and the genome hypothesis. Nucleic Acids Res. 1980;8(1):r49–r62. doi: 10.1093/nar/8.1.197-c. - DOI - PMC - PubMed
    1. dos Reis M, Savva R, Wernisch L. Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 2004;32(17):5036–5044. doi: 10.1093/nar/gkh834. - DOI - PMC - PubMed
    1. Sharp PM, Averof M, Lloyd AT, Matassi G, Peden JF. DNA sequence evolution: the sounds of silence. Philos Trans R Soc Lond Ser B Biol Sci. 1995;349(1329):241–247. doi: 10.1098/rstb.1995.0108. - DOI - PubMed
    1. Duret L. Evolution of synonymous codon usage in metazoans. Curr Opin Genet Dev. 2002;12(6):640–649. doi: 10.1016/S0959-437X(02)00353-2. - DOI - PubMed
    1. Plotkin JB, Kudla G. Synonymous but not the same: the causes and consequences of codon bias. Nat Rev Genet. 2011;12(1):32–42. doi: 10.1038/nrg2899. - DOI - PMC - PubMed