Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 May;39(10):4220-34.
doi: 10.1093/nar/gkr007. Epub 2011 Jan 25.

Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences

Affiliations

Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences

Ivaylo P Ivanov et al. Nucleic Acids Res. 2011 May.

Abstract

In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5' cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized--both for increased coding capacity and potentially also for novel regulatory mechanisms--remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5' untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Five known molecular mechanisms responsible for the initiation of translation upstream of the first 5′ in-frame AUG codon. mRNAs are shown as horizontal lines. Dark grey boxes represent annotated CDS regions. Light grey boxes represent extensions of CDSs upstream of annotated AUG codons up to the closest in-frame stop codon. Black boxes denoted as P5EC represent upstream regions where codons in-frame with annotated CDSs evolve under purifying selection. Diagonal stripes are used to denote alternatively spliced exons.
Figure 2.
Figure 2.
Pipeline of RefSeq mRNA analysis for the identification of conserved 5′ CDS extensions (P5ECs). White boxes indicate annotated CDSs. Black boxes correspond to 5′ in-frame codon extensions up to the closest in-frame stop codon. Xs correspond to the deleted regions of human–mouse alignments prior to Ka/Ks analysis.
Figure 3.
Figure 3.
Histogram of Ka/Ks values for mRNA sequences with known 5′ extensions. White bars represent mRNAs for which alternative transcripts with extended CDSs are known and therefore corresponding extensions are known to be translated in alternative transcripts. Sequences of these extensions are expected to evolve as protein coding sequences and were used as an internal control in this study. Black bars represent the remaining mRNAs for which it is not known whether alternative mRNA isoforms exist. Curves indicate the number of genes (y-axis) with Ka/Ks below a particular value (x-axis).
Figure 4.
Figure 4.
Scatter plots of Ka/Ks ratios for the alignments of the sequences corresponding to P5ECs from different mRNAs (y-axis) in relation to the level of protein identity (bottom panels), and the lengths of P5ECs (top panels). The right-hand panels correspond to mRNAs for which transcript variants with 5′-extended CDSs are known. The left panels correspond to the remaining mRNAs.
Figure 5.
Figure 5.
Boxplots of non-AUG CDS extension length distributions for previously known cases and those identified in this study.
Figure 6.
Figure 6.
Weblogo representation of the region surrounding the known and putative conserved non-AUG initiation sites in humans. Numbering is relative to the first nucleotide of the start codon. (A) Representation for the 42 sequences with newly identified extensions. (B) Representation for the 17 sequences with previously identified and conserved extensions. (C) Representation of all AUG start sites of humans [the frequencies for nucleotide occurrence at each position for the human mRNAs were obtained from the Transterm database (73)].
Figure 7.
Figure 7.
Plots showing density of mRNA fragments protected by ribosomes for NM_004494 and NM_001010858. The position of the annotated AUG codon was taken as zero; relative coordinates of stop codons and predicted non-AUG initiators are indicated. Regions corresponding to annotated CDSs are highlighted in dark grey; regions corresponding to non-AUG-initiated extensions are highlighted in light grey. The presence of ribosomal footprints in the region of an extension indicates that the initiation of translation takes place upstream of the annotated CDS.

References

    1. Ramakrishnan V. Ribosome structure and the mechanism of translation. Cell. 2002;108:557–572. - PubMed
    1. Simonetti A, Marzi S, Myasnikov AG, Fabbretti A, Yusupov M, Gualerzi CO, Klaholz BP. Structure of the 30S translation initiation complex. Nature. 2008;455:416–420. - PubMed
    1. Potapov AP, Triana-Alonso FJ, Nierhaus KH. Ribosomal decoding processes at codons in the A or P sites depend differently on 2′-OH groups. J. Biol. Chem. 1995;270:17680–17684. - PubMed
    1. Baranov PV, Gesteland RF, Atkins JF. P-site tRNA is a crucial initiator of ribosomal frameshifting. RNA. 2004;10:221–230. - PMC - PubMed
    1. Ogle JM, Brodersen DE, Clemons WM, Jr, Tarry MJ, Carter AP, Ramakrishnan V. Recognition of cognate transfer RNA by the 30S ribosomal subunit. Science. 2001;292:897–902. - PubMed

Publication types