Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb 1:15:36.
doi: 10.1186/1471-2105-15-36.

uPEPperoni: an online tool for upstream open reading frame location and analysis of transcript conservation

Affiliations

uPEPperoni: an online tool for upstream open reading frame location and analysis of transcript conservation

Adam Skarshewski et al. BMC Bioinformatics. .

Abstract

Background: Several small open reading frames located within the 5' untranslated regions of mRNAs have recently been shown to be translated. In humans, about 50% of mRNAs contain at least one upstream open reading frame representing a large resource of coding potential. We propose that some upstream open reading frames encode peptides that are functional and contribute to proteome complexity in humans and other organisms. We use the term uPEPs to describe peptides encoded by upstream open reading frames.

Results: We have developed an online tool, termed uPEPperoni, to facilitate the identification of putative bioactive peptides. uPEPperoni detects conserved upstream open reading frames in eukaryotic transcripts by comparing query nucleotide sequences against mRNA sequences within the NCBI RefSeq database. The algorithm first locates the main coding sequence and then searches for open reading frames 5' to the main start codon which are subsequently analysed for conservation. uPEPperoni also determines the substitution frequency for both the upstream open reading frames and the main coding sequence. In addition, the uPEPperoni tool produces sequence identity heatmaps which allow rapid visual inspection of conserved regions in paired mRNAs.

Conclusions: uPEPperoni features user-nominated settings including, nucleotide match/mismatch, gap penalties, Ka/Ks ratios and output mode. The heatmap output shows levels of identity between any two sequences and provides easy recognition of conserved regions. Furthermore, this web tool allows comparison of evolutionary pressures acting on the upstream open reading frame against other regions of the mRNA. Additionally, the heatmap web applet can also be used to visualise the degree of conservation in any pair of sequences. uPEPperoni is freely available on an interactive web server at http://upep-scmb.biosci.uq.edu.au.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Screenshots of the search, alignment and help pages of uPEPperoni. (A) The conserved uPEP search page showing the user-selectable settings for the RefSeq database, Ka/Ks ratio, reference heatmaps, alignment parameters and heatmap generation. (B) The heatmap alignment page showing the user-selectable settings for visual representation of the main coding sequence (CDS) and uORFs and the search parameters for uORF-length, the extent of uORF overlap into the CDS and the region of the transcript to be searched. (C) The help page. uPEPperoni is hosted on an Apache server on a Linux platform and is publically accessible free of charge at http://upep-scmb.biosci.uq.edu.au. Full documentation of uPEPperoni is also accessible via links on the website. The uORF reference database is automatically rebuilt on the server shortly after each major RefSeq release. We archive previous uORF reference databases. The RefSeq release version number from which the reference database is derived is shown on the web page.
Figure 2
Figure 2
Example output showing the heatmaps produced by querying the mRNA sequence of the Homo sapiens Hairless (HR) transcript (NM_005144) against Mus musculus Hairless (Hr) (NM_021877). The solid bars above the heatmap indicate the ORFs on the transcript. The output lists the most conserved uPEPs first. The heatmap generated by the query sequence is shown first; in this case human HR aligned with mouse Hr transcript. The reciprocal heatmap generated using the reference sequence is shown below (mouse Hr transcript versus human HR). The inclusion of the Reference Alignment is selectable by the user. The unformatted aligned sequence can be viewed using a hyperlink shown above the heatmap.
Figure 3
Figure 3
Several heatmaps of aligned transcript-pairs can be combined to provide a visual snapshot of sequence conservation. (A) Heatmaps for each pair-wise analysis of the human transcript encoding protein tyrosine phosphatase type IVA, member 1 (Ptp4a1) (NM_003463) with the othologous non-human transcript are shown. Black lines above each heatmap mark the position of the conserved uPEP and CDS for that species. Note the conservation of this uPEP even as the phylogenetic distance between the comparison species (on the right) widens. (B) ClustalW alignment of the Ptp4a1 uPEP, translated in silico from the conserved uORF. The numbers below the bar graph represent the conservation of each individual amino acid, where 10 (shown as an asterisk (*)) indicates identity across all species.

Similar articles

Cited by

References

    1. Calvo SE, Pagliarini DJ, Mootha VK. Upstream open reading frames cause widespread reduction of protein expression and are polymorphic among humans. Proc Natl Acad Sci USA. 2009;106(18):7507–7512. doi: 10.1073/pnas.0810916106. - DOI - PMC - PubMed
    1. Crowe ML, Wang XQ, Rothnagel JA. Evidence for conservation and selection of upstream open reading frames suggests probable encoding of bioactive peptides. BMC Genomics. 2006;7:16. doi: 10.1186/1471-2164-7-16. - DOI - PMC - PubMed
    1. Iacono M, Mignone F, Pesole G. uAUG and uORFs in human and rodent 5′untranslated mRNAs. Gene. 2005;349:97–105. - PubMed
    1. Pesole G, Gissi C, Grillo G, Licciulli F, Liuni S, Saccone C. Analysis of oligonucleotide AUG start codon context in eukariotic mRNAs. Gene. 2000;261(1):85–91. doi: 10.1016/S0378-1119(00)00471-6. - DOI - PubMed
    1. Rogozin IB, Kochetov AV, Kondrashov FA, Koonin EV, Milanesi L. Presence of ATG triplets in 5′ untranslated regions of eukaryotic cDNAs correlates with a ‘weak’ context of the start codon. Bioinformatics. 2001;17(10):890–900. doi: 10.1093/bioinformatics/17.10.890. - DOI - PubMed

Publication types