Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr 5:9:96.
doi: 10.3389/fgene.2018.00096. eCollection 2018.

FSPP: A Tool for Genome-Wide Prediction of smORF-Encoded Peptides and Their Functions

Affiliations

FSPP: A Tool for Genome-Wide Prediction of smORF-Encoded Peptides and Their Functions

Hui Li et al. Front Genet. .

Abstract

smORFs are small open reading frames of less than 100 codons. Recent low throughput experiments showed a lot of smORF-encoded peptides (SEPs) played crucial rule in processes such as regulation of transcription or translation, transportation through membranes and the antimicrobial activity. In order to gather more functional SEPs, it is necessary to have access to genome-wide prediction tools to give profound directions for low throughput experiments. In this study, we put forward a functional smORF-encoded peptides predictor (FSPP) which tended to predict authentic SEPs and their functions in a high throughput method. FSPP used the overlap of detected SEPs from Ribo-seq and mass spectrometry as target objects. With the expression data on transcription and translation levels, FSPP built two co-expression networks. Combing co-location relations, FSPP constructed a compound network and then annotated SEPs with functions of adjacent nodes. Tested on 38 sequenced samples of 5 human cell lines, FSPP successfully predicted 856 out of 960 annotated proteins. Interestingly, FSPP also highlighted 568 functional SEPs from these samples. After comparison, the roles predicted by FSPP were consistent with known functions. These results suggest that FSPP is a reliable tool for the identification of functional small peptides. FSPP source code can be acquired at https://www.bioinfo.org/FSPP.

Keywords: MS; Ribo-seq; SEP; function; smORF.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Pipeline of FSPP. Ribo-seq, RNA-seq and mass spectrometry data are input into FSPP. This tool integrates RiboTaper, Peppy etc. to analyze the three kinds of input and gets the fundamental and expression information of translated products. FSPP uses the overlap results of MS and Ribo-seq as the authentic target SEPs. Co-expression relevance and co-location information are used to build up the relation networks. FSPP selects the sub-networks which are significantly related with the target SEPs and annotate the SEPs’ functions with the help of the known neighbors. FSPP, functional smORF-encoded peptides predictor; MS, mass spectrometry; SEPs, smORF-encoded peptides.
FIGURE 2
FIGURE 2
(A) Statistics of nodes in three networks. Network_r: co-expression relations of RNA level; network_t: co-expression relations of translation level; network_n: relations of neighbors. (B) Numbers of predicted peptides in three test sets from three networks. (C) Numbers of predicted peptides in the global test data.
FIGURE 3
FIGURE 3
Statistics of SEPs hub and module methods in three networks. (A) The count of hub nodes and module nodes in RNA co-expression network (network_r). (B) The count of hub nodes and module nodes in RNA and translation products co-expression network (network_rt). (C) The count of hub nodes and module nodes in co-expression and neighbor relation network (network_rtn). (D) Box plot of nodes distribution of hub and module methods in three networks.
FIGURE 4
FIGURE 4
The detailed sub-networks to give functions. (A) Sub network to predict HER2_uORF functions. (B) The 5′ leader regions of HER2 sequence annotated by Sachs team (Spevak et al., 2006). FSPP-detected black bold sequence produced a 6-length functional peptide. (C) Sub-network to predict MKKS_uORF functions. (D) Sub-network to predict IFRD1_uORF functions.

Similar articles

Cited by

References

    1. Akimoto C., Sakashita E., Kasashima K., Kuroiwa K., Tominaga K., Hamamoto T., et al. (2013). Translational repression of the McKusick–Kaufman syndrome transcript by unique upstream open reading frames encoding mitochondrial proteins with alternative polyadenylation sites. Biochim. Biophys. Acta 1830 2728–2738. 10.1016/j.bbagen.2012.12.010 - DOI - PubMed
    1. Anderson D. M., Anderson K. M., Chang C. L., Makarewich C. A., Nelson B. R., McAnally J. R., et al. (2015). A micropeptide encoded by a putative long noncoding RNA regulates muscle performance. Cell 160 595–606. 10.1016/j.cell.2015.01.009 - DOI - PMC - PubMed
    1. Andrews S. J., Rothnagel J. A. (2014). Emerging evidence for functional peptides encoded by short open reading frames. Nat. Rev. Genet. 15 193–204. 10.1038/nrg3520 - DOI - PubMed
    1. Ashburner M., Ball C. A., Blake J. A., Botstein D., Butler H., Cherry J. M., et al. (2000). Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat. Genet. 25 25–29. 10.1038/75556 - DOI - PMC - PubMed
    1. Basrai M. A., Hieter P., Boeke J. D. (1997). Small open reading frames: beautiful needles in the haystack. Genome Res. 7 768–771. 10.1101/gr.7.8.768 - DOI - PubMed

LinkOut - more resources