Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Dec 3;10 Suppl 15(Suppl 15):S2.
doi: 10.1186/1471-2105-10-S15-S2.

A comprehensive assessment of N-terminal signal peptides prediction methods

Affiliations

A comprehensive assessment of N-terminal signal peptides prediction methods

Khar Heng Choo et al. BMC Bioinformatics. .

Abstract

Background: Amino-terminal signal peptides (SPs) are short regions that guide the targeting of secretory proteins to the correct subcellular compartments in the cell. They are cleaved off upon the passenger protein reaching its destination. The explosive growth in sequencing technologies has led to the deposition of vast numbers of protein sequences necessitating rapid functional annotation techniques, with subcellular localization being a key feature. Of the myriad software prediction tools developed to automate the task of assigning the SP cleavage site of these new sequences, we review here, the performance and reliability of commonly used SP prediction tools.

Results: The available signal peptide data has been manually curated and organized into three datasets representing eukaryotes, Gram-positive and Gram-negative bacteria. These datasets are used to evaluate thirteen prediction tools that are publicly available. SignalP (both the HMM and ANN versions) maintains consistency and achieves the best overall accuracy in all three benchmarking experiments, ranging from 0.872 to 0.914 although other prediction tools are narrowing the performance gap.

Conclusion: The majority of the tools evaluated in this study encounter no difficulty in discriminating between secretory and non-secretory proteins. The challenge clearly remains with pinpointing the correct SP cleavage site. The composite scoring schemes employed by SignalP may help to explain its accuracy. Prediction task is divided into a number of separate steps, thus allowing each score to tackle a particular aspect of the prediction.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Aggregated results from all three experiments. Accuracy results from all three experiments are provided here. For each tool, there are three bars, representing each experiment (gray bar: experiment 1; white bar: experiment 2; black bar: experiment 3).
Figure 2
Figure 2
Results from Experiment 1. The dataset [20] used in this experiment contains eukaryotic (human) sequences only. The bars colored in light gray represent the specificity while the black bars represent the sensitivity of the prediction tools.
Figure 3
Figure 3
Results from Experiment 2. The datasets employed in this experiment are derived from SPdb 5.1 [33] and subjected to manual curation. The datasets are divided into Euk (top chart), Gpos (bottom chart) and Gneg (middle chart) bacteria.
Figure 4
Figure 4
Results from Experiment 3. The datasets employed in this experiment are derived from Swiss-Prot Release 57.0 and subjected to the filtering process described in [33]. However, putative SPs which have high probability of existent based on the experiment literature are retained. The datasets are further grouped into Euk (top chart), Gpos (bottom chart) and Gneg (middle chart) bacteria.

Similar articles

Cited by

References

    1. von Heijne G. The signal peptide. J Membr Biol. 1990;115(3):195–201. doi: 10.1007/BF01868635. - DOI - PubMed
    1. Spiess M. Heads or tails--what determines the orientation of proteins in the membrane. FEBS Lett. 1995;369(1):76–79. doi: 10.1016/0014-5793(95)00551-J. - DOI - PubMed
    1. Bairoch A, Boeckmann B, Ferro S, Gasteiger E. Swiss-Prot: juggling between evolution and stability. Brief Bioinform. 2004;5(1):39–55. doi: 10.1093/bib/5.1.39. - DOI - PubMed
    1. Kulikova T, Akhtar R, Aldebert P, Althorpe N, Andersson M, Baldwin A, Bates K, Bhattacharyya S, Bower L, Browne P. EMBL Nucleotide Sequence Database in 2006. Nucleic Acids Res. 2007. pp. D16–20. - DOI - PMC - PubMed
    1. Reynolds SM, Kall L, Riffle ME, Bilmes JA, Noble WS. Transmembrane topology and signal peptide prediction using dynamic bayesian networks. PLoS Comput Biol. 2008;4(11):e1000213. doi: 10.1371/journal.pcbi.1000213. - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources