Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2001 Aug 15;29(16):E82.
doi: 10.1093/nar/29.16.e82.

Chloroplast transit peptide prediction: a peek inside the black box

Affiliations

Chloroplast transit peptide prediction: a peek inside the black box

A I Schein et al. Nucleic Acids Res. .

Abstract

Previous work in predicting protein localization to the chloroplast organelle in plants led to the development of an artificial neural network-based approach capable of remarkable accuracy in its prediction (ChloroP). A common criticism against such neural network models is that it is difficult to interpret the criteria that are used in making predictions. We address this concern with several new prediction methods that base predictions explicitly on the abundance of different amino acid types in the N-terminal region of the protein. Our successful prediction accuracy suggests that ChloroP uses little positional information in its decision-making; an unexpected result given the elaborate ChloroP input scheme. By removing positional information, our simpler methods allow us to identify those amino acids that are useful for successful prediction. The identification of important sequence features, such as amino acid content, is advantageous if one of the goals of localization predictors is to gain an understanding of the biological process of chloroplast localization. Our most accurate predictor combines principal component analysis and logistic regression. Web-based prediction using this method is available online at http://apicoplast.cis.upenn.edu/pclr/.

PubMed Disclaimer

Figures

Figure 1
Figure 1
ROC curves for ChloroPv1.1 and PCLR show how these methods trade sensitivity for specificity with different classification thresholds. A greater area under a ROC curve indicates superior prediction ability. Classification thresholds are available in Table 2.
Figure 2
Figure 2
Plot of the training set with respect to its first two principal components. These two components were selected for classification during the stepwise regression.

References

    1. Robinson C., Hynds,P., Robinson,D. and Mant,A. (1997) Multiple pathways for the targeting of thylakoid proteins in chloroplasts. Plant Mol. Biol., 38, 209–221. - PubMed
    1. Jean-Benoît P., Friso,G., Kalume,D., Roepstorff,P., Nilsson,F., Adamska,I. and van Wijk,K. (2000) Proteomics of the chloroplast: systematic identification and targeting analysis of lumenal and peripheral thylakoid proteins. Plant Cell, 12, 319–341. - PMC - PubMed
    1. Emanuelsson O., Nielsen,H. and von Heijne,G. (1999) ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci., 8, 978–984. - PMC - PubMed
    1. Lang T. and Secic,M. (1997) How to Report Statistics in Medicine. American College of Physicians, Philadelphia, PA.
    1. Emanuelsson O., Nielsen,H., Brunak,S. and von Heijne,G. (2000) Predicting subcellular localization of proteins based on their N-terminal amino acid sequence. J. Mol. Biol., 300, 1005–1016. - PubMed

Publication types