Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies
- PMID: 8145256
- DOI: 10.1006/jmbi.1994.1267
Discrimination of intracellular and extracellular proteins using amino acid composition and residue-pair frequencies
Abstract
Sequences of intracellular and extracellular soluble proteins were analyzed statistically in terms of amino acid composition and residue-pair frequencies. Residue-pair frequencies were calculated for sequential separations from (n, n + 1) to (n, n + 5), and converted into scoring parameters. Then, for each test protein, the single-residue and residue-pair parameters were applied to calculate a total score. According to our definition, a protein which yields a positive score is indicative of an intracellular protein, whereas a negative score implies an extracellular one. The parameter set was derived from 894 sequences constituting different protein families in the PIR database, and assessed by application to a test of 379 proteins. The results showed that 88% of intracellular and 84% of extracellular proteins were correctly assigned. The discrimination power was improved by about 8% in comparison with the previous study, which used composition data alone. Segregation of intra/extracellular proteins is also observed by other criteria, such as structural class (intracellular proteins prefer alpha and alpha/beta types and extracellular proteins prefer beta and alpha + beta types). Segregation by sequence was found to be a more reliable procedure for distinguishing intra/extracellular proteins than methods using structural class. Possible causes for this segregation by sequence are discussed.
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
