Prediction of protein function from sequence properties. Discriminant analysis of a data base
- PMID: 6547351
- DOI: 10.1016/0167-4838(84)90312-1
Prediction of protein function from sequence properties. Discriminant analysis of a data base
Abstract
The protein superfamilies in the National Biomedical Research Foundation sequence data base cluster into six groups that can be distinguished on the basis of four variables characterizing amino acid composition and local sequence properties. The variables are average hydrophobicity, net charge, sequence length and periodic variation in hydrophobic residues along the chain. The clusters they distinguish are: globins; chromosomal proteins; contractile system proteins and respiratory proteins other than cytochromes; enzyme inhibitors and toxins; enzymes except hydrolases; and all other proteins. The overall probability of correctly allocating a given protein to one of these functional groups is 0.76, with the allocation reliability being highest for globins (0.97) and for chromosomal proteins (0.93).
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
