Biochemical functional predictions for protein structures of unknown or uncertain function

Caitlyn L Mills¹, Penny J Beuning¹, Mary Jo Ondrechen¹

Affiliations

PMID: 25848497
PMCID: PMC4372640
DOI: 10.1016/j.csbj.2015.02.003

Review

Biochemical functional predictions for protein structures of unknown or uncertain function

Caitlyn L Mills et al. Comput Struct Biotechnol J. 2015.

. 2015 Feb 18:13:182-91.

doi: 10.1016/j.csbj.2015.02.003. eCollection 2015.

Authors

Caitlyn L Mills¹, Penny J Beuning¹, Mary Jo Ondrechen¹

Affiliation

¹ Department of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, United States.

PMID: 25848497
PMCID: PMC4372640
DOI: 10.1016/j.csbj.2015.02.003

Abstract

With the exponential growth in the determination of protein sequences and structures via genome sequencing and structural genomics efforts, there is a growing need for reliable computational methods to determine the biochemical function of these proteins. This paper reviews the efforts to address the challenge of annotating the function at the molecular level of uncharacterized proteins. While sequence- and three-dimensional-structure-based methods for protein function prediction have been reviewed previously, the recent trends in local structure-based methods have received less attention. These local structure-based methods are the primary focus of this review. Computational methods have been developed to predict the residues important for catalysis and the local spatial arrangements of these residues can be used to identify protein function. In addition, the combination of different types of methods can help obtain more information and better predictions of function for proteins of unknown function. Global initiatives, including the Enzyme Function Initiative (EFI), COMputational BRidges to EXperiments (COMBREX), and the Critical Assessment of Function Annotation (CAFA), are evaluating and testing the different approaches to predicting the function of proteins of unknown function. These initiatives and global collaborations will increase the capability and reliability of methods to predict biochemical function computationally and will add substantial value to the current volume of structural genomics data by reducing the number of absent or inaccurate functional annotations.

Keywords: Computational chemistry; Local structure methods; Protein function prediction; Structural genomics.

PubMed Disclaimer

Figures

**Fig. 1**
Three histidine residues from histidinol phosphate phosphatase (HPP) (PDB 2yz5) were analyzed by THEMATICS to produce theoretical titration curves (A), which plot the mean net charge of a given residue of a large ensemble of protein molecules as a function of pH, and first derivative plots (B).The titration curves of two non-catalytic residues, H84 and H150, show sigmoidal curve shapes with a small buffer range, while the catalytic H226 displays a curve with an anomalous shape, shallow slope, and larger buffer range. When analyzing the first derivatives of the titration curves, non-catalytic residues display symmetrical, highly peaked plots. However, active site residues such as H226 shown here display broad, asymmetric derivative plots and may have multiple peaks.

**Fig. 2**
(A) The metal binding pocket of YP_910028.1, containing a PHP (Polymerase and Histidinol Phosphatase) domain (PDB ID 3e0f, shown in dark blue) aligns well with that of DNA Pol III alpha subunit (PDB ID 2hpi, shown in magenta). However, C145 and Y74 of DNA Pol III are mismatched with a histidine and threonine, respectively in YP_910028.1. (B) On the other hand, the metal binding pocket of YP_910028.1 (PDB 3e0f) aligns perfectly with the pocket of histidinol phosphate phosphatase (HPP) (PDB 2yz5), shown in green.

**Fig. 3**
Schematic diagram outlining the different methods utilized in ProFunc. HMM: Hidden Markov Model; SSM: Secondary Structure Matching; HTH: Helix–Turn–Helix.

**Fig. 4**
Schematic diagram outlining the SALSA method of annotating protein function.

**Fig. 5**
The metabolites above dock in silico into Tm0936 and are substrates of the enzyme Tm0936. The general structure of these three metabolites is the same with the exception of the moieties shown in the boxes.

See this image and copyright information in PMC

References

1. The UniProt Consortium Activities at the universal protein resource (UniProt) Nucleic Acids Res. 2014;42:D191–D198. - PMC - PubMed
1. Westbrook J., Feng Z., Chen L., Yang H., Berman H.M. The Protein Data Bank and structural genomics. Nucleic Acids Res. 2003;31:489–491. - PMC - PubMed
1. Gerlt J.A. The enzyme function initiative. Biochemistry. 2011;50:9950–9962. - PMC - PubMed
1. Stevens R.C. Design of high-throughput methods of protein production for structural biology. Structure. 2000;8:R177–R185. - PubMed
1. Burley S.K., Joachimiak A., Montelione G.T., Wilson I.A. Contributions to the NIH–NIGMS Protein Structure Initiative from the PSI production centers. Structure. 2008;16:5–11. - PMC - PubMed

Publication types

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Biochemical functional predictions for protein structures of unknown or uncertain function

Affiliation

Biochemical functional predictions for protein structures of unknown or uncertain function

Authors

Affiliation

Abstract

Figures

References

Publication types

LinkOut - more resources

Full Text Sources

Other Literature Sources