Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2015 Feb 18:13:182-91.
doi: 10.1016/j.csbj.2015.02.003. eCollection 2015.

Biochemical functional predictions for protein structures of unknown or uncertain function

Affiliations
Review

Biochemical functional predictions for protein structures of unknown or uncertain function

Caitlyn L Mills et al. Comput Struct Biotechnol J. .

Abstract

With the exponential growth in the determination of protein sequences and structures via genome sequencing and structural genomics efforts, there is a growing need for reliable computational methods to determine the biochemical function of these proteins. This paper reviews the efforts to address the challenge of annotating the function at the molecular level of uncharacterized proteins. While sequence- and three-dimensional-structure-based methods for protein function prediction have been reviewed previously, the recent trends in local structure-based methods have received less attention. These local structure-based methods are the primary focus of this review. Computational methods have been developed to predict the residues important for catalysis and the local spatial arrangements of these residues can be used to identify protein function. In addition, the combination of different types of methods can help obtain more information and better predictions of function for proteins of unknown function. Global initiatives, including the Enzyme Function Initiative (EFI), COMputational BRidges to EXperiments (COMBREX), and the Critical Assessment of Function Annotation (CAFA), are evaluating and testing the different approaches to predicting the function of proteins of unknown function. These initiatives and global collaborations will increase the capability and reliability of methods to predict biochemical function computationally and will add substantial value to the current volume of structural genomics data by reducing the number of absent or inaccurate functional annotations.

Keywords: Computational chemistry; Local structure methods; Protein function prediction; Structural genomics.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Three histidine residues from histidinol phosphate phosphatase (HPP) (PDB 2yz5) were analyzed by THEMATICS to produce theoretical titration curves (A), which plot the mean net charge of a given residue of a large ensemble of protein molecules as a function of pH, and first derivative plots (B).The titration curves of two non-catalytic residues, H84 and H150, show sigmoidal curve shapes with a small buffer range, while the catalytic H226 displays a curve with an anomalous shape, shallow slope, and larger buffer range. When analyzing the first derivatives of the titration curves, non-catalytic residues display symmetrical, highly peaked plots. However, active site residues such as H226 shown here display broad, asymmetric derivative plots and may have multiple peaks.
Fig. 2
Fig. 2
(A) The metal binding pocket of YP_910028.1, containing a PHP (Polymerase and Histidinol Phosphatase) domain (PDB ID 3e0f, shown in dark blue) aligns well with that of DNA Pol III alpha subunit (PDB ID 2hpi, shown in magenta). However, C145 and Y74 of DNA Pol III are mismatched with a histidine and threonine, respectively in YP_910028.1. (B) On the other hand, the metal binding pocket of YP_910028.1 (PDB 3e0f) aligns perfectly with the pocket of histidinol phosphate phosphatase (HPP) (PDB 2yz5), shown in green.
Fig. 3
Fig. 3
Schematic diagram outlining the different methods utilized in ProFunc. HMM: Hidden Markov Model; SSM: Secondary Structure Matching; HTH: Helix–Turn–Helix.
Fig. 4
Fig. 4
Schematic diagram outlining the SALSA method of annotating protein function.
Fig. 5
Fig. 5
The metabolites above dock in silico into Tm0936 and are substrates of the enzyme Tm0936. The general structure of these three metabolites is the same with the exception of the moieties shown in the boxes.

References

    1. The UniProt Consortium Activities at the universal protein resource (UniProt) Nucleic Acids Res. 2014;42:D191–D198. - PMC - PubMed
    1. Westbrook J., Feng Z., Chen L., Yang H., Berman H.M. The Protein Data Bank and structural genomics. Nucleic Acids Res. 2003;31:489–491. - PMC - PubMed
    1. Gerlt J.A. The enzyme function initiative. Biochemistry. 2011;50:9950–9962. - PMC - PubMed
    1. Stevens R.C. Design of high-throughput methods of protein production for structural biology. Structure. 2000;8:R177–R185. - PubMed
    1. Burley S.K., Joachimiak A., Montelione G.T., Wilson I.A. Contributions to the NIH–NIGMS Protein Structure Initiative from the PSI production centers. Structure. 2008;16:5–11. - PMC - PubMed