Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Aug;39(8):363-71.
doi: 10.1016/j.tibs.2014.05.006. Epub 2014 Jul 2.

Leveraging structure for enzyme function prediction: methods, opportunities, and challenges

Affiliations
Review

Leveraging structure for enzyme function prediction: methods, opportunities, and challenges

Matthew P Jacobson et al. Trends Biochem Sci. 2014 Aug.

Abstract

The rapid growth of the number of protein sequences that can be inferred from sequenced genomes presents challenges for function assignment, because only a small fraction (currently <1%) has been experimentally characterized. Bioinformatics tools are commonly used to predict functions of uncharacterized proteins. Recently, there has been significant progress in using protein structures as an additional source of information to infer aspects of enzyme function, which is the focus of this review. Successful application of these approaches has led to the identification of novel metabolites, enzyme activities, and biochemical pathways. We discuss opportunities to elucidate systematically protein domains of unknown function, orphan enzyme activities, dead-end metabolites, and pathways in secondary metabolism.

Keywords: docking; enzyme function prediction; homology modeling; metabolic pathways; protein structures.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Structure based virtual metabolite docking protocol for enzyme activity prediction. When no structure has been experimentally determined for a protein sequence, a model can be built using a variety of comparative modeling methods, but only when the structure of a homologous protein is available that has ~30% of greater sequence identity to the protein of interest. Whether using a structure of a model, it is critical that active site metal ions and cofactors are present, and that catalytic residues are positioned appropriate for catalysis. Virtual metabolites libraries can be constructed and "docked" against the putative active sites of structures or models using computational tools more commonly employed in structure-based drug design (e.g., Glide, DOCK). The docking scoring functions can be used to rank the ligands according to their estimated relative binding affinities. Top scoring metabolites are typically inspected for plausibility (Is the predicted binding mode compatible with catalysis? Is the metabolite likely to be present in the relevant organism?), and then selected for experimental testing (in vitro enzymology). Protocols similar to that shown here have been used in retrospective and prospective studies [22-25, 27-33, 36, 39].
Figure 2
Figure 2
Predicted binding poses are in good agreement with subsequently determined experimental structures. Predicted ligand binding mode (cyan) superimposed with the X-ray crystal structure (gold) of: (a) S-adenosylhomocysteine deaminase (PDB: 2PLM); (b) N-succinyl-L-Arg racemase (PDB: 2P8C); (c) D-Ala-D-Ala epimerase (PDB: 3Q4D), and (d) a polyprenyl synthase (PDB: 4FP4). In (b), (c), and (d), the docking predictions were made using homology models based on crystal structures with 35%, 39%, and 29% sequence identity, respectively.
Figure 3
Figure 3
Structure-guided discovery of new enzymes in a novel hydroxyproline betaine metabolism pathway. Panel (a) shows the name, TrEMBL annotation, and most similar homolog in the PDB for each protein in the pathway. The automated TrEMBL annotations are incorrect or imprecise for all proteins in the pathway. However, there is rich structural information that can be used for modeling and docking, as shown in the closest PDB homolog column. The pathway is shown in (b). Panels c-e show the binding site and/or active site of the three proteins (HpbD, HpbJ and HpbR, shown in bold in (a)) in the pathway, respectively, along with the docking-predicted binding mode for the ligand trans-4-hydroxy-L-proline betaine (ball-and-stick, green color). Both HpbJ and HpbR have a predicted cation-π cage, known for binding quaternary amines. In HpbD, two catalytic residues (Lys163 and Lys265) replace aromatic residues, leaving Trp320 as the key aromatic residue forming a cation-π interaction with the substrate.
Figure 4
Figure 4
The biosynthesis of cholesterol: a paradigmatic isoprenoid pathway. Crystal structures of key enzymes in the pathway have been solved, including farnesyl pyrophosphate synthase (gold; PDB: 1RQI), squalene synthase (light blue; PDB: 3WEG), and oxidosqualene-lanosterol cyclase (magenta; PDB 1W6K). These crystal structures provide opportunities to predict functions of related enzymes of the isoprenoid synthase superfamily. However, function prediction for the terpenoid synthases (also called terpene cyclases) is extremely challenging due to the huge product chemical space created by carbocation rearrangements.

Similar articles

Cited by

References

    1. UniProtKB/Swiss-Prot protein knowledgebase release 2014_01 statistics. [Online]. Available: http://web.expasy.org/docs/relnotes/relstat.html.
    1. UniProtKB/TrEMBL protein database release 2014_01 statistics. [Online]. Available: http://www.ebi.ac.uk/uniprot/TrEMBLstats.
    1. Friedberg I. Automated protein function prediction - the genomic challenge. Briefings in Bioinformatics. 2006;7:225–242. - PubMed
    1. Schnoes AM, et al. Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLoS Comp. Biol. 2009;5:e1000605. - PMC - PubMed
    1. Seffernick JL, et al. Melamine deaminase and atrazine chlorohydrolase: 98 percent identical but functionally different. J. Bacteriol. 2001;183:2405–2410. - PMC - PubMed

Publication types