Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul 1;29(13):i135-44.
doi: 10.1093/bioinformatics/btt244.

Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets

Affiliations

Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets

Masaaki Kotera et al. Bioinformatics. .

Abstract

Motivation: The metabolic pathway is an important biochemical reaction network involving enzymatic reactions among chemical compounds. However, it is assumed that a large number of metabolic pathways remain unknown, and many reactions are still missing even in known pathways. Therefore, the most important challenge in metabolomics is the automated de novo reconstruction of metabolic pathways, which includes the elucidation of previously unknown reactions to bridge the metabolic gaps.

Results: In this article, we develop a novel method to reconstruct metabolic pathways from a large compound set in the reaction-filling framework. We define feature vectors representing the chemical transformation patterns of compound-compound pairs in enzymatic reactions using chemical fingerprints. We apply a sparsity-induced classifier to learn what we refer to as 'enzymatic-reaction likeness', i.e. whether compound pairs are possibly converted to each other by enzymatic reactions. The originality of our method lies in the search for potential reactions among many compounds at a time, in the extraction of reaction-related chemical transformation patterns and in the large-scale applicability owing to the computational efficiency. In the results, we demonstrate the usefulness of our proposed method on the de novo reconstruction of 134 metabolic pathways in Kyoto Encyclopedia of Genes and Genomes (KEGG). Our comprehensively predicted reaction networks of 15 698 compounds enable us to suggest many potential pathways and to increase research productivity in metabolomics.

Availability: Softwares are available on request. Supplementary material are available at http://web.kuicr.kyoto-u.ac.jp/supp/kot/ismb2013/.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Metabolic pathway reconstruction frameworks. Circles c1–c9 and rectangles e1–e10 represents chemical compounds and enzyme proteins, respectively. Left and right panels represent inputs and outputs, respectively. The reference-based framework (A) extracts an organism-specific pathway from a pre-fixed pathway map with orthologous information about enzyme genes, whereas the compound-filling framework (B) and the reaction-filling framework (C) are the de novo methods to reconstruct a new pathway where reference information is not available
Fig. 2.
Fig. 2.
Comparison of the number of extracted features among different methods
Fig. 3.
Fig. 3.
AUC scores for each pathway map with diff-common feature vectors (left panel) and diff-only feature vectors (right panel)
Fig. 4.
Fig. 4.
Substructure transformation network. Nodes represent the PubChem fingerprint components that contributed to the prediction with diff-only feature vectors, where the size of the nodes is proportional to the weights in L1SVM. Edges represent the top 100 frequently pairs of substructures, where the one is formed and the other is eliminated in a reaction
Fig. 5.
Fig. 5.
Part of the generated de novo reactions combined with existing network, where nodes and edges represent compounds and reactions, respectively. Black thin lines represent the reactions existing in KEGG. Gray lines represent 50 new reactions with high scores predicted by diff-common and diff-only feature vectors. The width of the gray edges is proportional to the predictive score. Predicted reactions (A–I) are given detailed explanation in Figure 6
Fig. 6.
Fig. 6.
Examples of the predicted pairs taken from Figure 5. The chemical transformation patterns in pairs (A–C) are already known and described in KEGG reactant pairs (Note that these reactions are not known, but the transformation patterns are known), whereas pairs (D–I) have unknown patterns. (A) C-C bond accompanied with secondary alcohol group is degraded and forms an aldehyde group, which is a reaction typically found in EC sub-subclass 4.1.2 (aldehyde-lyases). (B) C-S bond in disulfide bond is degraded and forms an S-mercapto group, which is found in EC sub-subclass 4.4.1 (carbon-sulfur lyases). (C) This chemical transformation pattern is found in many reactions in EC 2.4.1 (glycosyltransferases) and EC 3.2.1 (glycosidases). (D) This pattern is not found in known reactions. At the first sight, this pair may look like two steps of methylation/demethylation (EC 2.1.1) or intramolecular transfer of a methyl group (part of EC 5.4). With closer investigations of Isoquinoline alkaloid biosynthesis pathway (map00950, which these compounds belong to), it looks more natural to occur the two steps of metylenedioxy ring formation/cleavage (EC 1.14.21 or 1.21.3) because some metylenedioxy ring formation reactions are known to take place in this pathway. However, in any case, methylation and metylenedioxy ring formation occurs in the context of biosynthesis, whereas demethylation and metylenedioxy ring cleavage occurs in the context of biodegradation. In that sense, this compound–compound pair may be an example of false positives when taking account of the reaction flow in the pathway level. (E) This compound–compound pair may look intramolecular transfer of a hydroxy group, which is typically found in EC 5.4.4 (hydroxymutases), but the transfer of hydroxy group from a position to another in an aromatic ring is not found in any known reactions stored in KEGG. This pair may be another example of false positives because the substitution of hydroxy group in aromatic ring is much harder to occur than the addition of hydroxy group. It is known that some anaerobic bacteria have 4-hydroxybenzoyl-CoA reductase (EC 1.3.7.9) that catalyzes the substitution of hydroxy group in aromatic ring. However, we assume it would be hard to catalyze intramolecular transfer of hydroxy group in substituted aromatic ring. (F) Although there are many varieties of hydroxylases (part of EC 1.14), there is no known pattern to produce hydroxyl amine from amide group. (G) For this reaction to occur, there need to be more than one reaction steps, and an important step would be similar to EC 4.1.2 (aldehyde-lyases). (H) There are similar EC 2.3.3 (acyl transferases) reactions in polyketide synthesis. (I) Some of EC 1.2.3 (oxidases) catalyze similar reactions

Similar articles

Cited by

References

    1. Ben-Hur A, Noble W. Kernel methods for predicting protein–protein interactions. Bioinformatics. 2005;21(Suppl. 1):i38–i46. - PubMed
    1. Bono H, et al. Reconstruction of amino acid biosynthesis pathways from the complete genome sequence. Genome Res. 1998;8:203–220. - PubMed
    1. Cascante M, et al. Metabolic control analysis in drug discovery and disease. Nat. Biotechnol. 2002;20:243–249. - PubMed
    1. Dandekar T, et al. Pathway alignment: application to the comparative analysis of glycolytic enzymes. Biochem. 1999;343:115–124. - PMC - PubMed
    1. Darvas F. Predicting metabolic pathways by logic programming. J. Mol. Graph. 1988;6:80–86.

Publication types

LinkOut - more resources