. 2013 Jul 1;29(13):i135-44.

doi: 10.1093/bioinformatics/btt244.

Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets

Masaaki Kotera¹, Yasuo Tabei, Yoshihiro Yamanishi, Toshiaki Tokimatsu, Susumu Goto

Affiliations

PMID: 23812977
PMCID: PMC3694648
DOI: 10.1093/bioinformatics/btt244

Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets

Masaaki Kotera et al. Bioinformatics. 2013.

. 2013 Jul 1;29(13):i135-44.

doi: 10.1093/bioinformatics/btt244.

Authors

Masaaki Kotera¹, Yasuo Tabei, Yoshihiro Yamanishi, Toshiaki Tokimatsu, Susumu Goto

Affiliation

¹ Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan.

PMID: 23812977
PMCID: PMC3694648
DOI: 10.1093/bioinformatics/btt244

Abstract

Motivation: The metabolic pathway is an important biochemical reaction network involving enzymatic reactions among chemical compounds. However, it is assumed that a large number of metabolic pathways remain unknown, and many reactions are still missing even in known pathways. Therefore, the most important challenge in metabolomics is the automated de novo reconstruction of metabolic pathways, which includes the elucidation of previously unknown reactions to bridge the metabolic gaps.

Results: In this article, we develop a novel method to reconstruct metabolic pathways from a large compound set in the reaction-filling framework. We define feature vectors representing the chemical transformation patterns of compound-compound pairs in enzymatic reactions using chemical fingerprints. We apply a sparsity-induced classifier to learn what we refer to as 'enzymatic-reaction likeness', i.e. whether compound pairs are possibly converted to each other by enzymatic reactions. The originality of our method lies in the search for potential reactions among many compounds at a time, in the extraction of reaction-related chemical transformation patterns and in the large-scale applicability owing to the computational efficiency. In the results, we demonstrate the usefulness of our proposed method on the de novo reconstruction of 134 metabolic pathways in Kyoto Encyclopedia of Genes and Genomes (KEGG). Our comprehensively predicted reaction networks of 15 698 compounds enable us to suggest many potential pathways and to increase research productivity in metabolomics.

Availability: Softwares are available on request. Supplementary material are available at http://web.kuicr.kyoto-u.ac.jp/supp/kot/ismb2013/.

PubMed Disclaimer

Figures

**Fig. 1.**
Metabolic pathway reconstruction frameworks. Circles c1–c9 and rectangles e1–e10 represents chemical compounds and enzyme proteins, respectively. Left and right panels represent inputs and outputs, respectively. The reference-based framework (A) extracts an organism-specific pathway from a pre-fixed pathway map with orthologous information about enzyme genes, whereas the compound-filling framework (B) and the reaction-filling framework (C) are the *de novo* methods to reconstruct a new pathway where reference information is not available

**Fig. 2.**
Comparison of the number of extracted features among different methods

**Fig. 3.**
AUC scores for each pathway map with diff-common feature vectors (left panel) and diff-only feature vectors (right panel)

**Fig. 4.**
Substructure transformation network. Nodes represent the PubChem fingerprint components that contributed to the prediction with diff-only feature vectors, where the size of the nodes is proportional to the weights in L1SVM. Edges represent the top 100 frequently pairs of substructures, where the one is formed and the other is eliminated in a reaction

**Fig. 5.**
Part of the generated *de novo* reactions combined with existing network, where nodes and edges represent compounds and reactions, respectively. Black thin lines represent the reactions existing in KEGG. Gray lines represent 50 new reactions with high scores predicted by diff-common and diff-only feature vectors. The width of the gray edges is proportional to the predictive score. Predicted reactions (**A–I**) are given detailed explanation in Figure 6

**Fig. 6.**
Examples of the predicted pairs taken from Figure 5. The chemical transformation patterns in pairs (**A–C**) are already known and described in KEGG reactant pairs (Note that these reactions are not known, but the transformation patterns are known), whereas pairs (**D–I**) have unknown patterns. (A) C-C bond accompanied with secondary alcohol group is degraded and forms an aldehyde group, which is a reaction typically found in EC sub-subclass 4.1.2 (aldehyde-lyases). (B) C-S bond in disulfide bond is degraded and forms an S-mercapto group, which is found in EC sub-subclass 4.4.1 (carbon-sulfur lyases). (C) This chemical transformation pattern is found in many reactions in EC 2.4.1 (glycosyltransferases) and EC 3.2.1 (glycosidases). (D) This pattern is not found in known reactions. At the first sight, this pair may look like two steps of methylation/demethylation (EC 2.1.1) or intramolecular transfer of a methyl group (part of EC 5.4). With closer investigations of Isoquinoline alkaloid biosynthesis pathway (map00950, which these compounds belong to), it looks more natural to occur the two steps of metylenedioxy ring formation/cleavage (EC 1.14.21 or 1.21.3) because some metylenedioxy ring formation reactions are known to take place in this pathway. However, in any case, methylation and metylenedioxy ring formation occurs in the context of biosynthesis, whereas demethylation and metylenedioxy ring cleavage occurs in the context of biodegradation. In that sense, this compound–compound pair may be an example of false positives when taking account of the reaction flow in the pathway level. (E) This compound–compound pair may look intramolecular transfer of a hydroxy group, which is typically found in EC 5.4.4 (hydroxymutases), but the transfer of hydroxy group from a position to another in an aromatic ring is not found in any known reactions stored in KEGG. This pair may be another example of false positives because the substitution of hydroxy group in aromatic ring is much harder to occur than the addition of hydroxy group. It is known that some anaerobic bacteria have 4-hydroxybenzoyl-CoA reductase (EC 1.3.7.9) that catalyzes the substitution of hydroxy group in aromatic ring. However, we assume it would be hard to catalyze intramolecular transfer of hydroxy group in substituted aromatic ring. (F) Although there are many varieties of hydroxylases (part of EC 1.14), there is no known pattern to produce hydroxyl amine from amide group. (G) For this reaction to occur, there need to be more than one reaction steps, and an important step would be similar to EC 4.1.2 (aldehyde-lyases). (H) There are similar EC 2.3.3 (acyl transferases) reactions in polyketide synthesis. (I) Some of EC 1.2.3 (oxidases) catalyze similar reactions

See this image and copyright information in PMC

Cited by

Implementation and comparison of kernel-based learning methods to predict metabolic networks.
Roche-Lima A. Roche-Lima A. Netw Model Anal Health Inform Bioinform. 2016;5(1):26. doi: 10.1007/s13721-016-0134-5. Epub 2016 Jul 15. Netw Model Anal Health Inform Bioinform. 2016. PMID: 27471658 Free PMC article.
Simultaneous prediction of enzyme orthologs from chemical transformation patterns for de novo metabolic pathway reconstruction.
Tabei Y, Yamanishi Y, Kotera M. Tabei Y, et al. Bioinformatics. 2016 Jun 15;32(12):i278-i287. doi: 10.1093/bioinformatics/btw260. Bioinformatics. 2016. PMID: 27307627 Free PMC article.
Metabolome-scale prediction of intermediate compounds in multistep metabolic pathways with a recursive supervised approach.
Kotera M, Tabei Y, Yamanishi Y, Muto A, Moriya Y, Tokimatsu T, Goto S. Kotera M, et al. Bioinformatics. 2014 Jun 15;30(12):i165-74. doi: 10.1093/bioinformatics/btu265. Bioinformatics. 2014. PMID: 24931980 Free PMC article.
Metabolome-scale de novo pathway reconstruction using regioisomer-sensitive graph alignments.
Yamanishi Y, Tabei Y, Kotera M. Yamanishi Y, et al. Bioinformatics. 2015 Jun 15;31(12):i161-70. doi: 10.1093/bioinformatics/btv224. Bioinformatics. 2015. PMID: 26072478 Free PMC article.
Dual graph convolutional neural network for predicting chemical networks.
Harada S, Akita H, Tsubaki M, Baba Y, Takigawa I, Yamanishi Y, Kashima H. Harada S, et al. BMC Bioinformatics. 2020 Apr 23;21(Suppl 3):94. doi: 10.1186/s12859-020-3378-0. BMC Bioinformatics. 2020. PMID: 32321421 Free PMC article.

See all "Cited by" articles

References

1. Ben-Hur A, Noble W. Kernel methods for predicting protein–protein interactions. Bioinformatics. 2005;21(Suppl. 1):i38–i46. - PubMed
1. Bono H, et al. Reconstruction of amino acid biosynthesis pathways from the complete genome sequence. Genome Res. 1998;8:203–220. - PubMed
1. Cascante M, et al. Metabolic control analysis in drug discovery and disease. Nat. Biotechnol. 2002;20:243–249. - PubMed
1. Dandekar T, et al. Pathway alignment: application to the comparative analysis of glycolytic enzymes. Biochem. 1999;343:115–124. - PMC - PubMed
1. Darvas F. Predicting metabolic pathways by logic programming. J. Mol. Graph. 1988;6:80–86.

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets

Affiliation

Supervised de novo reconstruction of metabolic pathways from metabolome-scale compound sets

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources