. 2014 Jan 27;6(1):2.

doi: 10.1186/1758-2946-6-2.

Comparative evaluation of open source software for mapping between metabolite identifiers in metabolic network reconstructions: application to Recon 2

Hulda S Haraldsdóttir, Ines Thiele, Ronan Mt Fleming¹

Affiliations

PMID: 24468196
PMCID: PMC3917611
DOI: 10.1186/1758-2946-6-2

Comparative evaluation of open source software for mapping between metabolite identifiers in metabolic network reconstructions: application to Recon 2

Hulda S Haraldsdóttir et al. J Cheminform. 2014.

. 2014 Jan 27;6(1):2.

doi: 10.1186/1758-2946-6-2.

Authors

Hulda S Haraldsdóttir, Ines Thiele, Ronan Mt Fleming¹

Affiliation

¹ Center for Systems Biology, University of Iceland, Sturlugata 8, IS-101 Reykjavik, Iceland. ronan.mt.fleming@gmail.com.

PMID: 24468196
PMCID: PMC3917611
DOI: 10.1186/1758-2946-6-2

Abstract

Background: An important step in the reconstruction of a metabolic network is annotation of metabolites. Metabolites are generally annotated with various database or structure based identifiers. Metabolite annotations in metabolic reconstructions may be incorrect or incomplete and thus need to be updated prior to their use. Genome-scale metabolic reconstructions generally include hundreds of metabolites. Manually updating annotations is therefore highly laborious. This prompted us to look for open-source software applications that could facilitate automatic updating of annotations by mapping between available metabolite identifiers. We identified three applications developed for the metabolomics and chemical informatics communities as potential solutions. The applications were MetMask, the Chemical Translation System, and UniChem. The first implements a "metabolite masking" strategy for mapping between identifiers whereas the latter two implement different versions of an InChI based strategy. Here we evaluated the suitability of these applications for the task of mapping between metabolite identifiers in genome-scale metabolic reconstructions. We applied the best suited application to updating identifiers in Recon 2, the latest reconstruction of human metabolism.

Results: All three applications enabled partially automatic updating of metabolite identifiers, but significant manual effort was still required to fully update identifiers. We were able to reduce this manual effort by searching for new identifiers using multiple types of information about metabolites. When multiple types of information were combined, the Chemical Translation System enabled us to update over 3,500 metabolite identifiers in Recon 2. All but approximately 200 identifiers were updated automatically.

Conclusions: We found that an InChI based application such as the Chemical Translation System was better suited to the task of mapping between metabolite identifiers in genome-scale metabolic reconstructions. We identified several features, however, that could be added to such an application in order to tailor it to this task.

PubMed Disclaimer

Figures

**Figure 1**
**Lactose stereoisomers.** Two epimers of lactose occur in nature, α-lactose and β-lactose. The epimers differ by the configuration of structural groups around a single stereogenic carbon atom (top right). **(a)** In KEGG Compound the synonyms lactose and milk sugar are assigned to a generic stereoisomer, where the configuration around this stereogenic carbon is not specified (C00243). Reactions, enzymes and pathways involving lactose are linked to this entry in KEGG. **(b)** The same synonyms and most lactose-related data are linked to the α-epimer in HMDB (HMDB00186). There is neither an entry for the generic stereoisomer in HMDB, nor an entry for the α-epimer in KEGG Compound.

**Figure 2**
**Identifiers output in identifier mapping tests.** Annotations of unique identifiers returned by each application, **(a)** when all mapping tests are included, and **(b)** when only tests involving identifier types covered by UniChem are included. The output identifiers returned in all included tests were pooled and duplicates removed. If the same identifier was returned in more than one test it was only counted once. The annotations are explained in Section *Scoring*.

**Figure 3**
**Recon 2 identifiers.** Identifier statistics for Recon 2 before and after metabolite annotations were updated with CTS. **(a)** Number of unique metabolites with each of the seven types of identifiers. n: names, i: InChIKeys, c: ChEBI ID, h: HMDB ID, k: KEGG CID, p: PubChem CID, l: LipidMAPS ID. **(b)** Number of unique metabolites with one, and up to seven, identifiers each.

**Figure 4**
**Annotation of output identifiers.** An example demonstrating annotation of output PubChem Compound identifiers (b-e), when the KEGG Compound identifier for D-glucose **(a)** is input to a mapping application. The preferred output identifier is for D-glucose **(b)**, but an identifier for alpha-D-glucose **(c)** is also valid since it is a D-glucose. An identifier for a generic hexose **(d)**, however, is not valid. Finally, an identifier for phospholactic acid **(e)**, which is a completely different compound, is incorrect.

See this image and copyright information in PMC

Cited by

Comparative evaluation of atom mapping algorithms for balanced metabolic reactions: application to Recon 3D.
Preciat Gonzalez GA, El Assal LRP, Noronha A, Thiele I, Haraldsdóttir HS, Fleming RMT. Preciat Gonzalez GA, et al. J Cheminform. 2017 Jun 14;9(1):39. doi: 10.1186/s13321-017-0223-1. J Cheminform. 2017. PMID: 29086112 Free PMC article.
Mind the Gap: Mapping Mass Spectral Databases in Genome-Scale Metabolic Networks Reveals Poorly Covered Areas.
Frainay C, Schymanski EL, Neumann S, Merlet B, Salek RM, Jourdan F, Yanes O. Frainay C, et al. Metabolites. 2018 Sep 15;8(3):51. doi: 10.3390/metabo8030051. Metabolites. 2018. PMID: 30223552 Free PMC article.
How to crack a SMILES: automatic crosschecked chemical structure resolution across multiple services using MoleculeResolver.
Müller S. Müller S. J Cheminform. 2025 Aug 4;17(1):117. doi: 10.1186/s13321-025-01064-7. J Cheminform. 2025. PMID: 40760698 Free PMC article.
Consistency, Inconsistency, and Ambiguity of Metabolite Names in Biochemical Databases Used for Genome-Scale Metabolic Modelling.
Pham N, van Heck RGA, van Dam JCJ, Schaap PJ, Saccenti E, Suarez-Diez M. Pham N, et al. Metabolites. 2019 Feb 6;9(2):28. doi: 10.3390/metabo9020028. Metabolites. 2019. PMID: 30736318 Free PMC article.
Many InChIs and quite some feat.
Warr WA. Warr WA. J Comput Aided Mol Des. 2015 Aug;29(8):681-94. doi: 10.1007/s10822-015-9854-3. Epub 2015 Jun 17. J Comput Aided Mol Des. 2015. PMID: 26081259 No abstract available.

See all "Cited by" articles

References

1. Palsson BØ. Systems Biology: Properties of Reconstructed Networks, 1st edn. Cambridge: Cambridge University Press; 2006.
1. Thiele I, Palsson BØ. A protocol for generating a high-quality genome-scale metabolic reconstruction. Nat Protoc. 2010;5(1):93–121. - PMC - PubMed
1. Kümmel A, Panke S, Heinemann M. Putative regulatory sites unraveled by network-embedded thermodynamic analysis of metabolome data. Mol Syst Biol. 2006;2:2006–2034. - PMC - PubMed
1. Rolfsson Ó, Paglia G, Magnúsdóttir M, Palsson BØ, Thiele I. Inferring the metabolism of human orphan metabolites from their metabolic network context affirms human gluconokinase activity. Biochem J. 2013;449(2):427–435. - PubMed
1. Folger O, Jerby L, Frezza C, Gottlieb E, Ruppin E, Shlomi T. Predicting selective drug targets in cancer through metabolic networks. Mol Syst Biol. 2011;7:501. - PMC - PubMed

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comparative evaluation of open source software for mapping between metabolite identifiers in metabolic network reconstructions: application to Recon 2

Affiliation

Comparative evaluation of open source software for mapping between metabolite identifiers in metabolic network reconstructions: application to Recon 2

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases

Abstract

Figures

Similar articles

Cited by

References

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases