. 2018 May 23;14(5):e1006146.

doi: 10.1371/journal.pcbi.1006146. eCollection 2018 May.

Traceability, reproducibility and wiki-exploration for "à-la-carte" reconstructions of genome-scale metabolic models

Méziane Aite¹, Marie Chevallier^{1

2}, Clémence Frioux¹, Camille Trottier^{1

3}, Jeanne Got¹, María Paz Cortés^{4

5

6}, Sebastián N Mendoza^{4

6}, Grégory Carrier⁷, Olivier Dameron¹, Nicolas Guillaudeux¹, Mauricio Latorre^{4

6

8

9}, Nicolás Loira^{4

6}, Gabriel V Markov¹⁰, Alejandro Maass^{4

6}, Anne Siegel¹

Affiliations

¹ IRISA, Univ Rennes, Inria, CNRS, Rennes, France.
² ECOBIO, Univ Rennes, CNRS, Rennes, France.
³ UMR 6004 ComBi, Université de Nantes, CNRS, Nantes, France.
⁴ Centro de Modelamiento Matemático, Universidad de Chile, Santiago, Chile.
⁵ Facultad de Ingeniería y Ciencias, Universidad Adolfo Ibáñez, Santiago, Chile.
⁶ Centro para la Regulación del Genoma (Fondap 15090007), Universidad de Chile, Santiago, Chile.
⁷ Laboratoire de Physiologie et de Biotechnologie des Algues, IFREMER, Nantes, France.
⁸ Instituto de ciencias de la ingeniería, Universidad de O'Higgins, Rancagua, Chile.
⁹ Instituto de Nutrición y Tecnología de los Alimentos, Universidad de Chile, Santiago, Chile.
¹⁰ UMR 8227, Integrative Biology of Marine Models, Station biologique de Roscoff, Sorbonne Université, CNRS, Roscoff, France.

PMID: 29791443
PMCID: PMC5988327
DOI: 10.1371/journal.pcbi.1006146

Traceability, reproducibility and wiki-exploration for "à-la-carte" reconstructions of genome-scale metabolic models

Méziane Aite et al. PLoS Comput Biol. 2018.

. 2018 May 23;14(5):e1006146.

doi: 10.1371/journal.pcbi.1006146. eCollection 2018 May.

Authors

Affiliations

¹ IRISA, Univ Rennes, Inria, CNRS, Rennes, France.
² ECOBIO, Univ Rennes, CNRS, Rennes, France.
³ UMR 6004 ComBi, Université de Nantes, CNRS, Nantes, France.
⁴ Centro de Modelamiento Matemático, Universidad de Chile, Santiago, Chile.
⁵ Facultad de Ingeniería y Ciencias, Universidad Adolfo Ibáñez, Santiago, Chile.
⁶ Centro para la Regulación del Genoma (Fondap 15090007), Universidad de Chile, Santiago, Chile.
⁷ Laboratoire de Physiologie et de Biotechnologie des Algues, IFREMER, Nantes, France.
⁸ Instituto de ciencias de la ingeniería, Universidad de O'Higgins, Rancagua, Chile.
⁹ Instituto de Nutrición y Tecnología de los Alimentos, Universidad de Chile, Santiago, Chile.
¹⁰ UMR 8227, Integrative Biology of Marine Models, Station biologique de Roscoff, Sorbonne Université, CNRS, Roscoff, France.

PMID: 29791443
PMCID: PMC5988327
DOI: 10.1371/journal.pcbi.1006146

Abstract

Genome-scale metabolic models have become the tool of choice for the global analysis of microorganism metabolism, and their reconstruction has attained high standards of quality and reliability. Improvements in this area have been accompanied by the development of some major platforms and databases, and an explosion of individual bioinformatics methods. Consequently, many recent models result from "à la carte" pipelines, combining the use of platforms, individual tools and biological expertise to enhance the quality of the reconstruction. Although very useful, introducing heterogeneous tools, that hardly interact with each other, causes loss of traceability and reproducibility in the reconstruction process. This represents a real obstacle, especially when considering less studied species whose metabolic reconstruction can greatly benefit from the comparison to good quality models of related organisms. This work proposes an adaptable workspace, AuReMe, for sustainable reconstructions or improvements of genome-scale metabolic models involving personalized pipelines. At each step, relevant information related to the modifications brought to the model by a method is stored. This ensures that the process is reproducible and documented regardless of the combination of tools used. Additionally, the workspace establishes a way to browse metabolic models and their metadata through the automatic generation of ad-hoc local wikis dedicated to monitoring and facilitating the process of reconstruction. AuReMe supports exploration and semantic query based on RDF databases. We illustrate how this workspace allowed handling, in an integrated way, the metabolic reconstructions of non-model organisms such as an extremophile bacterium or eukaryote algae. Among relevant applications, the latter reconstruction led to putative evolutionary insights of a metabolic pathway.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. AuReMe workspace.**
*Overview of the AuReMe workspace*. Admissible inputs include standard formats in genomics and metabolic model fields that can be outputs of major reconstruction platforms. *AuReMe* acts as a workflow controller to administer the reconstruction or modification of the GSM performed by heterogeneous and independent tools. The latter are part of the services of *AuReMe* (reconstruction tools, analyses, manual curation) and can be chained together, either in a pre-set pipeline or in a customized one. In any case the *PADMet* data manager stores adequate information regarding the model and its metadata, most importantly the process ones, that keeps track of the modifications performed (at what step a reaction was added, by which tool etc.). At any time, the reconstruction can be monitored locally via an automatically-generated wiki that informs the user about the state of the model. Outputs of *AuReMe* can be self-sufficient or be integrated again in many existing platforms.

**Fig 2. Screen captures of several pages of the local wiki and the interactions between them.**
A local wiki-based export of the GSM facilitates user-interface exploration and traceability of the reconstruction procedure. Several screenshots of a wiki are displayed, arrows represent the link between pages. Notably, reactions can be sorted and explored according to reconstruction categories, tools and sources. The navigation panel enables exploring and comparing the contributions of each tool used in the “à la carte” GSM reconstruction pipeline. Pathways can be sorted based on their completion rate.

**Fig 3. Examples of customizable GSM reconstruction pipelines.**
*PADMet* allows a user to easily implement flexible and personalized pipelines adapted to the wideness of the considered species and resource data. *PADMet* traces multiple complex reconstruction paradigms. 4 customizations of the reconstruction are presented here: the orthology- and gap-filling-based reconstruction of a) *Enterococcus faecalis* and b) *Sulfobacillus thermosulfidooxidans str*. *Cutipay* models, and the reconstructions of c) *Ectocarpus siliculosus* and d) *Tisochrysis lutea* models, using orthology, annotation and gap-filling. All models include manual curations and several analysis steps.

**Fig 4. Interest of heterogeneous methods in pathway completion and filling thanks to tracking of process metadata.**
Completion of the 6-hydroxymethyl-dihydropterin diphosphate biosynthesis I and the tetrahydrofolate biosynthesis pathways in E. *siliculosus* via the combination of annotation (yellow), orthology (green) and gap-filling (blue). The dihydrofolate compound with the dotted line is an instance of the dihydrofolate-glu-n class, following MetaCyc classes ontology structure. The class compound is the original reactant of the dihydrofolatereduct-rxn reaction retrieved with annotation, whereas the previous reaction of the pathway (dihydrofolatesynth-rxn) produces the instance dihydrofolate. Hence the gap-filling step that, using an extended version of MetaCyc, selects an instantiated version of dihydrofolatesynth-rxn that consumes the instance dihydrofolate.

**Fig 5. *Tisochrysis lutea* metabolic model exploration: Origin of reactions according to the reconstruction pipeline.**
(A) Comparison of the numbers of EC numbers introduced in the network either by the annotation pipeline or by the orthology-based analysis 898 enzymes were identified via annotation-based information and 790 enzymes through orthology-based data, among which 524 were already identified via annotation information. (B) *Number of T-Iso ortholog enzymes according to their origin in template models*. For each of the 790 T-Iso ortholog enzymes, the figure depicts in which of the four template models an ortholog of the enzyme had been identified. The four templates used were: A. *thaliana*, C. *reinhardtii*, E. *siliculosus* and *Synechocystis* sp. PCC 6803 to decipher ortholog enzymes in T. *lutea*. (C) *T-Iso carnosine biosynthesis*. Reconstruction of T-Iso carnosine synthesis pathway was performed using three sources of data (i) T-Iso genome annotations (cyan star); (ii) template metabolic models (stars) of four organisms: A. *thaliana* (blue), C. *reinhardtii* (green), E. *siliculosus* (red), and *Synechocystis* sp. (yellow) with orthology-based information; (iii) complete proteomes of the four organisms (squares) with sequence alignment information (best reciprocal hit in blasts). All reactions of the T-Iso carnosine biosynthesis are common to the four organisms except for three of them: ASPDECARBOX-RXN, HISTIDPHOS-RXN, and CARNOSINE-SYNTHASE-RXN. The first seems to belong to an alternative pathway to produce β-alanine, also found in C. *reinhardtii*, *Synechocystis* sp and *Candidatus Phaeomarinobacter ectocarpi*, a symbiotic bacterium to E. *siliculosus*. HISTIDPHOS-RXN was not found in E. *siliculosus* but was identified in its symbiotic bacterium *Candidatus* Phaeomarinobacter ectocarpi. CARNOSINE-SYNTHASE-RXN was only identified in algae (C. *reinhardtii*, E. *siliculosus* and T. *lutea*).

See this image and copyright information in PMC

References

1. Bordbar A, Monk JM, King ZA, Palsson BO. Constraint-based models predict metabolic and associated cellular functions. Nat Rev Genet. 2014;15: 107–20. doi: 10.1038/nrg3643 - DOI - PubMed
1. Orth J, Thiele I, Bernhard. What is flux balance analysis? Nat Biotechnol. 2010;28: 245–248. doi: 10.1038/nbt.1614 - DOI - PMC - PubMed
1. Yim H, Haselbeck R, Niu W, Baxley CP, Burgard A, Boldt J, et al. Metabolic engineering of Escherichia coli for direct production of 1,4-butanediol. Nat Chem Biol. 2011;7: 445–452. doi: 10.1038/nchembio.580 - DOI - PubMed
1. Kim HU, Kim SY, Jeong H, Kim TY, Kim JJ, Choy HE, et al. Integrative genome-scale metabolic analysis of Vibrio vulnificus for drug targeting and discovery. Mol Syst Biol. John Wiley & Sons, Ltd; 2011;7: 460 doi: 10.1038/msb.2010.115 - DOI - PMC - PubMed
1. Zelezniak A, Andrejev S, Ponomarova O, Mende DR, Bork P, Patil KR. Metabolic dependencies drive species co-occurrence in diverse microbial communities. Proc Natl Acad Sci U S A. 2015;112: 6449–6454. doi: 10.1073/pnas.1421834112 - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- BacDive
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Traceability, reproducibility and wiki-exploration for "à-la-carte" reconstructions of genome-scale metabolic models

Affiliations

Traceability, reproducibility and wiki-exploration for "à-la-carte" reconstructions of genome-scale metabolic models

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases