The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics
- PMID: 37036103
- PMCID: PMC10084500
- DOI: 10.1093/gigascience/giad022
The GEN-ERA toolbox: unified and reproducible workflows for research in microbial genomics
Abstract
Background: Microbial culture collections play a key role in taxonomy by studying the diversity of their strains and providing well-characterized biological material to the scientific community for fundamental and applied research. These microbial resource centers thus need to implement new standards in species delineation, including whole-genome sequencing and phylogenomics. In this context, the genomic needs of the Belgian Coordinated Collections of Microorganisms were studied, resulting in the GEN-ERA toolbox. The latter is a unified cluster of bioinformatic workflows dedicated to both bacteria and small eukaryotes (e.g., yeasts).
Findings: This public toolbox allows researchers without a specific training in bioinformatics to perform robust phylogenomic analyses. Hence, it facilitates all steps from genome downloading and quality assessment, including genomic contamination estimation, to tree reconstruction. It also offers workflows for average nucleotide identity comparisons and metabolic modeling.
Technical details: Nextflow workflows are launched by a single command and are available on the GEN-ERA GitHub repository (https://github.com/Lcornet/GENERA). All the workflows are based on Singularity containers to increase reproducibility.
Testing: The toolbox was developed for a diversity of microorganisms, including bacteria and fungi. It was further tested on an empirical dataset of 18 (meta)genomes of early branching Cyanobacteria, providing the most up-to-date phylogenomic analysis of the Gloeobacterales order, the first group to diverge in the evolutionary tree of Cyanobacteria.
Conclusion: The GEN-ERA toolbox can be used to infer completely reproducible comparative genomic and metabolic analyses on prokaryotes and small eukaryotes. Although designed for routine bioinformatics of culture collections, it can also be used by all researchers interested in microbial taxonomy, as exemplified by our case study on Gloeobacterales.
Keywords: Gloeobacterales; Cyanobacteria; Singularity containers; culture collections; genomics; metagenomics; nextflow; phylogenomics; phylogeny; workflow.
© The Author(s) 2023. Published by Oxford University Press GigaScience.
Conflict of interest statement
The authors declare no competing interests.
Figures


Similar articles
-
Developing reproducible bioinformatics analysis workflows for heterogeneous computing environments to support African genomics.BMC Bioinformatics. 2018 Nov 29;19(1):457. doi: 10.1186/s12859-018-2446-1. BMC Bioinformatics. 2018. PMID: 30486782 Free PMC article.
-
SNVPhyl: a single nucleotide variant phylogenomics pipeline for microbial genomic epidemiology.Microb Genom. 2017 Jun 8;3(6):e000116. doi: 10.1099/mgen.0.000116. eCollection 2017 Jun 30. Microb Genom. 2017. PMID: 29026651 Free PMC article.
-
Phylogenomic workflow for uncultivable microbial eukaryotes using single-cell RNA sequencing - A case study with planktonic ciliates (Ciliophora, Oligotrichea).Mol Phylogenet Evol. 2025 Mar;204:108239. doi: 10.1016/j.ympev.2024.108239. Epub 2024 Nov 17. Mol Phylogenet Evol. 2025. PMID: 39551225
-
Microbial taxonomy in the era of OMICS: application of DNA sequences, computational tools and techniques.Antonie Van Leeuwenhoek. 2017 Oct;110(10):1357-1371. doi: 10.1007/s10482-017-0928-1. Epub 2017 Aug 22. Antonie Van Leeuwenhoek. 2017. PMID: 28831610 Review.
-
Small Genomes and Big Data: Adaptation of Plastid Genomics to the High-Throughput Era.Biomolecules. 2019 Jul 24;9(8):299. doi: 10.3390/biom9080299. Biomolecules. 2019. PMID: 31344945 Free PMC article. Review.
Cited by
-
TADA: taxonomy-aware dataset aggregator.Bioinformatics. 2023 Dec 1;39(12):btad742. doi: 10.1093/bioinformatics/btad742. Bioinformatics. 2023. PMID: 38060257 Free PMC article.
-
Metagenome quality metrics and taxonomical annotation visualization through the integration of MAGFlow and BIgMAG.F1000Res. 2024 Sep 23;13:640. doi: 10.12688/f1000research.152290.2. eCollection 2024. F1000Res. 2024. PMID: 39360247 Free PMC article.
References
-
- Goris J, Konstantinidis KT, Klappenbach JA, et al. DNA–DNA hybridization values and their relationship to whole-genome sequence similarities. Int J Syst Evol Microbiol. 2007;57:81–91. - PubMed
-
- Tindall BJ, Rosselló-Móra R, Busse H-J, et al. Notes on the characterization of prokaryote strains for taxonomic purposes. Int J Syst Evol Microbiol. 2010;60:249–66. - PubMed
-
- Lachance M-A, Lee DK, Hsiang T. Delineating yeast species with genome average nucleotide identity: a calibration of ANI with haplontic, heterothallic metschnikowia species. Antonie Van Leeuwenhoek. 2020;113:2097–106. - PubMed
-
- Parks DH, Chuvochina M, Chaumeil P-A, et al. Selection of representative genomes for 24,706 bacterial and archaeal species clusters provide a complete genome-based taxonomy. Biorxiv. 2019. bioRxiv.10.1101/771964 - DOI