Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Dec 9;38(12):5825-5829.
doi: 10.1093/molbev/msab293.

eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale

Affiliations

eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale

Carlos P Cantalapiedra et al. Mol Biol Evol. .

Abstract

Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. Here, we describe a major upgrade to eggNOG-mapper, a tool for functional annotation based on precomputed orthology assignments, now optimized for vast (meta)genomic data sets. Improvements in version 2 include a full update of both the genomes and functional databases to those from eggNOG v5, as well as several efficiency enhancements and new features. Most notably, eggNOG-mapper v2 now allows for: 1) de novo gene prediction from raw contigs, 2) built-in pairwise orthology prediction, 3) fast protein domain discovery, and 4) automated GFF decoration. eggNOG-mapper v2 is available as a standalone tool or as an online service at http://eggnog-mapper.embl.de.

Keywords: bioinformatics; computational genomics; functional annotation; metagenomics.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Workflow and new features of eggNOG-mapper v2. (A) The gene prediction stage uses Prodigal to perform protein prediction from assembled contigs. (B) During the search stage, HMMER3, Diamond, or MMseqs2 can be used to align the input proteins to eggNOG v5. (C) During the orthology inference stage, a report of orthologs is generated based on the desired taxonomic scope. (D) Finally, protein annotations and domains are transferred from orthologs to the queries and reported as tabular and GFF files.
Fig. 2
Fig. 2
Performance of eggNOG-mapper v2. (A) average minutes to annotate input proteomes. EggNOG-mapper v2 (blue) against eggNOG-mapper v1 (red). (B) average minutes to annotate input genomes. EggNOG-mapper v2 (blue) against Prokka (green). (C) average minutes (in log scale) to annotate input proteins. MMseqs2 (-s 2,4,6; black) against Diamond (iterate/sensitive mode; orange). (D) Specificity (Sp), recall (Re), and F1 score, of PFAM domain annotation either from direct transference from orthologs, or after realignment. Full de novo realignment results were used as reference. (E) average minutes for PFAM domain annotation, using either PFAM full de novo (brown) or realign to orthologs domains (blue) modes. Benchmark setup: tests in (A) and (B) were done on 20 sets of 1–100 random proteomes (A) or genomes (B) from (Almeida et al. 2021), and executed using 10 CPUs and 80 GB of RAM. Tests in (C) were done on 35 random sets of 10–10,000,000 proteins from Progenomes v2 (Mende et al. 2020), using 30 CPUs and 240 GB of RAM. Tests in (Dm) and (E) as in (C), only for sets of 10–100,000 proteins.

References

    1. Almeida A, Nayfach S, Boland M, Strozzi F, Beracochea M, Shi ZJ, Pollard KS, Sakharova E, Parks DH, Hugenholtz P, et al.2021. A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat Biotechnol. 39(1):105–114. - PMC - PubMed
    1. Blum M, Chang H-Y, Chuguransky S, Grego T, Kandasaamy S, Mitchell A, Nuka G, Paysan-Lafosse T, Qureshi M, Raj S, et al.2021. The InterPro protein families and domains database: 20 years on. Nucleic Acids Res. 49(D1):D344–D354. - PMC - PubMed
    1. Gabaldón T, Koonin EV.. 2013. Functional and evolutionary implications of gene orthology. Nat Rev Genet. 14(5):360–366. - PMC - PubMed
    1. Gene Ontology Consortium. 2018. The Gene Ontology Resource: 20 years and still GOing Strong. Nucleic Acids Res. 47(D1):D330–D338. - PMC - PubMed
    1. Glover N, Dessimoz C, Ebersberger I, Forslund SK, Gabaldón T, Huerta-Cepas J, Martin M-J, Muffato M, Patricio M, Pereira C, et al.2019. Advances and applications in the quest for orthologs. Mol Biol Evol. 36(10):2157–2164. - PMC - PubMed

Publication types