Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr;19(4):429-440.
doi: 10.1038/s41592-022-01431-4. Epub 2022 Apr 8.

Critical Assessment of Metagenome Interpretation: the second round of challenges

Fernando Meyer #  1   2 Adrian Fritz #  1   2   3 Zhi-Luo Deng  1   2   4 David Koslicki  5 Till Robin Lesker  3   6 Alexey Gurevich  7 Gary Robertson  1   2 Mohammed Alser  8 Dmitry Antipov  9 Francesco Beghini  10 Denis Bertrand  11 Jaqueline J Brito  12 C Titus Brown  13 Jan Buchmann  14 Aydin Buluç  15   16 Bo Chen  15   16 Rayan Chikhi  17 Philip T L C Clausen  18 Alexandru Cristian  19   20 Piotr Wojciech Dabrowski  21   22 Aaron E Darling  23 Rob Egan  24   25 Eleazar Eskin  26 Evangelos Georganas  27 Eugene Goltsman  24   25 Melissa A Gray  19   28 Lars Hestbjerg Hansen  29 Steven Hofmeyr  15   16 Pingqin Huang  30 Luiz Irber  13 Huijue Jia  31   32 Tue Sparholt Jørgensen  33   34 Silas D Kieser  35   36 Terje Klemetsen  37 Axel Kola  38 Mikhail Kolmogorov  39 Anton Korobeynikov  9   40 Jason Kwan  41 Nathan LaPierre  26 Claire Lemaitre  42 Chenhao Li  11 Antoine Limasset  43 Fabio Malcher-Miranda  44 Serghei Mangul  12 Vanessa R Marcelino  45   46 Camille Marchet  43 Pierre Marijon  47 Dmitry Meleshko  9 Daniel R Mende  48 Alessio Milanese  49   50 Niranjan Nagarajan  51   52 Jakob Nissen  53 Sergey Nurk  54 Leonid Oliker  15   16 Lucas Paoli  49 Pierre Peterlongo  42 Vitor C Piro  44 Jacob S Porter  55 Simon Rasmussen  56 Evan R Rees  41 Knut Reinert  57 Bernhard Renard  44   58 Espen Mikal Robertsen  37 Gail L Rosen  19   28   59 Hans-Joachim Ruscheweyh  49 Varuni Sarwal  26 Nicola Segata  10 Enrico Seiler  57 Lizhen Shi  60 Fengzhu Sun  61 Shinichi Sunagawa  49 Søren Johannes Sørensen  62 Ashleigh Thomas  24   63 Chengxuan Tong  11 Mirko Trajkovski  35   64 Julien Tremblay  65 Gherman Uritskiy  66 Riccardo Vicedomini  17 Zhengyang Wang  30 Ziye Wang  67 Zhong Wang  68   69   70 Andrew Warren  55 Nils Peder Willassen  37 Katherine Yelick  15   16 Ronghui You  30 Georg Zeller  50 Zhengqiao Zhao  19 Shanfeng Zhu  71   72 Jie Zhu  31   32 Ruben Garrido-Oter  73 Petra Gastmeier  38 Stephane Hacquard  73 Susanne Häußler  6 Ariane Khaledi  6 Friederike Maechler  38 Fantin Mesny  73 Simona Radutoiu  74 Paul Schulze-Lefert  73 Nathiana Smit  6 Till Strowig  6 Andreas Bremges  1   3 Alexander Sczyrba  75 Alice Carolyn McHardy  76   77   78   79
Affiliations

Critical Assessment of Metagenome Interpretation: the second round of challenges

Fernando Meyer et al. Nat Methods. 2022 Apr.

Abstract

Evaluating metagenomic software is key for optimizing metagenome interpretation and focus of the Initiative for the Critical Assessment of Metagenome Interpretation (CAMI). The CAMI II challenge engaged the community to assess methods on realistic and complex datasets with long- and short-read sequences, created computationally from around 1,700 new and known genomes, as well as 600 new plasmids and viruses. Here we analyze 5,002 results by 76 program versions. Substantial improvements were seen in assembly, some due to long-read data. Related strains still were challenging for assembly and genome recovery through binning, as was assembly quality for the latter. Profilers markedly matured, with taxon profilers and binners excelling at higher bacterial ranks, but underperforming for viruses and Archaea. Clinical pathogen detection results revealed a need to improve reproducibility. Runtime and memory usage analyses identified efficient programs, including top performers with other metrics. The results identify challenges and guide researchers in selecting methods for analyses.

PubMed Disclaimer

Conflict of interest statement

A.E.D. cofounded Longas Technologies Pty Ltd, a company aimed at development of synthetic long-read sequencing technologies, and is employed by Illumina Australia Pty Ltd. A.C. is employed by Google LLC. L.I. is employed by 10X Genomics. E.R.R. conducted an internship at Empress Therapeutics. E. Georganas is employed by Intel Corporation. G.U. is employed by Amazon.com, Inc. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Metagenome assembler performances on the marine and strain-madness datasets.
a, Radar plots of genome fraction. b, Mismatches per 100 kilobases (kb). c, Misassemblies. d, NGA50. e, Strain recall. f, Strain precision. For methods with multiple evaluated versions, the best ranked version on the marine data is shown (Supplementary Fig. 1 and Supplementary Table 3). Absolute values for metrics are log scaled. Lines indicate different subsets of genomes analyzed, and the value of the GSAs indicates the upper bound for a metric. The metrics are shown for both unique and common strain genomes. g, Genome recovery fraction versus genome sequencing depth (coverage) on the marine dataset. Blue indicates unique genomes (<95% ANI), green common genomes (ANI ≥ 95%) and orange high-copy circular elements. Gray lines indicate the coverage at which the first genome is recovered with ≥90% genome fraction. Source data
Fig. 2
Fig. 2. Performance of genome binners on short-read assemblies (GSA and MA, MEGAHIT) of the marine, strain-madness, and plant-associated data.
a, Boxplots of average completeness, purity, ARI, percentage of binned bp and fraction of genomes recovered with moderate or higher quality (>50% completeness, <10% contamination) across methods from each dataset (Methods). Arrows indicate the average. bg, Boxplots of completeness per genome and purity per bin, and bar charts of ARI, binned bp and moderate or higher quality genomes recovered, by method, for each dataset: marine GSA (b), marine MA (c), strain-madness GSA (d), strain-madness MA (e), plant-associated GSA (f) and plant-associated MA (g). The submission with the highest F1-score per method on a dataset is shown (Supplementary Tables 9–15). Boxes in boxplots indicate the interquartile range of n results, the center line the median and arrows the average. Whiskers extend to 1.5 × interquartile range or to the maximum and minimum if there is no outlier. Outliers are results represented as points outside 1.5 × interquartile range above the upper quartile and below the lower quartile. Source data
Fig. 3
Fig. 3. Taxonomic binning performance across ranks per dataset.
a, Marine. b, Strain-madness. c, Plant-associated. Metrics were computed over unfiltered (solid lines) and 1%-filtered (that is, without the 1% smallest bins in bp, dashed lines) predicted bins of short reads (SR), long reads (LR) and contigs of the GSA. Shaded bands show the standard error across bins. Source data
Fig. 4
Fig. 4. Taxonomic profiling results for the marine and strain-madness datasets at genus level.
a,b, Marine datasets. c,d, Strain-madness datasets. Results are shown for the overall best ranked submission per software version (Supplementary Tables 33 and 35, and Supplementary Fig. 12). a,c, Purity versus completeness. b,d, Upper bound of L1 norm (2) minus actual L1 norm versus upper bound of weighted UniFrac error (16) minus actual weighted UniFrac error. Symbols indicate the mean over ten marine and 100 strain-madness samples, respectively, and error bars the standard deviation. Metrics were determined using OPAL with default settings. Source data
Fig. 5
Fig. 5. Computational requirements of software from all categories.
a, Runtime. b, Maximum memory usage. Results are reported for the marine and strain-madness read data or GSAs (Supplementary Table 40). The x axes are log scaled and the numbers given are the software version numbers. Source data

References

    1. Ghurye JS, Cepeda-Espinoza V, Pop M. Metagenomic assembly: overview, challenges and applications. Yale J. Biol. Med. 2016;89:353–362. - PMC - PubMed
    1. Breitwieser FP, Lu J, Salzberg SL. A review of methods and databases for metagenomic classification and assembly. Brief. Bioinform. 2019;20:1125–1136. doi: 10.1093/bib/bbx120. - DOI - PMC - PubMed
    1. Sangwan N, Xia F, Gilbert JA. Recovering complete and draft population genomes from metagenome datasets. Microbiome. 2016;4:8. doi: 10.1186/s40168-016-0154-5. - DOI - PMC - PubMed
    1. Sczyrba A, et al. Critical Assessment of Metagenome Interpretation: a benchmark of metagenomics software. Nat. Methods. 2017;14:1063–1071. doi: 10.1038/nmeth.4458. - DOI - PMC - PubMed
    1. McIntyre ABR, et al. Comprehensive benchmarking and ensemble approaches for metagenomic classifiers. Genome Biol. 2017;18:182. doi: 10.1186/s13059-017-1299-7. - DOI - PMC - PubMed

Publication types