Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Oct 2;41(10):btaf538.
doi: 10.1093/bioinformatics/btaf538.

Enhancing genome recovery across metagenomic samples using MAGmax

Affiliations

Enhancing genome recovery across metagenomic samples using MAGmax

Arangasamy Yazhini et al. Bioinformatics. .

Abstract

Summary: The number of metagenome-assembled genomes (MAGs) is rapidly increasing with the growing scale of metagenomic studies, driving fast progress in microbiome research. Sample-wise assembly has become the standard due to its computational efficiency and strain-level resolution. It requires dereplication, the removal of near-identical genomes assembled in different metagenomic samples. We present MAGmax, an efficient dereplication tool that enhances both the quantity and quality of MAGs through a strategy of bin merging and reassembly. Unlike dRep, which selects a single representative bin per genome cluster, MAGmax merges multiple bins within a cluster and reassembles them to increase coverage. MAGmax produces more dereplicated, higher-quality MAGs than dRep at 1.6× its speed and using three times less memory.

Availability and implementation: The MAGmax open source software, implemented in Rust, is available under the GPLv3 license at https://github.com/soedinglab/MAGmax.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
(a) Overview of MAGmax. The blue star marks a high-quality bin. (b) Complementary cumulative distribution of the number of dereplicated genomic bins over the bin quality score (completeness − 5 × purity) in combination with three popular binners. (c) Quality scores of dRep bins (y-axis) and the corresponding MAGmax bins (x-axis) based on genomic cluster membership. Grey/blue circles: MAGmax and dRep bins are identical/different. (d) Scatter plot showing quality scores of merged and reassembled bins (x-axis) versus input bins (y-axis). Blue circles represent the highest-quality bins within each genomic cluster, while cyan circles indicate the other input bins. Vertical lines connect the input bins that were used for merging and reassembly. (e) Runtime (in minutes) and peak memory usage (in GB).

References

    1. Bankevich A, Nurk S, Antipov D et al. Spades: a new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol 2012;19:455–77. - PMC - PubMed
    1. Beghini F, Pullman J, Alexander M et al. Gut microbiome strain-sharing within isolated village social networks. Nature 2025;637:167–75. - PMC - PubMed
    1. Chklovski A, Parks DH, Woodcroft BJ et al. Checkm2: a rapid, scalable and accurate tool for assessing microbial genome quality using machine learning. Nat Methods 2023;20:1203–12. - PubMed
    1. Delgado LF, Andersson AF. Evaluating metagenomic assembly approaches for biome-specific gene catalogues. Microbiome 2022;10:72. - PMC - PubMed
    1. Evans JT, Denef VJ. To dereplicate or not to dereplicate? mSphere 2020;5:10–1128. - PMC - PubMed