Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Dec 13;38(24):5430-5433.
doi: 10.1093/bioinformatics/btac694.

MAGScoT: a fast, lightweight and accurate bin-refinement tool

Affiliations

MAGScoT: a fast, lightweight and accurate bin-refinement tool

Malte Christoph Rühlemann et al. Bioinformatics. .

Abstract

Motivation: Recovery of metagenome-assembled genomes (MAGs) from shotgun metagenomic data is an important task for the comprehensive analysis of microbial communities from variable sources. Single binning tools differ in their ability to leverage specific aspects in MAG reconstruction, the use of ensemble binning refinement tools is often time consuming and computational demand increases with community complexity. We introduce MAGScoT, a fast, lightweight and accurate implementation for the reconstruction of highest-quality MAGs from the output of multiple genome-binning tools.

Results: MAGScoT outperforms popular bin-refinement solutions in terms of quality and quantity of MAGs as well as computation time and resource consumption.

Availability and implementation: MAGScoT is available via GitHub (https://github.com/ikmb/MAGScoT) and as an easy-to-use Docker container (https://hub.docker.com/repository/docker/ikmb/magscot).

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Summary of binning performance in the (A) simulated marine dataset from the CAMI2 challenge and the (B) collection of 50 gut metagenomes from the HMP2 based on CheckM output. The left panels show mean runtime per sample on 8 CPUs and 80 GB of RAM. In the center panel, completeness and contamination are combined into a score (= completeness – 0.5 × contamination), which was used to rank the bins. GSA2k in the simulated marine dataset represents the perfect-case scenario that can be obtained from the gold standard assembly filtered by a minimum contig length of 2000 bp. The right panel summarizes the median completeness and contamination by binning or refinement software. Boxplots depict the distribution of binning completeness and contamination with each software output. The dashed horizontal lines in the center panels show proxies for high quality (score > 0.9) and good (score > 0.7) MAGs. The dashed lines in right panels show the completeness and contamination thresholds for high quality MAGs at 90 and 10%, respectively

References

    1. Almeida A. et al. (2019) A new genomic blueprint of the human gut microbiota. Nature, 568, 499–504. - PMC - PubMed
    1. Alneberg J. et al. (2014) Binning metagenomic contigs by coverage and composition. Nat. Methods, 11, 1144–1146. - PubMed
    1. Chaumeil P.-A. et al. (2020) GTDB-Tk: a toolkit to classify genomes with the genome taxonomy database. Bioinformatics, 36, 1925–1927. - PMC - PubMed
    1. Hyatt D. et al. (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics, 11, 119. - PMC - PubMed
    1. Johnson L.S. et al. (2010) Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics, 11, 431. - PMC - PubMed

Publication types