Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Feb 1;32(3):354-61.
doi: 10.1093/bioinformatics/btv584. Epub 2015 Oct 9.

SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data

Affiliations

SUPER-FOCUS: a tool for agile functional analysis of shotgun metagenomic data

Genivaldo Gueiros Z Silva et al. Bioinformatics. .

Abstract

Summary: Analyzing the functional profile of a microbial community from unannotated shotgun sequencing reads is one of the important goals in metagenomics. Functional profiling has valuable applications in biological research because it identifies the abundances of the functional genes of the organisms present in the original sample, answering the question what they can do. Currently, available tools do not scale well with increasing data volumes, which is important because both the number and lengths of the reads produced by sequencing platforms keep increasing. Here, we introduce SUPER-FOCUS, SUbsystems Profile by databasE Reduction using FOCUS, an agile homology-based approach using a reduced reference database to report the subsystems present in metagenomic datasets and profile their abundances. SUPER-FOCUS was tested with over 70 real metagenomes, the results showing that it accurately predicts the subsystems present in the profiled microbial communities, and is up to 1000 times faster than other tools.

Availability and implementation: SUPER-FOCUS was implemented in Python, and its source code and the tool website are freely available at https://edwards.sdsu.edu/SUPERFOCUS.

Contact: redwards@mail.sdsu.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Workflow of the SUPER-FOCUS program
Fig. 2.
Fig. 2.
Representation of a subsystem structure (Levels 1–3 classifications and Function)
Fig. 3.
Fig. 3.
Percent classification sensitivity (A) and precision (B) of level 1 subsystems and speed of RAPSearch2 and SUPER-FOCUS using different databases and parameter modes. This analysis was based on a comparison of 50 HMP metagenomes, where blastx assignments using DB_100 were considered to be the true answer
Fig. 4.
Fig. 4.
Percentage of level 3 subsystems present in all the testing set metagenomes predicted by SUPER-FOCUS
Fig. 5.
Fig. 5.
Confusion matrix displaying the percentage of correct assignments in each level 1 subsystem for the 50 HMP metagenomes. (a) Shows the RAPSearch2 assignments in the sensitive mode to DB_100. (b) Shows the SUPER-FOCUS assignments in the sensitive mode to DB_100
Fig. 6.
Fig. 6.
Classification sensitivity using level 1 classifications and speed comparison of 50 HMP metagenomes using RAPSearch2 and SUPER-FOCUS using different databases and modes, but removing Eurkaryota and viral assignments. blastx assignments using DB_100 were considered to be the true answer
Fig. 7.
Fig. 7.
Classification sensitivity (a) and precision (b) percent using level 1 and speed comparison of three viromes using RAPSearch2 and SUPER-FOCUS using different databases and modes. blastx assignments using DB_100 were considered to be the true answer
Fig. 8.
Fig. 8.
Run time comparison for the three marine viromes using SUPER-FOCUS, RTMg, MEGAN and MG-RAST
Fig. 9.
Fig. 9.
Run time comparison for the one big data metagenome using SUPER-FOCUS, RTMg, MEGAN and MG-RAST
Fig. 10.
Fig. 10.
Comparison of level 1 subsystems profile of one big data metagenome using SUPER-FOCUS, RTMg, MEGAN, MG-RAST and blastx that are considered to be the true answer
Fig. 11.
Fig. 11.
Box plots displaying the percent sensitivity (A and C) and precision (B and D) of RAPSearch2 (A and B), blastx (C and D) annotation of the 20 coral metagenomes. RAPsearch2 was tested in the fast and sensitive modes
Fig. 12.
Fig. 12.
Hierarchical clustering of the taxonomic (A) and functional (B) annotations of 20 coral metagenomes. Genus level taxonomic annotation was performed using FOCUS. Functional annotation of level 3 subsystems was performed using SUPER-FOCUS using blastx and DB_98

References

    1. Altschul S.F., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402. - PMC - PubMed
    1. Aziz R.K., et al. (2008) The RAST server: rapid annotations using subsystems technology. BMC Genomics, 9, 75. - PMC - PubMed
    1. Aziz R.K., et al. (2012) SEED servers: high-performance access to the SEED genomes, annotations, and metabolic models. PLoS One, 7, e48053. - PMC - PubMed
    1. Berendzen J., et al. (2012) Rapid phylogenetic and functional classification of short genomic fragments with signature peptides. BMC Res. Notes, 5, 460. - PMC - PubMed
    1. Buchfink B., et al. (2015) Fast and sensitive protein alignment using DIAMOND. Nat. Methods, 12, 59–60. - PubMed

Publication types