Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Sep 19:9:386.
doi: 10.1186/1471-2105-9-386.

The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes

Affiliations

The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes

F Meyer et al. BMC Bioinformatics. .

Abstract

Background: Random community genomes (metagenomes) are now commonly used to study microbes in different environments. Over the past few years, the major challenge associated with metagenomics shifted from generating to analyzing sequences. High-throughput, low-cost next-generation sequencing has provided access to metagenomics to a wide range of researchers.

Results: A high-throughput pipeline has been constructed to provide high-performance computing to all researchers interested in using metagenomics. The pipeline produces automated functional assignments of sequences in the metagenome by comparing both protein and nucleotide databases. Phylogenetic and functional summaries of the metagenomes are generated, and tools for comparative metagenomics are incorporated into the standard views. User access is controlled to ensure data privacy, but the collaborative environment underpinning the service provides a framework for sharing datasets between multiple users. In the metagenomics RAST, all users retain full control of their data, and everything is available for download in a variety of formats.

Conclusion: The open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes. With built-in support for multiple data sources and a back end that houses abstract data types, the metagenomics RAST is stable, extensible, and freely available to all researchers. This service has removed one of the primary bottlenecks in metagenome sequence analysis - the availability of high-performance computing for annotating the data. http://metagenomics.nmpdr.org.

PubMed Disclaimer

Figures

Figure 1
Figure 1
After uploading a dataset (a), the system computes initial quality control (b) and allows the user to set the parameters for phylogenetic analysis (c). The system then displays the results (d) and allows the user to alter the parameters (e). Data shown in this example is from the dataset CF11.2 (ID:4440026.3) that is publicly available in the MG-RAST server.
Figure 2
Figure 2
Overview of the workflow implemented in the metagenomics RAST pipeline. Three distinct stages of processing are executed, each adding data to a single directory, and ultimately enabling web-based browsing of results.
Figure 3
Figure 3
We emphasize data accessibility, (a) sequence analysis results (e.g. BLAST matches) and all sequences in a metagenome are visible and can be downloaded. In addition the server provides an overview (b) of the sequence analysis results per fragment in a metagenome (c).
Figure 4
Figure 4
Comparing the phylogenetic composition of four metagenomes. Initially (a) the user selects a subset of metagenomes or genomes (here we selected 2 Soudan mine samples and 2 marine samples). The next step (b) allows selecting the basis for the comparison (protein-based-only SEED subsystems or all SEED proteins vs. RNA based RDP or Greengenes) and the parameters for the matches. The parameters include e-value, minimal alignment length, p-value, and percent identity. Finally, the result (c) is displayed in tabular format, in which a heatmap-style color coding is used to highlight differences. The resulting table can be downloaded as a spreadsheet.

References

    1. Venter JC, Remington K, Heidelberg JF, Halpern AL, Rusch D, Eisen JA, Wu D, Paulsen I, Nelson KE, Nelson W, et al. Environmental genome shotgun sequencing of the Sargasso Sea. Science. 2004;304:66–74. doi: 10.1126/science.1093857. - DOI - PubMed
    1. Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, Richardson PM, Solovyev VV, Rubin EM, Rokhsar DS, Banfield JF. Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature. 2004;428:37–43. doi: 10.1038/nature02340. - DOI - PubMed
    1. Huse SM, Huber JA, Morrison HG, Sogin ML, Welch DM. Accuracy and quality of massively parallel DNA pyrosequencing. Genome biology. 2007;8:R143. doi: 10.1186/gb-2007-8-7-r143. - DOI - PMC - PubMed
    1. Overbeek R, Begley T, Butler RM, Choudhuri JV, Diaz N, Chuang H-Y, Cohoon M, de Crécy-Lagard V, Disz T, Edwards R, et al. The Subsystems Approach to Genome Annotation and its Use in the Project to Annotate 1000 Genomes. Nucleic Acids Res. 2005;33 - PMC - PubMed
    1. McNeil LK, Reich C, Aziz RK, Bartels D, Cohoon M, Disz T, Edwards RA, Gerdes S, Hwang K, Kubal M, et al. The National Microbial Pathogen Database Resource (NMPDR): a genomics platform based on subsystem annotation. Nucleic Acids Res. 2007:D347–353. doi: 10.1093/nar/gkl947. - DOI - PMC - PubMed

Publication types