Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 31;6(1):7.
doi: 10.3390/proteomes6010007.

Disseminating Metaproteomic Informatics Capabilities and Knowledge Using the Galaxy-P Framework

Affiliations

Disseminating Metaproteomic Informatics Capabilities and Knowledge Using the Galaxy-P Framework

Clemens Blank et al. Proteomes. .

Abstract

The impact of microbial communities, also known as the microbiome, on human health and the environment is receiving increased attention. Studying translated gene products (proteins) and comparing metaproteomic profiles may elucidate how microbiomes respond to specific environmental stimuli, and interact with host organisms. Characterizing proteins expressed by a complex microbiome and interpreting their functional signature requires sophisticated informatics tools and workflows tailored to metaproteomics. Additionally, there is a need to disseminate these informatics resources to researchers undertaking metaproteomic studies, who could use them to make new and important discoveries in microbiome research. The Galaxy for proteomics platform (Galaxy-P) offers an open source, web-based bioinformatics platform for disseminating metaproteomics software and workflows. Within this platform, we have developed easily-accessible and documented metaproteomic software tools and workflows aimed at training researchers in their operation and disseminating the tools for more widespread use. The modular workflows encompass the core requirements of metaproteomic informatics: (a) database generation; (b) peptide spectral matching; (c) taxonomic analysis and (d) functional analysis. Much of the software available via the Galaxy-P platform was selected, packaged and deployed through an online metaproteomics "Contribution Fest" undertaken by a unique consortium of expert software developers and users from the metaproteomics research community, who have co-authored this manuscript. These resources are documented on GitHub and freely available through the Galaxy Toolshed, as well as a publicly accessible metaproteomics gateway Galaxy instance. These documented workflows are well suited for the training of novice metaproteomics researchers, through online resources such as the Galaxy Training Network, as well as hands-on training workshops. Here, we describe the metaproteomics tools available within these Galaxy-based resources, as well as the process by which they were selected and implemented in our community-based work. We hope this description will increase access to and utilization of metaproteomics tools, as well as offer a framework for continued community-based development and dissemination of cutting edge metaproteomics software.

Keywords: Galaxy platform; bioinformatics; community development; functional microbiome; mass spectrometry; metaproteomics; software workflow development.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Generalized metaproteomics schema: Identification of metaproteome peptides is a complex workflow consisting of metaproteome sequence database generation (in FAST-ALL (FASTA) format) and peak processing of tandem mass spectrometry (MS/MS) data (in Mascot Generic Format (MGF) of mzML format). These two output files are used to match observed MS/MS spectra to predicted peptide sequences. This generates a list of bacterial peptide–spectral matches (PSMs). Later, the bacterial PSMs can be parsed out and subjected to functional analysis and taxonomic analysis for biological insight.
Figure 2
Figure 2
Galaxy interface and metaproteomics gateway. The Galaxy interface includes a tool menu, which consists of the list of available customized software within the instance in use. The central main viewing pane offers an area to view parameters for tools, edit workflows, and to visualize the results. The history menu maintains a real-time record of inputs and intermediate or final outputs from active software operations as the data is processed.
Figure 3
Figure 3
Sixgill tool within Galaxy. The Sixgill tool within Galaxy shows the build module, which uses a shotgun sequencing generated FASTQ file as an input, and generates a Tab-Separated Values (TSV) format file as an output. The filtering parameters aid in determining the quality and features of the output and are dependent on minimum length of the gene sequence, quality score, etc.
Figure 4
Figure 4
Edit view of Galaxy workflow for metaproteomics analysis. Representation of software tools used in a Galaxy metaproteomics workflow to identify bacterial peptides from the metaproteomic dataset. The first part of workflow includes database generation, followed by peak processing. The outputs from these sections are used for database search to generate a list of both bacterial peptide-spectral matches (PSMs). Later, bacterial PSMs were parsed out and subjected to Unipept analysis using Pept2Pro algorithm to generate outputs for functional analysis. Gene ontology categories such as biological processes, cellular localization and molecular function are generated. Additionally, bacterial PSMs were subjected to Unipept analysis using the lowest common ancestor algorithm to generate outputs for taxonomic analysis.
Figure 5
Figure 5
Taxonomy analysis using Unipept. Bacterial PSMs were subjected to Unipept analysis against UniProt database using lowest common ancestor algorithm to generate outputs for taxonomic analysis. These outputs include a Unipept Viewer which is an interactive visualization plugin that can be used to visualize taxonomic distribution of the ocean metaproteomic dataset. Unipept also generates a Comma-Separated Values (CSV) format file that lists the peptide assignments to taxa. That file then can be parsed to generate a tabular output (lower right).
Figure 6
Figure 6
Functional analysis using Unipept and GO (Gene Ontology) terms. Bacterial PSMs were subjected to Unipept analysis against Pept2Pro algorithm to generate outputs for functional analysis. Using PSM report, gene ontology mapping files and Unipept outputs, the query tabular file generates tabular outputs for gene ontology categories. The generated tabular outputs for molecular function (A), cellular localization; (B) and biological processes; (C) and also enlist the number of associated peptides and PSMs with each gene ontology category.
Figure 6
Figure 6
Functional analysis using Unipept and GO (Gene Ontology) terms. Bacterial PSMs were subjected to Unipept analysis against Pept2Pro algorithm to generate outputs for functional analysis. Using PSM report, gene ontology mapping files and Unipept outputs, the query tabular file generates tabular outputs for gene ontology categories. The generated tabular outputs for molecular function (A), cellular localization; (B) and biological processes; (C) and also enlist the number of associated peptides and PSMs with each gene ontology category.

References

    1. Knight R., Callewaert C., Marotz C., Hyde E.R., Debelius J.W., McDonald D., Sogin M.L. The Microbiome and Human Biology. Annu. Rev. Genom. Hum. Genet. 2017;31:65–86. doi: 10.1146/annurev-genom-083115-022438. - DOI - PubMed
    1. Foo J.L., Ling H., Lee Y.S., Chang M.W. Microbiome engineering: Current applications and its future. Biotechnol. J. 2017;12 doi: 10.1002/biot.201600099. - DOI - PubMed
    1. Arnold J.W., Roach J., Azcarate-Peril M.A. Emerging Technologies for Gut Microbiome Research. Trends Microbiol. 2016;24:887–901. doi: 10.1016/j.tim.2016.06.008. - DOI - PMC - PubMed
    1. Siegwald L., Touzet H., Lemoine Y., Hot D., Audebert C., Caboche S. Assessment of Common and Emerging Bioinformatics Pipelines for Targeted Metagenomics. PLoS ONE. 2017;12:e0169563. doi: 10.1371/journal.pone.0169563. - DOI - PMC - PubMed
    1. Maier T.V., Lucio M., Lee L.H., VerBerkmoes N.C., Brislawn C.J., Bernhardt J., Lamendella R., McDermott J.E., Bergeron N., Heinzmann S.S., et al. Impact of Dietary Resistant Starch on the Human Gut Microbiome, Metaproteome, and Metabolome. mBio. 2017;8:1343–1417. doi: 10.1128/mBio.01343-17. - DOI - PMC - PubMed