Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Oct 2;18(1):434.
doi: 10.1186/s12859-017-1849-8.

MetaComp: comprehensive analysis software for comparative meta-omics including comparative metagenomics

Affiliations

MetaComp: comprehensive analysis software for comparative meta-omics including comparative metagenomics

Peng Zhai et al. BMC Bioinformatics. .

Abstract

Background: During the past decade, the development of high throughput nucleic sequencing and mass spectrometry analysis techniques have enabled the characterization of microbial communities through metagenomics, metatranscriptomics, metaproteomics and metabolomics data. To reveal the diversity of microbial communities and interactions between living conditions and microbes, it is necessary to introduce comparative analysis based upon integration of all four types of data mentioned above. Comparative meta-omics, especially comparative metageomics, has been established as a routine process to highlight the significant differences in taxon composition and functional gene abundance among microbiota samples. Meanwhile, biologists are increasingly concerning about the correlations between meta-omics features and environmental factors, which may further decipher the adaptation strategy of a microbial community.

Results: We developed a graphical comprehensive analysis software named MetaComp comprising a series of statistical analysis approaches with visualized results for metagenomics and other meta-omics data comparison. This software is capable to read files generated by a variety of upstream programs. After data loading, analyses such as multivariate statistics, hypothesis testing of two-sample, multi-sample as well as two-group sample and a novel function-regression analysis of environmental factors are offered. Here, regression analysis regards meta-omic features as independent variable and environmental factors as dependent variables. Moreover, MetaComp is capable to automatically choose an appropriate two-group sample test based upon the traits of input abundance profiles. We further evaluate the performance of its choice, and exhibit applications for metagenomics, metaproteomics and metabolomics samples.

Conclusion: MetaComp, an integrative software capable for applying to all meta-omics data, originally distills the influence of living environment on microbial community by regression analysis. Moreover, since the automatically chosen two-group sample test is verified to be outperformed, MetaComp is friendly to users without adequate statistical training. These improvements are aiming to overcome the new challenges under big data era for all meta-omics data. MetaComp is available at: http://cqb.pku.edu.cn/ZhuLab/MetaComp/ and https://github.com/pzhaipku/MetaComp/ .

Keywords: Comparative meta-omics; Comparative metagenomics; Graphical user interface; Statistical analysis; Visualization.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
The graphical user interface of MetaComp. (a) Drop-down menu File for data input. (b) Drop-down menu Analysis for selecting analysis methods
Fig. 2
Fig. 2
The workflow of MetaComp. The input data of MetaComp includes meta-omics data (for all analyses) and environmental factors input (only for regression analysis). The analysis procedure in MetaComp consist of three independent parts: multivariate statistics (PCA and cluster analysis), statistical hypothesis tests (two-sample test, multi-sample test and two-group sample test) and regression analysis of environmental factors. The outputs are provided in Excel spreadsheet (k-means clustering results, statistically significance for each feature and regression analysis results) and visualized in diagrams (PCA map, hierarchical clustering dendrogram, bar plot, MDS map, heat-map)
Fig. 3
Fig. 3
The workflow of preparation for all four types of meta-omics data. Metagenomics, metatranscriptomics, metaproteomics and metabolomics data are preprocessed through experimental procedures such as molecule extraction, sequencing for nucleotides or MS measuring for peptides and metabolites. Then, bioinformatics procedures such as sequence assembly and functional annotation are introduced. Finally, the results of this workflow are functional gene, taxon and physiological metabolite abundance profiles
Fig. 4
Fig. 4
The visualization examples of MetaComp. a The bar plot of the top ten significantly different features. b The multi-dimensional scaling map of samples. Each point represents an individual sample. c The hierarchical clustering dendrogram of given samples. d The heat-map of given samples
Fig. 5
Fig. 5
Visualizations of metagenomic samples analysing results. a This bar plot displays the top ten significantly different protein families among eight given samples. The frequencies of PF00072, PF00144, PF00872 in eight samples are dramatically fluctuated. b Hierarchical clustering dendrogram of eight given samples. c Multi-dimensional scaling map of eight given samples. Obviously, three samples from Sargasso Sea as well as three whale fall samples are grouped respectively; Minnesota farm soil and AMD samples are separated from Sargasso Sea samples and whale fall samples in both phylogenetic view and multi-dimensional distance. d The heat-map of eight given samples. This figure demonstrates our conclusion mentioned above through the similarity of relative gene abundance among eight samples
Fig. 6
Fig. 6
Diagrams of regression. These diagrams exhibit the relationship between DIP and selected functional genes categorized by COG (COG0379, COG0458, COG0486, COG0849, COG1190 and COG1921). It is obviously that the abundance of these genes is linear with the content of DIP
Fig. 7
Fig. 7
ROC curve for all five methods. ROC performance of five methods in significant feature detection

Similar articles

Cited by

References

    1. White III RA, Callister SJ, Moore RJ, Baker ES, Jansson JK. The past, present and future of microbiome analyses. Nat Protoc. 2016;11(11):2049–53. doi: 10.1038/nprot.2016.148. - DOI
    1. Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, Ley RE, Sogin ML, Jones WJ, Roe BA, Affourtit JP, et al. A core gut microbiome in obese and lean twins. Nature. 2009;457(7228):480–4. doi: 10.1038/nature07540. - DOI - PMC - PubMed
    1. Wu GD, Chen J, Hoffmann C, Bittinger K, Chen YY, Keilbaugh SA, Sinha R. Linking long-term dietary patterns with gut microbial enterotypes. Science. 2011;334:105–8. doi: 10.1126/science.1208344. - DOI - PMC - PubMed
    1. David LA, Maurice CF, Carmody RN, Gootenberg DB, Button JE, Wolfe BE, Biddinger SB. Diet rapidly and reproducibly alters the human gut microbiome. Nature. 2014;505:559–63. doi: 10.1038/nature12820. - DOI - PMC - PubMed
    1. Raman M, Ahmed I, Gillevet PM, Probert CS, Ratcliffe NM, Smith S, Greenwood R, Sikaroodi M, Lam V, Crotty P, et al. Fecal microbiome and volatile organic compound metabolome in obese humans with nonalcoholic fatty liver disease. Clin Gastroenterol Hepatol. 2013;11(7):868–75. doi: 10.1016/j.cgh.2013.02.015. - DOI - PubMed