Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Aug 6.
doi: 10.1038/s41596-025-01239-4. Online ahead of print.

A workflow for statistical analysis and visualization of microbiome omics data using the R microeco package

Affiliations
Review

A workflow for statistical analysis and visualization of microbiome omics data using the R microeco package

Chi Liu et al. Nat Protoc. .

Abstract

The increasing complexity of experimental designs and the volume of data in the microbiome field, along with the diversification of omics data types, pose substantial challenges to statistical analysis and visualization. Here we present a step-by-step protocol based on the R microeco package ( https://github.com/ChiLiubio/microeco ) that details the statistical analysis and visualization of microbiome data. The omics data types shown consist of amplicon sequencing data, metagenomic sequencing data and nontargeted metabolomics data. The analysis of amplicon sequencing data specifically involves data preprocessing and normalization, core taxa, alpha diversity, beta diversity, differential abundance testing and machine learning. We consider various data analysis scenarios in each section to exhibit the comprehensiveness of the protocol. We emphasize that different normalized data produced by various methods are selected for subsequent analysis of each part based on the best analytical practices. Additionally, in the differential abundance test analysis, we adopt parametric community simulation to enable the performance evaluation of various testing approaches. For the analysis of metagenomic data, the focus is on how bioinformatic analysis data are read and preprocessed, which refers to the major usage differences from amplicon sequencing data. For metabolomics data, we mainly demonstrate the differential test, machine learning and association analysis with microbial abundances. To address some complex analyses, this protocol extensively combines different types of methods to build an analysis pipeline. This protocol is more comprehensive and scalable compared with alternative methods. The provided R codes can run in about 6 h on a laptop computer.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Similar articles

References

    1. Donaldson, G. P., Lee, S. M. & Mazmanian, S. K. Gut biogeography of the bacterial microbiota. Nat. Rev. Microbiol. 14, 20–32 (2016). - PubMed
    1. Wang, Z. & Song, Y. Toward understanding the genetic bases underlying plant‐mediated ‘cry for help’ to the microbiota. iMeta 1, e8 (2022). - PubMed - PMC
    1. Raina, J. B. et al. Chemotaxis shapes the microscale organization of the ocean’s microbiome. Nature 605, 132–138 (2022). - PubMed
    1. Bolyen, E. et al. Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nat. Biotechnol. 37, 852–857 (2019). - PubMed - PMC
    1. Callahan, B. J. et al. DADA2: high-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583 (2016). - PubMed - PMC

LinkOut - more resources