Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 30:14:1094800.
doi: 10.3389/fmicb.2023.1094800. eCollection 2023.

Reducing bias in microbiome research: Comparing methods from sample collection to sequencing

Affiliations

Reducing bias in microbiome research: Comparing methods from sample collection to sequencing

Jolanda Kool et al. Front Microbiol. .

Abstract

Background: Microbiota profiles are strongly influenced by many technical aspects that impact the ability of researchers to compare results. To investigate and identify potential biases introduced by technical variations, we compared several approaches throughout the entire workflow of a microbiome study, from sample collection to sequencing, using commercially available mock communities (from bacterial strains as well as from DNA) and multiple human fecal samples, including a large set of positive controls created as a random mix of several participant samples.

Methods: Human fecal material was sampled, and aliquots were used to test two commercially available stabilization solutions (OMNIgene·GUT and Zymo Research) in comparison to samples frozen immediately upon collection. In addition, the methodology for DNA extraction, input of DNA, or the number of PCR cycles were analyzed. Furthermore, to investigate the potential batch effects in DNA extraction, sequencing, and barcoding, we included 139 positive controls.

Results: Samples preserved in both the stabilization buffers limited the overgrowth of Enterobacteriaceae when compared to unpreserved samples stored at room temperature (RT). These stabilized samples stored at RT were different from immediately frozen samples, where the relative abundance of Bacteroidota was higher and Actinobacteriota and Firmicutes were lower. As reported previously, the method used for cell disruption was a major contributor to variation in microbiota composition. In addition, a high number of cycles during PCR lead to an increase in contaminants detected in the negative controls. The DNA extraction had a significant impact on the microbial composition, also observed with the use of different Illumina barcodes during library preparation and sequencing, while no batch effect was observed in replicate runs.

Conclusion: Our study reaffirms the importance of the mechanical cell disruption method and immediate frozen storage as critical aspects in fecal microbiota studies. A comparison of storage conditions revealed that the bias was limited in RT samples preserved in stabilization systems, and these may be a suitable compromise when logistics are challenging due to the size or location of a study. Moreover, to reduce the effect of contaminants in fecal microbiota profiling studies, we suggest the use of ~125 pg input DNA and 25 PCR cycles as optimal parameters during library preparation.

Keywords: 16S rRNA gene sequencing; gut; human studies; microbiome; microbiota; reproducible analysis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Overview of the conditions tested, samples included and processed in every experiment, and analyses performed.
Figure 2
Figure 2
Overview of the different methods for sample collection and storage. The effect of storage at RT for 3–5 days was tested with and without stabilization buffer of the Zymo research (ZYBUF) and OMNIgene·GUT (OMBUF) collection tubes and compared to the standard storage condition (−80). Three aliquots of fecal material from 12 participant samples (PSs) and one negative bacterial control (NCB) were tested: one control sample stored under standard storage conditions (−80-ZYCON) and two additional copies stored at room temperature for 3–5 days prior to freezing at −80°C, with and without the Zymo Research stabilization buffer (named RT-ZYBUF and RT-ZYCON, respectively). In addition, fecal samples of 64 participants (PS) and one negative bacterial control (NCB) were analyzed under two conditions. In addition to the control sample collected under standard conditions (−80-OMCON), an additional aliquot was collected in the OMNIgene·GUT tube with a stabilization buffer and stored at RT for 3–5 days before freezing at −80°C (RT-OMBUF).
Figure 3
Figure 3
(A) Beta-diversity on sample collection and storage. Bray–Curtis distance in a PCoA ordination showed a limited effect on sample collection, only significant for samples stored in the OMNIgene·GUT tubes. PERMANOVA testing was performed to test the difference between the two studies (R2 = 0.026; p-adjusted = 0.001), and the type of stabilization buffer compared to the frozen control: OMNIgene·GUT tubes (R2 = 0.020; p = 0.002) and Zymo research tubes (R2 = 0.046; p-adjusted = 1 for RT-ZYBUF, R2 = 0.032; p-adjusted = 1, for RT-ZYCON). (B) Boxplots show the relative abundance of the five top most abundant phyla. The differences between the storage conditions were tested with the Wilcoxon test.
Figure 4
Figure 4
LEfSe analysis showing the significant distinguishing taxa between the different storage methods based on an LDA score >4.0. Results are shown in cladograms, showing the effect of storage at RT, with or without stabilization buffer (RT-ZYBUF, RT-ZYCON, RT-OMBUF) in green, compared to the control samples (−80-ZYCON, −80-OMCON, RT-ZYCON) in red. (A–D) The Cladograms show the taxonomic levels represented by rings, with the phylum level in the outermost ring, and the genus level in the innermost ring. Each green or red circle represents significantly different taxa associated with one of the compared groups.
Figure 5
Figure 5
Overview of the different methods for DNA extraction. Personal samples of five donors were stored under two conditions, one aliquot was directly frozen (−80-ZYCON), and the other aliquot of the sample was stored at RT for 3–5 days in Zymo research collection tubes (RT-ZYBUF). These fecal samples, together with two positive and two negative controls were used to test the effect of mechanical or enzymatical cell disruption. Furthermore, we looked into the difference in DNA purification using the Maxwell® RSC Whole Blood DNA Kit and the Maxwell® RSC Fecal Microbiome DNA Kit.
Figure 6
Figure 6
The bacterial yield of samples using mechanical (MD) or enzymatical disruption (ED) and purified with the Maxwell RSC Blood DNA kit (BK) or the Maxwell RSC Fecal Microbiome DNA kit (FK). The DNA concentration was measured using the Quantus Fluorometer and the bacterial DNA using a universal 16S rRNA gene qPCR and represented in ng/μl.
Figure 7
Figure 7
Bray–Curtis distance in a PCoA ordination shows the difference in overall microbial community structure of the different groups (Bead-beating using the Blood and Fecal kits, i.e., MD-BK and MD-FK, and similarly for lysis buffer, i.e., ED-BK and ED-FK). PERMANOVA testing for the method for cell disruption (R2 = 0.08585; p-adjusted = 0.001) and purification kit (R2 = 0.00016; p-adjusted = 1).
Figure 8
Figure 8
(A) Boxplots show the relative abundance of the top most abundant phyla. The Wilcoxon test was used to calculate the adjusted p-values of the differences between the DNA extraction methods. (B) Spearman's correlation of the Mock samples extracted by the different methods, compared to the theoretical composition of the Mock community sample (MD-FK rho = 0.933, ZMD rho = 0.9, MD-BK rho = 0.75, ED-FK rho = 0.633, ED-BK rho = 0.517). (C) Barplots of eight bacterial strains included in the Zymo mock sample.
Figure 9
Figure 9
Overview of the different methods tested during library preparation. The effect of bacterial DNA input (16, 125, and 1,000 pg) and PCR cycles (25, 30, and 35 cycles) was tested using the DNA of three participant samples, three negative controls, and one positive Zymo mock control.
Figure 10
Figure 10
Sequenced reads of three participant samples (PSs), three negative controls DNA (NCD), and one positive control, Zymo mock DNA (ZMD). The effect of different bacterial inputs (A) and PCR cycles (B) during the 16S rRNA gene V4 region PCR on the number of reads sequenced.
Figure 11
Figure 11
(A) Effect of PCR conditions on mock communities during library preparation compared to the theoretical composition using a Spearman's correlation. (B) Barplots of eight bacterial strains included in the Zymo mock sample.
Figure 12
Figure 12
Overview of the conditions tested to measure the variation in large microbiome studies introduced during different DNA extraction rounds, multiple sequencing runs, and the use of barcodes as a unique identifier. DNA of 20 Zymo mock community samples and 23 mixed samples was extracted in different DNA extraction (DE) rounds. The effect of sequencing with a different barcode was tested by repeated sequencing of Zymo mock DNA and Mixed sample DNA 36 times using a different Illumina barcode (UB). The same DNA samples (ZMD and MSD) were sequenced six times using the same Illumina barcode (BC1 and BC2).
Figure 13
Figure 13
Divergence of (A) mock samples and (B) mixed samples when comparing the effect of DNA extraction rounds and the use of different barcodes on the heterogeneity within the set of samples. The Wilcoxon test was used to calculate the adjusted p-values between all tested conditions. (C) Non-metric multidimensional scaling (NMDS) plot of the five donor samples (S01–S05) used to generate the mixed sample used as a positive control in DNA extraction (DE n = 23) and sequencing runs (UB n = 36, BC1 n = 6, and BC2 n = 6) (n = 71). (D) NMDS plot of the 30 mixed samples sequenced: fecal samples extracted in different DNA extraction (DE) rounds, DNA samples amplified with unique barcodes (UBs), and DNA samples amplified with the same barcodes (BC1 and BC2).

References

    1. Abdill R. J., Adamowicz E. M., Blekhman R. (2022). Public human microbiome data are dominated by highly developed countries. PLoS Biol. 20, e3001536. 10.1371/journal.pbio.3001536 - DOI - PMC - PubMed
    1. Anand S. S. (2022). chkMocks: An R Package to Compare Mock Community Samples in Microbiome Amplicon Sequencing Studies (0.1.03). Paris: Zenedo.
    1. Bahl M. I., Bergström A., Licht T. R. (2012). Freezing fecal samples prior to DNA extraction affects the Firmicutes to Bacteroidetes ratio determined by downstream quantitative PCR analysis. FEMS Microbiol. Lett. 329, 193–197. 10.1111/j.1574-6968.2012.02523.x - DOI - PubMed
    1. Callahan B. J., McMurdie P. J., Rosen M. J., Han A. W., Johnson A. J. A., Holmes S. P. (2016). DADA2: high-resolution sample inference from Illumina amplicon data. Nat. Methods 13, 581–583. 10.1038/nmeth.3869 - DOI - PMC - PubMed
    1. Caporaso J. G., Lauber C. L., Walters W. A., Berg-Lyons D., Lozupone C. A., Turnbaugh P. J., et al. (2011). Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample. Proc. Natl. Acad. Sci. U. S. A. 108, 4516–4522. 10.1073/pnas.1000080107 - DOI - PMC - PubMed