Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Meta-Analysis
. 2022 Apr 27;10(2):e0009022.
doi: 10.1128/spectrum.00090-22. Epub 2022 Mar 15.

Library Preparation and Sequencing Platform Introduce Bias in Metagenomic-Based Characterizations of Microbiomes

Affiliations
Meta-Analysis

Library Preparation and Sequencing Platform Introduce Bias in Metagenomic-Based Characterizations of Microbiomes

Casper S Poulsen et al. Microbiol Spectr. .

Abstract

Metagenomics is increasingly used to describe microbial communities in biological specimens. Ideally, the steps involved in the processing of the biological specimens should not change the microbiome composition in a way that it could lead to false interpretations of inferred microbial community composition. Common steps in sample preparation include sample collection, storage, DNA isolation, library preparation, and DNA sequencing. Here, we assess the effect of three library preparation kits and two DNA sequencing platforms. Of the library preparation kits, one involved a PCR step (Nextera), and two were PCR free (NEXTflex and KAPA). We sequenced the libraries on Illumina HiSeq and NextSeq platforms. As example microbiomes, two pig fecal samples and two sewage samples of which aliquots were stored at different storage conditions (immediate processing and storage at -80°C) were assessed. All DNA isolations were performed in duplicate, totaling 80 samples, excluding controls. We found that both library preparation and sequencing platform had systematic effects on the inferred microbial community composition. The different sequencing platforms introduced more variation than library preparation and freezing the samples. The results highlight that all sample processing steps need to be considered when comparing studies. Standardization of sample processing is key to generating comparable data within a study, and comparisons of differently generated data, such as in a meta-analysis, should be performed cautiously. IMPORTANCE Previous research has reported effects of sample storage conditions and DNA isolation procedures on metagenomics-based microbiome composition; however, the effect of library preparation and DNA sequencing in metagenomics has not been thoroughly assessed. Here, we provide evidence that library preparation and sequencing platform introduce systematic biases in the metagenomic-based characterization of microbial communities. These findings suggest that library preparation and sequencing are important parameters to keep consistent when aiming to detect small changes in microbiome community structure. Overall, we recommend that all samples in a microbiome study are processed in the same way to limit unwanted variations that could lead to false conclusions. Furthermore, if we are to obtain a more holistic insight from microbiome data generated around the world, we will need to provide more detailed sample metadata, including information about the different sample processing procedures, together with the DNA sequencing data at the public repositories.

Keywords: DNA sequencing; library preparation; metadata; metagenomics; microbial communities; microbiome.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

FIG 1
FIG 1
Study design and comparison between sample groups. (A) Two pig feces samples and two sewage samples were processed directly or after storage at −80°C for 64 h. The DNA isolation was performed in duplicates, respectively. Library preparation and sequencing were performed in four different combinations, NEXTflex PCR-Free library preparation with sequencing on a HiSeq (NEXTflex HiSeq), KAPA PCR-free library preparation with sequencing on a HiSeq (KAPA HiSeq), NEXTflex PCR-Free library preparation with sequencing on a NextSeq (NEXTflex NextSeq), and Nextera library preparation with sequencing on a NextSeq (Nextera NextSeq). The latter sequencing strategy was performed twice (Nextera 1 NextSeq and Nextera 2 NextSeq). The setup resulted in a total of 80 metagenomes plus 5 negative controls (i.e., DNA extraction controls). (B) Boxplots display pairwise Aitchison distances between different groupings of samples. Within the different groups, dots representing the distances were colored according to which sample the comparison was made in. Blue dots represent a distance between two different samples.
FIG 2
FIG 2
Principal-component analysis (PCA) subset to the different sample matrices. Euclidean distances were calculated after performing centered log-ratio transformation (CLR) of the count data (Aitchison distances). Variance explained by the two first axes are included in their labels. The same DNA samples processed differently are connected with dotted lines.
FIG 3
FIG 3
Heatmaps of pig feces and sewage samples separately with the 30 most abundant genera. Complete-linkage clustering was performed to create dendrograms for both genera and samples. Spearman correlation was used to cluster the genera, and Aitchison distances were used to cluster the samples. Genera abundance depicted in the cells were CLR-transformed counts standardized to zero mean and unit variance. Grouping of organisms were included in genera names according to cell wall structure based on Gram-positive staining (G+), Gram-negative staining (G−), or belonging to Archaea (Ar). (A) Heatmap of all pig feces samples, where the first branching was according to sequencing platform. The third cluster of genera exclusively contained Gram negatives. (B) Heatmap of all sewage samples. The fourth cluster mainly consisted of Gram positives. A few Gram positives were also present in the other clusters. For explanation of colours, see panel A.

References

    1. Qin J, Li Y, Cai Z, Li S, Zhu J, Zhang F, Liang S, Zhang W, Guan Y, Shen D, Peng Y, Zhang D, Jie Z, Wu W, Qin Y, Xue W, Li J, Han L, Lu D, Wu P, Dai Y, Sun X, Li Z, Tang A, Zhong S, Li X, Chen W, Xu R, Wang M, Feng Q, Gong M, Yu J, Zhang Y, Zhang M, Hansen T, Sanchez G, Raes J, Falony G, Okuda S, Almeida M, LeChatelier E, Renault P, Pons N, Batto J-M, Zhang Z, Chen H, Yang R, Zheng W, Li S, Yang H, et al. 2012. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature 490:55–60. doi: 10.1038/nature11450. - DOI - PubMed
    1. Gevers D, Kugathasan S, Denson LA, Vázquez-Baeza Y, Van Treuren W, Ren B, Schwager E, Knights D, Song SJ, Yassour M, Morgan XC, Kostic AD, Luo C, González A, McDonald D, Haberman Y, Walters T, Baker S, Rosh J, Stephens M, Heyman M, Markowitz J, Baldassano R, Griffiths A, Sylvester F, Mack D, Kim S, Crandall W, Hyams J, Huttenhower C, Knight R, Xavier RJ. 2014. The treatment-naive microbiome in new-onset Crohn’s disease. Cell Host Microbe 15:382–392. doi: 10.1016/j.chom.2014.02.005. - DOI - PMC - PubMed
    1. Yu J, Feng Q, Wong SH, Zhang D, Yi Liang Q, Qin Y, Tang L, Zhao H, Stenvang J, Li Y, Wang X, Xu X, Chen N, Wu WKK, Al-Aama J, Nielsen HJ, Kiilerich P, Jensen BAH, Yau TO, Lan Z, Jia H, Li J, Xiao L, Lam TYT, Ng SC, Cheng ASL, Wong VWS, Chan FKL, Xu X, Yang H, Madsen L, Datz C, Tilg H, Wang J, Brünner N, Kristiansen K, Arumugam M, Sung JJY, Wang J. 2017. Metagenomic analysis of faecal microbiome as a tool towards targeted non-invasive biomarkers for colorectal cancer. Gut 66:70–78. doi: 10.1136/gutjnl-2015-309800. - DOI - PubMed
    1. Zeller G, Tap J, Sobhani I, Amiot A, Tap J, Tran Van Nhieu J, Voigt AY, Zimmermann J, Bohm J, Kultima JR, Benes V, Schrotz-King P, Zeller G, Habermann N, Bork P, Luciani A, Hercog R, Sunagawa S, Kloor M, Schneider MA, Mende DR, Ulrich CM, Costea PI, Koch M, von Knebel Doeberitz M, Brunetti F, Yamada T, Tournigand C. 2014. Potential of fecal microbiota for early-stage detection of colorectal cancer. Mol Syst Biol 10:766–766. doi: 10.15252/msb.20145645. - DOI - PMC - PubMed
    1. Dekker JP. 2018. Metagenomics for clinical infectious disease diagnostics steps closer to reality. J Clin Microbiol 56:e00850-18. doi: 10.1128/JCM.00850-18. - DOI - PMC - PubMed

Publication types