GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms
- PMID: 32052832
- PMCID: PMC7016772
- DOI: 10.1093/gigascience/giaa008
GC bias affects genomic and metagenomic reconstructions, underrepresenting GC-poor organisms
Abstract
Background: Metagenomic sequencing is a well-established tool in the modern biosciences. While it promises unparalleled insights into the genetic content of the biological samples studied, conclusions drawn are at risk from biases inherent to the DNA sequencing methods, including inaccurate abundance estimates as a function of genomic guanine-cytosine (GC) contents.
Results: We explored such GC biases across many commonly used platforms in experiments sequencing multiple genomes (with mean GC contents ranging from 28.9% to 62.4%) and metagenomes. GC bias profiles varied among different library preparation protocols and sequencing platforms. We found that our workflows using MiSeq and NextSeq were hindered by major GC biases, with problems becoming increasingly severe outside the 45-65% GC range, leading to a falsely low coverage in GC-rich and especially GC-poor sequences, where genomic windows with 30% GC content had >10-fold less coverage than windows close to 50% GC content. We also showed that GC content correlates tightly with coverage biases. The PacBio and HiSeq platforms also evidenced similar profiles of GC biases to each other, which were distinct from those seen in the MiSeq and NextSeq workflows. The Oxford Nanopore workflow was not afflicted by GC bias.
Conclusions: These findings indicate potential sources of difficulty, arising from GC biases, in genome sequencing that could be pre-emptively addressed with methodological optimizations provided that the GC biases inherent to the relevant workflow are understood. Furthermore, it is recommended that a more critical approach be taken in quantitative abundance estimates in metagenomic studies. In the future, metagenomic studies should take steps to account for the effects of GC bias before drawing conclusions, or they should use a demonstrably unbiased workflow.
Keywords: GC bias; Illumina; Oxford Nanopore; PacBio; high-throughput sequencing; metagenomics.
© The Author(s) 2020. Published by Oxford University Press.
Figures



Similar articles
-
Comparison of the sequencing bias of currently available library preparation kits for Illumina sequencing of bacterial genomes and metagenomes.DNA Res. 2019 Oct 1;26(5):391-398. doi: 10.1093/dnares/dsz017. DNA Res. 2019. PMID: 31364694 Free PMC article.
-
Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data.BMC Bioinformatics. 2016 Mar 11;17:125. doi: 10.1186/s12859-016-0976-y. BMC Bioinformatics. 2016. PMID: 26968756 Free PMC article.
-
Biases from Oxford Nanopore library preparation kits and their effects on microbiome and genome analysis.BMC Genomics. 2025 May 19;26(1):504. doi: 10.1186/s12864-025-11649-z. BMC Genomics. 2025. PMID: 40389811 Free PMC article.
-
Metagenomic approaches in microbial ecology: an update on whole-genome and marker gene sequencing analyses.Microb Genom. 2020 Aug;6(8):mgen000409. doi: 10.1099/mgen.0.000409. Epub 2020 Jul 24. Microb Genom. 2020. PMID: 32706331 Free PMC article. Review.
-
Third-Generation Sequencing in the Clinical Laboratory: Exploring the Advantages and Challenges of Nanopore Sequencing.J Clin Microbiol. 2019 Dec 23;58(1):e01315-19. doi: 10.1128/JCM.01315-19. Print 2019 Dec 23. J Clin Microbiol. 2019. PMID: 31619531 Free PMC article. Review.
Cited by
-
Genetic Diversity and Phylogenetic Analysis of Zygophyllum loczyi in Northwest China's Deserts Based on the Resequencing of the Genome.Genes (Basel). 2023 Nov 28;14(12):2152. doi: 10.3390/genes14122152. Genes (Basel). 2023. PMID: 38136974 Free PMC article.
-
Field-based detection of bacteria using nanopore sequencing: Method evaluation for biothreat detection in complex samples.PLoS One. 2023 Nov 28;18(11):e0295028. doi: 10.1371/journal.pone.0295028. eCollection 2023. PLoS One. 2023. PMID: 38015952 Free PMC article.
-
MTG-Link: leveraging barcode information from linked-reads to assemble specific loci.BMC Bioinformatics. 2023 Jul 14;24(1):284. doi: 10.1186/s12859-023-05395-w. BMC Bioinformatics. 2023. PMID: 37452278 Free PMC article.
-
Shotgun metagenomics of soil invertebrate communities reflects taxonomy, biomass, and reference genome properties.Ecol Evol. 2022 Jun 6;12(6):e8991. doi: 10.1002/ece3.8991. eCollection 2022 Jul. Ecol Evol. 2022. PMID: 35784064 Free PMC article.
-
Comparison of Oxford Nanopore Technologies and Illumina MiSeq sequencing with mock communities and agricultural soil.Sci Rep. 2023 Jun 8;13(1):9323. doi: 10.1038/s41598-023-36101-8. Sci Rep. 2023. PMID: 37291169 Free PMC article.
References
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Miscellaneous