Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Mar 1;1(1):e4.
doi: 10.1002/imt2.4. eCollection 2022 Mar.

Applications of de Bruijn graphs in microbiome research

Affiliations
Review

Applications of de Bruijn graphs in microbiome research

Keith Dufault-Thompson et al. Imeta. .

Abstract

High-throughput sequencing has become an increasingly central component of microbiome research. The development of de Bruijn graph-based methods for assembling high-throughput sequencing data has been an important part of the broader adoption of sequencing as part of biological studies. Recent advances in the construction and representation of de Bruijn graphs have led to new approaches that utilize the de Bruijn graph data structure to aid in different biological analyses. One type of application of these methods has been in alternative approaches to the assembly of sequencing data like gene-targeted assembly, where only gene sequences are assembled out of larger metagenomes, and differential assembly, where sequences that are differentially present between two samples are assembled. de Bruijn graphs have also been applied for comparative genomics where they can be used to represent large sets of multiple genomes or metagenomes where structural features in the graphs can be used to identify variants, indels, and homologous regions in sequences. These de Bruijn graph-based representations of sequencing data have even begun to be applied to whole sequencing databases for large-scale searches and experiment discovery. de Bruijn graphs have played a central role in how high-throughput sequencing data is worked with, and the rapid development of new tools that rely on these data structures suggests that they will continue to play an important role in biology in the future.

Keywords: Omics; de Bruijn graphs; microbiome.

PubMed Disclaimer

Conflict of interest statement

The authors declare that there are no conflict of interests.

Figures

Figure 1
Figure 1
Illustration showing different applications of de Bruijn graphs in genome and metagenome assembly. (A) Illustration of de Bruijn graph assembly. First, a de Bruijn graph is constructed from raw reads, then a path through the graph that visits each k‐mer is identified (red arrow over the graph), and lastly a sequence is assembled based on this path. (B) Illustration of the general process for gene‐targeted assembly. First reference sequences or profiles are used to identify reads that may contain partial gene sequences, next this information is used to add weights (thicker black arrows) to the graph, and lastly these weighted paths can be used to directly assembly gene sequences. (C) Illustration showing the concept of differential assembly. de Bruijn graphs are generated from multiple metagenomes (red and blue graphs). These de Bruijn graphs can then be combined revealing portions of the graph that are shared between the two metagenomes (gray nodes and edges), or portions that are unique to one metagenome (red or blue nodes and edges). Sequences that are uniquely present in one sample versus the other can then be assembled
Figure 2
Figure 2
Illustration showing the concept of a colored de Bruijn graph (DBG) and the process of variant identification using the graph

Similar articles

Cited by

References

    1. Ward, R. Matthew , Schmieder Robert, Highnam Gareth, and Mittelman David. 2013. “Big Data Challenges and Opportunities in High‐Throughput Sequencing.” Systems Biomedicine 1(1): 29–34. 10.4161/sysb.24470 - DOI
    1. Compeau, Phillip E. C. , Pevzner Pavel A., and Tesler Glenn. 2011. “How to Apply de Bruijn Graphs to Genome Assembly.” Nature Biotechnology 29(11): 987–91. 10.1038/nbt.2023 - DOI - PMC - PubMed
    1. Ghurye, Jay S. , Cepeda‐Espinoza Victoria, and Pop Mihai. 2016. “Metagenomic Assembly: Overview, Challenges and Applications.” The Yale Journal of Biology and Medicine 89(3): 353–62. - PMC - PubMed
    1. Sutton, Granger G. , White Owen, Adams Mark D., and Kerlavage Anthony R.. 1995. “TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects.” Genome Science and Technology 1(1): 9–19. 10.1089/gst.1995.1.9 - DOI
    1. Li, Zhenyu , Yanxiang Chen, Desheng Mu, Jianying Yuan, Yujian Shi, Hao Zhang, and Jun Gan, et al. 2012. “Comparison of the Two Major Classes of Assembly Algorithms: Overlap‐Layout‐Consensus and de‐Bruijn‐Graph.” Briefings in Functional Genomics 11(1): 25–37. 10.1093/bfgp/elr035 - DOI - PubMed

LinkOut - more resources