Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 11;23(Suppl 4):361.
doi: 10.1186/s12864-022-08577-7.

B-assembler: a circular bacterial genome assembler

Affiliations

B-assembler: a circular bacterial genome assembler

Fengyuan Huang et al. BMC Genomics. .

Abstract

Background: Accurate bacteria genome de novo assembly is fundamental to understand the evolution and pathogenesis of new bacteria species. The advent and popularity of Third-Generation Sequencing (TGS) enables assembly of bacteria genomes at an unprecedented speed. However, most current TGS assemblers were specifically designed for human or other species that do not have a circular genome. Besides, the repetitive DNA fragments in many bacterial genomes plus the high error rate of long sequencing data make it still very challenging to accurately assemble their genomes even with a relatively small genome size. Therefore, there is an urgent need for the development of an optimized method to address these issues.

Results: We developed B-assembler, which is capable of assembling bacterial genomes when there are only long reads or a combination of short and long reads. B-assembler takes advantage of the structural resolving power of long reads and the accuracy of short reads if applicable. It first selects and corrects the ultra-long reads to get an initial contig. Then, it collects the reads overlapping with the ends of the initial contig. This two-round assembling procedure along with optimized error correction enables a high-confidence and circularized genome assembly. Benchmarked on both synthetic and real sequencing data of several species of bacterium, the results show that both long-read-only and hybrid-read modes can accurately assemble circular bacterial genomes free of structural errors and have fewer small errors compared to other assemblers.

Conclusions: B-assembler provides a better solution to bacterial genome assembly, which will facilitate downstream bacterial genome analysis.

Keywords: Bacteria genome; De novo assembly; Hybrid-read assembly; Long-read-only assembly.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
The workflow of B-assembler. B-assembler has two modes: long-read-only assembly and hybrid reads assembly
Fig. 2
Fig. 2
Indels and mismatches produced by the benchmarked assemblers on the 14 NCTC PacBio samples. The number of indels and mismatches were added per 100kbp

References

    1. Whitman WB, Coleman DC, Wiebe WJ. Prokaryotes: the unseen majority. Proc Natl Acad Sci U S A. 1998;95(12):6578–6583. doi: 10.1073/pnas.95.12.6578. - DOI - PMC - PubMed
    1. Marchesi JR, Ravel J. The vocabulary of microbiome research: a proposal. Microbiome. 2015;3:31. doi: 10.1186/s40168-015-0094-5. - DOI - PMC - PubMed
    1. Birchenough G, Hansson GC. Bacteria tell us how to protect our intestine. Cell Host Microbe. 2017;22(1):3–4. doi: 10.1016/j.chom.2017.06.011. - DOI - PubMed
    1. Fernandez L, Cima-Cabal MD, Duarte AC, Rodriguez A, Garcia P, Garcia-Suarez MDM. Developing diagnostic and therapeutic approaches to bacterial infections for a new era: implications of globalization. Antibiotics (Basel) 2020;9(12):916. doi: 10.3390/antibiotics9120916. - DOI - PMC - PubMed
    1. Land M, Hauser L, Jun SR, Nookaew I, Leuze MR, Ahn TH, Karpinets T, Lund O, Kora G, Wassenaar T, et al. Insights from 20 years of bacterial genome sequencing. Funct Integr Genomics. 2015;15(2):141–161. doi: 10.1007/s10142-015-0433-4. - DOI - PMC - PubMed

LinkOut - more resources