Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jul 17;24(1):401.
doi: 10.1186/s12864-023-09500-4.

A high fidelity approach to assembling the complex Borrelia genome

Affiliations

A high fidelity approach to assembling the complex Borrelia genome

Sabrina Hepner et al. BMC Genomics. .

Abstract

Background: Bacteria of the Borrelia burgdorferi sensu lato (s.l.) complex can cause Lyme borreliosis. Different B. burgdorferi s.l. genospecies vary in their host and vector associations and human pathogenicity but the genetic basis for these adaptations is unresolved and requires completed and reliable genomes for comparative analyses. The de novo assembly of a complete Borrelia genome is challenging due to the high levels of complexity, represented by a high number of circular and linear plasmids that are dynamic, showing mosaic structure and sequence homology. Previous work demonstrated that even advanced approaches, such as a combination of short-read and long-read data, might lead to incomplete plasmid reconstruction. Here, using recently developed high-fidelity (HiFi) PacBio sequencing, we explored strategies to obtain gap-free, complete and high quality Borrelia genome assemblies. Optimizing genome assembly, quality control and refinement steps, we critically appraised existing techniques to create a workflow that lead to improved genome reconstruction.

Results: Despite the latest available technologies, stand-alone sequencing and assembly methods are insufficient for the generation of complete and high quality Borrelia genome assemblies. We developed a workflow pipeline for the de novo genome assembly for Borrelia using several types of sequence data and incorporating multiple assemblers to recover the complete genome including both circular and linear plasmid sequences.

Conclusion: Our study demonstrates that, with HiFi data and an ensemble reconstruction pipeline with refinement steps, chromosomal and plasmid sequences can be fully resolved, even for complex genomes such as Borrelia. The presented pipeline may be of interest for the assembly of further complex microbial genomes.

Keywords: Borrelia burgdorferi; De novo assembly; Genome reconstruction pipeline; Genomics; HiFi sequencing; Plasmids.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Schematic overview of the ensemble pipeline for Borrelia genome reconstruction established in this study. Lab preparation steps are indicated in grey. Data based on PacBio sequencing is shown in dark blue, data based on Illumina sequencing is shown in orange. A combination of PacBio and Illumina data is colored purple. QC and refinement steps are shown in yellow and the steps to generate the final consensus are shown in red
Fig. 2
Fig. 2
Dot plot examples before (left) and after (right) contig trimming. Wraparound and terminal direct repeats that need to be trimmed are indicated by a black arrow. The remaining part after trimming is indicated by a red box. Dot plot of PBaeII lp54 (contig ctg.s2.000000F of the microbial assembly) untrimmed (A) and trimmed (B). Dot plot of PBaeII lp28-8 (contig ctg.s2.000004F of the microbial assembly) untrimmed (C) and trimmed (D). The region of the vls locus is indicated by a gray filled box. Dot plot of PBaeII cp26 (contig tig00000016 of the HiCanu assembly) untrimmed (E) and trimmed (F). Dot plots were generated using the web-based NCBI-BLASTN [51]
Fig. 3
Fig. 3
Dot plot of contig ctg.s2.10 (cp26) of the microbial assembly of PBaeII without terminal direct repeats (left) and containing terminal direct repeats after extension (right). In the left panel (A) is the untrimmed contig that does not show terminal direct repeats, in the right panel (B) is the extended contig which contains the overlapping region (1 bp – 2,000 bp overlaps 27,108 bp – 29,108 bp). Therefore, the plasmid can be considered as circular and complete. Dot plots were generated using the web-based NCBI-BLASTN [51]
Fig. 4
Fig. 4
Schematic visualization of the genome elements of PBaeII, PBes and 89B13. Partitioning genes are shown as colored dots (PFam32: red, PFam49: green, PFam50: yellow, PFam57/62: blue). Intact genes are shown as filled dots, pseudogenes are shown as unfilled points with a cross. Intact genes and pseudogenes were defined using the NCBI annotator PGAP [52]

References

    1. Stanek G, Wormser GP, Gray J, Strle F. Lyme borreliosis. Lancet. 2012;379(9814):461–473. doi: 10.1016/S0140-6736(11)60103-7. - DOI - PubMed
    1. Margos G, Wilske B, Sing A, Hizo-Teufel C, Cao WC, Chu C, et al. Borrelia bavariensis sp. Nov. is widely distributed in Europe and Asia. Int J Syst Evol Microbiol. 2013;63(Pt 11):4284–8. doi: 10.1099/ijs.0.052001-0. - DOI - PubMed
    1. Pritt BS, Petersen JM. Borrelia mayonii: prying open Pandora's box of spirochetes - Authors' reply. Lancet Infect Dis. 2016;16(6):637–638. doi: 10.1016/S1473-3099(16)30071-8. - DOI - PubMed
    1. Margos G, Henningsson AJ, Hepner S, Markowicz M, Sing A, Fingerle V. Borrelia ecology, evolution, and human disease: A mosaic of life. In: Sing A, editor. Zoonoses: Infections Affecting Humans and Animals. Cham: Springer International Publishing; 2022. pp. 1–66.
    1. Kurtenbach K, Hanincova K, Tsao JI, Margos G, Fish D, Ogden NH. Fundamental processes in the evolutionary ecology of Lyme borreliosis. Nat Rev Microbiol. 2006;4(9):660–669. doi: 10.1038/nrmicro1475. - DOI - PubMed