Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 11;21(1):150.
doi: 10.1186/s12864-020-6535-y.

Comparative genome characterization of the periodontal pathogen Tannerella forsythia

Affiliations

Comparative genome characterization of the periodontal pathogen Tannerella forsythia

Nikolaus F Zwickl et al. BMC Genomics. .

Abstract

Background: Tannerella forsythia is a bacterial pathogen implicated in periodontal disease. Numerous virulence-associated T. forsythia genes have been described, however, it is necessary to expand the knowledge on T. forsythia's genome structure and genetic repertoire to further elucidate its role within pathogenesis. Tannerella sp. BU063, a putative periodontal health-associated sister taxon and closest known relative to T. forsythia is available for comparative analyses. In the past, strain confusion involving the T. forsythia reference type strain ATCC 43037 led to discrepancies between results obtained from in silico analyses and wet-lab experimentation.

Results: We generated a substantially improved genome assembly of T. forsythia ATCC 43037 covering 99% of the genome in three sequences. Using annotated genomes of ten Tannerella strains we established a soft core genome encompassing 2108 genes, based on orthologs present in > = 80% of the strains analysed. We used a set of known and hypothetical virulence factors for comparisons in pathogenic strains and the putative periodontal health-associated isolate Tannerella sp. BU063 to identify candidate genes promoting T. forsythia's pathogenesis. Searching for pathogenicity islands we detected 38 candidate regions in the T. forsythia genome. Only four of these regions corresponded to previously described pathogenicity islands. While the general protein O-glycosylation gene cluster of T. forsythia ATCC 43037 has been described previously, genes required for the initiation of glycan synthesis are yet to be discovered. We found six putative glycosylation loci which were only partially conserved in other bacteria. Lastly, we performed a comparative analysis of translational bias in T. forsythia and Tannerella sp. BU063 and detected highly biased genes.

Conclusions: We provide resources and important information on the genomes of Tannerella strains. Comparative analyses enabled us to assess the suitability of T. forsythia virulence factors as therapeutic targets and to suggest novel putative virulence factors. Further, we report on gene loci that should be addressed in the context of elucidating T. forsythia's protein O-glycosylation pathway. In summary, our work paves the way for further molecular dissection of T. forsythia biology in general and virulence of this species in particular.

Keywords: Codon usage bias; Comparative genomics; Computational analysis; Genome assembly; Glycosylation gene cluster; Pan-genome; Pathogenicity island; Periodontitis; Tannerella; Virulence.

PubMed Disclaimer

Conflict of interest statement

HH is a member of the editorial board of BMC Genomics.

Figures

Fig. 1
Fig. 1
Comparison of our assembled scaffolds to a previously published T. forsythia sequence. The sequence KP715369 (black bar in the middle) aligns partially to our scaffold 1 (bottom) and partially to scaffold 2 (top). The sections named A to F represent the scaffolded contigs, gaps between them are indicated by vertical bars. Coverage tracks are shown for two different mapping strategies (allowing zero mismatches versus allowing only uniquely mapping reads); the differences between the two tracks highlight repetitive content found especially at the contig ends. Numbers of linking read pairs between contigs are indicated (based on the uniquely-mapping strategy) along with the numbers of unique mapping positions (read 1 / read 2). There were only 20 read pairs that supported the linkage of contig C to contig E as suggested by the alignment of KP715369. All adjacent contigs as scaffolded by us were supported by more than 5000 pairs for each link
Fig. 2
Fig. 2
Multiple whole genome alignment of eight T. forsythia strains. Each coloured block represents a genomic region that aligned to a region in at least one other genome, plotted in the same colour, to which it was predicted to be homologous based on sequence similarity. Blocks above the centre line indicate forward orientation; blocks below the line indicate reverse orientation relative to strain 92A2. A histogram within each block shows the average similarity of a region to its counterparts in the other genomes. Red vertical lines indicate contig boundaries. Strain ATCC 43037 displayed two translocations compared to strain 92A2 with lengths of approximately 500 kbp (blue and yellow blocks at the right end of 92A2 and in the centre of ATCC) and 30 kbp (pink block at approx. 1.25 Mbp in 92A2 and at approx. 2.7 Mbp in ATCC), respectively. Previously described large-scale inversions in strain KS16 could be confirmed (reverted blocks in the left half of the alignment)
Fig. 3
Fig. 3
Phylogenetic tree showing the topology (a) and the distances (b) as computed by MASH applied on the whole-genome assemblies of T. forsythia strains and Tannerella sp. BU063, including Bacterioides vulgatus ATCC 8482 as outgroup
Fig. 4
Fig. 4
Whole genome alignment between the six frame amino acid translations of both Tannerella sp. BU063 and the scaffolded and ordered ATCC 43037 assembly. Whereas the amino acid alignment reflects similarity with respect to gene content, the order of genes is not preserved
Fig. 5
Fig. 5
Blast Score Ratio (BSR) values plotted as heatmap for 45 suggested virulence genes in ten T. forsythia strains and the genome of putative health-associated Tannerella sp. BU063. Gene sequences were blasted against the complete genomic sequences of each genome. Tannerella sp. BU063 achieved considerable BSR values for several genes that were actually suggested as virulence factors in pathogenic T. forsythia strains. On the other hand, some of the pathogenic strains show reduced similarity to some predicted virulence factors
Fig. 6
Fig. 6
Predicted core- and pan-genome sizes for T. forsythia based on ten genome assemblies using a sampling approach that iteratively adds genomes to the analysis. The species’ core genome has a saturated size of 1900 genes, i.e. genes that are found to be conserved throughout the ten analysed strains are likely to be conserved throughout the whole species (left panel). In contrast, novel genes are expected to be found in newly sequenced T. forsythia genomes as indicated by the pan-genome curve that has not yet reached a saturation plateau (right panel)
Fig. 7
Fig. 7
Analysis of codon usage for ATCC 43037 (left panel) and BU063 (right panel). The continuous curves indicate the NC values to be expected for a given GC3s content in the absence of other factors shaping codon usage. Every dot represents a protein coding gene, dots not positioned near the curve therefore represent genes that display a considerable codon usage bias. GC3s: G + C content at synonymous positions, NC: effective number of codons used within the sequence of a gene

Similar articles

Cited by

References

    1. Hajishengallis G, Lamont RJ. Beyond the red complex and into more complexity: the polymicrobial synergy and dysbiosis (PSD) model of periodontal disease etiology. Mol Oral Microbiol. 2012;27:409–419. doi: 10.1111/j.2041-1014.2012.00663.x. - DOI - PMC - PubMed
    1. Sharma A, Sojar HT, Glurich I, Honma K, Kuramitsu HK, Genco RJ. Cloning, expression, and sequencing of a cell surface antigen containing a leucine-rich repeat motif from Bacteroides forsythus ATCC 43037. Infect Immun. 1998;66:5703–5710. doi: 10.1128/IAI.66.12.5703-5710.1998. - DOI - PMC - PubMed
    1. Sharma A, Inagaki S, Honma K, Sfintescu C, Baker PJ, Evans RT. Tannerella forsythia-induced alveolar bone loss in mice involves leucine-rich-repeat BspA protein. J Dent Res. 2005;84:462–467. doi: 10.1177/154405910508400512. - DOI - PubMed
    1. Saito T, Ishihara K, Kato T, Okuda K. Cloning, expression, and sequencing of a protease gene from Bacteroides forsythus ATCC 43037 in Escherichia coli. Infect Immun. 1997;65:4888–4891. doi: 10.1128/IAI.65.11.4888-4891.1997. - DOI - PMC - PubMed
    1. Lee S-W, Sabet M, Um H-S, Yang J, Kim HC, Zhu W. Identification and characterization of the genes encoding a unique surface (S-) layer of Tannerella forsythia. Gene. 2006;371:102–111. doi: 10.1016/j.gene.2005.11.027. - DOI - PubMed

Substances

LinkOut - more resources