Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep 20:13:1007056.
doi: 10.3389/fmicb.2022.1007056. eCollection 2022.

High-throughput nanopore sequencing of Treponema pallidum tandem repeat genes arp and tp0470 reveals clade-specific patterns and recapitulates global whole genome phylogeny

Affiliations

High-throughput nanopore sequencing of Treponema pallidum tandem repeat genes arp and tp0470 reveals clade-specific patterns and recapitulates global whole genome phylogeny

Nicole A P Lieberman et al. Front Microbiol. .

Abstract

Sequencing of most Treponema pallidum genomes excludes repeat regions in tp0470 and the tp0433 gene, encoding the acidic repeat protein (arp). As a first step to understanding the evolution and function of these genes and the proteins they encode, we developed a protocol to nanopore sequence tp0470 and arp genes from 212 clinical samples collected from ten countries on six continents. Both tp0470 and arp repeat structures recapitulate the whole genome phylogeny, with subclade-specific patterns emerging. The number of tp0470 repeats is on average appears to be higher in Nichols-like clade strains than in SS14-like clade strains. Consistent with previous studies, we found that 14-repeat arp sequences predominate across both major clades, but the combination and order of repeat type varies among subclades, with many arp sequence variants limited to a single subclade. Although strains that were closely related by whole genome sequencing frequently had the same arp repeat length, this was not always the case. Structural modeling of TP0470 suggested that the eight residue repeats form an extended α-helix, predicted to be periplasmic. Modeling of the ARP revealed a C-terminal sporulation-related repeat (SPOR) domain, predicted to bind denuded peptidoglycan, with repeat regions possibly incorporated into a highly charged β-sheet. Outside of the repeats, all TP0470 and ARP amino acid sequences were identical. Together, our data, along with functional considerations, suggests that both TP0470 and ARP proteins may be involved in T. pallidum cell envelope remodeling and homeostasis, with their highly plastic repeat regions playing as-yet-undetermined roles.

Keywords: AlphaFold; SPOR domain; Treponema pallidum; genomics; nanopore; next generation sequencing (NGS); syphilis; trRosetta.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Variation in tp0470 repeat length. (A) Recombination masked whole genome phylogeny (left) with the number of tp0470 repeats for each strain (right). Sequence variant number is included as text to the right of each length bar. All data are also included in a tabular representation in Supporting Information. Number of tp0470 repeats by subclade (B) or country (C).
FIGURE 2
FIGURE 2
Variation in arp repeat length. (A) Recombination masked whole genome phylogeny (left) with the number of arp repeats for each strain (right). Sequence variant number is included as text to the right of each length bar. All data are also included in a tabular representation in Supplementary Table 2. Number of arp repeats by subclade (B) or country (C). (D) Multiple sequence alignment of the nine variants with 14 arp repeats. Variant positions are highlighted and bases colored red, blue, yellow, or green for A, C, G, or T, respectively. The number of strains with each variant sequence is included in the bar graph to the right of the multiple sequence alignment.
FIGURE 3
FIGURE 3
arp repeat type usage. (A) Nucleotide sequence of the arp repeat module types. Variable positions are highlighted. (B) Amino acid sequence of the arp repeat module types. Variable positions are highlighted. (C) Recombination masked whole genome phylogeny (left) with the repeat type usage per strain (right). The 60 bp arp repeats are colored by type.
FIGURE 4
FIGURE 4
Structure predictions of TP0470. (A) trRosetta and (B) AlphaFold predictions of structure of 15 repeat TP0470 variant. N terminal tetratricopeptide repeat domain is shown in green, repeats are in purple, and C-terminal region is in gold. (C) APBS electrostatic surface potential (top) and stick model of sidechains (bottom) for portion of repeat helix.
FIGURE 5
FIGURE 5
Structure predictions ARP14A. (A) trRosetta and (B) AlphaFold predictions of structure of ARP14A. C-terminal SPOR domain is shown in magenta, repeats in cyan. (C) APBS electrostatic surface potential for trRosetta ARP structure from panel (A). Red denotes negative charge (acidic) and blue denotes positive charge (basic).
FIGURE 6
FIGURE 6
Model showing ARP and TP0470 cellular location and putative interactions. Both ARP and TP0470 are localized to the periplasm. The ARP N terminus may be acylated at cysteine 29. OM, outer membrane; IM, inner membrane; PG, peptidoglycan. Proteins and cellular structures have not been drawn to scale.

References

    1. Arora N., Schuenemann V. J., Jäger G., Peltzer A., Seitz A., Herbig A., et al. (2017). Origin of modern syphilis and emergence of a pandemic Treponema pallidum cluster. Nat. Microbiol. 2:16245. 10.1038/nmicrobiol.2016.245 - DOI - PubMed
    1. Beale M. A., Marks M., Cole M. J., Lee M.-K., Pitt R., Ruis C., et al. (2021). Global phylogeny of Treponema pallidum lineages reveals recent expansion and spread of contemporary syphilis. Nat. Microbiol. 6 1549–1560. 10.1038/s41564-021-01000-z - DOI - PMC - PubMed
    1. Beale M. A., Marks M., Sahi S. K., Tantalo L. C., Nori A. V., French P., et al. (2019). Genomic epidemiology of syphilis reveals independent emergence of macrolide resistance across multiple circulating lineages. Nat. Commun. 10:3255. 10.1038/s41467-019-11216-7 - DOI - PMC - PubMed
    1. Brinkman M. B., McKevitt M., McLoughlin M., Perez C., Howell J., Weinstock G. M., et al. (2006). Reactivity of antibodies from syphilis patients to a protein array representing the Treponema pallidum proteome. J. Clin. Microbiol. 44 888–891. 10.1128/JCM.44.3.888-891.2006 - DOI - PMC - PubMed
    1. Bushnell B. (2014). BBMap Short Read Aligner, and Other Bioinformatic Tools. Berkeley, CA: Ernest Orlando Lawrence Berkeley National Laboratory.