Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Aug 19;6(8):e1001069.
doi: 10.1371/journal.pgen.1001069.

A global overview of the genetic and functional diversity in the Helicobacter pylori cag pathogenicity island

Affiliations

A global overview of the genetic and functional diversity in the Helicobacter pylori cag pathogenicity island

Patrick Olbermann et al. PLoS Genet. .

Abstract

The Helicobacter pylori cag pathogenicity island (cagPAI) encodes a type IV secretion system. Humans infected with cagPAI-carrying H. pylori are at increased risk for sequelae such as gastric cancer. Housekeeping genes in H. pylori show considerable genetic diversity; but the diversity of virulence factors such as the cagPAI, which transports the bacterial oncogene CagA into host cells, has not been systematically investigated. Here we compared the complete cagPAI sequences for 38 representative isolates from all known H. pylori biogeographic populations. Their gene content and gene order were highly conserved. The phylogeny of most cagPAI genes was similar to that of housekeeping genes, indicating that the cagPAI was probably acquired only once by H. pylori, and its genetic diversity reflects the isolation by distance that has shaped this bacterial species since modern humans migrated out of Africa. Most isolates induced IL-8 release in gastric epithelial cells, indicating that the function of the Cag secretion system has been conserved despite some genetic rearrangements. More than one third of cagPAI genes, in particular those encoding cell-surface exposed proteins, showed signatures of diversifying (Darwinian) selection at more than 5% of codons. Several unknown gene products predicted to be under Darwinian selection are also likely to be secreted proteins (e.g. HP0522, HP0535). One of these, HP0535, is predicted to code for either a new secreted candidate effector protein or a protein which interacts with CagA because it contains two genetic lineages, similar to cagA. Our study provides a resource that can guide future research on the biological roles and host interactions of cagPAI proteins, including several whose function is still unknown.

PubMed Disclaimer

Conflict of interest statement

MV is an employee of Applied Maths nv and therefore has competing interest for the Kodon software.

Figures

Figure 1
Figure 1. Distribution of the cag pathogenicity island in a global collection of H. pylori strains from different populations.
(A) Neighbor joining (NJ) tree of neutral genetic relatedness of H. pylori strains, including information about the presence or absence of the cagPAI. The NJ tree was calculated from concatenated sequences of seven housekeeping genes (length 3406 bp) from 877 isolates of H. pylori plus 9 additional isolates from which either cagPAI sequences or whole genome sequences had been published (indicated by arrows; , –. Each strain was scored for presence (filled triangles) or absence (empty circles) of the cagPAI based on the results of PCR reactions that span the ends of the cagPAI. Population assignments based on Bayesian analyses , are indicated by the color coding of symbols that correspond to the labels next to the tree; red symbols indicate all strains whose cagPAI sequences are now available, including the 29 strains that have been newly selected for cagPAI sequence analysis. (B) Geographic sources of strains whose cagPAI sequences are now available. Each dot indicates the source of isolation of one of the 38 cagPAI sequences that were analyzed. The dots are color-coded by population or subpopulation as in (A).
Figure 2
Figure 2. Conservation of the cagPAI genetic organization across H. pylori biogeographic populations.
The sequences were aligned in KODON using the cagPAI of strain J99 as a scaffold sequence. Individual isolates are grouped according to biogeographic (sub-)populations. The continuity of the cagPAI was disrupted in isolates PAL3414, V225 and HUI1769, and fragments found in secondary locations are displayed in grey-shaded boxes on separate lines. The two cagPAI sequences from reference strains J99 and 26695 were extracted from whole genomes. Genes essential for a basic function of the cagPAI type IV secretion system (IL-8 induction; [3]) are labeled with an asterisk*. Activity of the Cag t4ss (IL-8 secretion; + or −) was monitored during experimental infection of AGS cells with H. pylori. Obs., observed IL-8 secretion; exp., IL-8 secretion expected from the cagPAI sequence; red, genes in forward orientation; blue, genes in reverse orientation; light blue, shorter gene version; white, different gene HP521B in this locus; yellow, pseudogenes; black, IS elements; green, cagPAI insertion sites. Diamonds: frameshift mutations leading to pseudogenes. Δ followed by numbers 1 through 10 indicate different deletions (manifestation of macrodiversity) and are consecutively numbered as mentioned in the text and Table 1. a,b,c,d: strains not functionally tested in this study possess functional cagPAIs according to the following references: a ; b ; c ; d .
Figure 3
Figure 3. Variability of Cag t4ss function in H. pylori strains from different biogeographic populations.
(A) IL-8 induction in human gastric epithelial cells by diverse H. pylori strains from different biogeographic populations. IL-8 secretion induced at 20 h post infection by live H. pylori in gastric epithelial cells (AGS, shown here, and MKN28, data not shown) was determined as a read-out for Cag t4ss activity. The two strains J99 and 26695A, for which entire genome sequences are available, were included as positive controls. CagA EPIYA motifs for each strain are indicated on top of the graph. Exceptions in the genetic integrity of some of the islands and other explanations for an observed loss of functionality are indicated above the single bars. Colored bars designate the population assignments of strains. Coincubation experiments were performed independently at least three times for each strain, with similar results, and one representative experiment, performed in triplicates for each strain, is shown. IL-8 secretion is depicted in relative values, as a multiple of the negative control (mock), which was set to 1. (B,C) Assessing underlying causes of loss of function of cagPAIs in some H. pylori strains. (B) CagA translocation assays performed after infection of AGS cells with the two selected H. pylori strains D3A and M49. These displayed loss of cagPAI-related activity in IL-8 release assays. Both strains were unable to translocate CagA into human gastric epithelial cells. Strains SU2, N6, and 26695A wild type (wt) were used as positive controls for CagA translocation. Strains SU2Δcag and 26695AΔcag (isogenic cagPAI deletion mutants to SU2 and 26695A) were included as negative controls. (C) transcript amounts of single cagPAI genes. 30 strains (4 strains shown here – for complete results see Table S3) were studied using semiquantitative RT PCR for each gene with known function in the Cag t4ss (refer to Table 2 for gene names). Two strains with loss of t4ss function, CC72C, and M49, are shown. TAI196 and 26695A are depicted as positive controls. TAI196, a strain with a high propensity to induce IL-8, shows relatively high transcript amounts for the majority of genes. Strains CC42C and L72 (not shown) which have pseudogenes and lost the ability to induce IL-8, showed low or undetectable transcript amounts for some genes including the pseudogenes. M49 displayed low transcript amounts for a number of essential genes of the t4ss located predominantly in the right half of the cagPAI (genes HP0528, and HP0537 to HP0544).
Figure 4
Figure 4. Sliding window map of maximum likelihood analysis of codons to be under diversifying selection for complete cagPAIs and housekeeping genes.
Codons calculated by CODEML (model M3) to have a high likelihood p>95% of being under diversifying selection in each gene of the cagPAI or housekeeping genes of all analyzed strains are highlighted by black symbols.
Figure 5
Figure 5. Pairwise correlation of genetic distances and phylogeographic diversity between H. pylori housekeeping genes and concatenated cagPAI genes.
(A) neighbor-joining (NJ) tree analysis of concatenated housekeeping genes for all strains, whose complete cagPAIs were analyzed. (B) NJ tree analysis of concatenated cagPAI genes for all strains. (C) Mantel comparison of pairwise genetic distances in housekeeping genes and cagPAI genes.
Figure 6
Figure 6. Model of the Cag t4ss of H. pylori, highlighting diversifying selection on outer and secreted components of the t4ss apparatus.
Each defined component of the cagPAI-encoded secretion system was shaded in grey according to averaged probability values, indicating the proportion of amino acids likely to be under diversifying selection for each individual protein; the probability values were calculated for each gene by the software CODEML (Table 2). 10 cagPAI genes which do not participate in the structure or are of unknown function are not included in the model. The model of the Cag t4ss is based on , –, , .
Figure 7
Figure 7. Diversifying selection in the MARK2 kinase binding domain of CagA.
(A) amino acids (aa) in CagA predicted to be under diversifying selection (PAML, Model 3) were mapped onto the crystal structure of a short peptide within the CagA C-terminal domain (aa 948 to aa 961, MK1 peptide; aa under positive selection colored in pink, aa not predicted to be under positive selection colored in green), in complex with its interaction partner, the human kinase MARK2 . Four residues critically involved in this interaction (Leu950, Arg952, Val954, Leu959) are labelled. Several amino acids involved in this interaction (e.g. Arg952 and Lys955, Ref. 44) are predicted to be under diversifying selection and are highly variable in our global strain collection. (B) amino acid alignment of the MARK2 binding region in the analyzed global strain collection. Black asterisks: amino acids involved in MARK2 binding. Pink dots: amino acids predicted to be under positive selection. Small dots in alignment: residue identical with reference strain 26695 (blue line on top). Hyphen: aa missing in respective strain.

Similar articles

Cited by

References

    1. Suerbaum S, Michetti P. Helicobacter pylori infection. N Engl J Med. 2002;347:1175–1186. - PubMed
    1. Censini S, Lange C, Xiang Z, Crabtree JE, Ghiara P, et al. cag, a pathogenicity island of Helicobacter pylori, encodes type I-specific and disease-associated virulence factors. Proc Natl Acad Sci U S A. 1996;93:14648–14653. - PMC - PubMed
    1. Fischer W, Puls J, Buhrdorf R, Gebert B, Odenbreit S, et al. Systematic mutagenesis of the Helicobacter pylori cag pathogenicity island: essential genes for CagA translocation in host cells and induction of interleukin-8. Mol Microbiol. 2001;42:1337–1348. - PubMed
    1. Wiedemann T, Loell E, Mueller S, Stoeckelhuber M, Stolte M, et al. Helicobacter pylori cag-Pathogenicity island-dependent early immunological response triggers later precancerous gastric changes in Mongolian gerbils. PLoS ONE. 2009;4:e4754. doi: 10.1371/journal.pone.0004754. - DOI - PMC - PubMed
    1. Figueiredo C, Machado JC, Pharoah P, Seruca R, Sousa S, et al. Helicobacter pylori and interleukin 1 genotyping: an opportunity to identify high-risk individuals for gastric carcinoma. J Natl Cancer Inst. 2002;94:1680–1687. - PubMed

Publication types

MeSH terms