Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 11;14(1):8184.
doi: 10.1038/s41467-023-43562-y.

The Helicobacter pylori Genome Project: insights into H. pylori population structure from analysis of a worldwide collection of complete genomes

Collaborators, Affiliations

The Helicobacter pylori Genome Project: insights into H. pylori population structure from analysis of a worldwide collection of complete genomes

Kaisa Thorell et al. Nat Commun. .

Abstract

Helicobacter pylori, a dominant member of the gastric microbiota, shares co-evolutionary history with humans. This has led to the development of genetically distinct H. pylori subpopulations associated with the geographic origin of the host and with differential gastric disease risk. Here, we provide insights into H. pylori population structure as a part of the Helicobacter pylori Genome Project (HpGP), a multi-disciplinary initiative aimed at elucidating H. pylori pathogenesis and identifying new therapeutic targets. We collected 1011 well-characterized clinical strains from 50 countries and generated high-quality genome sequences. We analysed core genome diversity and population structure of the HpGP dataset and 255 worldwide reference genomes to outline the ancestral contribution to Eurasian, African, and American populations. We found evidence of substantial contribution of population hpNorthAsia and subpopulation hspUral in Northern European H. pylori. The genomes of H. pylori isolated from northern and southern Indigenous Americans differed in that bacteria isolated in northern Indigenous communities were more similar to North Asian H. pylori while the southern had higher relatedness to hpEastAsia. Notably, we also found a highly clonal yet geographically dispersed North American subpopulation, which is negative for the cag pathogenicity island, and present in 7% of sequenced US genomes. We expect the HpGP dataset and the corresponding strains to become a major asset for H. pylori genomics.

PubMed Disclaimer

Conflict of interest statement

J.P.G. has served as a speaker, consultant, and advisory member for or has received research funding from Mayoly, Allergan, Diasorin, Gebro Pharma, and Richen. E.B.-M. has served as a speaker and consultant for Janssen, Chiesi, Kern and Takeda. R.M.F., J.C.M., and C.F. own patent WO/2018/169423 on microbiome markers for gastric cancer, and R.J.R. works for New England Biolabs, a company that sells research reagents, including restriction enzymes and DNA methyltransferases, to the scientific community. The remaining authors declare no competing interests.

Figures

Fig. 1
Fig. 1. World map of HpGP strain origins and population assignments.
The area of each pie is proportional to the number of HpGP genomes from each country and colored by the H. pylori population (hp) and subpopulation (hsp) as assigned by fineSTRUCTURE (Supplementary Figs. 1 and 2).
Fig. 2
Fig. 2. Distance network analyses of the core genome of the H. pylori strains studied.
Fruchterman–Reingold layout of the pruned distance network between HpGP genomes (circles) and reference genomes (triangles) (see Methods). Colors indicate the H. pylori population (hp) and subpopulation (hsp) as assigned by fineSTRUCTURE (Supplementary Figs. 1 and 2). The length and opacity of each link are proportional to the genetic distance between genomes (nodes), with higher opacity and shorter length indicating genetic closeness and less opacity and higher length indicating higher genetic distance between strains. The size of each node is proportional to the connectivity (number of links) of that node, indicating that bigger nodes have connections to more other strains than those of lesser sizes.
Fig. 3
Fig. 3. Inferred ancestral genomic contributions to the Eurasian HpGP genomes.
Ancestral chromosome painting proportions by donor and Eurasian subpopulation. Boxplots show the median value per group, and the 25th and 75th percentiles (hinges), with whiskers extending from the hinge to the largest value no further than 1.5 × IQR (inter-quartile range) from the hinge. Data points beyond the whiskers are plotted individually. The number of genomes in each respective Eurasian population is hspSWEuropeLatinAmerica, n = 15; hspSWEurope2, n = 12; hspSWEurope1, n = 129; hspEurasia3, n = 18; hspEurasia2, n = 76; hspEurasia1, n = 103; hspNEurope, n = 95; hpNorthAsia, n = 2; HpGP “hspUral”, n = 10; hpAsia2, n = 27.
Fig. 4
Fig. 4. In-depth analysis of clonal relationships in the global H. pylori dataset.
a Pairwise core genome MLST (cgMLST) distances of the HpGP dataset. Bins illustrate the distribution of core genome allele sharing between pairs of samples. The x-axis ranges from 0.1 to 0.99, with lower values indicating higher number of shared alleles. Every pair is included in a single category of comparison (color bar). Only a small fraction of all possible pairs shares more than 1% of alleles, most of them involving samples from the same country of origin. It is noteworthy that a group of strains from different regions of the US shares between 6% and 17% of alleles corresponding to 62 and 176 identical genes, suggesting the presence of a deep clone. Other pairs exhibit larger portions of shared alleles (distances <50%), representing recent transmissions between closely related strains. b Dated ClonalFrameML tree of the final set of strains considered to belong to the US deep clone Hp_Clone_US-1, including five publicly available genomes. Node ages correspond to years based on a previously estimated 1.38 × 10−5 mutation rate per site per year. The colored dots represent the geographical origin of each strain.
Fig. 5
Fig. 5. Summary of population classifications.
Summary of the clustering results using the respective analyses in relation to previously reported MLST and whole genome-based H. pylori populations (Hp) and subpopulations (hsp). Colors are based on classifications from the fineSTRUCTURE (fs) analyses visualized in Supplementary Fig. 1, on the K = 6 discriminant analysis of principal components, DAPC (Supplementary Fig. 3), and the network clusters (Fig. 2). The topology of the dendrogram to the left is based on the fineSTRUCTURE hierarchical clustering of Supplementary Fig. 1.

References

    1. Fox JG, Wang TC. Inflammation, atrophy, and gastric cancer. J. Clin. Investig. 2007;117:60–69. doi: 10.1172/JCI30111. - DOI - PMC - PubMed
    1. Conteduca V, et al. H. pylori infection and gastric cancer: state of the art (review) Int. J. Oncol. 2013;42:5–18. doi: 10.3892/ijo.2012.1701. - DOI - PubMed
    1. Falush D, et al. Traces of human migrations in Helicobacter pylori populations. Science. 2003;299:1582–1585. doi: 10.1126/science.1080857. - DOI - PubMed
    1. Linz B, et al. An African origin for the intimate association between humans and Helicobacter pylori. Nature. 2007;445:915–918. doi: 10.1038/nature05562. - DOI - PMC - PubMed
    1. Moodley Y, et al. Age of the association between Helicobacter pylori and man. PLoS Pathog. 2012;8:e1002693. doi: 10.1371/journal.ppat.1002693. - DOI - PMC - PubMed

Publication types