. 2019 Feb 28;14(2):e0213039.

doi: 10.1371/journal.pone.0213039. eCollection 2019.

GenomeGraphR: A user-friendly open-source web application for foodborne pathogen whole genome sequencing data integration, analysis, and visualization

Moez Sanaa¹, Régis Pouillot¹, Francisco Garcés Vega¹, Errol Strain¹, Jane M Van Doren¹

Affiliations

PMID: 30818354
PMCID: PMC6394949
DOI: 10.1371/journal.pone.0213039

GenomeGraphR: A user-friendly open-source web application for foodborne pathogen whole genome sequencing data integration, analysis, and visualization

Moez Sanaa et al. PLoS One. 2019.

. 2019 Feb 28;14(2):e0213039.

doi: 10.1371/journal.pone.0213039. eCollection 2019.

Authors

Moez Sanaa¹, Régis Pouillot¹, Francisco Garcés Vega¹, Errol Strain¹, Jane M Van Doren¹

Affiliation

¹ Center for Food Safety and Applied Nutrition, Food and Drug Administration, College Park, Maryland, United States of America.

PMID: 30818354
PMCID: PMC6394949
DOI: 10.1371/journal.pone.0213039

Abstract

Food safety risk assessments and large-scale epidemiological investigations have the potential to provide better and new types of information when whole genome sequence (WGS) data are effectively integrated. Today, the NCBI Pathogen Detection database WGS collections have grown significantly through improvements in technology, coordination, and collaboration, such as the GenomeTrakr and PulseNet networks. However, high-quality genomic data is not often coupled with high-quality epidemiological or food chain metadata. We have created a set of tools for cleaning, curation, integration, analysis and visualization of microbial genome sequencing data. It has been tested using Salmonella enterica and Listeria monocytogenes data sets provided by NCBI Pathogen Detection (160,000 sequenced isolates in 2018). GenomeGraphR presents foodborne pathogen WGS data and associated curated metadata in a user-friendly interface that allows a user to query a variety of research questions such as, transmission sources and dynamics, global reach, and persistence of genotypes associated with contamination in the food supply and foodborne illness across time or space. The application is freely available (https://fda-riskmodels.foodrisk.org/genomegraphr/).

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Simplified example of the search strategy.**
A. the complete network includes all nodes and link the nodes that are closer, in SNP distance, than a given, user-specified, threshold. B. Selecting some strains (e.g. based on their isolation source), the connected components are limited to those strains and the clinical strains closer than the SNP threshold. This graph shows the strains from the isolation source that are potentially linked to clinical strains. It includes only clinical strains and strains from the selected isolation source. C. in order to verify if these links are meaningful, all the additional strains, from any sources, that are closer than the SNP threshold to the clinical strains are recalled, forming a sub-network.

**Fig 2. Categorization scheme of strain isolation sources with relative numbers of isolates illustrated by the width of the bank in this Sankey plot–non-clinical strains.**
Top: L. *monocytogenes* strains (root: 10,912 isolates), Bottom: S. *enterica* strains (root: 49,525 strains).

**Fig 3**
Left: Numbers of isolates (scale on the right axis) and number of SNP clusters (scale on the left axis) as a function of time (creation date of the target in the NCBI database), for L. *monocytogenes* (top) and S. *enterica* (bottom). Right: Probability for a newly created clinical strains of being genetically matched with a non-clinical strain previously isolated, as a function of time and SNP threshold for L. *monocytogenes* (top) and S. *enterica* (bottom). Note: the 2013 artifact for S. *enterica* is linked to the massive inclusion of new strains in 2013.

Fig 4. Box-plot of the number of connections per strains (Degree—k) as a function of the year of creation of the target, per year (left: L. *monocytogenes*; right: S. *enterica*. SNP threshold = 12).

**Fig 5. Connected components characteristics at SNP threshold equal to 12 (left: *Listeria monocytogenes*, right: *Salmonella*).**
Each point represents a connected component, placed on the x-axis at its number of nodes (n) and on the y-axis at its number of links, both in log₁₀ scale. The upper line represents n × (n—1)/2, the maximum number of possible links and the lower, dashed line represents (n-1) the minimum number of links.

**Fig 6**
**Left: the isolation source tree**. Hovering the mouse on a node provides the number of strains from this isolation source in the database. Clicking on the node provides the graph on the right. **Right: Clinical strains connected to shell egg strains**. A connection exists when the SNP differences between a clinical strain and a non-clinical strain is less or equal to 12, leading to Connected components. The framed CC was selected to show an example of the in-depth analysis of clinical case sources.

**Fig 7. Examples of connectivity between food category and clinical cases (SNP threshold = 12).**
CCs: connected components, (): number of strains, → connected with SNPs ≤ 12.

**Fig 8. Examples of connectivity between a sub-set of strains (Food/environmental strains isolated in Canada) and clinical cases (SNP threshold = 12).**
CCs: connected components, (): number of strains, → connected with SNPs ≤ 12.

**Fig 10. Example of visualization and analysis of a sub-network.**

**Fig 11. Map illustrating the origin of the strains.**
Note that the dots are placed at random within the limits of the state (United States) or the country (The position of each dot doesn’t represent the actual location of sampling). Strains from the US not assigned to a specific State are placed in the blue square. Clinical strains are in black.

See this image and copyright information in PMC

Cited by

Prospective modeling and estimating the epidemiologically informative match rate within large foodborne pathogen genomic databases.
Yin L, Pettengill JB. Yin L, et al. BMC Res Notes. 2024 Jul 9;17(1):191. doi: 10.1186/s13104-024-06847-z. BMC Res Notes. 2024. PMID: 38982485 Free PMC article.
Use of whole genome sequencing for surveillance and control of foodborne diseases: status quo and quo vadis.
Schadron T, van den Beld M, Mughini-Gras L, Franz E. Schadron T, et al. Front Microbiol. 2024 Sep 13;15:1460335. doi: 10.3389/fmicb.2024.1460335. eCollection 2024. Front Microbiol. 2024. PMID: 39345263 Free PMC article. Review.
Listeriosis outbreak likely due to contaminated liver pâté consumed in a tavern, Austria, December 2018.
Cabal A, Allerberger F, Huhulescu S, Kornschober C, Springer B, Schlagenhaufen C, Wassermann-Neuhold M, Fötschl H, Pless P, Krause R, Lennkh A, Murer A, Ruppitsch W, Pietzka A. Cabal A, et al. Euro Surveill. 2019 Sep;24(39):1900274. doi: 10.2807/1560-7917.ES.2019.24.39.1900274. Euro Surveill. 2019. PMID: 31576804 Free PMC article.
Integrating Whole-Genome Sequencing Data Into Quantitative Risk Assessment of Foodborne Antimicrobial Resistance: A Review of Opportunities and Challenges.
Collineau L, Boerlin P, Carson CA, Chapman B, Fazil A, Hetman B, McEwen SA, Parmley EJ, Reid-Smith RJ, Taboada EN, Smith BA. Collineau L, et al. Front Microbiol. 2019 May 21;10:1107. doi: 10.3389/fmicb.2019.01107. eCollection 2019. Front Microbiol. 2019. PMID: 31231317 Free PMC article. Review.
Human Listeriosis.
Koopmans MM, Brouwer MC, Vázquez-Boland JA, van de Beek D. Koopmans MM, et al. Clin Microbiol Rev. 2023 Mar 23;36(1):e0006019. doi: 10.1128/cmr.00060-19. Epub 2022 Dec 8. Clin Microbiol Rev. 2023. PMID: 36475874 Free PMC article. Review.

See all "Cited by" articles

References

1. Allard MW, Strain E, Melka D, Bunning K, Musser SM, Brown EW, et al. Practical Value of Food Pathogen Traceability through Building a Whole-Genome Sequencing Network and Database. J Clin Microbiol. 2016;54(8):1975–83. Epub 2016/03/25. 10.1128/JCM.00081-16 - DOI - PMC - PubMed
1. Byrne L, Adams N, Glen K, Dallman TJ, Kar-Purkayastha I, Beasley G, et al. Epidemiological and Microbiological Investigation of an Outbreak of Severe Disease from Shiga Toxin-Producing Escherichia coli O157 Infection Associated with Consumption of a Slaw Garnish. J Food Prot. 2016;79(7):1161–8. Epub 2016/07/01. 10.4315/0362-028X.JFP-15-580 . - DOI - PubMed
1. Franz E, Gras LM, Dallman T. Significance of whole genome sequencing for surveillance, source attribution and microbial risk assessment of foodborne pathogens. Current Opinion in Food Science. 2016;8:74–9. 10.1016/j.cofs.2016.04.004. - DOI
1. Octavia S, Wang Q, Tanaka MM, Kaur S, Sintchenko V, Lan R. Delineating community outbreaks of Salmonella enterica serovar Typhimurium by use of whole-genome sequencing: insights into genomic variability within an outbreak. J Clin Microbiol. 2015;53(4):1063–71. Epub 2015/01/23. 10.1128/JCM.03235-14 - DOI - PMC - PubMed
1. Jackson BR, Tarr C, Strain E, Jackson KA, Conrad A, Carleton H, et al. Implementation of Nationwide Real-time Whole-genome Sequencing to Enhance Listeriosis Outbreak Detection and Investigation. Clin Infect Dis. 2016;63(3):380–6. Epub 2016/04/20. 10.1093/cid/ciw242 - DOI - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

GenomeGraphR: A user-friendly open-source web application for foodborne pathogen whole genome sequencing data integration, analysis, and visualization

Affiliation

GenomeGraphR: A user-friendly open-source web application for foodborne pathogen whole genome sequencing data integration, analysis, and visualization

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

LinkOut - more resources

Full Text Sources

Medical