Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 13;20(5):e0322663.
doi: 10.1371/journal.pone.0322663. eCollection 2025.

dTOURS: Dense-region tagging for outbreak detection using ratio statistics

Affiliations

dTOURS: Dense-region tagging for outbreak detection using ratio statistics

Lukas Wagner et al. PLoS One. .

Abstract

Surveillance for food safety in the United States of America is a collaborative effort among public health agencies with additional partners worldwide contributing sequence data. Assemblies in GenBank and sequence reads in the Sequence Read Archive for surveilled species are received, rapidly analyzed, and results published publicly by an automated pathogen detection pipeline at the National Center for Biotechnology Information. The pipeline detects close isolates with a recent common ancestor by finding single nucleotide polymorphisms (SNPs) in genomes for pairs of isolates. Very few vertically transmitted SNPs are expected between a pair of close isolates; any genomic region with many SNPs compared to the number of SNPs in the rest of the genome is indicative of a horizontally transferred region that needs to be excluded for counting vertically transmitted SNPs. We developed dTOURS that adapted the ratio statistic for finding outliers to the problem of finding regions of high SNP density in a pair of genomes where isolates typically have fragmented genome assemblies. Simulations for deciding the dTOURS parameter are presented. We illustrate correctness of dTOURS using five published outbreaks, one each for five bacterial species that cause many foodborne outbreaks or lead to a high mortality rate. Comparison to Gubbins shows that while both Gubbins and dTOURS use the ratio statistic, the implementation in dTOURS is more robust for finding close isolates in outbreak analysis. Comparison with the method used by the Food and Drug Administration shows that their method is simple and fast but not sensitive.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Flowchart for dTOURS algorithm.
Input is a set of SNPs on a pair of genomes and output is HDRs.
Fig 2
Fig 2. SNPs between GCA_011671145.1 and GCA_010507495.1 before filtering.
SNPs are color coded by contig. Coordinates on the X-axis are by concatenating aligned blocks of contigs in the order in which they appear in the subject genome GCA_010507495.1.
Fig 3
Fig 3. Distribution of RS score for some simulated combinations.
With N={10, 25, 50, 75, 100} and G={1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 12.7} million, combinations shown are (A) 10 positions from all counts in G (B) 100 positions from all counts in G (C) all positions in N from 1.5 million (D) all positions in N from 12.7 million.
Fig 4
Fig 4. Clades in Listeria outbreak.
Number of members in each clade and subclade are shown in round brackets. Number of pairs in each subclade where the unfiltered SNP count is at least 100 more than the filtered SNP count in our analysis is shown in square brackets.
Fig 5
Fig 5. Plasmid alignment and SNPs on contig 67 in SRR3476797 assembly with respect to SRR3476805 assembly.
Length of contig 67 in SRR3476797 assembly by SKESA and plasmid LR890577.1 is 41,027 bp and 92,280 bp, respectively. Each vertical pink bar shows a single SNP.

References

    1. Stevens EL, Carleton HA, Beal J, Tillman GE, Lindsey RL, Lauer AC, et al.. Use of Whole Genome Sequencing by the Federal Interagency Collaboration for Genomics for Food and Feed Safety in the United States. J Food Prot. 2022. May 1;85(5):755-772. doi: 10.4315/JFP-21-437 - DOI - PubMed
    1. Allard MW, Strain E, Melka D, Bunning K, Musser SM, Brown EW, et al.. Practical Value of Food Pathogen Traceability through Building a Whole-Genome Sequencing Network and Database. J Clin Microbiol. 2016. Aug;54(8):1975-83. 10.1128/JCM.00081-16 - DOI - PMC - PubMed
    1. PulseNet. https://www.cdc.gov/pulsenet/hcp/about/index.html
    1. GenomeTrakr Network. https://www.fda.gov/food/whole-genome-sequencing-wgs-program/genometrakr...
    1. The NCBI Pathogen Detection Project. https://www.ncbi.nlm.nih.gov/pathogens/