dTOURS: Dense-region tagging for outbreak detection using ratio statistics
- PMID: 40359413
- PMCID: PMC12074585
- DOI: 10.1371/journal.pone.0322663
dTOURS: Dense-region tagging for outbreak detection using ratio statistics
Abstract
Surveillance for food safety in the United States of America is a collaborative effort among public health agencies with additional partners worldwide contributing sequence data. Assemblies in GenBank and sequence reads in the Sequence Read Archive for surveilled species are received, rapidly analyzed, and results published publicly by an automated pathogen detection pipeline at the National Center for Biotechnology Information. The pipeline detects close isolates with a recent common ancestor by finding single nucleotide polymorphisms (SNPs) in genomes for pairs of isolates. Very few vertically transmitted SNPs are expected between a pair of close isolates; any genomic region with many SNPs compared to the number of SNPs in the rest of the genome is indicative of a horizontally transferred region that needs to be excluded for counting vertically transmitted SNPs. We developed dTOURS that adapted the ratio statistic for finding outliers to the problem of finding regions of high SNP density in a pair of genomes where isolates typically have fragmented genome assemblies. Simulations for deciding the dTOURS parameter are presented. We illustrate correctness of dTOURS using five published outbreaks, one each for five bacterial species that cause many foodborne outbreaks or lead to a high mortality rate. Comparison to Gubbins shows that while both Gubbins and dTOURS use the ratio statistic, the implementation in dTOURS is more robust for finding close isolates in outbreak analysis. Comparison with the method used by the Food and Drug Administration shows that their method is simple and fast but not sensitive.
Copyright: This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures





References
-
- GenomeTrakr Network. https://www.fda.gov/food/whole-genome-sequencing-wgs-program/genometrakr...
-
- The NCBI Pathogen Detection Project. https://www.ncbi.nlm.nih.gov/pathogens/
MeSH terms
LinkOut - more resources
Full Text Sources
Medical