Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 18;9(6):e0141523.
doi: 10.1128/msystems.01415-23. Epub 2024 May 31.

SARS-CoV-2 wastewater variant surveillance: pandemic response leveraging FDA's GenomeTrakr network

Collaborators, Affiliations

SARS-CoV-2 wastewater variant surveillance: pandemic response leveraging FDA's GenomeTrakr network

Ruth E Timme et al. mSystems. .

Abstract

Wastewater surveillance has emerged as a crucial public health tool for population-level pathogen surveillance. Supported by funding from the American Rescue Plan Act of 2021, the FDA's genomic epidemiology program, GenomeTrakr, was leveraged to sequence SARS-CoV-2 from wastewater sites across the United States. This initiative required the evaluation, optimization, development, and publication of new methods and analytical tools spanning sample collection through variant analyses. Version-controlled protocols for each step of the process were developed and published on protocols.io. A custom data analysis tool and a publicly accessible dashboard were built to facilitate real-time visualization of the collected data, focusing on the relative abundance of SARS-CoV-2 variants and sub-lineages across different samples and sites throughout the project. From September 2021 through June 2023, a total of 3,389 wastewater samples were collected, with 2,517 undergoing sequencing and submission to NCBI under the umbrella BioProject, PRJNA757291. Sequence data were released with explicit quality control (QC) tags on all sequence records, communicating our confidence in the quality of data. Variant analysis revealed wide circulation of Delta in the fall of 2021 and captured the sweep of Omicron and subsequent diversification of this lineage through the end of the sampling period. This project successfully achieved two important goals for the FDA's GenomeTrakr program: first, contributing timely genomic data for the SARS-CoV-2 pandemic response, and second, establishing both capacity and best practices for culture-independent, population-level environmental surveillance for other pathogens of interest to the FDA.

Importance: This paper serves two primary objectives. First, it summarizes the genomic and contextual data collected during a Covid-19 pandemic response project, which utilized the FDA's laboratory network, traditionally employed for sequencing foodborne pathogens, for sequencing SARS-CoV-2 from wastewater samples. Second, it outlines best practices for gathering and organizing population-level next generation sequencing (NGS) data collected for culture-free, surveillance of pathogens sourced from environmental samples.

Keywords: FAIR data; GenomeTrakr; SARS-CoV-2; covid-19; data standards; data structures; pathogen genomic surveillance; wastewater based epidemiology; wastewater surveillance.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Fig 1
Fig 1
NCBI data structure for population-level pathogen surveillance, or environmental pathogen data object model (DOM). This Env pathogen DOM has sample and sequence contextual data required for analyzing wastewater sequence data with a single target pathogen, SARS-CoV-2. The flag on the BioProject represents the automated human-read scrubbing by NCBI for all data submissions linked to this project.
Fig 2
Fig 2
Data sources for the public dashboard summarizing wastewater surveillance for SARS-CoV-2 variants. The compilation of information for the public dashboard involved two distinct NCBI queries and a sequence analysis pipeline. Daily queries were executed to capture new submissions, and the newly obtained summary data were incorporated into the public dashboard guided by information in the static file. Raw data for the dashboard are available for download here: https://github.com/CFSAN-Biostatistics/WW-SC2-variant-estimations.
Fig 3
Fig 3
Sample collection and sequencing workflow. SARS-CoV-2 wastewater surveillance sample analysis process and critical quality control checkpoints recommended for this project.
Fig 4
Fig 4
QC summary metrics for short-read Illumina data. Three panels summarize the quality of population-level SARS-CoV-2 sequence data collected and submitted for this project: (a) average depth of coverage across the SARS-CoV-2 genome (average coverage), (b) percent of the SARS-CoV-2 genome uncovered at <10×, and (c) percent of raw sequence reads that aligned to the SARS-CoV-2 genome. Quality control determinations made by the submitter (QC bins A, B, C, or F) are also summarized in each panel.
Fig 5
Fig 5
Relative abundances of variants and sublineages over time. Stacked bar chart showing the average variant and sub-lineage proportions for samples collected during that week. For the sake of visibility, only sub-lineages with a relative abundance of ≥5% for at least one week are displayed. The rest, regardless of its designated interest to the WHO or CDC, were treated as parts of their parent lineage until a sub-lineage had sufficient relative abundance to meet the ≥5% threshold.
Fig 6
Fig 6
Turnaround time from sample collection to NCBI data release. Box and whisker plot showing number of days between sample collection date and NCBI release date for each participating laboratory.

References

    1. World Health Organization . 2021. Weekly epidemiological update - 2 February 2021. Available from: https://www.who.int/publications/m/item/weekly-epidemiological-update---.... Retrieved 24 Nov 2022.
    1. Allard MW, Strain E, Melka D, Bunning K, Musser SM, Brown EW, Timme RE. 2016. Practical value of food pathogen traceability through building a whole-genome sequencing network and database. J Clin Microbiol 54:1975–1983. doi: 10.1128/JCM.00081-16 - DOI - PMC - PubMed
    1. Stevens EL, Carleton HA, Beal J, Tillman GE, Lindsey RL, Lauer AC, Pightling A, Jarvis KG, Ottesen A, Ramachandran P, et al. 2022. The use of whole-genome sequencing by the Federal interagency collaboration for genomics for food and feed safety in the United States. J Food Prot 85:755–772. doi: 10.4315/JFP-21-437 - DOI - PubMed
    1. Timme RE, Wolfgang WJ, Balkey M, Venkata SLG, Randolph R, Allard M, Strain E. 2020. Optimizing open data to support one health: best practices to ensure interoperability of genomic data from bacterial pathogens. One health outlook 2:20. doi: 10.1186/s42522-020-00026-3 - DOI - PMC - PubMed
    1. Bivins A, North D, Ahmad A, Ahmed W, Alm E, Been F, Bhattacharya P, Bijlsma L, Boehm AB, Brown J, et al. 2020. Wastewater-based epidemiology: global collaborative to maximize contributions in the fight against COVID-19. Environ Sci Technol 54:7754–7757. doi: 10.1021/acs.est.0c02388 - DOI - PubMed

MeSH terms

Supplementary concepts

LinkOut - more resources