Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2022 Apr 4:2021.12.21.21268143.
doi: 10.1101/2021.12.21.21268143.

Wastewater sequencing uncovers early, cryptic SARS-CoV-2 variant transmission

Smruthi Karthikeyan  1 Joshua I Levy  2 Peter De Hoff  3   4   5 Greg Humphrey  1 Amanda Birmingham  6 Kristen Jepsen  7 Sawyer Farmer  1 Helena M Tubb  1 Tommy Valles  1 Caitlin E Tribelhorn  1 Rebecca Tsai  1 Stefan Aigner  3 Shashank Sathe  3 Niema Moshiri  8 Benjamin Henson  7 Adam M Mark  6 Abbas Hakim  3   4   5 Nathan A Baer  3 Tom Barber  3 Pedro Belda-Ferre  3 Marisol Chacón  3 Willi Cheung  3   4   5 Evelyn S Cresini  3 Emily R Eisner  3 Alma L Lastrella  3 Elijah S Lawrence  3 Clarisse A Marotz  3 Toan T Ngo  3 Tyler Ostrander  3 Ashley Plascencia  3 Rodolfo A Salido  3 Phoebe Seaver  3 Elizabeth W Smoot  3 Daniel McDonald  1 Robert M Neuhard  9   10 Angela L Scioscia  11   4 Alysson M Satterlund  12 Elizabeth H Simmons  13 Dismas B Abelman  10 David Brenner  10 Judith C Bruner  10 Anne Buckley  10 Michael Ellison  10 Jeffrey Gattas  10 Steven L Gonias  14 Matt Hale  10 Faith Hawkins  10 Lydia Ikeda  10 Hemlata Jhaveri  10 Ted Johnson  10 Vince Kellen  10 Brendan Kremer  10 Gary Matthews  10 Ronald W McLawhon  10 Pierre Ouillet  10 Daniel Park  10 Allorah Pradenas  10 Sharon Reed  10 Lindsay Riggs  10 Alison Sanders  10 Bradley Sollenberger  10 Angela Song  9   10 Benjamin White  10 Terri Winbush  10 Christine M Aceves  2 Catelyn Anderson  2 Karthik Gangavarapu  2 Emory Hufbauer  2 Ezra Kurzban  2 Justin Lee  2 Nathaniel L Matteson  2 Edyth Parker  2 Sarah A Perkins  2 Karthik S Ramesh  2 Refugio Robles-Sikisaka  2 Madison A Schwab  2 Emily Spencer  2 Shirlee Wohl  2 Laura Nicholson  2 Ian H Mchardy  2 David P Dimmock  15 Charlotte A Hobbs  15 Omid Bakhtar  16 Aaron Harding  16 Art Mendoza  16 Alexandre Bolze  17 David Becker  17 Elizabeth T Cirulli  17 Magnus Isaksson  17 Kelly M Schiabor Barrett  17 Nicole L Washington  17 John D Malone  18 Ashleigh Murphy Schafer  18 Nikos Gurfield  18 Sarah Stous  18 Rebecca Fielding-Miller  19   20 Richard S Garfein  19 Tommi Gaines  20 Cheryl Anderson  19 Natasha K Martin  19 Robert Schooley  19 Brett Austin  16 Duncan R MacCannell  21 Stephen F Kingsmore  15 William Lee  17 Seema Shah  18 Eric McDonald  18 Alexander T Yu  5 Mark Zeller  2 Kathleen M Fisch  6   4 Christopher Longhurst  1   22 Patty Maysent  23 David Pride  24 Pradeep K Khosla  8 Louise C Laurent  3   4   25 Gene W Yeo  3   25   26 Kristian G Andersen  2 Rob Knight  1   8   27
Affiliations

Wastewater sequencing uncovers early, cryptic SARS-CoV-2 variant transmission

Smruthi Karthikeyan et al. medRxiv. .

Update in

  • Wastewater sequencing reveals early cryptic SARS-CoV-2 variant transmission.
    Karthikeyan S, Levy JI, De Hoff P, Humphrey G, Birmingham A, Jepsen K, Farmer S, Tubb HM, Valles T, Tribelhorn CE, Tsai R, Aigner S, Sathe S, Moshiri N, Henson B, Mark AM, Hakim A, Baer NA, Barber T, Belda-Ferre P, Chacón M, Cheung W, Cresini ES, Eisner ER, Lastrella AL, Lawrence ES, Marotz CA, Ngo TT, Ostrander T, Plascencia A, Salido RA, Seaver P, Smoot EW, McDonald D, Neuhard RM, Scioscia AL, Satterlund AM, Simmons EH, Abelman DB, Brenner D, Bruner JC, Buckley A, Ellison M, Gattas J, Gonias SL, Hale M, Hawkins F, Ikeda L, Jhaveri H, Johnson T, Kellen V, Kremer B, Matthews G, McLawhon RW, Ouillet P, Park D, Pradenas A, Reed S, Riggs L, Sanders A, Sollenberger B, Song A, White B, Winbush T, Aceves CM, Anderson C, Gangavarapu K, Hufbauer E, Kurzban E, Lee J, Matteson NL, Parker E, Perkins SA, Ramesh KS, Robles-Sikisaka R, Schwab MA, Spencer E, Wohl S, Nicholson L, McHardy IH, Dimmock DP, Hobbs CA, Bakhtar O, Harding A, Mendoza A, Bolze A, Becker D, Cirulli ET, Isaksson M, Schiabor Barrett KM, Washington NL, Malone JD, Schafer AM, Gurfield N, Stous S, Fielding-Miller R, Garfein RS, Gaines T, Anderson C, Martin NK, Schooley R, Austin B, MacCannell DR, Kingsmore SF, Lee W, Shah S, Mc… See abstract for full author list ➔ Karthikeyan S, et al. Nature. 2022 Sep;609(7925):101-108. doi: 10.1038/s41586-022-05049-6. Epub 2022 Jul 7. Nature. 2022. PMID: 35798029 Free PMC article.

Abstract

As SARS-CoV-2 continues to spread and evolve, detecting emerging variants early is critical for public health interventions. Inferring lineage prevalence by clinical testing is infeasible at scale, especially in areas with limited resources, participation, or testing/sequencing capacity, which can also introduce biases. SARS-CoV-2 RNA concentration in wastewater successfully tracks regional infection dynamics and provides less biased abundance estimates than clinical testing. Tracking virus genomic sequences in wastewater would improve community prevalence estimates and detect emerging variants. However, two factors limit wastewater-based genomic surveillance: low-quality sequence data and inability to estimate relative lineage abundance in mixed samples. Here, we resolve these critical issues to perform a high-resolution, 295-day wastewater and clinical sequencing effort, in the controlled environment of a large university campus and the broader context of the surrounding county. We develop and deploy improved virus concentration protocols and deconvolution software that fully resolve multiple virus strains from wastewater. We detect emerging variants of concern up to 14 days earlier in wastewater samples, and identify multiple instances of virus spread not captured by clinical genomic surveillance. Our study provides a scalable solution for wastewater genomic surveillance that allows early detection of SARS-CoV-2 variants and identification of cryptic transmission.

PubMed Disclaimer

Figures

Extended Data Figure 1:
Extended Data Figure 1:. Relationship of daily UCSD campus wastewater sampler positivity and campus clinical positives.
Black line indicates the linear fit to the data, with bootstrap 95% confidence interval shown in gray.
Extended Data Figure 2:
Extended Data Figure 2:. Relationship between genome coverage and cycle quantification values.
10x genome coverage (fraction of sites with 10 reads or greater) remains high, even for Cq values of nearly 38. Points indicate median value in each bin, while error bars indicate the median absolute deviation.
Extended Data Figure 3:
Extended Data Figure 3:. Lineage-specific prediction of variant abundance in spike-in validation samples.
A. Schematic of “spike-in” sample design. B-F. Lineage specific prediction. Proportions of each lineage in the sample are shown as a pie chart marker (Grey = Lineage A, Orange = Alpha, Pink = Beta, Turquoise = Delta, and Purple = Gamma) with error bars indicating the standard deviation from the mean, across four replicates.
Extended Data Figure 4:
Extended Data Figure 4:. Freyja more accurately estimates virus abundance, with fewer false positives.
A-B. Estimated vs expected fraction of each lineage in the mixture. The Kallisto-based approach from Baaijens et. al shows a wider range of estimates for each known mix fraction, and generally underestimates the fraction. C. False positives with abundance greater than 0.5%.
Extended Data Figure 5:
Extended Data Figure 5:. The rise of the Delta variant during Summer 2021
A. Mean SARS-CoV-2 viral gene copies/L of raw sewage (blue) collected from the Point Loma Wastewater Treatment Plant and caseload (gray) reported by the county during the same period. SARS-CoV-2 concentrations were normalized by PMMoV (pepper mild mottle virus) concentration to adjust for load changes. B. Lineage distribution in UCSD campus wastewater. C. Monthly lineage averages for wastewater collected at Point Loma Wastewater Treatment Plant during the Delta surge (N= 5, 20, 25, 7)
Extended Data Figure 6:
Extended Data Figure 6:. Quantification of deconvolution uncertainty in first detection of VOCs.
A-D. Bootstrap distributions of Freyja abundance estimates obtained by resampling read data from each sample corresponding to the first detection of that VOC in San Diego. Two samplers were found to contain Delta on the same day. First detections were also confirmed using a VOC qPCR panel, as shown in Figure 2 and Extended Data Table 3. 95% Confidence intervals for variant prevalence for each first detection event: A. Alpha: (0.232, 0.278), B. Delta: (0.336, 0.397), C. Delta: (0.676, 0.772), D. Omicron: (0.017, 0.021).
Extended Data Figure 7:
Extended Data Figure 7:. Estimated proportion of Omicron sequences in clinical data.
Omicron estimates tracked via S-gene target failure, SGTF (characteristic of Omicron lineage BA.1 and its descendants) qPCR assays for clinical samples in San Diego between November 27th, 2021-February 7th, 2022. First detection of Omicron through clinical genomic sequencing in San Diego was December 8th. Dotted line shows a rolling average with a window size of seven days.
Figure 1:
Figure 1:. Campus sampling locations and SARS-CoV-2 testing statistics.
A. Geospatial distribution of the 131 actively deployed wastewater autosamplers and the corresponding 360 university buildings on the campus sewer network. Building-specific data have been de-identified in accordance with university reporting policies. B. Campus wastewater and diagnostic testing statistics over the 295 day sampling period (WW = wastewater, positivity is the fraction of WW samplers with a positive qPCR signal). C.Virus diversity in wastewater and clinical samples: Boxplots of Shannon entropy (top) and richness (bottom) for each sample type.
Figure 2:
Figure 2:. Sample deconvolution robustly recovers relative virus abundance.
A. Subset of lineage defining mutation “barcode” matrix. Each row represents one lineage (out of >1000 lineages included in the UShER global phylogenetic tree), and individual nucleotide mutations are represented as columns. B. Single nucleotide variant frequencies obtained from iVar used for recovering relative abundance of each lineage. C. Schematic of the spike-in validation experiment. D. Depth-weighted de-mixing estimates of the virus abundance versus expected/known abundance. Details on lineage specific predictions are provided in Extended Data Figure 3. E. Comparison of wastewater sample deconvolution with VOC qPCR panel, with lookup table (bottom) showing amino acid mutations corresponding to each variant.
Figure 3:
Figure 3:. Freyja recovers early and cryptic transmission of SARS-CoV-2 variants of concern
A. Timeline and normalized epidemiological curves for VOC detection in both wastewater and clinical sequences from San Diego County for the 3 major VOCs in circulation during the sampling period. Both Alpha and Delta are detected first in wastewater before clinical samples. Markers for clinical detections correspond to the ceiling of the daily detection count divided by 30 (e.g. 1–30 samples= one marker, 31–60 = two markers), while wastewater markers correspond to a single detection. B. Timeline and epidemiological curves for VOC detection in the campus samples. Markers correspond to a single detection event for both clinical and wastewater surveillance. All wastewater detections correspond to an estimated VOC prevalence of at least 10%.
Figure 4:
Figure 4:. Deconvolution recovers a fine-grained estimate of virus population dynamics.
A. Prevalence of SARS-CoV-2 variants in UCSD clinical surveillance, and B. Variant prevalence in all clinical samples collected in San Diego County. C,D. Variant prevalence in wastewater at UCSD as well as the greater San Diego County (includes wastewater samples collected from Point Loma wastewater treatment plant as well as public schools in the San Diego districts). Further analysis of Point Loma wastewater samples is shown in Extended Data Figure 5. All curves show rolling average, window ±10 days. “Other” contains all lineages not designated as VOCs. Bottom panels show number of sequenced samples per day.
Figure 5:
Figure 5:. Community wastewater enables early Omicron detection and reveals lineage dynamics.
A. Prevalence of SARS-CoV-2 VOCs in wastewater collected from the Point Loma wastewater treatment plant from late September 2021 to early February 2022. B. Estimated VOC concentrations, prevalence estimates scaled by normalized viral load in wastewater. C,D. Lineage-specific estimates of prevalence and concentration. All curves show an adaptive rolling average calculated using a local linear approximation (Savitzky-Golay filter) of virus copies/L, with window size ± 1 sampling date.
Figure 6:
Figure 6:. Wastewater identifies clinically known and unknown virus transmission.
A-C. Maximum likelihood phylogenetic trees for each of the dominant variants of concern using high quality samples obtained at UCSD, as well as a representative set of sequences from the entire United States. Wastewater sequences from the same sampler that differ by 1 or fewer SNPs are denoted with a red asterisk. For all sequences, consensus bases were called at sites with >50% nucleotide frequency. Location information is provided for select outbreaks. D. Pairwise comparison of collection date for matching and near-matching wastewater and nasal swab samples obtained at UCSD. Positive values indicate earlier collection in nasal swabs, and negative values indicate earlier detection in wastewater.

References

    1. Mullen Julia L., Tsueng Ginger, Latif Alaa Abdel, Alkuzweny Manar, Cano Marco, Haag Emily, Zhou Jerry, Zeller Mark, Hufbauer Emory, Matteson Nate, Andersen Kristian G., Wu Chunlei, Su Andrew I., Gangavarapu Karthik, Hughes Laura D., and the Center for Viral Systems Biology. outbreak.info. outbreak.info https://outbreak.info/(2021).
    1. Harvey W. T. et al. SARS-CoV-2 variants, spike mutations and immune escape. Nat. Rev. Microbiol. 19, 409–424 (2021). - PMC - PubMed
    1. Reitsma M. B. et al. Racial/Ethnic Disparities In COVID-19 Exposure Risk, Testing, And Cases At The Subcounty Level In California. Health Aff. 40, 870–878 (2021). - PMC - PubMed
    1. Lieberman-Cribbin W., Tuminello S., Flores R. M. & Taioli E. Disparities in COVID-19 Testing and Positivity in New York City. Am. J. Prev. Med. 59, 326–332 (2020). - PMC - PubMed
    1. Brito A. F. et al. Global disparities in SARS-CoV-2 genomic surveillance. medRxiv (2021) doi:10.1101/2021.08.21.21262393. - DOI - PMC - PubMed

Publication types