Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 9;21(6):e1012850.
doi: 10.1371/journal.ppat.1012850. eCollection 2025 Jun.

Detecting SARS-CoV-2 cryptic lineages using publicly available whole genome wastewater sequencing data

Affiliations

Detecting SARS-CoV-2 cryptic lineages using publicly available whole genome wastewater sequencing data

Reinier Suarez et al. PLoS Pathog. .

Abstract

Beginning in early 2021, unique and highly divergent lineages of SARS-CoV-2 were sporadically found in wastewater sewersheds using a sequencing strategy focused on amplifying the most rapidly evolving region of SARS-CoV-2, the receptor binding domain (RBD). Because these RBD sequences did not match known circulating strains and their source was not known, we termed them "cryptic lineages". To date, more than 20 cryptic lineages have been identified using the RBD-focused sequencing strategy. Here, we identified and characterized additional cryptic lineages from SARS-CoV-2 wastewater sequences submitted to NCBI's Sequence Read Archives (SRA). Wastewater sequence datasets were screened for individual sequence reads that contained combinations of mutations frequently found in cryptic lineages but not contemporary circulating lineages. Using this method, we identified 18 cryptic lineages that appeared in multiple (2-81) samples from the same sewershed, including 12 that were not previously reported. Partial consensus sequences were generated for each cryptic lineage by extracting and mapping sequences containing cryptic-specific mutations. Surprisingly, seven of the mutations that appeared convergently in cryptic lineages were reversions to sequences that were highly conserved in SARS-CoV-2-related enteric bat Sarbecoviruses. The apparent reversion to bat Sarbecovirus sequences is consistent with the notion that SARS-CoV-2 adaptation to replicate efficiently in respiratory tissues preceded the COVID-19 pandemic.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Schematic of workflow.
Samples from sewer shed facilities containing cryptic lineages (yellow) were compared against samples from neighboring sewer sheds that did not contain cryptic lineages (orange). A) Using the CH-1 cryptic lineage as an example, mutations found in at least two cryptic samples, with a prevalence of 50x more in the cryptic samples, are tentatively considered cryptic-specific (green). B) The sequence reads containing cryptic-specific mutations (red box) were mapped onto the SARS-CoV-2 genome, with varying coverage across the genome to create a consensus sequence (middle genome). To be mapped onto the genome, a cryptic-specific sequence must appear in two or more samples.
Fig 2
Fig 2. The phylogenetic tree generated by NextClade illustrates the diversity of the cryptic lineages.
The consensus sequences were uploaded onto Nextclade and compared against the Wuhan-Hu-1/2019 (MN908947). The phylogenetic tree highlights the diversity among the cryptic lineages detected.
Fig 3
Fig 3. Convergent cryptic changes found in
3 lineages. A) Convergent mutations that appeared in at least three cryptic lineages were mapped onto the spike protein based on their location and prevalence across all the cryptic lineages. B) Convergent non-Spike mutations mapped against the SARS-CoV-2 genome. Positions which contain multiple mutations in the same position are represented as stacked bars and color-coded.
Fig 4
Fig 4. Chart of SARS-CoV-2 amino acids that deviate from the consensus Sarbecovirus amino acid sequence.
Consensus amino acids found across seven bat Sarbecoviruses (orange) that are different in SARS-CoV-2 (yellow). The amino acid positions where a change is observed but differ from the Sarbecoviruses and SARS-CoV-2 are highlighted in blue. Frequency of patient sequences, reported by CoV-Spectrum [21], reverting to the Sarbecovirus consensus by October 2023 or November 2024 are shown. The independent occurrence calculated from Bloom and Neher’s calculator [26] for each mutation over the same time periods is shown, including its fitness score and effect. Mutations that did not appear in Bloom and Neher’s calculator were designated as not determined (ND).
Fig 5
Fig 5. Insertion sequences were mainly derived from duplications.
Insertion sites were mapped onto the SARS-CoV-2 genome to visually represent where the duplicated sequence (red) occurred and where the insertion was detected with respect to the cryptic lineage.
Fig 6
Fig 6. Ohio’s cryptic-specific RBD mutations over time.
Both locations shared highly similar mutation profiles in the RBD, with distinct mutations appearing in both locations around the same time (N460K, F486P, and P499T). Crossed-out cells signify areas of low or no coverage.

References

    1. Bade R, Nadarajan D, Driver EM, Halden RU, Gerber C, Krotulski A. Wastewater-based monitoring of the nitazene analogues: first detection of protonitazene in wastewater. Sci Total Environ. 2024;920:170781. - PubMed
    1. Barber C, Crank K, Papp K, Innes GK, Schmitz BW, Chavez J. Community-scale wastewater surveillance of Candida auris during an ongoing outbreak in southern Nevada. Environ Sci Tech. 2023;57(4):1755–63. - PMC - PubMed
    1. Corrin T, Rabeenthira P, Young KM, Mathiyalagan G, Baumeister A, Pussegoda K. A scoping review of human pathogens detected in untreated human wastewater and sludge. J Water Health. 2024. doi: jwh2024326 - PubMed
    1. Wurtzer S, Waldman P, Levert M, Cluzel N, Almayrac JL, Charpentier C. SARS-CoV-2 genome quantification in wastewaters at regional and city scale allows precise monitoring of the whole outbreaks dynamics and variants spreading in the population. Sci Total Environ. 2022;810:152213. - PMC - PubMed
    1. Smyth DS, Trujillo M, Gregory DA, Cheung K, Gao A, Graham M, et al. Tracking cryptic SARS-CoV-2 lineages detected in NYC wastewater. Nat Commun. 2022;13(1):635. doi: 10.1038/s41467-022-28246-3 - DOI - PMC - PubMed