Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jun 10:876:162572.
doi: 10.1016/j.scitotenv.2023.162572. Epub 2023 Mar 4.

Intensity of sample processing methods impacts wastewater SARS-CoV-2 whole genome amplicon sequencing outcomes

Affiliations

Intensity of sample processing methods impacts wastewater SARS-CoV-2 whole genome amplicon sequencing outcomes

Shuchen Feng et al. Sci Total Environ. .

Abstract

Wastewater SARS-CoV-2 surveillance has been deployed since the beginning of the COVID-19 pandemic to monitor the dynamics in virus burden in local communities. Genomic surveillance of SARS-CoV-2 in wastewater, particularly efforts aimed at whole genome sequencing for variant tracking and identification, are still challenging due to low target concentration, complex microbial and chemical background, and lack of robust nucleic acid recovery experimental procedures. The intrinsic sample limitations are inherent to wastewater and are thus unavoidable. Here, we use a statistical approach that couples correlation analyses to a random forest-based machine learning algorithm to evaluate potentially important factors associated with wastewater SARS-CoV-2 whole genome amplicon sequencing outcomes, with a specific focus on the breadth of genome coverage. We collected 182 composite and grab wastewater samples from the Chicago area between November 2020 to October 2021. Samples were processed using a mixture of processing methods reflecting different homogenization intensities (HA + Zymo beads, HA + glass beads, and Nanotrap), and were sequenced using one of the two library preparation kits (the Illumina COVIDseq kit and the QIAseq DIRECT kit). Technical factors evaluated using statistical and machine learning approaches include sample types, certain sample intrinsic features, and processing and sequencing methods. The results suggested that sample processing methods could be a predominant factor affecting sequencing outcomes, and library preparation kits was considered a minor factor. A synthetic SARS-CoV-2 RNA spike-in experiment was performed to validate the impact from processing methods and suggested that the intensity of the processing methods could lead to different RNA fragmentation patterns, which could also explain the observed inconsistency between qPCR quantification and sequencing outcomes. Overall, extra attention should be paid to wastewater sample processing (i.e., concentration and homogenization) for sufficient and good quality SARS-CoV-2 RNA for downstream sequencing.

Keywords: Amplicon sequencing; Illumina COVIDseq; QIAseq DIRECT; RNA fragmentation; Sample processing methods; Wastewater SARS-CoV-2.

PubMed Disclaimer

Conflict of interest statement

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

Unlabelled Image
Graphical abstract
Fig. 1
Fig. 1
Comparison of N1 gene copy numbers via RT-qPCR and SARS-CoV-2 genome breadth of coverage via sequencing in composite and grab sewage samples, using the QIAseq DIRECT kit and Illumina COVIDseq kits. The x-axis represents the N1 concentration in gene copies per liter of sewage sample (cp/L), and the y-axis represents for the breadth of coverage observed. The dashed line represents a genome breadth of coverage of 80 %. The dot colors indicate the sample processing methods, the sizes represent the sequencing depth of each sample, and the shapes indicate the library preparation kit used for sequencing. Grab samples with no detection SARS-CoV-2 are marked as not detected (ND) on the x-axis.
Fig. 2
Fig. 2
Statistical assessment of impact of sampling and processing variables on genome breadth of coverage using all composite samples with available metadata. The “Shadow Min”, “Shadow Mean” and “Shadow Max” values indicate the minimal, average, and maximum Z score of a shadow feature decided by Boruta, respectively (Blue boxes). Features with an importance metric that exceeds the Shadow Max value are considered important (green boxes), and features with importance below the Shadow Max value are deemed not important (red boxes). In this analysis only processing method (i.e., concentration and homogenization) and library preparation kit are considered important by Boruta.
Fig. 3
Fig. 3
RT-qPCR results of Twist positive control, Twist-spiked-in and no-spiked-in samples. The y-axis shows the SARS-CoV-2 concentration in N1 log10 transformed copies per processed sample volume, or per extraction. Processing methods are indicated on the x-axis. The squares indicate positive controls of Twist spike-in, including positive control (Nanotrap or HA no filtration) and positive control filtered (HA filtration with water). Triangles represent for samples with or without Twist spike-in. The boxplot whiskers show the 25th, 50th and 75th percentile of each group. The white diamond shows the mean value of all data points in each group. The violin plots' width is proportional to the estimated density of the observed N1 values in each group, with the density curves plotted symmetrically to the left and right of the boxplot.
Fig. 4
Fig. 4
Sequencing results of Twist positive control, no-spiked-in and Twist-spiked-in samples. Y-axis represents the genome covered percentage. X-axis represents processing methods. The squares indicate positive controls of Twist, including positive control (Nanotrap or HA no filtration) and positive control filtered (HA filtration with water). Triangles show samples with or without Twist spike-in. The boxplot whiskers represent the 25th, 50th and 75th percentile of each group. The white diamond stands for the mean value of all data points in each group. The violin plots' width is proportional to the estimated density of the observed breadth of coverage values in each group, with the density curves plotted symmetrically to the left and right of the boxplot.

References

    1. Ahmed W., Angel N., Edson J., et al. First confirmed detection of SARS-CoV-2 in untreated wastewater in Australia: a proof of concept for the wastewater surveillance of COVID-19 in the community. Sci. Total Environ. 2020;728 doi: 10.1016/j.scitotenv.2020.138764. - DOI - PMC - PubMed
    1. Ahmed W., Bertsch P.M., Bibby K., et al. Decay of SARS-CoV-2 and surrogate murine hepatitis virus RNA in untreated wastewater to inform application in wastewater-based epidemiology. Environ. Res. 2020;191 doi: 10.1016/j.envres.2020.110092. - DOI - PMC - PubMed
    1. Ahmed W., Bivins A., Korajkic A., Metcalfe S., Smith W.J.M., Simpson S.L. Comparative analysis of Adsorption-Extraction (AE) and Nanotrap® Magnetic Virus Particles (NMVP) workflows for the recovery of endogenous enveloped and non-enveloped viruses in wastewater. Sci. Total Environ. 2023;859 doi: 10.1016/j.scitotenv.2022.160072. - DOI - PMC - PubMed
    1. Amman F., Markt R., Endler L., et al. Viral variant-resolved wastewater surveillance of SARS-CoV-2 at national scale. Nat. Biotechnol. 2022 doi: 10.1038/s41587-022-01387-y. Published online. - DOI - PubMed
    1. Barbé L., Schaeffer J., Besnard A., et al. SARS-CoV-2 whole-genome sequencing using Oxford nanopore technology for variant monitoring in wastewaters. Front. Microbiol. 2022;13(June):1–14. doi: 10.2139/ssrn.4028274. - DOI - PMC - PubMed