Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 24;9(2):vead050.
doi: 10.1093/ve/vead050. eCollection 2023.

Association between SARS-CoV-2 and metagenomic content of samples from the Huanan Seafood Market

Affiliations

Association between SARS-CoV-2 and metagenomic content of samples from the Huanan Seafood Market

Jesse D Bloom. Virus Evol. .

Abstract

The role of the Huanan Seafood Market in the early severe acute respiratory syndrome virus 2 (SARS-CoV-2) outbreak remains unclear. Recently, the Chinese Centers for Disease Control (CDC) released data from deep sequencing of environmental samples collected from the market after it was closed on 1 January 2020. Prior to this release, Crits-Christoph et al. analyzed data from a subset of the samples. Both that study and the Chinese CDC study concurred that the samples contained genetic material from a variety of species, including some like raccoon dogs that are susceptible to SARS-CoV-2. However, neither study systematically analyzed the relationship between the amount of genetic material from SARS-CoV-2 and different animal species. Here I implement a fully reproducible computational pipeline that jointly analyzes the number of reads mapping to SARS-CoV-2 and the mitochondrial genomes of chordate species across the full set of samples. I validate the presence of genetic material from numerous species and calculate mammalian mitochondrial compositions similar to those reported by Crits-Christoph et al. However, the SARS-CoV-2 content of the environmental samples is generally very low: only 21 of 176 samples contain more than ten SARS-CoV-2 reads, despite most samples being sequenced to depths exceeding 108 total reads. None of the samples with double-digit numbers of SARS-CoV-2 reads have a substantial fraction of their mitochondrial material from any non-human susceptible species. Only one of the fourteen samples with at least a fifth of the chordate mitochondrial material from raccoon dogs contains any SARS-CoV-2 reads, and that sample only has 1 of ~200,000,000 reads mapping to SARS-CoV-2. Instead, SARS-CoV-2 reads are most correlated with reads mapping to various fish, such as catfish and largemouth bass. These results suggest that while metagenomic analysis of the environmental samples is useful for identifying animals or animal products sold at the market, co-mingling of animal and viral genetic material is unlikely to reliably indicate whether any animals were infected by SARS-CoV-2.

Keywords: COVID-19 origins; Wuhan; lab leak; zoonosis.

PubMed Disclaimer

Conflict of interest statement

J.D.B. is on the scientific advisory boards of Apriori Bio, Aerium Therapeutics, Invivyd, and the Vaccine Company; consults for GSK; and receives royalty payments as an inventor on Fred Hutch licensed patents related to deep mutational scanning of viral proteins.

Figures

Figure 1.
Figure 1.
Metagenomic composition of sample Q61. (A) Composition as determined by aligning reads to mitochondrial genomes. From left to right: composition determined in the current study across all chordates, composition determined in the current study across all mammals, composition reported in Crits-Christoph et al. (2023) for just mammals shown in the first figure of their report, and composition reported in Crits-Christoph et al. (2023) across all mammals in the third supplementary table of their report. (B) Composition determined in the current study by aligning assembled contigs to the four indicated genomes; this composition is similar to that reported in the third figure of Crits-Christoph et al. (2023). See https://jbloom.github.io/Huanan_market_samples/mito_composition.html for an interactive version of (A) that allows similar pie charts to be viewed for any sample. See https://jbloom.github.io/Huanan_market_samples/genomic_contig_composition.html for an interactive version of (B).
Figure 2.
Figure 2.
The percentage of high-quality reads that align to SARS-CoV-2 for each sample. Points are colored according to the date that the sample was collected. Note that the x-axis uses a symlog scale. See https://jbloom.github.io/Huanan_market_samples/sars2_aligned_vertical.html for an interactive version of this plot, where you can mouseover points for details including the mitochondrial composition of each sample and select only samples from specific dates or from locations.
Figure 3.
Figure 3.
The correlation between the percentage of all reads mapping to SARS-CoV-2 and the mitochondrial genome of each of the indicated species. Each point represents a different environmental sample, and the text in the upper left of each panel shows the Pearson correlation. The scales are log10, and values of zero (which cannot be plotted on a log scale) are shown as half the minimum non-zero value observed across all samples. See https://jbloom.github.io/Huanan_market_samples/per_species_corr_faceted.html for an interactive version of this plot that enables mouseover of points for sample details, selection only of samples collected on specific dates or containing at least one SARS-CoV-2 reads, adjustment of scales from log to linear, and adjustment of mitochondrial percent to be of reads mapping to any mitochondria rather than of all reads. See https://jbloom.github.io/Huanan_market_samples/per_species_corr_single.html for similar plots for individual species. The plots shown here include only samples with at least 200 aligned mitochondrial reads; that option can be adjusted in the interactive plots.
Figure 4.
Figure 4.
Correlations between the SARS-CoV-2 content and the mitochondrial content for all species. The top row shows just samples containing at least one SARS-CoV-2 read and the bottom row shows samples regardless of the SARS-CoV-2 content; the left shows samples from all sampling dates and the right shows just samples from the 12 January 2020 date when most of the wildlife sampling occurred. This plot is designed to mimic the fourth figure of Liu et al. (2022). Some key species are labeled; see the interactive version at https://jbloom.github.io/Huanan_market_samples/overall_corr.html to mouseover all points for details, select different subsets of samples, and calculate the correlations on a linear or log scale. See Figure S4 for a version of this plot that uses correlations based on a Theil–Sen estimator that is more robust to outliers.

References

    1. Bloom J. D. (2021) ‘Recovery of Deleted Deep Sequencing Data Sheds More Light on the Early Wuhan SARS-CoV-2 Epidemic’, Molecular Biology and Evolution, 38: 5211–24. - PMC - PubMed
    1. Chan J. F. -W. et al. (2020) ‘A Familial Cluster of Pneumonia Associated with the 2019 Novel Coronavirus Indicating Person-to-Person Transmission: A Study of a Family Cluster’, The Lancet, 395: 514–23. - PMC - PubMed
    1. Chen S. et al. (2018) ‘fastp: An Ultra-fast All-in-One FASTQ Preprocessor’, Bioinformatics, 34: i884–90. - PMC - PubMed
    1. Chen N. et al. (2020) ‘Epidemiological and Clinical Characteristics of 99 Cases of 2019 Novel Coronavirus Pneumonia in Wuhan, China: A Descriptive Study’, Lancet, 395: 507–13. - PMC - PubMed
    1. Cohen J. (2020) ‘Wuhan Seafood Market May Not Be Source of Novel Virus Spreading Globally’, Science, 10. doi: 10.1126/science.abb0611 - DOI

LinkOut - more resources