Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May 6;5(3):e00160-20.
doi: 10.1128/mSphere.00160-20.

An Extensive Meta-Metagenomic Search Identifies SARS-CoV-2-Homologous Sequences in Pangolin Lung Viromes

Affiliations

An Extensive Meta-Metagenomic Search Identifies SARS-CoV-2-Homologous Sequences in Pangolin Lung Viromes

Lamia Wahba et al. mSphere. .

Abstract

In numerous instances, tracking the biological significance of a nucleic acid sequence can be augmented through the identification of environmental niches in which the sequence of interest is present. Many metagenomic data sets are now available, with deep sequencing of samples from diverse biological niches. While any individual metagenomic data set can be readily queried using web-based tools, meta-searches through all such data sets are less accessible. In this brief communication, we demonstrate such a meta-metagenomic approach, examining close matches to the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in all high-throughput sequencing data sets in the NCBI Sequence Read Archive accessible with the "virome" keyword. In addition to the homology to bat coronaviruses observed in descriptions of the SARS-CoV-2 sequence (F. Wu, S. Zhao, B. Yu, Y. M. Chen, et al., Nature 579:265-269, 2020, https://doi.org/10.1038/s41586-020-2008-3; P. Zhou, X. L. Yang, X. G. Wang, B. Hu, et al., Nature 579:270-273, 2020, https://doi.org/10.1038/s41586-020-2012-7), we note a strong homology to numerous sequence reads in metavirome data sets generated from the lungs of deceased pangolins reported by Liu et al. (P. Liu, W. Chen, and J. P. Chen, Viruses 11:979, 2019, https://doi.org/10.3390/v11110979). While analysis of these reads indicates the presence of a similar viral sequence in pangolin lung, the similarity is not sufficient to either confirm or rule out a role for pangolins as an intermediate host in the recent emergence of SARS-CoV-2. In addition to the implications for SARS-CoV-2 emergence, this study illustrates the utility and limitations of meta-metagenomic search tools in effective and rapid characterization of potentially significant nucleic acid sequences.IMPORTANCE Meta-metagenomic searches allow for high-speed, low-cost identification of potentially significant biological niches for sequences of interest.

Keywords: COVID; SARS-nCoV-2; bioinformatics; coronavirus; metagenomics; pangolin.

PubMed Disclaimer

Figures

FIG 1
FIG 1
(a) Integrated Genomics Viewer (IGV) snapshot of alignment. Reads from the pangolin lung virome samples (SRA accession no. SRR10168377, SRR10168378, and SRR10168376) were mapped to a SARS-CoV-2 reference sequence (GenBank accession no. MN908947.3). The total numbers of aligned reads from the three samples were 1,107, 313, and 32 reads, respectively. Figure S1 in the supplemental material shows an enlarged view for these alignments within the spike RBD region. (b) Quantification of nucleotide-level similarity between the SARS-CoV-2 genome and pangolin lung metavirome reads aligning to the SARS-CoV-2 genome. Average similarity was calculated in 101-nucleotide windows along the SARS-CoV-2 genome and is only shown for those windows where each nucleotide in the window had coverage of ≥2. Average nucleotide similarity calculated (in 101-nucleotide windows) between the SARS-CoV-2 genome and reference genomes of three relevant bat coronaviruses (bat-SL-CoVZC45 [accession no. MG772933.1], bat-SL-CoVZXC21, [accession no. MG772934.1], and RaTG13 [accession no. MN996532.1]) is also shown. Note that the pangolin metavirome similarity trace is not directly comparable to the bat coronavirus similarity traces, because the former uses read data for calculation, whereas the latter uses reference genomes.

References

    1. Bry L, Falk PG, Midtvedt T, Gordon JI. 1996. A model of host-microbial interactions in an open mammalian ecosystem. Science 273:1380–1383. doi:10.1126/science.273.5280.1380. - DOI - PubMed
    1. Bowman JS. 2018. Identification of microbial dark matter in Antarctic environments. Front Microbiol 9:3165. doi:10.3389/fmicb.2018.03165. - DOI - PMC - PubMed
    1. Wu F, Zhao S, Yu B, Chen YM, Wang W, Song ZG, Hu Y, Tao ZW, Tian JH, Pei YY, Yuan ML, Zhang YL, Dai FH, Liu Y, Wang QM, Zheng JJ, Xu L, Holmes EC, Zhang YZ. 2020. A new coronavirus associated with human respiratory disease in China. Nature 579:265–269. doi:10.1038/s41586-020-2202-3. - DOI - PMC - PubMed
    1. Zhou P, Yang XL, Wang XG, Hu B, Zhang L, Zhang W, Si HR, Zhu Y, Li B, Huang CL, Chen HD, Chen J, Luo Y, Guo H, Jiang RD, Liu MQ, Chen Y, Shen XR, Wang X, Zheng XS, Zhao K, Chen QJ, Deng F, Liu LL, Yan B, Zhan FX, Wang YY, Xiao GF, Shi ZL. 2020. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579:270–273. doi:10.1038/s41586-020-2012-7. - DOI - PMC - PubMed
    1. Edgar RC. 2020. URMAP, an ultra-fast read mapper. bioRxiv doi:10.1101/2020.01.12.903351. - DOI - PMC - PubMed

Publication types