Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul;67(4):1453-1462.
doi: 10.1111/tbed.13588. Epub 2020 May 25.

Supporting pandemic response using genomics and bioinformatics: A case study on the emergent SARS-CoV-2 outbreak

Affiliations

Supporting pandemic response using genomics and bioinformatics: A case study on the emergent SARS-CoV-2 outbreak

Denis C Bauer et al. Transbound Emerg Dis. 2020 Jul.

Abstract

Pre-clinical responses to fast-moving infectious disease outbreaks heavily depend on choosing the best isolates for animal models that inform diagnostics, vaccines and treatments. Current approaches are driven by practical considerations (e.g. first available virus isolate) rather than a detailed analysis of the characteristics of the virus strain chosen, which can lead to animal models that are not representative of the circulating or emerging clusters. Here, we suggest a combination of epidemiological, experimental and bioinformatic considerations when choosing virus strains for animal model generation. We discuss the currently chosen SARS-CoV-2 strains for international coronavirus disease (COVID-19) models in the context of their phylogeny as well as in a novel alignment-free bioinformatic approach. Unlike phylogenetic trees, which focus on individual shared mutations, this new approach assesses genome-wide co-developing functionalities and hence offers a more fluid view of the 'cloud of variances' that RNA viruses are prone to accumulate. This joint approach concludes that while the current animal models cover the existing viral strains adequately, there is substantial evolutionary activity that is likely not considered by the current models. Based on insights from the non-discrete alignment-free approach and experimental observations, we suggest isolates for future animal models.

Keywords: COVID-19; PHEIC; alignment-free phylogeny; bioinformatics; genomics; viral evolution.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

FIGURE 1
FIGURE 1
Illustration of coronavirus spread while it accumulates mutations. The dark blue arrows represent the main volume of transmissions, while the nucleic acid symbol illustrates mutations acquired by the different viral strains as they enter humans from a primary/reservoir host (represented by the bat symbol) through an intermediate host (which is yet to be identified for SARS‐CoV‐2). The first human SARS‐CoV‐2 isolate sequenced (with orange and pink mutation) may not have been the original strain that first infected humans (grey). It is possible that a strain sequenced later (green) may be genetically closer to the original strain. In this scenario, the original strain has not been captured through sequencing at all. It also shows that there may be two currently circulating strains (orange‐pink‐purple and orange‐pink‐brown), which in turn might be different from the most virulent one (orange‐pink‐blue). In the absence of clinical data correlated with SARS‐CoV‐2 genome isolates, bioinformatic analysis (represented by the computer symbol) can identify clusters and consensus sequences to investigate the genetic diversity of the emerging SARS‐CoV‐2 strains
FIGURE 2
FIGURE 2
Phylogenetic tree highlighting isolates of interest with branch points of the six clusters labelled to indicate mature (orange) and emerging (yellow) disease clusters (full list of identical sequences for these branch points are in Table S1 and complete image in Figure S5)
FIGURE 3
FIGURE 3
PCA plots showing the genomic signatures of different coronavirus sequences. Each point represents the genomic signature for an isolate. Inset Comparison of genomic signatures across different strains of coronavirus. Numbers correspond to the number of isolates at each location. Overall, the genomic signatures for isolates of different coronavirus strains were relatively far apart. Main image Zoomed in PCA plot of the cluster of SARS‐CoV‐2 isolates, showing the overall genomic signatures of the different strains
FIGURE 4
FIGURE 4
Identification of potential viral strains for animal models. Phylogenetic methods (a) show that current animal models (highlighted in green) cover the major clusters (C1‐3) but may not capture the emerging clusters. A K‐mer based analysis (b) is able to suggest alternative strains that cover all emerging clusters (C4‐6). The inset shows the wider region with the main image extent marked by a rectangle

References

    1. Abadi, S. , Azouri, D. , Pupko, T. , & Mayrose, I. (2019). Model selection may not be a mandatory step for phylogeny reconstruction. Nature Communications, 10(1), 934. 10.1038/s41467-019-08822-w - DOI - PMC - PubMed
    1. Callaway, E. (2020). Labs rush to study coronavirus in transgenic animals – Some are in short supply. Nature, 579(7798), 183. 10.1038/d41586-020-00698-x - DOI - PubMed
    1. CSIRO (2020). Working against the new coronavirus. March 5, 2020, Retrieved from https://www.csiro.au/en/Research/Health/Infectious‐dieases‐coronavirus/c...
    1. Drew, T. W. (2011). The emergence and evolution of swine viral diseases: To what extend have husbandry systems and global trade contributed to their distribution and diversity? Revue Scientifique et Technique de l’OIE, 30(1), 95–106. 10.20506/rst.30.1.2020 - DOI - PubMed
    1. Eigen, M. , McCaskill, J. , & Schuster, P. (1988). Molecular quasi‐species. The Journal of Physical Chemistry, 92(24), 6881–6891. 10.1021/j100335a010 - DOI