Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2022 Jun 28:rs.3.rs-1723829.
doi: 10.21203/rs.3.rs-1723829/v1.

Outbreak.info genomic reports: scalable and dynamic surveillance of SARS-CoV-2 variants and mutations

Affiliations

Outbreak.info genomic reports: scalable and dynamic surveillance of SARS-CoV-2 variants and mutations

Karthik Gangavarapu et al. Res Sq. .

Update in

  • Outbreak.info genomic reports: scalable and dynamic surveillance of SARS-CoV-2 variants and mutations.
    Gangavarapu K, Latif AA, Mullen JL, Alkuzweny M, Hufbauer E, Tsueng G, Haag E, Zeller M, Aceves CM, Zaiets K, Cano M, Zhou X, Qian Z, Sattler R, Matteson NL, Levy JI, Lee RTC, Freitas L, Maurer-Stroh S; GISAID Core and Curation Team; Suchard MA, Wu C, Su AI, Andersen KG, Hughes LD. Gangavarapu K, et al. Nat Methods. 2023 Apr;20(4):512-522. doi: 10.1038/s41592-023-01769-3. Epub 2023 Feb 23. Nat Methods. 2023. PMID: 36823332 Free PMC article.

Abstract

The emergence of SARS-CoV-2 variants of concern has prompted the need for near real-time genomic surveillance to inform public health interventions. In response to this need, the global scientific community, through unprecedented effort, has sequenced and shared over 11 million genomes through GISAID, as of May 2022. This extraordinarily high sampling rate provides a unique opportunity to track the evolution of the virus in near real-time. Here, we present outbreak.info, a platform that currently tracks over 40 million combinations of PANGO lineages and individual mutations, across over 7,000 locations, to provide insights for researchers, public health officials, and the general public. We describe the interpretable and opinionated visualizations in the variant and location focussed reports available in our web application, the pipelines that enable the scalable ingestion of heterogeneous sources of SARS-CoV-2 variant data, and the server infrastructure that enables widespread data dissemination via a high performance API that can be accessed using an R package. We present a case study that illustrates how outbreak.info can be used for genomic surveillance and as a hypothesis generation tool to understand the ongoing pandemic at varying geographic and temporal scales. With an emphasis on scalability, interactivity, interpretability, and reusability, outbreak.info provides a template to enable genomic surveillance at a global and localized scale.

PubMed Disclaimer

Conflict of interest statement

Conflicts of Interest

MAS receives grants from the US National Institutes of Health within the scope of this work, and grants and contracts from the US Food & Drug Administration, the US Department of Veterans Affairs and Janssen Research & Development outside the scope of this work. MAS and KGA have received consulting fees and/or compensated expert testimony on SARS-CoV-2 and the COVID-19 pandemic.

Figures

Figure 1.
Figure 1.
outbreak.info enables the exploration of genomic data across three dimensions. a, Growth rate of a lineage is a function of epidemiology and intrinsic biological properties of a lineage. Further, epidemiology varies over time and by geography while intrinsic biological properties are determined by the mutations present in a given lineage. b, Genomic data is ingested from GISAID, processed using the custom-built data pipeline, Bjorn, and stored on a server which can be accessed via an Application Programming Interface (API). The API is consumed by two clients: A JavaScript based web client and an R package that provides programmatic access by authenticating against GISAID credentials. c, The web interface contains three tools that allow exploration of genomic data across three different dimensions: lineage/mutation, time, and geography.
Figure 2.
Figure 2.. Lineage and/or Mutation Tracker.
a, Prevalence of VOCs in the United Kingdom from Sep 2020 to May 2022. b, Search and filter options for Lineage/Variant of Concern tracker. c, Prevalence of S:Y145H + S:A222V mutations across different lineages globally. d, Prevalence of BA.2 in the United Kingdom. e, Mutation map showing the characteristic mutations of AY.4. f, Summary statistics of BA.2 lineage. g, Geographic distribution of the cumulative prevalence of BA.2 lineage globally. h, Cumulative prevalence of BA.2 in each country globally. i, Research articles, and datasets related to BA.2.
Figure 3.
Figure 3.. Location report.
a, Relative prevalence of all lineages over time in South Africa. Total number of sequenced samples collected per day are shown in the bar chart below. b, Relative cumulative prevalence of all lineages over the last 60 days in South Africa. c, Mutation prevalence across the most prevalent lineages in South Africa over the last 60 days. d, Comparison of the prevalence of VOCs grouped by WHO classification: Alpha, Beta, Delta, and Omicron over time in South Africa. e, Daily reported cases in South Africa are shown in the line chart below.
Figure 4.
Figure 4.
Prevalence of Variants of Concern: Alpha, Beta, Gamma, Delta, and Omicron lineages over time in the (a) Worldwide, (b) South Africa, (c) Brazil, and (d) United States. Lineages with a prevalence over 3% over the last 60 days in (e) Denmark, (f) United Kingdom, (g) United States, and (h) South Africa.
Figure 5.
Figure 5.
Software infrastructure of outbreak.info. The infrastructure can be broadly divided into (1) Data ingestion pipelines, (2) Server-side hosting the database and API server, and (3) Client-side applications that use the API from the server.
Figure 6.
Figure 6.
Flowchart describing the steps in Bjorn.

Similar articles

References

    1. Zhu N. et al. A Novel Coronavirus from Patients with Pneumonia in China, 2019. N. Engl. J. Med. 382, 727–733 (2020). - PMC - PubMed
    1. Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine. New England Journal of Medicine vol. 384 1576–1578 (2021). - PubMed
    1. edward_holmes et al. Novel 2019 coronavirus genome. https://virological.org/t/novel-2019-coronavirus-genome/319 (2020).
    1. GISAID - Initiative. https://gisaid.org.
    1. Khare S. et al. GISAID’s Role in Pandemic Response. China CDC weekly 3, (2021). - PMC - PubMed

Publication types

LinkOut - more resources