Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 7;50(D1):D765-D770.
doi: 10.1093/nar/gkab889.

The Ensembl COVID-19 resource: ongoing integration of public SARS-CoV-2 data

Affiliations

The Ensembl COVID-19 resource: ongoing integration of public SARS-CoV-2 data

Nishadi H De Silva et al. Nucleic Acids Res. .

Abstract

The COVID-19 pandemic has seen unprecedented use of SARS-CoV-2 genome sequencing for epidemiological tracking and identification of emerging variants. Understanding the potential impact of these variants on the infectivity of the virus and the efficacy of emerging therapeutics and vaccines has become a cornerstone of the fight against the disease. To support the maximal use of genomic information for SARS-CoV-2 research, we launched the Ensembl COVID-19 browser; the first virus to be encompassed within the Ensembl platform. This resource incorporates a new Ensembl gene set, multiple variant sets, and annotation from several relevant resources aligned to the reference SARS-CoV-2 assembly. Since the first release in May 2020, the content has been regularly updated using our new rapid release workflow, and tools such as the Ensembl Variant Effect Predictor have been integrated. The Ensembl COVID-19 browser is freely available at https://covid-19.ensembl.org.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
A comparison of the Ensembl gene set (displayed in red) and the gene set submitted to INSDC by the Shanghai Public Health Clinical Centre (displayed in blue) for the entire SARS-CoV-2 reference assembly. A notable difference between the two gene sets is the absence of ORF1a and ORF7b in the submitted gene set. Annotation tracks can be configured by clicking on the cog icon displayed in the top left of the figure.
Figure 2.
Figure 2.
Alignment coverage across the SARS-CoV-2 reference genome based on a whole genome multiple sequence alignment with 60 other Orthocoronavirinae genomes. The green plot of alignment coverage shows that the central region of the genome is highly shared across the subfamily, while the ends are generally shared only with closely related viruses. The region encoding for the spike protein S has been enlarged within the red circle showing the difference between the low alignment coverage of the upstream S1 subunit (left) and the high coverage of the downstream S2 subunit (right). This demonstrates that our methods were able to reproduce the same observations made by other groups - that there is little conservation in S1 in the Orthocoronavirinae subfamily compared to S2.
Figure 3.
Figure 3.
The browser with several tracks turned on and highlighting a substitution flagged early in UCSC’s community annotation at position 23403 (D614G) in the S spike glycoprotein gene. Due to the prompt nature of community driven annotation, this was available on our browser as soon as the annotation was published as a pre-print. It is labelled as a common missense mutation in SARS-CoV-2 with a notably high difference in resulting isoelectric point (D→G). Studies have shown this missense mutation in the spike protein is predominantly observed in Europe (26); patterns that were also observed in the variation data we host when first imported.

References

    1. Wu A., Peng Y., Huang B., Ding X., Wang X., Niu P., Meng J., Zhu Z., Zhang Z., Wang J.et al.. Genome composition and divergence of the novel coronavirus (2019-nCoV) originating in China. Cell Host Microbe. 2020; 27:325–328. - PMC - PubMed
    1. Fernandes J.D., Hinrichs A.S., Clawson H., Gonzalez J.N., Lee B.T., Nassar L.R., Raney B.J., Rosenbloom K.R., Nerli S., Rao A.A.et al.. The UCSC SARS-CoV-2 Genome Browser. Nat. Genet. 2020; 52:991–998. - PMC - PubMed
    1. Flynn J.A., Purushotham D., Choudhary M.N.K., Zhuo X., Fan C., Matt G., Li D., Wang T.. Exploring the coronavirus pandemic with the WashU Virus Genome Browser. Nat. Genet. 2020; 52:986–991. - PubMed
    1. Howe K.L., Contreras-Moreira B., De Silva N., Maslen G., Akanni W., Allen J., Alvarez-Jarreta J., Barba M., Bolser D.M., Cambell L.et al.. Ensembl Genomes 2020—enabling non-vertebrate genomic research. Nucleic. Acids. Res. 2020; 48:D689–D695. - PMC - PubMed
    1. Howe K.L., Achuthan P., Allen J., Allen J., Alvarez-Jarreta J., Amode M.R., Armean I.M., Azov A.G., Bennett R., Bhai J.et al.. Ensembl 2021. Nucleic Acids Res. 2021; 49:D884–D891. - PMC - PubMed

Publication types