Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jan;36(Database issue):D504-11.
doi: 10.1093/nar/gkm754. Epub 2007 Oct 2.

CoVDB: a comprehensive database for comparative analysis of coronavirus genes and genomes

Affiliations

CoVDB: a comprehensive database for comparative analysis of coronavirus genes and genomes

Yi Huang et al. Nucleic Acids Res. 2008 Jan.

Abstract

The recent SARS epidemic has boosted interest in the discovery of novel human and animal coronaviruses. By July 2007, more than 3000 coronavirus sequence records, including 264 complete genomes, are available in GenBank. The number of coronavirus species with complete genomes available has increased from 9 in 2003 to 25 in 2007, of which six, including coronavirus HKU1, bat SARS coronavirus, group 1 bat coronavirus HKU2, groups 2c and 2d coronaviruses, were sequenced by our laboratory. To overcome the problems we encountered in the existing databases during comparative sequence analysis, we built a comprehensive database, CoVDB (http://covdb.microbiology.hku.hk), of annotated coronavirus genes and genomes. CoVDB provides a convenient platform for rapid and accurate batch sequence retrieval, the cornerstone and bottleneck for comparative gene or genome analysis. Sequences can be directly downloaded from the website in FASTA format. CoVDB also provides detailed annotation of all coronavirus sequences using a standardized nomenclature system, and overcomes the problems of duplicated and identical sequences in other databases. For complete genomes, a single representative sequence for each species is available for comparative analysis such as phylogenetic studies. With the annotated sequences in CoVDB, more specific blast search results can be generated for efficient downstream analysis.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Number of coronavirus sequences in GenBank from 1984 to 2006.
Figure 2.
Figure 2.
Screenshots of CoVDB complete genome retrieval pages. (a) Specific gene can be retrieved using the pull-down list at the left lower corner. The number in brackets indicates the number of complete genomes for that coronavirus. (b) Example of showing genomes of selected species (some group 2a coronaviruses and SARS-CoV-related coronaviruses). Default is to show the ‘Type strain’ for each species only. The columns NCBIacc and PMID link to GenBank and pubmed, respectively. (c) Example of showing S gene of selected species by choosing S in the pull-down list. For genes downstream to orf1ab, sequences upstream to the initiation codons can also be retrieved from this result page. This function is particularly useful for the detection of transcription regulatory sequences.
Figure 2.
Figure 2.
Screenshots of CoVDB complete genome retrieval pages. (a) Specific gene can be retrieved using the pull-down list at the left lower corner. The number in brackets indicates the number of complete genomes for that coronavirus. (b) Example of showing genomes of selected species (some group 2a coronaviruses and SARS-CoV-related coronaviruses). Default is to show the ‘Type strain’ for each species only. The columns NCBIacc and PMID link to GenBank and pubmed, respectively. (c) Example of showing S gene of selected species by choosing S in the pull-down list. For genes downstream to orf1ab, sequences upstream to the initiation codons can also be retrieved from this result page. This function is particularly useful for the detection of transcription regulatory sequences.
Figure 3.
Figure 3.
Screenshots of all gene retrieval pages. (a) Gene sequences are grouped vertically according to which coronavirus group and subgroup they belong to, and horizontally by the name of the genes. The numbers next to each checkbox indicates the number of that gene in CoVDB. The option ‘Exclude partial CDS’ can be used if only complete genes are required. (b) Example of showing the 15 sequences of nsp13 in group 3 coronaviruses. The first column is CoVDB gene id. In the Uniq column, ‘Uniq’ will be shown if there is no other identical sequence in CoVDB. Otherwise, gene id of the sequences identical to it will be shown.
Figure 3.
Figure 3.
Screenshots of all gene retrieval pages. (a) Gene sequences are grouped vertically according to which coronavirus group and subgroup they belong to, and horizontally by the name of the genes. The numbers next to each checkbox indicates the number of that gene in CoVDB. The option ‘Exclude partial CDS’ can be used if only complete genes are required. (b) Example of showing the 15 sequences of nsp13 in group 3 coronaviruses. The first column is CoVDB gene id. In the Uniq column, ‘Uniq’ will be shown if there is no other identical sequence in CoVDB. Otherwise, gene id of the sequences identical to it will be shown.
Figure 4.
Figure 4.
Screenshot of blast similarity search page. Five datasets can be chosen as the database for comparison.

References

    1. Brian DA, Baric RS. Coronavirus genome structure and replication. Curr. Top. Microbiol. Immunol. 2005;287:1–30. - PMC - PubMed
    1. Lai MM, Cavanagh D. The molecular biology of coronaviruses. Adv. Virus Res. 1997;48:1–100. - PMC - PubMed
    1. Ziebuhr J. Molecular biology of severe acute respiratory syndrome coronavirus. Curr. Opin. Microbiol. 2004;7:412–419. - PMC - PubMed
    1. Woo PC, Lau SK, Yip CC, Huang Y, Tsoi HW, Chan KH, Yuen KY. Comparative analysis of 22 coronavirus HKU1 genomes reveals a novel genotype and evidence of natural recombination in coronavirus HKU1. J. Virol. 2006;80:7136–7145. - PMC - PubMed
    1. Guan Y, Zheng BJ, He YQ, Liu XL, Zhuang ZX, Cheung CL, Luo SW, Li PH, Zhang LJ, et al. Isolation and characterization of viruses related to the SARS coronavirus from animals in southern China. Science. 2003;302:276–278. - PubMed

Publication types