Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012 Nov 20;3(6):e00473-12.
doi: 10.1128/mBio.00473-12.

Genomic characterization of a newly discovered coronavirus associated with acute respiratory distress syndrome in humans

Affiliations

Genomic characterization of a newly discovered coronavirus associated with acute respiratory distress syndrome in humans

Sander van Boheemen et al. mBio. .

Abstract

A novel human coronavirus (HCoV-EMC/2012) was isolated from a man with acute pneumonia and renal failure in June 2012. This report describes the complete genome sequence, genome organization, and expression strategy of HCoV-EMC/2012 and its relation with known coronaviruses. The genome contains 30,119 nucleotides and contains at least 10 predicted open reading frames, 9 of which are predicted to be expressed from a nested set of seven subgenomic mRNAs. Phylogenetic analysis of the replicase gene of coronaviruses with completely sequenced genomes showed that HCoV-EMC/2012 is most closely related to Tylonycteris bat coronavirus HKU4 (BtCoV-HKU4) and Pipistrellus bat coronavirus HKU5 (BtCoV-HKU5), which prototype two species in lineage C of the genus Betacoronavirus. In accordance with the guidelines of the International Committee on Taxonomy of Viruses, and in view of the 75% and 77% amino acid sequence identity in 7 conserved replicase domains with BtCoV-HKU4 and BtCoV-HKU5, respectively, we propose that HCoV-EMC/2012 prototypes a novel species in the genus Betacoronavirus. HCoV-EMC/2012 may be most closely related to a coronavirus detected in Pipistrellus pipistrellus in The Netherlands, but because only a short sequence from the most conserved part of the RNA-dependent RNA polymerase-encoding region of the genome was reported for this bat virus, its genetic distance from HCoV-EMC remains uncertain. HCoV-EMC/2012 is the sixth coronavirus known to infect humans and the first human virus within betacoronavirus lineage C.

Importance: Coronaviruses are capable of infecting humans and many animal species. Most infections caused by human coronaviruses are relatively mild. However, the outbreak of severe acute respiratory syndrome (SARS) caused by SARS-CoV in 2002 to 2003 and the fatal infection of a human by HCoV-EMC/2012 in 2012 show that coronaviruses are able to cause severe, sometimes fatal disease in humans. We have determined the complete genome of HCoV-EMC/2012 using an unbiased virus discovery approach involving next-generation sequencing techniques, which enabled subsequent state-of-the-art bioinformatics, phylogenetics, and taxonomic analyses. By establishing its complete genome sequence, HCoV-EMC/2012 was characterized as a new genotype which is closely related to bat coronaviruses that are distant from SARS-CoV. We expect that this information will be vital to rapid advancement of both clinical and vital research on this emerging pathogen.

PubMed Disclaimer

Figures

FIG 1
FIG 1
Genome organization and expression of HCoV-EMC/2012. (A) The coding part of the genome and terminal untranslated regions are depicted, respectively, by a gray background and horizontal lines. Rectangles indicate ORFs and their locations in three reading frames. The dashed lines in ORF1a and ORF5 indicate base ambiguities observed during sequencing. Triangles represent sites in the replicase polyproteins pp1a and pp1ab that are predicted to be cleaved by papain-like proteinases (gray) or the 3C-like cysteine proteinase (black). Cleavage products are numbered nsp1 to nsp16, according to the convention established for other coronaviruses (23). The −1 ribosomal frameshift site (RFS) in the ORF1a/ORF1b overlap region is indicated. The location of the leader TRS (transcription-regulatory sequences) (L) and seven body TRSs (numbered) are highlighted by black dots. All coordinates correspond to the scale shown at the bottom. (B) Sequence comparison of leader TRS region and seven body TRSs. The fully conserved TRS core sequence AACGAA is highlighted. Nucleotides in the body TRSs are written in uppercase letters if the complementary nucleotide can base pair with the corresponding residue in the leader TRS region (including G-U base pairs). TRS starting coordinates in the HCoV-EMC/2012 genome are shown at the left; for the body TRSs, the numbers of (potential) base pairs with the leader TRS region are shown at the right.
FIG 2
FIG 2
Phylogenetic trees for HCoV-EMC/2012 and selected other coronaviruses. Unrooted maximum likelihood phylogenies inferred from the nucleotide sequences of full-length ORF1ab (A) or a 332-nt fragment from the RdRp-encoding domain of ORF1b (B) are shown. HCoV-EMC/2012 and 20 viruses representing the recognized species diversity of coronaviruses were included, with bat-derived isolate VM314/2008 also included in the analysis presented in panel B (31). The viruses and corresponding species used are Alphacoronavirus 1 (Alpha-CoV1), Human coronavirus 229E (HCoV-229E), Human coronavirus NL63 (HCoV-NL63), Miniopterus bat coronavirus 1 (BtCoV-1AB), Miniopterus bat coronavirus HKU8 (BtCoV-HKU8), Porcine epidemic diarrhea virus (PED), Rhinolophus bat coronavirus HKU2 (BtCoV-HKU2), Scotophilus bat coronavirus 512 (BtCoV-512), Betacoronavirus 1 (Beta-CoV1), Human coronavirus HKU1 (HCoV-HKU1), Murine coronavirus (MHV), Tylonycteris bat coronavirus HKU4 (BtCoV-HKU4), Pipistrellus bat coronavirus HKU5 (BtCoV-HKU5), Rousettus bat coronavirus HKU9 (BtCoV-HKU9), Severe acute respiratory syndrome-related coronavirus (SARS-CoV), Avian coronavirus (IBV), Beluga whale coronavirus SW1 (BWCoV-SW1), Bulbul coronavirus HKU11 (ACoV-HKU11), Thrush coronavirus HKU12 (ACoV-HKU12), and Munia coronavirus HKU13 (ACoV-HKU13). Bootstrap values above 50 are shown. Arcs and symbols indicate the four coronavirus genera. The scale bar represents the number of nucleotide substitutions per site.
FIG 3
FIG 3
Phylogenetic trees for HCoV-EMC/2012 and selected other coronaviruses. Unrooted maximum likelihood phylogenies based on coronavirus-wide conserved protein domains in replicase pp1ab (A) or on the conserved parts of structural proteins S2, E, M, and N (B) for HCoV-EMC/2012 and 20 viruses representing the recognized species diversity of coronaviruses are shown (see Fig. 2 legend for names and abbreviations). Branch support values are based on the Shimodaira-Hasegawa-like procedure and are in the range of zero to one; only nonoptimal values smaller than one are shown. Arcs and symbols indicate the four coronavirus genera. The scale bars represent average numbers of substitutions per amino acid position.

Comment in

References

    1. de Groot RJ, et al. 2012. Family Coronaviridae, p. 806–828 In King AMQ, Adams MJ, Cartens EB, Lefkowitz EJ, Virus taxonomy, the 9th report of the international committee on taxonomy of viruses. Academic Press, San Diego, CA.
    1. Perlman S, Netland J. 2009. Coronaviruses post-SARS: update on replication and pathogenesis. Nat. Rev. Microbiol. 7:439–450 - PMC - PubMed
    1. Gloza-Rausch F, et al. 2008. Detection and prevalence patterns of group I coronaviruses in bats, northern Germany. Emerg. Infect. Dis. 14:626–631 - PMC - PubMed
    1. Lau SK, et al. 2005. Severe acute respiratory syndrome coronavirus-like virus in Chinese horseshoe bats. Proc. Natl. Acad. Sci. U. S. A. 102:14040–14045 - PMC - PubMed
    1. Li W, et al. 2005. Bats are natural reservoirs of SARS-like coronaviruses. Science 310:676–679 - PubMed

Publication types

Associated data