Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2020 Jun 8:2020.05.17.100685.
doi: 10.1101/2020.05.17.100685.

Architecture and self-assembly of the SARS-CoV-2 nucleocapsid protein

Affiliations

Architecture and self-assembly of the SARS-CoV-2 nucleocapsid protein

Qiaozhen Ye et al. bioRxiv. .

Update in

Abstract

The COVID-2019 pandemic is the most severe acute public health threat of the twenty-first century. To properly address this crisis with both robust testing and novel treatments, we require a deep understanding of the life cycle of the causative agent, the SARS-CoV-2 coronavirus. Here, we examine the architecture and self-assembly properties of the SARS-CoV-2 nucleocapsid protein, which packages viral RNA into new virions. We determined a 1.4 Å resolution crystal structure of this protein's N2b domain, revealing a compact, intertwined dimer similar to that of related coronaviruses including SARS-CoV. While the N2b domain forms a dimer in solution, addition of the C-terminal spacer B/N3 domain mediates formation of a homotetramer. Using hydrogen-deuterium exchange mass spectrometry, we find evidence that at least part of this putatively disordered domain is structured, potentially forming an α-helix that self-associates and cooperates with the N2b domain to mediate tetramer formation. Finally, we map the locations of amino acid substitutions in the N protein from over 38,000 SARS-CoV-2 genome sequences. We find that these substitutions are strongly clustered in the protein's N2a linker domain, and that substitutions within the N1b and N2b domains cluster away from their functional RNA binding and dimerization interfaces. Overall, this work reveals the architecture and self-assembly properties of a key protein in the SARS-CoV-2 life cycle, with implications for both drug design and antibody-based testing.

Keywords: COVID-19; Coronavirus; SARS-CoV-2; crystal structure; nucleocapsid.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.. Structure of the SARS-CoV-2 Nucleocapsid dimerization domain
(A) Domain structure of the SARS-CoV-2 Nucleocapsid protein, as defined previously,, with plot showing the Jalview alignment conservation score (three-point smoothed; gray) and DISOPRED3 disorder propensity (red) for nine related coronavirus N proteins (SARS-CoV, SARS-CoV-2, MERS-CoV, HCoV-OC43, HCoV-HKU1, HCoV-NL63, and HCoV-229E, IBV (Infectious Bronchitis virus), and MHV (Murine Hepatitis virus)). SR: serine/arginine rich domain; SB; spacer B. The boundary between SB and N3 is not well-defined due to low identity between SARS-CoV/SARS-CoV-2 and MHV N proteins. All purified truncations are noted at bottom. (B) Two views of the SARS-CoV-2 N2b dimer, with one monomer colored as a rainbow (N-terminus blue, C-terminus red) and the other colored white. See Figure S1A for comparison with other structures of this domain. (C) Structural overlay of the SARS-CoV-2 N2b dimer (blue) and the equivalent domain of SARS-CoV-N (PDB ID 2GIB).
Figure 2.
Figure 2.. N protein variability in SARS-CoV-2 patient sequences
(A) Top: Plot showing the number of observed amino acid variants at each position in the N gene in 16,975 SARS-CoV-2 genomes (details in Table S2). The most highly-mutated positions are R203 and G204, which are each mutated more than 4,000 times due to a prevalent trinucleotide substitution (Figure S2A–B). Red tick marks indicate the locations of seven premature stop mutations detected (two sequences contained stop codons at residue 256; not graphed). Bottom: Plots showing amino acid variants in the N1b and N2b domains. (B) Surface views of the N protein N1b domain (PDB ID 6VYO; Center for Structural Genomics of Infectious Diseases (CSGID), unpublished). At left, blue indicates RNA-binding residues identified by NMR peak shifts (A50, T57, H59, R92, I94, S105, R107, R149, and Y172). At right, two views colored by the number of variants at each position observed in a set of 38,318 SARS-CoV-2 genomes. The two most frequently-mutated residues are shown in stick view and labeled. Only one mutation (A50E, observed in one sequence) overlaps the putative RNA binding surface. (C) Cartoon view of the N protein N2b domain, with one monomer colored gray and the other colored by the number of variants at each position observed in a set of 38,318 SARS-CoV-2 genomes. The four most frequently-mutated residues are shown in stick view and labeled.
Figure 3.
Figure 3.. The C-terminus of N mediates tetramer formation
(A) Size exclusion chromatography (Superose 6) coupled to multi-angle light scattering (SEC-MALS) analysis of full-length SARS-CoV-2 N. The measured MW of 190.0 kDa closely matches that of a tetramer (182.5 kDa). See Figure S3B for SDS-PAGE analysis of all purified proteins. (B) Size exclusion chromatography (Superdex 200 Increase; used for panels B-F) coupled to multi-angle light scattering (SEC-MALS) analysis of SARS-CoV-2 N1ab (residues 2–175). The measured MW of 20.8 kDa closely matches that of a monomer (18.9 kDa). dRI: differential refractive index. (C) SEC-MALS analysis of SARS-CoV-2 N1ab2a (residues 2–246). The measured MW of 25.0 kDa is slightly less than that of a monomer (26.2 kDa), reflecting partial proteolysis within the N2a domain (Figure S3B). (D) SEC-MALS analysis of SARS-CoV-2 N2b. The measured MW (31.5 kDa) closely matches that of a homodimer (26.5 kDa). (E) SEC-MALS analysis of SARS-CoV-2 N2b3. The measured MW (75.6 kDa) closely matches that of a homotetramer (77.4 kDa). (F) SEC-MALS analysis of MBP-SARS-CoV-2 N3 (“peak 1” black/dark blue; “peak 2” gray/light blue) The measured MW of peak 1 (101.9 kDa) and peak 2 (48.9 kDa) closely match those of a homodimer (101.7 kDa) and a monomer (50.9 kDa). The small peak at 10.5 mL suggests higher-order self-assembly. (G) Schematic summary of size exclusion and SEC-MALS results on N protein constructs. See Figure S3C–D for SEC-MALS analysis of N1b (residues 49–174) and N1b2a (residues 49–246). (H) Proposed self-assembly mechanism of SARS-CoV-2 N. Dimerization is mediated by the N2b domain, and these dimers self-associate through the N3 region to form homotetramers.
Figure 4.
Figure 4.. HDX-MS analysis of N2b and N2b3
(A) Schematic showing the N2b sequence and structure, plus protein regions detected by HDX-MS. Each peptide is colored by its fractional deuterium uptake during the course of the experiment (blue-white-magenta = 0–100% fractional uptake). (B) Schematic showing the N2b3 sequence and inferred structure (the α-helix spanning residues 400–416 is predicted by PSI-PRED), plus protein regions detected by HDX-MS. Two sets of exchange rates are shown: fractional deuterium uptake in N2b3 (upper box) colored as in panel A, and relative uptake comparing N2b and N2b3 (lower box). (C) Structure of the N2b dimer, with one monomer colored by fractional deuterium uptake (blue-white-magenta = 0–75% fractional uptake). (D) Uptake plots for two peptides within the ordered N2b domain, with uptake in N2b indicated in blue and uptake in N2b3 indicated in green. The peptide covering residues 323–329 (located within a loop) is relatively exposed, while the peptide covering residues 330–336 (within a β-strand) is strongly protected from H-D exchange. (E) Uptake plots for three peptides in the C-terminal region of N2b3, plotted by fractional deuterium uptake. Peptides covering residues 395–402 (yellow) and 403–411 (red) show more protection than residues 404–419, suggesting that this region is partially structured. See Figure S4A for each peptide plotted by absolute deuterium uptake.

References

    1. Huang C, Wang Y, Li X, Ren L, Zhao J, Hu Y, Zhang L, Fan G, Xu J, Gu X, et al. (2020) Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 395:497–506. - PMC - PubMed
    1. Zhu N, Zhang D, Wang W, Li X, Yang B, Song J, Zhao X, Huang B, Shi W, Lu R, et al. (2020) A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. [Internet] 382:727–733. Available from: http://www.nejm.org/doi/10.1056/NEJMoa2001017 - DOI - PMC - PubMed
    1. Drosten C, Günther S, Preiser W, Van der Werf S, Brodt HR, Becker S, Rabenau H, Panning M, Kolesnikova L, Fouchier RAM, et al. (2003) Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N. Engl. J. Med. [Internet] 348:1967–1976. Available from: http://www.nejm.org/doi/abs/10.1056/NEJMoa030747 - DOI - PubMed
    1. Ksiazek TG, Erdman D, Goldsmith CS, Zaki SR, Peret T, Emery S, Tong S, Urbani C, Comer JA, Lim W, et al. (2003) A novel coronavirus associated with severe acute respiratory syndrome. N. Engl. J. Med. 348:1953–1966. - PubMed
    1. Zaki AM, van Boheemen S, Bestebroer TM, Osterhaus ADME, Fouchier RAM (2012) Isolation of a Novel Coronavirus from a Man with Pneumonia in Saudi Arabia. N. Engl. J. Med. [Internet] 367:1814–1820. Available from: http://www.nejm.org/doi/abs/10.1056/NEJMoa1211721 - DOI - PubMed

Publication types