Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 7;378(6615):56-61.
doi: 10.1126/science.add1964. Epub 2022 Sep 15.

Hallucinating symmetric protein assemblies

Affiliations

Hallucinating symmetric protein assemblies

B I M Wicky et al. Science. .

Abstract

Deep learning generative approaches provide an opportunity to broadly explore protein structure space beyond the sequences and structures of natural proteins. Here, we use deep network hallucination to generate a wide range of symmetric protein homo-oligomers given only a specification of the number of protomers and the protomer length. Crystal structures of seven designs are very similar to the computational models (median root mean square deviation: 0.6 angstroms), as are three cryo-electron microscopy structures of giant 10-nanometer rings with up to 1550 residues and C33 symmetry; all differ considerably from previously solved structures. Our results highlight the rich diversity of new protein structures that can be generated using deep learning and pave the way for the design of increasingly complex components for nanomachines and biomaterials.

PubMed Disclaimer

Conflict of interest statement

Competing interests: BIMW, LFM, AC, RJR, JD, EK, ST, RDK, and DB are inventors on a provisional patent application submitted by the University of Washington for the design, composition and function of the proteins created in this study.

Figures

Fig. 1.
Fig. 1.. Hallucinating symmetric protein assemblies
(A) Starting from choice of a cyclic symmetry and protein length, a random sequence is optimized by MCMC through the AF2 network until the resulting structure fits the design objective, followed by sequence re-design with ProteinMPNN. (B) The method generates structurally diverse outputs, quantified here by multi-dimensional scaling of protomer pairwise structural similarities between experimentally tested HALs (N = 351) and all de novo cyclic oligomers present in the PDB (N = 162). (C) Generated structures differ from those in the PDB. Median TM-scores to the closest match: 0.67 and 0.57 for the protomers and oligomers respectively (vertical lines). (D) Generated sequences are unrelated to naturally-occuring proteins. Median BLAST E-values from the closet hit in UniRef100: 2.6 and 1.3 for the repeat motifs and protomers respectively (vertical lines). (E) Success counts of ProteinMPNN-designed HALs at different levels of characterization. (F) Most soluble HALs have SEC retention volumes consistent with their oligomeric state. The gray line shows the fit to calibration standards (open circles), and the shaded area represents the 95% confidence interval of the calibration. (G) The observed molecular weights of HALs from SEC-MALS are close to those computed from the design models. (H) ProteinMPNN-designed HALs are thermostable. Pre-melting and post-melting retention volumes are closely correlated; circles represent designs that remained monodisperse, while triangles indicate polydispersity after heat-treatment. In plots E-H, the data is categorized by cyclic symmetry classes. The legend is shown in H.
Fig. 2.
Fig. 2.. Structures of HALs solved by X-ray crystallography compared to their design models.
(A) HALC2_062 (RMSD: 0.81 Å). (B) HALC2_065 (RMSD: 1.02 Å). (C) HALC2_068 (RMSD: 0.86 Å). (D) HALC3_104 (RMSD: 0.42 Å). (E) HALC3_109 (RMSD: 0.46 Å). (F) HALC4_135 (RMSD: 0.60 Å). (G) HALC4_136 (RMSD: 0.34 Å). For each row, the first panel shows a surface rendering of the oligomer with one protomer highlighted in purple, the second highlights the side-chain rotamers of the design model to the 2mFo-DFc map (in gray), and the last two panels show two different orientations of the structural overlays between the model (gray) and the solved structure (colored by chains).
Fig. 3.
Fig. 3.. Cryo-electron and negative stain electron microscopy validation of large HALs.
For each design, the model is shown colored by chain and the corresponding internal symmetry (X) and oligomerization state (Y) are indicated (CX-Y). The electron density map is shown next to the model alongside characteristic 2D class averages. (A) Negative stain characterization of HALs. Ring diameters are 92 Å, 110 Å, 75 Å, 80 Å, 100 Å, 107 Å, for HALC6_220, HALC24–6_316, HALC20–5_308, HALC25–5_341, HALC18–6_278 and HALC42–7_351, respectively. (B) CryoEM characterisation of three large HALs. The ring diameters are 87 Å, 99 Å, and 100 Å for HALC15–5_262, HALC18–6_265, and HALC33–3_343, respectively. Top row left panels: design model colored by chain; Top row, right panels: superpositions of the CryoEM model (gray) and design model (blue). The computed backbone atom RMSD between the designed and experimental structure are 0.81 Å, 1.69 Å, and 2.30 Å respectively (Fig. S16). Bottom row: 4.38 Å, 6.51 Å, and 6.32 Å cryoEM electron density maps. Scale bars = 10 nm.
Fig. 4.
Fig. 4.. Hallucinated structures differ significantly from their closest matches in the PDB.
For each structure solved by crystallography (Fig. 2) or cryoEM (Fig. 3B), the closest structural match to the protomer and to the oligomer are shown on the left and right respectively. Designs are colored by chain and the closest matching PDB is shown in gray. In most cases the closest oligomer has an entirely different structure; this is particularly evident for the larger designs in G-H. TM-scores (protomer | oligomer) are indicated in parentheses, and the PDB IDs are reported in Table S2. (A) HALC2_062 (0.69 | 0.59). (B) HALC2_065 (0.67 | 0.54). (C) HALC2_068 (0.67 | 0.57). (D) HALC3_104 (0.87 | 0.88). (E) HALC3_109 (0.78 | 0.69). (F) HALC4_135 (0.80 | 0.59). (G) HALC4_136 (0.80 | 0.71). (H) HALC15–5_262 (0.65 | 0.46). (I) HALC18–6_265 (0.65 | 0.49). (J) HALC33–3_343 (0.49 | 0.41).

References

    1. Garcia-Seisdedos H, Empereur-Mot C, Elad N, Levy ED, Proteins evolve on the edge of supramolecular self-assembly. Nature. 548, 244–247 (2017). - PubMed
    1. Johnston IG, Dingle K, Greenbury SF, Camargo CQ, Doye JPK, Ahnert SE, Louis AA, Symmetry and simplicity spontaneously emerge from the algorithmic nature of evolution. Proc. Natl. Acad. Sci 119, e2113883119 (2022). - PMC - PubMed
    1. Ahnert SE, Marsh JA, Hernández H, Robinson CV, Teichmann SA, Principles of assembly reveal a periodic table of protein complexes. Science. 350, aaa2245 (2015). - PubMed
    1. wwPDB consortium, Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 47, D520–D528 (2019). - PMC - PubMed
    1. Goodsell DS, Olson AJ, Structural Symmetry and Protein Function. Annu. Rev. Biophys. Biomol. Struct 29, 105–153 (2000). - PubMed