Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Mar 27;25(3):bbae137.
doi: 10.1093/bib/bbae137.

Multilevel superposition for deciphering the conformational variability of protein ensembles

Affiliations

Multilevel superposition for deciphering the conformational variability of protein ensembles

Takashi Amisaki. Brief Bioinform. .

Abstract

The dynamics and variability of protein conformations are directly linked to their functions. Many comparative studies of X-ray protein structures have been conducted to elucidate the relevant conformational changes, dynamics and heterogeneity. The rapid increase in the number of experimentally determined structures has made comparison an effective tool for investigating protein structures. For example, it is now possible to compare structural ensembles formed by enzyme species, variants or the type of ligands bound to them. In this study, the author developed a multilevel model for estimating two covariance matrices that represent inter- and intra-ensemble variability in the Cartesian coordinate space. Principal component analysis using the two estimated covariance matrices identified the inter-/intra-enzyme variabilities, which seemed to be important for the enzyme functions, with the illustrative examples of cytochrome P450 family 2 enzymes and class A $\beta$-lactamases. In P450, in which each enzyme has its own active site of a distinct size, an active-site motion shared universally between the enzymes was captured as the first principal mode of the intra-enzyme covariance matrix. In this case, the method was useful for understanding the conformational variability after adjusting for the differences between enzyme sizes. The developed method is advantageous in small ensemble-size problems and hence promising for use in comparative studies on experimentally determined structures where ensemble sizes are smaller than those generated, for example, by molecular dynamics simulations.

Keywords: EM algorithm; covariance matrix; principal component analysis; random effects model; structural superposition.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The MAE of the estimates of standard deviations. The upper four panes are the plots of the MAE of the square root of the diagonal components of formula image for formula image 10 (A), 20 (B), 40 (C) and 80 (D). The lower panes are the respective plots for formula image. The averages for true values for formula image and formula image were 0.52 and 0.51, respectively. Lower MAE indicates higher accuracy. The accuracy of the TS/OLS estimates was lower than those of the other two methods. The accuracy of the REM formula image was slightly higher than the other two at low formula image.
Figure 2
Figure 2
The RSMIP values of the top two eigenvectors as a function of formula image. The upper four panes are the plots for formula image with formula image 10 (A), 20 (B), 40 (C) and 80 (D). The lower panes are the respective plots for formula image. At low formula image, the accuracy of REM estimates was slightly higher than those of the other methods.
Figure 3
Figure 3
The plot of openness and coordinates of the CYP2 structures along the PC1 of the covariance matrices: (A) PSE/IWLS formula image, (B) REM formula image and (C) REM formula image. Openness was inferred from descriptions in the literature. ND is the abbreviation for non-determined and indicates that informative descriptions were not found in the literature. The PC1 of REM formula image seems to have a better correlation with openness. Notice the difference in the scale of the vertical axes.
Figure 4
Figure 4
Open and closed structures of CYP2 generated in accordance with the PC1 of the estimated covariance matrices. The traces of Cformula image are colored by the residue index, from red (C-terminal) to blue (N-terminal). The open structures (A) and (B) were obtained by adding the highest coordinates of the PC1 of SPE formula image and REM formula image, respectively, throughout the structures to the estimated mean structures. The closed structures (C) and (D) were obtained by similarly adding the lowest coordinates of the PC1 of SPE formula image and REM formula image, respectively. Structures (A) and (B) show the larger cleft produced by the upward shift of the F–G region. In addition, in structure (A), the upward shift of the B–C loop is remarkable, enlarging the active site cavity as shown by the elongated rod dist3. The three rods are drawn as defined in Table 1.
Figure 5
Figure 5
The average magnitudes of per-residue deviations of the CYP2 conformations along PC1. (A) SPE formula image, (B) REM formula image and (C) REM formula image. Each deviation is the average over the 162 conformations. The residue numbers of 2C8 are shown. The capital letters and bars at the bottom indicate the major helices and their regions (adopted from Ref. 37).
Figure 6
Figure 6
Projections of the X-ray structures of the formula image-lactamases onto the PC1–PC2 plane of the covariance matrices: (A) REM formula image and (B) REM formula image. (A) The structures were classified into loosely three clusters: non-ESBLs, half of the ESBLs, and carbapenemases and the other half of the ESBLs. The outliers, 6afmA and 6afnA, of PenL are the structures of a non-catalytic-region mutant C69Y [39]. (B) The outliers of SHV-1 and TEM-1 were neighbors of each other’s major clusters. The structures 1pzpA and 1pzoA were reported as core-disrupted structures [43]. All structures of the SHV-1-BLIP complex were located in the vicinity of the TEM-1 cluster. The other three chains (A–C) of 5eph were not included in the analysis because of the insufficient coordinates of Cformula image atoms.
Figure 7
Figure 7
The superimposed mean structures of the formula image-lactamases. The cartoon-drawing is the mean structure of TEM-1. The other structures are shown in colored traces. The gray, blue and orange lines are the structures of ESBL, non-ESBL and carbapenemases, respectively. The location of formula imageformula image loop was noticeable. The locations of the carbapenemases and the half of the ESBLs were shifted to cover the formula imageformula image loop which faced the formula image-loop.
Figure 8
Figure 8
Displacement of the Cformula image atoms of each structure of the formula image-lactamases. The coordinates in the first three PCs of (A) SPE formula image, (B) REM formula image and (C) REM formula image were transformed to the deviations in the Cartesian coordinates (Å) and plotted per residue. The peak occurred at the hinge-formula image11 region (218–223) for the deviation along the PC1 of formula image (c, left). formula image-, formula imageformula image, and formula imageformula image (169–176, 238–243, 266–275, respectively) constituted triple peaks in the plot for the PC2 of formula image (C, middle). The standalone peak in (C, right) was produced mainly by the outliers in BEL-1 (5ephD). The PC1s of formula image and formula image were almost identical as indicated by the dot product between the two as high as 0.961.
Figure 9
Figure 9
Superimposed mean group structures of calmodulin. (A) REM and (B) TS/OLS. The traces of the Cformula image atoms are shown. The yellow and blue lines represent the average structures of the NMR and X-ray ensembles, respectively. For example, the uppermost helix in (A) is the average structure of that of an X-ray ensemble. Calmodulin consists of N- and C-domains and a link connecting them. In the structures superimposed using the (heteroscedastic) REM method, compared to the C-terminal half, the N-terminal half exhibited a better fit.

Similar articles

Cited by

References

    1. Boehr DD, Nussinov R, Wright PE. The role of dynamic conformational ensembles in biomolecular recognition. Nat Chem Biol 2009;5:789–96. - PMC - PubMed
    1. Motlagh HN, Wrabl JO, Li J, Hilser VJ. The ensemble nature of allostery. Nature 2014;508:331–9. - PMC - PubMed
    1. Kessel A, Ben-Tal N. Introduction to Proteins: Structure, Function, and Motion. Boca Raton: CRC Press, 2011. 10.1093/bib/bbad242, 10.1093/molbev/msab017. - DOI - DOI
    1. Wei G, Xi W, Nussinov R, Ma B. Protein ensembles: how does nature harness thermodynamic fluctuations for life? The diverse functional roles of conformational ensembles. Cell Chem Rev 2016;116:6516–51. - PMC - PubMed
    1. Campitelli P, Modi T, Kumar S, Ozkan SB. The role of conformational dynamics and Allostery in modulating protein evolution. Annu Rev Biophys 2020;49:267–88. - PubMed