Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Mar 10;16(3):1985-2001.
doi: 10.1021/acs.jctc.9b01010. Epub 2020 Feb 18.

Obtaining Tertiary Protein Structures by the ab Initio Interpretation of Small Angle X-ray Scattering Data

Affiliations

Obtaining Tertiary Protein Structures by the ab Initio Interpretation of Small Angle X-ray Scattering Data

Christopher Prior et al. J Chem Theory Comput. .

Abstract

Small angle X-ray scattering (SAXS) is an important tool for investigating the structure of proteins in solution. We present a novel ab initio method representing polypeptide chains as discrete curves used to derive a meaningful three-dimensional model from only the primary sequence and SAXS data. High resolution structures were used to generate probability density functions for each common secondary structural element found in proteins, which are used to place realistic restraints on the model curve's geometry. This is coupled with a novel explicit hydration shell model in order to derive physically meaningful three-dimensional models by optimizing against experimental SAXS data. The efficacy of this model is verified on an established benchmark protein set, and then it is used to predict the lysozyme structure using only its primary sequence and SAXS data. The method is used to generate a biologically plausible model of the coiled-coil component of the human synaptonemal complex central element protein.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

Figure 1
Figure 1
Figures depicting elements of the backbone model. (a) Curve subsections (ci, ci+1, ci+2, ci+3) (red points) and their midsection points (cm1, cm2, cm3) (blue). The first example is more tightly wound and has a smaller sphere, hence a higher κ value. The sphere defined by these midsection points is shown; the inverse of its radius is the curvature κ. (b) α-Helical section with uniformly similar (κ, τ) values. (c) Flexible (linker) section with varying (κ, τ) values.
Figure 2
Figure 2
Illustrations of κ–τ spaces used to impose realistic geometry constraints on the polypeptide chain. (a) (κ, τ) pairs obtained from crystal structures, plotted as points with κ on the horizontal axis and τ on the vertical axis. (b) PDF, created from the data in (a), which correspond to linker sections. There are three distinct domains of high probability corresponding to the preferred secondary structural elements.
Figure 3
Figure 3
Visualizations of the hydration layer model. (a) Initial solvent layer, shown as silver spheres with the core Rc and outer Ro cylinders surrounding the axis of the section (red curve). (b) Overlapping sections and solvent layers. (c) In blue, the removed solvent molecules of the pair of sections shown in (b).
Figure 4
Figure 4
Comparisons of crystallographic and model solvent positions from the crystal structure of a phosphate binding protein PDB 4F1V, determined at ultrahigh resolution of 0.88 Å. (a) PDB backbone and relevant solvent molecules. (b) Model solvent positions (surrounding the same curve as in (a)) obtained with the experimentally determined hydration shell model parameters.
Figure 5
Figure 5
Fits to scattering data for various molecules using appropriate Cα coordinates as a backbone model {c}i=1n. (See section 3 of the Supporting Information for details.) In panels a–d, the data scattering data is shown overlaid by the smoothed data used for fitting (blue curve) and the model fit (red curve). Panel e is the averaged scattering function famex obtained by averaging the scattering parameters obtained from fits like those shown in (a)–(d).
Figure 6
Figure 6
Secondary knot fingerprint analysis of the lysozyme structure. (a) Cα trace of lysozyme (PDB 1LYZ(41)). The α-helices are shown in red, β-strand structures are in green, and linker sections are in light blue. (b) Random structure generated using the CB algorithm which has the same secondary structural elements as lysozyme. This could be a starting model for the fitting procedure. Panels c and d are secondary fingerprints of two different crystal structures of lysozome (1LYZ and 1AKI, respectively). The knot types are indicated (Rolfsen classification); white spaces indicate no secondary knots (all knots were of the primary type). (e) Secondary fingerprint for the random structure shown in (b); it differs significantly from (c) and (d) and has a larger range of knots present.
Figure 7
Figure 7
Properties of the (secondary) knot fingerprint statistic formula image based on variations of the lysozyme structure. (a) Secondary knot statistics formula image of various structures K compared to the curve shown in Figure 6a. The two distinct sets are lysozyme coordinates from the PDB and random structures with secondary structure alignment to lysozyme (generated using the CB algorithm). (b) Plots of the mean, maximum, and minimum values of the 50 secondary knot statistics comparing the 1LYZ structure and the same structure subjected to n random changes in its secondary structure. The dotted lines show one standard deviation from the mean. The black line is the average of the PDB structure secondary fingerprint statistics (see (a)), the purple line is the random structure average (crossing the mean at about n = 15), and the yellow line is the average of secondary fingerprint values for models which fit the experimental data (crossing the mean at about n = 3).
Figure 8
Figure 8
Illustrating the fitting process. (a) Initial configuration of the backbone based only on the secondary structure assignment of lysozyme (PDB 1LYZ). Also shown as spheres are the molecules of the hydration layer. (b) Model scattering curve comparing BioSAXS data. (c) Final structure (and hydration layer) obtained from the fitting process and its model scattering curve now fitting the BioSAXS data well (d).
Figure 9
Figure 9
Sections of the 1LYZ PDB structure and example fits obtained by fitting our model to the scattering data. Panels (a) and c are subsections of the PDB; (a) has the sheet. Panels b and d are composite visualizations of the predictions.
Figure 10
Figure 10
Comparison of RMSD measures and knot fingerprint statistics formula image for fittings of the model to scattering data for lysozyme and ribonuclease.These results are obtained using the PDB structure as the initial input to the algorithm and are by comparison to that PDB. (a) Lysozyme; (b) ribonuclease.
Figure 11
Figure 11
Sections of the 3V03 PDB structure and example fits obtained by fitting our model to the scattering data. Panels a and c are subsections of the PDB. Panels b and d are example predictions.
Figure 12
Figure 12
Ab initio predictions for lysozyme based on sequence data alone. Panel a depicts the RMSD and knot statistic formula image values for the predictions Kn; these are indicated as blue circles with the from-PDB data (Figure 10) shown as brown squares for comparison. (b) Secondary structure sections 1–10 (residues 1–63) of the 1LYZ crystal structure. (c) Secondary structure sections 1–10 of the best ab initio fit. (d) Secondary structure sections 11+ (the reminder of the structure) of the 1LYZ crystal structure (residues 64–129). (e) Secondary structure sections 11+ of the best ab initio fit.
Figure 13
Figure 13
Comparisons of high formula image and low formula image lysozyme predictions. (a) PDB 1LYZ crystal structure. (b) High quality fit (formula image). (c) Low quality fit formula image. (d) Comparison of the contact constraint χcon and the knot fingerprint; the blue points (with larger values) are for the ab initio fits and the brown dots are from the PDB fits. (e) Two (green) sections of a sheet from lysozyme model. A plane and its normal bisecting the strand sections are shown; also shown are two sections of the rest of the molecule which bisect the plane between the two strands. (f) Fingerprint–RMSD comparison plot with screened ab initio predictions.
Figure 14
Figure 14
Schematic drawing of the SYCE1 construct with each box corresponding to one predicted α-helix. The SYCE sequence of approximately 120 amino acids corresponding to helices 1–4 is duplicated and linked by a tether to a repeat of the same sequence comprised of helices 5–8.
Figure 15
Figure 15
Illustrations of the optimization process used to obtain the model predictions for structural core of human SYCE1. (a) Initial starting configuration of the backbone based only on the sequence data shown in Figure 14. Also shown as spheres are the molecules of the hydration layer. Large black and white spheres indicate the end termini. (b) Scattering curve of the initial configuration (blue) overlaid on the scattering data (red). (c) Model prediction for which χf2 + χnl + χcon < 0.008; the end termini are next to each other. (d) Final scattering curve compared to experimental data.
Figure 16
Figure 16
Illustrations of the model predictions. (a, b) All model predictions. (c) One of the coiled-coil units of (a) with black tubes representing the contact prediction distances, as seen along the axis of the unit. (d) Tilted helical structure of the coiled-coil unit. (e) Model obtained by minimizing the chi-squared measure χnl + χcon only (i.e., without taking the scattering data into account).
Figure 17
Figure 17
GASBOR (obtained from SASBDB, code SASDAG2) and CB algorithm models for lysozyme superimposed on the 1LYZ crystal structure. (a) Bead cloud (red) of a GASBOR prediction for the same lysozyme scattering data as used above, obtained from SASBDB (code SASDAG2). A lysozyme prediction (blue curve) from Results for which the SUPCOMB NSD is 0.813 and knot fingerprint statistic formula image (both by comparison to the 1LYZ PDB structure). (c) Lysozyme prediction (green curve) from Results for which the SUPCOMB NSD is 0.8664 and knot fingerprint statistic formula image (both by comparison to the 1LYZ PDB structure). (d) Secondary structure sections 1–10 of the CB lysozyme model shown in (c).
Figure 18
Figure 18
Comparison of the CORAL derived model of the SYCE1 core obtained in ref (24) (a) and an example CB derived model (b).

References

    1. Petoukhov M. V.; Svergun D. I. Applications of small-angle X-ray scattering to biomacromolecular solutions. Int. J. Biochem. Cell Biol. 2013, 45, 429–437. 10.1016/j.biocel.2012.10.017. - DOI - PubMed
    1. Kikhney A. G.; Svergun D. I. A practical guide to small angle X-ray scattering (SAXS) of flexible and intrinsically disordered proteins. FEBS Lett. 2015, 589, 2570–2577. 10.1016/j.febslet.2015.08.027. - DOI - PubMed
    1. Mina J. G.; Thye J. K.; Alqaisi A. Q.; Bird L. E.; Dods R. H.; Grøftehauge M. K.; Mosely J. A.; Pratt S.; Shams-Eldin H.; Schwarz R. T.; Pohl E.; Denny P. W. Functional and phylogenetic evidence of a bacterial origin for the first enzyme in sphingolipid biosynthesis in a phylum of eukaryotic protozoan parasites. J. Biol. Chem. 2017, 292, 12208–12219. 10.1074/jbc.M117.792374. - DOI - PMC - PubMed
    1. Svergun D. I.; Koch M. H.; Timmins P. A.; May R. P.. Small Angle X-ray and Neutron Scattering from Solutions of Biological Macromolecules; Oxford University Press: 2013; Vol. 19.
    1. Rambo R. P.; Tainer J. A. Bridging the solution divide: comprehensive structural analyses of dynamic RNA, DNA, and protein assemblies by small-angle X-ray scattering. Curr. Opin. Struct. Biol. 2010, 20, 128–137. 10.1016/j.sbi.2009.12.015. - DOI - PMC - PubMed

LinkOut - more resources