. 2018 Jan;27(1):41-50.

doi: 10.1002/pro.3249. Epub 2017 Aug 11.

Ensemblator v3: Robust atom-level comparative analyses and classification of protein structure ensembles

Andrew E Brereton¹, P Andrew Karplus¹

Affiliations

PMID: 28762605
PMCID: PMC5734391
DOI: 10.1002/pro.3249

Ensemblator v3: Robust atom-level comparative analyses and classification of protein structure ensembles

Andrew E Brereton et al. Protein Sci. 2018 Jan.

. 2018 Jan;27(1):41-50.

doi: 10.1002/pro.3249. Epub 2017 Aug 11.

Authors

Andrew E Brereton¹, P Andrew Karplus¹

Affiliation

¹ Department of Biochemistry and Biophysics, Oregon State University, Corvallis, Oregon, 97331.

PMID: 28762605
PMCID: PMC5734391
DOI: 10.1002/pro.3249

Abstract

Ensembles of protein structures are increasingly used to represent the conformational variation of a protein as determined by experiment and/or by molecular simulations, as well as uncertainties that may be associated with structure determinations or predictions. Making the best use of such information requires the ability to quantitatively compare entire ensembles. For this reason, we recently introduced the Ensemblator (Clark et al., Protein Sci 2015; 24:1528), a novel approach to compare user-defined groups of models, in residue level detail. Here we describe Ensemblator v3, an open-source program that employs the same basic ensemble comparison strategy but includes major advances that make it more robust, powerful, and user-friendly. Ensemblator v3 carries out multiple sequence alignments to facilitate the generation of ensembles from non-identical input structures, automatically optimizes the key global overlay parameter, optionally performs "ensemble clustering" to classify the models into subgroups, and calculates a novel "discrimination index" that quantifies similarities and differences, at residue or atom level, between each pair of subgroups. The clustering and automatic options mean that no pre-knowledge about an ensemble is required for its analysis. After describing the novel features of Ensemblator v3, we demonstrate its utility using three case studies that illustrate the ease with which complex analyses are accomplished, and the kinds of insights derived from clustering into subgroups and from the detailed information that locates significant differences. The Ensemblator v3 enhances the structural biology toolbox by greatly expanding the kinds of problems to which this ensemble comparison strategy can be applied.

Keywords: NMR ensemble; Rosetta; clustering; ensemble clustering; protein structure comparison; python; structure prediction; superposition; template-based modeling.

PubMed Disclaimer

Figures

**Figure 1**
Analysis of the solution structure of RNase Sa. (A) Discrimination Index (DI) plots for the pairwise comparisons of the three groups identified by the Ensemblator. The residue‐based global DI (blue) and the local DI (green) are averaged to create the unified DI (red). The median unified DI is also indicated (black line). (B) Wire‐diagram tracing of the backbone path in the region of largest inter‐group difference (residues 44–49): Group 1 (blue; models 1,2,7,8,10,13–15); Group 2 (green; models 3–6,9,11,12); Group 3 (red; models 16–20). (C) Wire‐diagram as in (B), for groups identified by analysis of only residues 38–58: Group 1 (blue; models 3–7,9,12,16–20); Group 2 (red; models 1,2,8,10,11,13–15). The tighter backbone spread results from the more local overlay. (D) φ,ψ values for residues 46 (circles), 47 (squares), and 48 (triangles) representative of the three groups shown in panel (B) (blue, green, red) and the X‐ray structures (purple). The ±30° boxes indicate the areas used in Protein Geometry Database11 searches for tripeptides present in structures solved at 1.5‐Å resolution or better that have no more than 25% sequence identity to one another. The tripeptide conformation in all the X‐ray models was found 467 times (0.34% of all tripeptides), while zero occurrences were found for the NMR conformations.

**Figure 2**
Analysis of a mixed‐source ensemble of the FK506 binding protein (FKBP). (A) t‐SNE dimensionality reduction results showing a 2D visualization of the relationships between the models in the N‐dimensional space used to cluster them. Per the key, the shape of each point represents the original label for a given model, and the clusters are differentiated by color (1—blue, 2—green, 3—red). (B) Backbone RMSDs along the chain for the final set of X‐ray (blue), NMR (green), or Rosetta (red) produced models. The bars indicate positions of β‐strands (purple), and α/3–10 helices (orange). (C) Discrimination Index (DI) plots for the Rosetta models vs. the X‐ray models. Residue‐based global (blue), local (green), and unified (black) DI are shown, along with the median unified DI (horizontal black line). Secondary structure indicated as in (B). (D) Wire‐diagram tracing the backbone for the X‐ray (blue), the NMR (green) and the Rosetta (red) models. The N‐ and C‐terminal are indicated, as well as the position of residue 67, at the base of an α‐helix. (E) The φ,ψ‐angles for serine 67 in the Rosetta (red) and the X‐ray structures (blue) are shown. As context, the φ,ψ‐values of all serine residues in crystal structures at 1.5 Å resolution or better with ≤25% sequence identity to one another are indicated (black dots).

**Figure 3**
Ensemblator analysis of calmodulin (CaM) crystal structures. (A) Wire‐diagram backbone tracing for the ligand‐bound models (blue), and the ligand‐free models (red), as overlayed by the Ensemblator. (B) Discrimination indices (top panel; global (blue), local (green), unified (black), and median unified (horizontal black line)), and RMSDs from the global (middle panel) and local (bottom panel) comparisons for the entire CaM protein. In the global and local comparisons, the within group variation is shown for the ligand‐bound (green) and ligand‐free (blue) conformations. Also indicated is the inter‐group variation (black) and the closest approach distances (grey). (C) As in (B), except the analysis only included the N‐terminal domain. (D) As in (B), except the analysis only included the C‐terminal domain.

See this image and copyright information in PMC

Cited by

Proteomimetic Zinc Finger Domains with Modified Metal-binding β-Turns.
Rao SR, Horne WS. Rao SR, et al. Pept Sci (Hoboken). 2020 Sep;112(5):e24177. doi: 10.1002/pep2.24177. Epub 2020 Jun 7. Pept Sci (Hoboken). 2020. PMID: 33733039 Free PMC article.
Systematic identification of recognition motifs for the hub protein LC8.
Jespersen N, Estelle A, Waugh N, Davey NE, Blikstad C, Ammon YC, Akhmanova A, Ivarsson Y, Hendrix DA, Barbar E. Jespersen N, et al. Life Sci Alliance. 2019 Jul 2;2(4):e201900366. doi: 10.26508/lsa.201900366. Print 2019 Aug. Life Sci Alliance. 2019. PMID: 31266884 Free PMC article.
Automated NMR resonance assignments and structure determination using a minimal set of 4D spectra.
Evangelidis T, Nerli S, Nováček J, Brereton AE, Karplus PA, Dotas RR, Venditti V, Sgourakis NG, Tripsianes K. Evangelidis T, et al. Nat Commun. 2018 Jan 26;9(1):384. doi: 10.1038/s41467-017-02592-z. Nat Commun. 2018. PMID: 29374165 Free PMC article.
Ligand-induced shifts in conformational ensembles that describe transcriptional activation.
Khan SH, Braet SM, Koehler SJ, Elacqua E, Anand GS, Okafor CD. Khan SH, et al. Elife. 2022 Oct 12;11:e80140. doi: 10.7554/eLife.80140. Elife. 2022. PMID: 36222302 Free PMC article.
Heterogeneous-Backbone Proteomimetic Analogues of Lasiocepsin, a Disulfide-Rich Antimicrobial Peptide with a Compact Tertiary Fold.
Cabalteja CC, Lin Q, Harmon TW, Rao SR, Di YP, Horne WS. Cabalteja CC, et al. ACS Chem Biol. 2022 Apr 15;17(4):987-997. doi: 10.1021/acschembio.2c00138. Epub 2022 Mar 15. ACS Chem Biol. 2022. PMID: 35290019 Free PMC article.

See all "Cited by" articles

References

1. Elber R, Karplus M (1987) Multiple conformational states of proteins: a molecular dynamics analysis of myoglobin. Science 235:318–321. - PubMed
1. Furnham N, Blundell TL, DePristo MA, Terwilliger TC (2006) Is one solution good enough? Nat Struct Mol Biol 13:184–185. - PubMed
1. Lindorff‐Larsen K, Best RB, DePristo MA, Dobson CM, Vendruscolo M (2005) Simultaneous determination of protein structure and dynamics. Nature 433:128–132. - PubMed
1. Monzon AM, Rohr CO, Fornasari MS, Parisi G (2016) CoDNaS 2.0: a comprehensive database of protein conformational diversity in the native state. Database J Biol Databases Curation 29:2512–2514. - PMC - PubMed
1. Palopoli N, Monzon AM, Parisi G, Fornasari MS (2016) Addressing the role of conformational diversity in protein structure prediction. Plos One 11:e0154923. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

R01 GM083136/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Ensemblator v3: Robust atom-level comparative analyses and classification of protein structure ensembles

Affiliation

Ensemblator v3: Robust atom-level comparative analyses and classification of protein structure ensembles

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources