Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Apr 30;14(4):e1006104.
doi: 10.1371/journal.pcbi.1006104. eCollection 2018 Apr.

Automated evaluation of quaternary structures from protein crystals

Affiliations

Automated evaluation of quaternary structures from protein crystals

Spencer Bliven et al. PLoS Comput Biol. .

Abstract

A correct assessment of the quaternary structure of proteins is a fundamental prerequisite to understanding their function, physico-chemical properties and mode of interaction with other proteins. Currently about 90% of structures in the Protein Data Bank are crystal structures, in which the correct quaternary structure is embedded in the crystal lattice among a number of crystal contacts. Computational methods are required to 1) classify all protein-protein contacts in crystal lattices as biologically relevant or crystal contacts and 2) provide an assessment of how the biologically relevant interfaces combine into a biological assembly. In our previous work we addressed the first problem with our EPPIC (Evolutionary Protein Protein Interface Classifier) method. Here, we present our solution to the second problem with a new method that combines the interface classification results with symmetry and topology considerations. The new algorithm enumerates all possible valid assemblies within the crystal using a graph representation of the lattice and predicts the most probable biological unit based on the pairwise interface scoring. Our method achieves 85% precision (ranging from 76% to 90% for different oligomeric types) on a new dataset of 1,481 biological assemblies with consensus of PDB annotations. Although almost the same precision is achieved by PISA, currently the most popular quaternary structure assignment method, we show that, due to the fundamentally different approach to the problem, the two methods are complementary and could be combined to improve biological assembly assignments. The software for the automatic assessment of protein assemblies (EPPIC version 3) has been made available through a web server at http://www.eppic-web.org.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Visualizations of the biological assembly for GAD1 from Arabidopsis thaliana [PDB:3HBX], as presented by the EPPIC server.
(a) 3D lattice graph of a full unit cell (http://eppic-web.org/ewui/ewui/latticeGraph?id=3hbx&interfaces=*). The nodes are placed at the centroids of each chain, with edges indicating all interfaces. Many edges extend outside the unit cell due to the periodic nature of the lattice. (b) 2D graph of the hexameric biological assembly, formed by engaging three interface types (interfaces 1-3, 4-6 and 8-13). In both diagrams, nodes are labeled with chain ID and symmetry operator and colored by molecular entity. Edges are numbered sequentially by buried surface area and colored by interface type.
Fig 2
Fig 2. EPPIC assembly predictions as a confusion matrix of macromolecular sizes.
Tiles are colored as the fraction of predictions (i.e. row normalized). The method achieves 85% precision on the dataset. PDB1 refers to the 1st biological assembly annotation provided by the PDB, in here considered as the true biological assembly.
Fig 3
Fig 3. Comparison of assembly predictions from EPPIC and PISA on the benchmarking dataset.
On the top right, a pie chart shows the global agreement between EPPIC and PISA. On the bottom left, the confusion matrix of actual (PDB1 annotations) and predicted macromolecular sizes. Tiles colored as a fraction of each EPPIC (blue) and PISA (red) macromolecular size prediction (i.e. row normalized). On the bottom right, the agreement and precision of the methods for each PISA macromolecular size prediction. On the top left, the total number and recall for each macromolecular size in the dataset.
Fig 4
Fig 4. EPPIC and PISA predictions on the protein assembly dataset as a Venn diagram.
PDB1 refers to the 1st biological assembly annotation provided by the PDB.
Fig 5
Fig 5. Example of an asymmetric assembly with a heterologous interface.
(a) The crystal lattice of PDB 2VCO as shown by the EPPIC server (http://eppic-web.org/ewui/ewui/latticeGraph?id=2vco&interfaces=1,3). The highlighted tetrameric assembly is the one annotated in the PDB. (b) Schematic 2D representation of a lattice that contains an asymmetric dimer through a heterologous interface but which does not form infinite fibers in the crystal.
Fig 6
Fig 6. Example of a non-isomorphic assembly in the crystal.
(a) The crystal lattice of PDB 1A99, highlighting the C2 dimer wrapping around the unit cell (http://eppic-web.org/ewui/ewui/latticeGraph?id=1a99&interfaces=7). (b) Schematic 2D representation of a lattice that contains a valid C2 assembly, but which is not isomorphic throughout the crystal.
Fig 7
Fig 7. Example of an asymmetric assembly with internal pseudo-symmetry in one of the chains.
(a) The ABC transporter (PDB 4FI3). (b) The BtuF periplasmic domain with internal C2 pseudo-symmetry highlighted, including the 2-fold axis of symmetry. The internal symmetry calculation was performed with CE-Symm [35].

References

    1. Svedberg T. Mass and Size of Protein Molecules; 1929. Available from: http://www.nature.com/doifinder/10.1038/123871a0. - DOI
    1. Bernal JD. General introduction structure arrangements of macromolecules. Discussions of the Faraday Society. 1958;25:7 doi: 10.1039/df9582500007 - DOI
    1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Research. 2000;28(1):235–242. doi: 10.1093/nar/28.1.235 - DOI - PMC - PubMed
    1. Baskaran K, Duarte JM, Biyani N, Bliven S, Capitani G. A PDB-wide, evolution-based assessment of protein-protein interfaces. BMC Structural Biology. 2014;14:1–11. doi: 10.1186/s12900-014-0022-0 - DOI - PMC - PubMed
    1. Capitani G, Duarte JM, Baskaran K, Bliven S, Somody JC. Understanding the fabric of protein crystals: Computational classification of biological interfaces and crystal contacts. Bioinformatics. 2015;32(4):481–489. doi: 10.1093/bioinformatics/btv622 - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources