. 2021 Apr 20;37(3):360-366.

doi: 10.1093/bioinformatics/btaa714.

GraphQA: protein model quality assessment using graph convolutional networks

Federico Baldassarre¹, David Menéndez Hurtado^{2

3}, Arne Elofsson^{2

3}, Hossein Azizpour¹

Affiliations

¹ Division of Robotics, Perception and Learning (RPL), KTH - Royal Institute of Technology, 10044 Stockholm, Sweden.
² Department of Intelligent Systems, Science for Life Laboratory, Stockholm University, Box 1031, 17121 Solna, Sweden.
³ Department of Biochemistry and Biophysics, school of Electrical Engineering and Computer Science (EECS), Stockholm University, 10691 Stockholm, Sweden.

PMID: 32780838
PMCID: PMC8058777
DOI: 10.1093/bioinformatics/btaa714

GraphQA: protein model quality assessment using graph convolutional networks

Federico Baldassarre et al. Bioinformatics. 2021.

. 2021 Apr 20;37(3):360-366.

doi: 10.1093/bioinformatics/btaa714.

Authors

Federico Baldassarre¹, David Menéndez Hurtado^{2

3}, Arne Elofsson^{2

3}, Hossein Azizpour¹

Affiliations

¹ Division of Robotics, Perception and Learning (RPL), KTH - Royal Institute of Technology, 10044 Stockholm, Sweden.
² Department of Intelligent Systems, Science for Life Laboratory, Stockholm University, Box 1031, 17121 Solna, Sweden.
³ Department of Biochemistry and Biophysics, school of Electrical Engineering and Computer Science (EECS), Stockholm University, 10691 Stockholm, Sweden.

PMID: 32780838
PMCID: PMC8058777
DOI: 10.1093/bioinformatics/btaa714

Abstract

Motivation: Proteins are ubiquitous molecules whose function in biological processes is determined by their 3D structure. Experimental identification of a protein's structure can be time-consuming, prohibitively expensive and not always possible. Alternatively, protein folding can be modeled using computational methods, which however are not guaranteed to always produce optimal results. GraphQA is a graph-based method to estimate the quality of protein models, that possesses favorable properties such as representation learning, explicit modeling of both sequential and 3D structure, geometric invariance and computational efficiency.

Results: GraphQA performs similarly to state-of-the-art methods despite using a relatively low number of input features. In addition, the graph network structure provides an improvement over the architecture used in ProQ4 operating on the same input features. Finally, the individual contributions of GraphQA components are carefully evaluated.

Availability and implementation: PyTorch implementation, datasets, experiments and link to an evaluation server are available through this GitHub repository: github.com/baldassarreFe/graphqa.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
Protein QA. GraphQA predicts local and global scores from a protein’s graph using message passing between chemically bonded or spatially close residues. CASP QA algorithms score protein models by comparison with experimentally determined conformations

**Fig. 2.**
Protein representations for learning. Sequential representations for LSTM or 1D-CNN fail to represent spatial proximity of non-consecutive residues. Volumetric representations for 3D-CNN fail instead to capture sequence information and are not rotation invariant. Protein graphs explicitly represent both sequential and spatial structure, and are geometrically invariant by design

**Fig. 3.**
Joint plots of LDDT and GDT_TS scores on CASP13. The marginal plots show the distribution of true versus predicted scores

**Fig. 4.**
Trade-off between the number of message-passing layers and the connectivity of the protein graph (CASP11)

**Fig. 5.**
Ablation study of node (top) and edge (bottom) features (validation results on CASP 11). All node features improve both local and global scoring. DSSP features are marginally more relevant for LDDT. Richer edge features benefit LDDT predictions the most, while bringing little improvement to GDT_TS

**Fig. 6.**
Gradient magnitude of predicted LDDT score w.r.t. the edges of the input graph (T0773). In the edge matrix, a darker red indicates a higher magnitude. The attributions for residue 20 (left) and 60 (right) reveal the long-range dependencies between residues captured by GraphQA

See this image and copyright information in PMC

References

1. AlQuraishi M. (2019) End-to-end differentiable learning of protein structure. Cell Syst., 8, 292–301. - PMC - PubMed
1. Anand N., Huang P. (2018) Generative modeling for protein structures. In: Bengio,S. et al. (eds), Advances in Neural Information Processing Systems 31, Curran Associates, pp. 7494–7505.
1. Arnold K. et al. (2006) The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling. Bioinformatics, 22, 195–201. - PubMed
1. Baehrens D. et al. (2010) How to explain individual classification decisions. J. Mach. Learn. Res., 11, 1803–1831.
1. Battaglia P.W. et al. (2018) Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv : 1806.01261.

Publication types

Actions

MeSH terms

Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

GraphQA: protein model quality assessment using graph convolutional networks

Affiliations

GraphQA: protein model quality assessment using graph convolutional networks

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources