. 2023 Jan 1;39(1):btad030.

doi: 10.1093/bioinformatics/btad030.

3D-equivariant graph neural networks for protein model quality assessment

Chen Chen¹, Xiao Chen¹, Alex Morehead¹, Tianqi Wu¹, Jianlin Cheng¹

Affiliations

PMID: 36637199
PMCID: PMC10089647
DOI: 10.1093/bioinformatics/btad030

3D-equivariant graph neural networks for protein model quality assessment

Chen Chen et al. Bioinformatics. 2023.

. 2023 Jan 1;39(1):btad030.

doi: 10.1093/bioinformatics/btad030.

Authors

Chen Chen¹, Xiao Chen¹, Alex Morehead¹, Tianqi Wu¹, Jianlin Cheng¹

Affiliation

¹ Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA.

PMID: 36637199
PMCID: PMC10089647
DOI: 10.1093/bioinformatics/btad030

Abstract

Motivation: Quality assessment (QA) of predicted protein tertiary structure models plays an important role in ranking and using them. With the recent development of deep learning end-to-end protein structure prediction techniques for generating highly confident tertiary structures for most proteins, it is important to explore corresponding QA strategies to evaluate and select the structural models predicted by them since these models have better quality and different properties than the models predicted by traditional tertiary structure prediction methods.

Results: We develop EnQA, a novel graph-based 3D-equivariant neural network method that is equivariant to rotation and translation of 3D objects to estimate the accuracy of protein structural models by leveraging the structural features acquired from the state-of-the-art tertiary structure prediction method-AlphaFold2. We train and test the method on both traditional model datasets (e.g. the datasets of the Critical Assessment of Techniques for Protein Structure Prediction) and a new dataset of high-quality structural models predicted only by AlphaFold2 for the proteins whose experimental structures were released recently. Our approach achieves state-of-the-art performance on protein structural models predicted by both traditional protein structure prediction methods and the latest end-to-end deep learning method-AlphaFold2. It performs even better than the model QA scores provided by AlphaFold2 itself. The results illustrate that the 3D-equivariant graph neural network is a promising approach to the evaluation of protein structural models. Integrating AlphaFold2 features with other complementary sequence and structural features is important for improving protein model QA.

Availability and implementation: The source code is available at https://github.com/BioinfoMachineLearning/EnQA.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
The illustration of the local spherical coordinate system. Different colors indicate atoms from different residues. Here, *θ, φ* and r are spherical angles and the radial distance for the vector between the alpha carbons (Ca) of two residues (blue and red)

**Fig. 2.**
The illustration of the overall architecture of EnQA. The 1D/2D features from the input model are first converted into hidden node and edge features for the 3D-equivarant graph module. The spatial coordinates of Ca atoms of the residues are also used as an extra feature. The node and edge network modules update the graph features iteratively. In the end, the final per-residue lDDT score and distance errors of residue pairs are predicted from the updated node/edge features and spatial coordinates by the 3D-equivariant network

**Fig. 3.**
The distribution of lDDT scores of AlphaFold test models. The x-axis denotes the targets ordered by the mean lDDT of their models in increasing order. The red dots indicate the position of the median and the bars indicate the upper and lower ranges of model quality of each target

**Fig. 4.**
The comparison between the predicted and true lDDT scores for AlphaFold2_test models for the two methods (AF2 reported score and EnQA-MSA). The residue-level correlation is computed for all residues at once, which is different from the average of the residue-level correlation in each model (used in Sections 3.1 and 3.2). r, Pearson correlation coefficient; ρ, Spearman correlation coefficient. The lDDT scores predicted by EnQA-MSA have higher correlation with the true lDDT scores than AlphaFold2 self-reported scores

**Fig. 5.**
The distribution of estimation error between the predicted and true lDDT scores on AlphaFold2_test dataset. The difference between AF2_plddt scores and true pLDDT scores (green) is significant (P < 0.01), but the difference between pLDDT scores predicted by EnQA-MSA and true pLDDT scores (red) is not significant (P = 0.117)

**Fig. 6.**
The comparison of residue-level Pearson’s correlation coefficient when different features are randomly permuted for model QA. The red dots indicate the position of the median

See this image and copyright information in PMC

References

1. Andreeva A. et al. (2014) SCOP2 prototype: A new approach to protein structure mining. Nucleic Acids Res., 42, D310–D314. - PMC - PubMed
1. Andreeva A. et al. (2020) The SCOP database in 2020: Expanded classification of representative family and superfamily domains of known protein structures. Nucleic Acids Res., 48, D376–D382. - PMC - PubMed
1. Arnold K. et al. (2006) The SWISS-MODEL workspace: A web-based environment for protein structure homology modelling. Bioinformatics, 22, 195–201. - PubMed
1. Baek M. et al. (2021) Accurate prediction of protein structures and interactions using a three-track neural network. Science, 373, 871–876. - PMC - PubMed
1. Baldassarre F. et al. (2021) GraphQA: Protein model quality assessment using graph convolutional networks. Bioinformatics, 37, 360–366. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

3D-equivariant graph neural networks for protein model quality assessment

Affiliation

3D-equivariant graph neural networks for protein model quality assessment

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources