Prediction of interresidue contacts with DeepMetaPSICOV in CASP13

Shaun M Kandathil^{1

2}, Joe G Greener^{1

2}, David T Jones^{1

2}

Affiliations

¹ Department of Computer Science, University College London, London, UK.
² Biomedical Data Science Laboratory, The Francis Crick Institute, London, UK.

PMID: 31298436
PMCID: PMC6899903
DOI: 10.1002/prot.25779

Prediction of interresidue contacts with DeepMetaPSICOV in CASP13

Shaun M Kandathil et al. Proteins. 2019 Dec.

. 2019 Dec;87(12):1092-1099.

doi: 10.1002/prot.25779. Epub 2019 Jul 27.

Authors

Shaun M Kandathil^{1

2}, Joe G Greener^{1

2}, David T Jones^{1

2}

Affiliations

¹ Department of Computer Science, University College London, London, UK.
² Biomedical Data Science Laboratory, The Francis Crick Institute, London, UK.

PMID: 31298436
PMCID: PMC6899903
DOI: 10.1002/prot.25779

Abstract

In this article, we describe our efforts in contact prediction in the CASP13 experiment. We employed a new deep learning-based contact prediction tool, DeepMetaPSICOV (or DMP for short), together with new methods and data sources for alignment generation. DMP evolved from MetaPSICOV and DeepCov and combines the input feature sets used by these methods as input to a deep, fully convolutional residual neural network. We also improved our method for multiple sequence alignment generation and included metagenomic sequences in the search. We discuss successes and failures of our approach and identify areas where further improvements may be possible. DMP is freely available at: https://github.com/psipred/DeepMetaPSICOV.

Keywords: deep learning; machine learning; metagenomics; neural networks; protein contact prediction; protein structure prediction.

PubMed Disclaimer

Figures

**Figure 1**
Architecture of the DeepMetaPSICOV residual neural network model. On the left, the overall organization of the model is shown, beginning with the inputs, and ending in the final sigmoid output layer. The numbers in parentheses represent the dimensionality of the output from each layer in the format (*number of feature channels, width, height*). The network takes in input features for a protein of length L and produces correspondingly sized output. Most of the model is comprised of 18 residual blocks (denoted ResBlock; only a few are shown), and the structure of each block is shown on the right. The convolutional layers (Conv2D) in a residual block have 5 × 5 filters with a dilation rate d. The values of d for each residual block in the model are given in Supplementary Table S2

**Figure 2**
The data augmentation procedures used during the training of DeepMetaPSICOV. (A) Deletions in loops can be simulated by probabilistically removing rows and columns in the input tensors and contact maps corresponding to residues classified as loops by DSSP. The DSSP assignment for an example protein is shown above its contact map, with blue rectangles representing alpha helices, and line segments representing loops. (B) Input tensors generated using different alignments can be linearly interpolated to produce new training examples, simulating inputs generated from alignments of varying quality. Inputs thus generated for a given protein are mapped to the same contact maps. (C) New examples are generated by flipping the input feature tensors and contact maps by 180°, corresponding to a reversal of the chain direction

**Figure 3**
(A) Comparison of effective sequence count (M _eff) between alignments generated using only HHblits, or HHblits and jackHMMER. In the latter case, the jackHMMER search makes use of UniRef100 and EBI MGnify metagenomic protein sequences. (B) Plot of top‐L/5 long‐range precision values obtained using the deeper alignments vs those obtained using HHblits only. Using the deeper alignments was beneficial overall, although there are a few domains for which just the HHblits alignment would have provided much higher precision; these are marked

**Figure 4**
Gap fraction per column in the MSA generated for target T1021s3 (3112 raw sequences, M _eff = 979). Official domain boundaries are shaded in light blue and brown, and the precision obtained by DMP on these domains (long‐range, top‐L/5) is shown. The region of the MSA covering the C‐terminal domain D2 is comprised mostly of gaps and thus has little to no information content. Consequently, the obtained contact precision on this domain is much lower than that obtained for D1

**Figure 5**
Impact of incorrect mutual information (MI) calculations on top‐L/5 long‐range contact precision. Values are expressed as percentage point differences, with positive values indicating a gain in precision upon using the correct MI calculation

See this image and copyright information in PMC

References

1. Wang S, Sun S, Li Z, Zhang R, Xu J. Accurate De novo prediction of protein contact map by ultra‐deep learning model. PLoS Comput Biol. 2017;13(1):e1005324. - PMC - PubMed
1. Wang S, Sun S, Xu J. Analysis of deep learning methods for blind protein contact prediction in CASP12. Proteins: Structure, Function, and Bioinformatics. 2018;86(S1):67‐77. - PMC - PubMed
1. Adhikari B, Hou J, Cheng J. DNCON2: improved protein contact prediction using two‐level deep convolutional neural networks. Bioinformatics. 2017;34(9):1466‐1472. - PMC - PubMed
1. Liu Y, Palmedo P, Ye Q, Berger B, Peng J. Enhancing evolutionary couplings with deep convolutional neural networks. Cell Systems. 2018;6(1):65‐74.e63. - PMC - PubMed
1. Buchan DWA, Jones DT. Contact predictions with the MetaPSICOV2 server in CASP12. Proteins: Structure, Function and Bioinformatics. 2018;86(S1):78‐83. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

695558/ERC_/European Research Council/International

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Prediction of interresidue contacts with DeepMetaPSICOV in CASP13

Affiliations

Prediction of interresidue contacts with DeepMetaPSICOV in CASP13

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources