. 2018 May 1;34(9):1466-1472.

doi: 10.1093/bioinformatics/btx781.

DNCON2: improved protein contact prediction using two-level deep convolutional neural networks

Badri Adhikari¹, Jie Hou¹, Jianlin Cheng^{1

2}

Affiliations

¹ Department of Mathematics and Computer Science, University of Missouri-St. Louis, St. Louis, MO 63121, USA.
² Informatics Institute, University of Missouri, Columbia, MO 65211, USA.

PMID: 29228185
PMCID: PMC5925776
DOI: 10.1093/bioinformatics/btx781

DNCON2: improved protein contact prediction using two-level deep convolutional neural networks

Badri Adhikari et al. Bioinformatics. 2018.

. 2018 May 1;34(9):1466-1472.

doi: 10.1093/bioinformatics/btx781.

Authors

Badri Adhikari¹, Jie Hou¹, Jianlin Cheng^{1

2}

Affiliations

¹ Department of Mathematics and Computer Science, University of Missouri-St. Louis, St. Louis, MO 63121, USA.
² Informatics Institute, University of Missouri, Columbia, MO 65211, USA.

PMID: 29228185
PMCID: PMC5925776
DOI: 10.1093/bioinformatics/btx781

Abstract

Motivation: Significant improvements in the prediction of protein residue-residue contacts are observed in the recent years. These contacts, predicted using a variety of coevolution-based and machine learning methods, are the key contributors to the recent progress in ab initio protein structure prediction, as demonstrated in the recent CASP experiments. Continuing the development of new methods to reliably predict contact maps is essential to further improve ab initio structure prediction.

Results: In this paper we discuss DNCON2, an improved protein contact map predictor based on two-level deep convolutional neural networks. It consists of six convolutional neural networks-the first five predict contacts at 6, 7.5, 8, 8.5 and 10 Å distance thresholds, and the last one uses these five predictions as additional features to predict final contact maps. On the free-modeling datasets in CASP10, 11 and 12 experiments, DNCON2 achieves mean precisions of 35, 50 and 53.4%, respectively, higher than 30.6% by MetaPSICOV on CASP10 dataset, 34% by MetaPSICOV on CASP11 dataset and 46.3% by Raptor-X on CASP12 dataset, when top L/5 long-range contacts are evaluated. We attribute the improved performance of DNCON2 to the inclusion of short- and medium-range contacts into training, two-level approach to prediction, use of the state-of-the-art optimization and activation functions, and a novel deep learning architecture that allows each filter in a convolutional layer to access all the input features of a protein of arbitrary length.

Availability and implementation: The web server of DNCON2 is at http://sysbio.rnet.missouri.edu/dncon2/ where training and testing datasets as well as the predictions for CASP10, 11 and 12 free-modeling datasets can also be downloaded. Its source code is available at https://github.com/multicom-toolbox/DNCON2/.

Contact: chengji@missouri.edu.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
(A) The block diagram of DNCON2’s overall architecture. The 2D volumes representing a protein’s features are used by five convolution neural networks to predict preliminary contact probabilities at 6, 7.5, 8, 8.5 and 10 Å thresholds at the first level. The preliminary 2D predictions and the input volume are used by a convolutional neural network to predict final contact probability map at the second level. (B) The structure of one deep convolutional neural network in DNCON2 consisting of six hidden convolutional layers with 16 5x5 filters and an output layer consisting of one 5x5 filter to predict a contact probability map

**Fig. 2.**
The improvement from inclusion of predictions at distance thresholds of 6, 7.5, 8, 8.5 and 10 Å as additional features, measured using the precision of top L/5 (left) and top L/2 (right) long-range contacts on the validation dataset. Box plot of precision for best 30 of 40 models for the level one model trained only using the original features (pink), the level-two model trained using only 8 Å prediction as additional feature (green), and the level-two model trained by adding all five predictions at multiple thresholds as additional features (blue) (Color version of this figure is available at *Bioinformatics* online.)

**Fig. 3.**
Importance of features measured by the best of five precisions of top L/2 long-range contacts on the validation dataset after removing a feature or a set of features. ‘MSA Stats’ features are multiple sequence alignment (MSA) statistics related features comprising of Shannon entropy sum, mean contact potential, normalized mutual information and mutual information, ‘DNCON scores’ are set of several pre-computed statistical potentials, N and Neff are number of sequences and effective number of sequences (Color version of this figure is available at *Bioinformatics* online.)

See this image and copyright information in PMC

References

1. Adhikari B. et al. (2016) ConEVA: a toolbox for comprehensive assessment of protein contacts. BMC Bioinformatics, 17, 517.. - PMC - PubMed
1. Adhikari B. et al. (2015) CONFOLD: residue–residue contact-guided ab initio protein folding. Proteins, 83, 1436–1449. - PMC - PubMed
1. Cheng J. et al. (2005) SCRATCH: a protein structure and structural feature prediction server. Nucleic Acids Res., 33, W72–W76. - PMC - PubMed
1. Eickholt J., Cheng J. (2013) A study and benchmark of DNcon: a method for protein residue–residue contact prediction using deep networks. BMC Bioinformatics, 14, S12. - PMC - PubMed
1. Eickholt J., Cheng J. (2012) Predicting protein residue–residue contacts using deep networks and boosting. Bioinformatics, 28, 3066–3072. - PMC - PubMed

Publication types

Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

R01 GM093123/GM/NIGMS NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

DNCON2: improved protein contact prediction using two-level deep convolutional neural networks

Affiliations

DNCON2: improved protein contact prediction using two-level deep convolutional neural networks

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous