Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Mar 23;60(3):1245-1252.
doi: 10.1021/acs.jcim.0c00043. Epub 2020 Mar 9.

DenseCPD: Improving the Accuracy of Neural-Network-Based Computational Protein Sequence Design with DenseNet

Affiliations

DenseCPD: Improving the Accuracy of Neural-Network-Based Computational Protein Sequence Design with DenseNet

Yifei Qi et al. J Chem Inf Model. .

Abstract

Computational protein design remains a challenging task despite its remarkable success in the past few decades. With the rapid progress of deep-learning techniques and the accumulation of three-dimensional protein structures, the use of deep neural networks to learn the relationship between protein sequences and structures and then automatically design a protein sequence for a given protein backbone structure is becoming increasingly feasible. In this study, we developed a deep neural network named DenseCPD that considers the three-dimensional density distribution of protein backbone atoms and predicts the probability of 20 natural amino acids for each residue in a protein. The accuracy of DenseCPD was 53.24 ± 0.17% in a 5-fold cross-validation on the training set and 55.53% and 50.71% on two independent test sets, which is more than 10% higher than those of previous state-of-the-art methods. Two approaches for using DenseCPD predictions in computational protein design were analyzed. The approach using the cutoff of accumulative probability had a smaller sequence search space compared with the approach that simply uses the top-k predictions and therefore enabled higher sequence identity in redesigning three proteins with Rosetta. The network and the datasets are available on a web server at http://protein.org.cn/densecpd.html. The results of this study may benefit the further development of computational protein design methods.

PubMed Disclaimer

Publication types

LinkOut - more resources