Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun 10;11(1):12295.
doi: 10.1038/s41598-021-91827-7.

DNCON2_Inter: predicting interchain contacts for homodimeric and homomultimeric protein complexes using multiple sequence alignments of monomers and deep learning

Affiliations

DNCON2_Inter: predicting interchain contacts for homodimeric and homomultimeric protein complexes using multiple sequence alignments of monomers and deep learning

Farhan Quadir et al. Sci Rep. .

Abstract

Deep learning methods that achieved great success in predicting intrachain residue-residue contacts have been applied to predict interchain contacts between proteins. However, these methods require multiple sequence alignments (MSAs) of a pair of interacting proteins (dimers) as input, which are often difficult to obtain because there are not many known protein complexes available to generate MSAs of sufficient depth for a pair of proteins. In recognizing that multiple sequence alignments of a monomer that forms homomultimers contain the co-evolutionary signals of both intrachain and interchain residue pairs in contact, we applied DNCON2 (a deep learning-based protein intrachain residue-residue contact predictor) to predict both intrachain and interchain contacts for homomultimers using multiple sequence alignment (MSA) and other co-evolutionary features of a single monomer followed by discrimination of interchain and intrachain contacts according to the tertiary structure of the monomer. We name this tool DNCON2_Inter. Allowing true-positive predictions within two residue shifts, the best average precision was obtained for the Top-L/10 predictions of 22.9% for homodimers and 17.0% for higher-order homomultimers. In some instances, especially where interchain contact densities are high, DNCON2_Inter predicted interchain contacts with 100% precision. We also developed Con_Complex, a complex structure reconstruction tool that uses predicted contacts to produce the structure of the complex. Using Con_Complex, we show that the predicted contacts can be used to accurately construct the structure of some complexes. Our experiment demonstrates that monomeric multiple sequence alignments can be used with deep learning to predict interchain contacts of homomeric proteins.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Diagram describing how the input PDB file was pre-processed using the MULTICOM TOOLBOX to clean up the PDB files. If no DSSP is available, the PDB was removed from our list. The individual chains in the multimer PDB were separated into individual files containing the ATOM (x, y, and z coordinates) segments only while discarding all other information. Only chain pairs whose FASTA sequences match 95% or more were kept, and any mismatched residues were removed to ensure homogeneity between chains.
Figure 2
Figure 2
Workflow diagram describing how the pre-processed input PDB and FASTA sequence was used to derive true intrachain contacts, true interchain contacts, predicted interchain contacts, and finally, obtain the evaluation and visualization of the prediction.
Figure 3
Figure 3
The precision heatmap of interchain contact predictions for the (a) homodimers and (b) homomultimers for the Top-k predictions where k = 5, 10, L/10, L/5, L/2, L, and 2L. For all categories, as we do more relax removal and relaxation, precision values increase within the respective Top-k categories. Relax removal = 2 and relaxation = 2 shows the best precision of 22.9% for homodimers and 17.0% for homomultimers within the Top-L/10 predictions.
Figure 4
Figure 4
Bar plot depicting the prediction of Top 2L interchain contact predictions changes as contact density varies with no relaxation removal. (a) shows the results for DNCON2_Inter predicted contacts and (b) shows the results obtained for random prediction. We can see that high contact density leads to high precision. Relaxation has little effect on precision when contact densities are beyond 3.50 for the DNCON2_Inter prediction.
Figure 5
Figure 5
Rows A., B., and C. correspond to relax removal values 0, 1, and 2, respectively. Columns (a) show the contact maps, (b) shows the complex structure comparison, and (c) shows the quality of the complex structures for relevant relax removals A., B., and C. The contact map comparisons are between true intrachain (blue), predicted interchain (green), and true interchain (red) contacts for 1A64. The green dots that overlap with the red dots are correct interchain contact predictions. These contacts were used to reconstruct the complex structure using Con_Complex. (b) shows the comparison between true homodimer structure and the structure derived from Con_Complex. (Golden: original chain A; Cyan: reconstructed chain A; red: original chain B; green: reconstructed chain B) The TM-score and RMSDs were obtained using TM-Align and shown in column (c).

Similar articles

Cited by

References

    1. Goodsell DS, Olson AJ. Structural symmetry and protein function. Annu. Rev. Biophys. Biomol. Struct. 2000;29:105–153. doi: 10.1146/annurev.biophys.29.1.105. - DOI - PubMed
    1. Matthews JM, Sunde M. Dimers, oligomers, everywhere. Adv. Exp. Med. Biol. 2012;747:1–18. doi: 10.1007/978-1-4614-3229-6_1. - DOI - PubMed
    1. Hopf, T. A. et al. Sequence co-evolution gives 3D contacts and structures of protein complexes. eLife3 (2014). - PMC - PubMed
    1. Zhou, T.-M., Wang, S. & Xu, J. Deep learning reveals many more inter-protein residue-residue contacts than direct coupling analysis. biorxiv.org10812 LNBI, 295–296 (2018).
    1. Adhikari B, Hou J, Cheng J. DNCON2: Improved protein contact prediction using two-level deep convolutional neural networks. Bioinformatics. 2018;34:1466–1472. doi: 10.1093/bioinformatics/btx781. - DOI - PMC - PubMed

Publication types