. 2016 Jan 22:6:19848.

doi: 10.1038/srep19848.

In silico functional dissection of saturation mutagenesis: Interpreting the relationship between phenotypes and changes in protein stability, interactions and activity

Douglas E V Pires^{1

2}, Jing Chen¹, Tom L Blundell¹, David B Ascher¹

Affiliations

¹ Department of Biochemistry, Sanger Building, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK.
² Centro de Pesquisas René Rachou, Fundação Oswaldo Cruz, Avenida Augusto de Lima 1715, Belo Horizonte, 30190-002, Brazil.

PMID: 26797105
PMCID: PMC4726175
DOI: 10.1038/srep19848

In silico functional dissection of saturation mutagenesis: Interpreting the relationship between phenotypes and changes in protein stability, interactions and activity

Douglas E V Pires et al. Sci Rep. 2016.

. 2016 Jan 22:6:19848.

doi: 10.1038/srep19848.

Authors

Douglas E V Pires^{1

2}, Jing Chen¹, Tom L Blundell¹, David B Ascher¹

Affiliations

¹ Department of Biochemistry, Sanger Building, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK.
² Centro de Pesquisas René Rachou, Fundação Oswaldo Cruz, Avenida Augusto de Lima 1715, Belo Horizonte, 30190-002, Brazil.

PMID: 26797105
PMCID: PMC4726175
DOI: 10.1038/srep19848

Abstract

Despite interest in associating polymorphisms with clinical or experimental phenotypes, functional interpretation of mutation data has lagged behind generation of data from modern high-throughput techniques and the accurate prediction of the molecular impact of a mutation remains a non-trivial task. We present here an integrated knowledge-driven computational workflow designed to evaluate the effects of experimental and disease missense mutations on protein structure and interactions. We exemplify its application with analyses of saturation mutagenesis of DBR1 and Gal4 and show that the experimental phenotypes for over 80% of the mutations correlate well with predicted effects of mutations on protein stability and RNA binding affinity. We also show that analysis of mutations in VHL using our workflow provides valuable insights into the effects of mutations, and their links to the risk of developing renal carcinoma. Taken together the analyses of the three examples demonstrate that structural bioinformatics tools, when applied in a systematic, integrated way, can rapidly analyse a given system to provide a powerful approach for predicting structural and functional effects of thousands of mutations in order to reveal molecular mechanisms leading to a phenotype. Missense or non-synonymous mutations are nucleotide substitutions that alter the amino acid sequence of a protein. Their effects can range from modifying transcription, translation, processing and splicing, localization, changing stability of the protein, altering its dynamics or interactions with other proteins, nucleic acids and ligands, including small molecules and metal ions. The advent of high-throughput techniques including sequencing and saturation mutagenesis has provided large amounts of phenotypic data linked to mutations. However, one of the hurdles has been understanding and quantifying the effects of a particular mutation, and how they translate into a given phenotype. One approach to overcome this is to use robust, accurate and scalable computational methods to understand and correlate structural effects of mutations with disease.

PubMed Disclaimer

Figures

**Figure 1. A proposed computational mutation analysis workflow.**
The figure depicts the proposed methodology workflow which can be divided in the four main steps: data collection and structural analysis, *in silico* (quantitative) prediction of effects of mutations, filtering mutations by their predicted effect, building regression and classification models to link prediction with observed phenotype.

**Figure 2. Noncovalent interaction networks in DBR1.**
(A) shows depicts interactions between Manganese ion and the DRB1-RNA complex. The ion is coordinated by a series of interactions within the protein as well as with the RNA molecule. Mutations on these residues would, therefore, disrupt manganese binding, affecting catalytic activity directly. (**B,C**) depicts noncovalent interaction network of mutated residues in the DRB1-RNA complex. Mutated residues are depicted in green and the RNA fragment in blue. Hydrogen bonds are depicted as red dotted lines, hydrophobic interactions in green and ring-ring interactions in grey. Panel B shows residue Trp99 performing a series of hydrophobic and ring interactions. Mutation from tryptophan to glycine would destabilize the protein given the loss of interactions leading to a loss of entropy when folding. Panel C shows residue His85, whose mutations are predicted to also affect RNA binding affinity. His85 makes a series of inter and intramolecular ring interactions that would be lost by a mutation to serine.

**Figure 3. Performance analysis on classification and regression models of the phenotypic effects of DBR1 mutations.**
The left-hand graph shows the ROC curves for the binary classifiers trained with stability (DUET, SDM and mCSM-Stability) and RNA binding affinity change (mCSM-NA) predictions. Three curves are shown with the performance for the developed classifier on the complete set of mutations, the set of mild mutations and the performance of PolyPhen-2 on a selected set of mutations. The area under the curve values (AUC) for each classifier are also shown. The remaining graphs show regression results for those mutations not predicted to have large effects on stability or RNA binding (center graph) and for the final model including all mutations (right graph). Fitted log(enrichment) scores using DUET, SDM, mCSM-Stability and mCSM-NA are combined using linear equations compared to the average phenotypic results obtained by Gregory *et al.* (2014). For each data set the Pearson’s correlation coefficient (ρ) is shown in the bottom-right part of each graph and at the top-left after 10% outlier removal. Outliers are depicted in red.

**Figure 4. Heatmap of the average predicted changes upon mutation on DBR1.**
The figure shows the average prediction per residue (considering all 19 potential mutations at each position) in stability (left), RNA affinity (middle) and enrichment score (right). Residues were coloured in a scale from blue to red indicating the average effect from stabilizing to destabilizing as predicted by mCSM-Stability, mCSM-NA or the degree of reduced cell growth as predicted by the final regression model.

**Figure 5. Noncovalent interaction networks in Gal4.**
(A) depicts interactions between a pair of Zn²⁺ ions coordinated by six conserved cysteine residues in the Gal4-DNA complex. Mutations on these residues would, therefore, disrupt zinc binding, affecting Gal4 function. (B,C) depicts noncovalent interaction network of mutated residues in the Gal4-DNA complex. Mutated residues are depicted in green. Hydrogen bonds are depicted as red dotted lines, hydrophobic interactions in green and ring-ring interactions in grey. (Panel B) shows residue Tyr40 performing a side-chain to main-chain hydrogen bond and ring interactions. The mutation Y40A is predicted to be highly destabilizing, given the removal of a large portion of the side chain and consequent loss of interactions. (Panel C) shows residue Val57, whose mutations are predicted to also affect protein-protein affinity. Val57 establishes a network of hydrophobic and ring interactions that would be lost by the introduction of larger or hydrophilic residues, destabilizing the homodimer. (Panel D) shows residue R15, whose mutations are predicted to also affect RNA binding affinity. Arg15 directly interacts with the DNA molecule through a weak polar interaction and hydrogen bond. Mutations to aspartic and glutamic acids, through the introduction of an opposite charge, are predicted to destabilize the region and reducing DNA affinity.

**Figure 6. Performance analysis on regression on Gal4 mutations.**
The graph shows regression results on 10-fold cross validation for the predictive model trained on the complete set of mutations (1083) on Gal4. Fitted log(enrichment) scores using DUET, SDM, mCSM-Stability, mCSM-PPI and mCSM-NA are combined using linear equations compared to the average phenotypic results obtained by Kitzman et al. (2015). Pearson’s correlation coefficient (ρ) is shown in the bottom-right part of each graph and at the top-left after 10% outlier removal. Outliers are depicted in red.

**Figure 7. Heatmap of the average predicted and experimental changes upon mutation on Gal4.**
The figure shows the average prediction per residue in stability (left), DNA affinity (middle) and experimental measurement of enrichment score (right). Residues were coloured in a scale from blue to red indicating the average effect from stabilizing to destabilizing as predicted by mCSM-Stability, mCSM-NA or the degree of reduced cell growth as described experimentally. It is interesting to notice the predicted variability of effects on protein stability and DNA affinity on the structures and how they are together complementary to the experimental phenotype.

**Figure 8. Noncovalent interaction networks in VHL.**
Mutated residues are depicted in green. Proximal hydrophobic interactions are depicted in small dots, ring-ring interactions in grey and donor-pi interactions in blue. (Panel A) shows residue Phe136 performing a dense network of hydrophobic and ring interactions. Mutation to serine is predicted to be highly destabilizing, given the removal of a large portion of the side chain and consequent loss of interactions. Panel B shows residue Trp88, whose mutations are predicted to also affect protein-protein affinity. Trp88 establishes a network of ring interactions, as well as donor-pi interactions within its chain and with the HIF-1α peptide. Mutations to arginine or serine would disrupt these strong interactions, destabilize the region as well as the protein-protein interface, reducing affinity.

See this image and copyright information in PMC

References

1. Deng Z., Chuaqui C. & Singh J. Structural interaction fingerprint (SIFt): a novel method for analyzing three-dimensional protein-ligand binding interactions. J Med Chem 47, 337–344 (2004). - PubMed
1. Adzhubei I. A. et al. A method and server for predicting damaging missense mutations. Nat Methods 7, 248–249 (2010). - PMC - PubMed
1. Topham C. M., Srinivasan N. & Blundell T. L. Prediction of the stability of protein mutants based on structural environment-dependent amino acid substitution and propensity tables. Protein Eng 10, 7–21 (1997). - PubMed
1. Worth C. L., Preissner R. & Blundell T. L. SDM–a server for predicting effects of mutations on protein stability and malfunction. Nucleic Acids Res 39, W215–222 (2011). - PMC - PubMed
1. Capriotti E., Fariselli P. & Casadio R. A neural-network-based method for predicting protein stability changes upon single point mutations. Bioinformatics 20 Suppl 1, i63–68 (2004). - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Molecular Biology Databases
- BioCyc
- Saccharomyces Genome Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

In silico functional dissection of saturation mutagenesis: Interpreting the relationship between phenotypes and changes in protein stability, interactions and activity

Affiliations

In silico functional dissection of saturation mutagenesis: Interpreting the relationship between phenotypes and changes in protein stability, interactions and activity

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases