. 2017 Mar;17(2):128-136.

doi: 10.1038/tpj.2015.97. Epub 2016 Jan 26.

Impact of germline and somatic missense variations on drug binding sites

C Yan¹, N Pattabiraman², J Goecks³, P Lam¹, A Nayak¹, Y Pan¹, J Torcivia-Rodriguez¹, A Voskanian¹, Q Wan¹, R Mazumder^{1

4}

Affiliations

¹ Department of Biochemistry and Molecular Medicine, George Washington University, Washington, DC, USA.
² MolBox LLC, Silver Spring, MD, USA.
³ The Computational Biology Institute, George Washington University, Ashburn, VA, USA.
⁴ McCormick Genomic and Proteomic Center, George Washington University, Washington, DC, USA.

PMID: 26810135
PMCID: PMC5380835
DOI: 10.1038/tpj.2015.97

Impact of germline and somatic missense variations on drug binding sites

C Yan et al. Pharmacogenomics J. 2017 Mar.

. 2017 Mar;17(2):128-136.

doi: 10.1038/tpj.2015.97. Epub 2016 Jan 26.

Authors

C Yan¹, N Pattabiraman², J Goecks³, P Lam¹, A Nayak¹, Y Pan¹, J Torcivia-Rodriguez¹, A Voskanian¹, Q Wan¹, R Mazumder^{1

4}

Affiliations

¹ Department of Biochemistry and Molecular Medicine, George Washington University, Washington, DC, USA.
² MolBox LLC, Silver Spring, MD, USA.
³ The Computational Biology Institute, George Washington University, Ashburn, VA, USA.
⁴ McCormick Genomic and Proteomic Center, George Washington University, Washington, DC, USA.

PMID: 26810135
PMCID: PMC5380835
DOI: 10.1038/tpj.2015.97

Abstract

Advancements in next-generation sequencing (NGS) technologies are generating a vast amount of data. This exacerbates the current challenge of translating NGS data into actionable clinical interpretations. We have comprehensively combined germline and somatic nonsynonymous single-nucleotide variations (nsSNVs) that affect drug binding sites in order to investigate their prevalence. The integrated data thus generated in conjunction with exome or whole-genome sequencing can be used to identify patients who may not respond to a specific drug because of alterations in drug binding efficacy due to nsSNVs in the target protein's gene. To identify the nsSNVs that may affect drug binding, protein-drug complex structures were retrieved from Protein Data Bank (PDB) followed by identification of amino acids in the protein-drug binding sites using an occluded surface method. Then, the germline and somatic mutations were mapped to these amino acids to identify which of these alter protein-drug binding sites. Using this method we identified 12 993 amino acid-drug binding sites across 253 unique proteins bound to 235 unique drugs. The integration of amino acid-drug binding sites data with both germline and somatic nsSNVs data sets revealed 3133 nsSNVs affecting amino acid-drug binding sites. In addition, a comprehensive drug target discovery was conducted based on protein structure similarity and conservation of amino acid-drug binding sites. Using this method, 81 paralogs were identified that could serve as alternative drug targets. In addition, non-human mammalian proteins bound to drugs were used to identify 142 homologs in humans that can potentially bind to drugs. In the current protein-drug pairs that contain somatic mutations within their binding site, we identified 85 proteins with significant differential gene expression changes associated with specific cancer types. Information on protein-drug binding predicted drug target proteins and prevalence of both somatic and germline nsSNVs that disrupt these binding sites can provide valuable knowledge for personalized medicine treatment. A web portal is available where nsSNVs from individual patient can be checked by scanning against DrugVar to determine whether any of the SNVs affect the binding of any drug in the database.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Workflow for mapping nonsynonymous single-nucleotide variations (nsSNVs) on protein–drug binding sites. ATC, anatomical therapeutic chemical classification; PDB, protein data bank.

**Figure 2**
Distribution of binding sites and binding sites affecting single-nucleotide variations (SNVs) across 253 drug target proteins. The blue bar indicates the ratio between number of drug binding sites and target protein length, whereas the red bar shows the ratio between number of binding sites affecting SNVs and binding sites.

**Figure 3**
DrugVar website browser interface. Users can perform searches using Protein Data Bank (PDB) IDs, gene names, UniProtKB accessions and drug names or identifiers.

**Figure 4**
Circos plot representing the binding connections between 25 antineoplastic agents and their target proteins. Proteins are presented with gene names. Ribbon colors are assigned for visualization purposes and the ribbon width indicates the number of target proteins.

**Figure 5**
Structural view of protein–drug interactions. (a) Superposition (c-alpha atoms) of imatinib binding to eight target protein X-ray structures (ABL1, LCK, KIT, NQO2, ABL2, SYK, DDR1 and MAPK14). The superimposed protein structures are colored. The blue to red color represents low to high conservation. The ligand is shown bound to protein pockets. (b) Imatinib binding to the same target proteins as shown in (a). Only the side chains of binding sites are shown. (c) Imatinib binding to its target proteins. The side chains of proteins are imatinib binding sites that are mutated.

**Figure 6**
Structural representation of protein–drug binding sites. (a) Cytochrome P450 3A4 bound to bromocriptine, erythromycin, metyrapone, ritonavir and progesterone, respectively. The drugs are shown in magenta color in the protein pocket bound to amino acid residues that are shown in cyan except the mutated amino acids marked in blue. The yellow color is the heme of Cytochrome P450 3A4 (CYP3A4). (b) Superposition of energy minimized structures for the wild-type carbonic anhydrase 2 (CA2) bound to lacosamide (PDB: 3IEO) and the mutated models (N67K, Q92P and F131L) bound to the same drug. PDB, protein data bank.

**Figure 7**
Superposition of X-ray crystal structures of carbonic anhydrase 2 (CA2) and its paralogs (CA13, CA7) bound to hydroxyurea. The ribbon structure of CA2 and its paralogs is shown in pink color. The hydroxyurea in the protein pocket is shown in magenta color bound to amino acid residues that are conserved across CA2 and its paralogs.

See this image and copyright information in PMC

Cited by

Rare genetic variability in human drug target genes modulates drug response and can guide precision medicine.
Zhou Y, Arribas GH, Turku A, Jürgenson T, Mkrtchian S, Krebs K, Wang Y, Svobodova B, Milani L, Schulte G, Korabecny J, Gastaldello S, Lauschke VM. Zhou Y, et al. Sci Adv. 2021 Sep 3;7(36):eabi6856. doi: 10.1126/sciadv.abi6856. Epub 2021 Sep 1. Sci Adv. 2021. PMID: 34516913 Free PMC article.
Pharmacogenomics-Based Detection of Variants Involved in Pain, Anti-inflammatory and Immunomodulating Agents Pathways by Whole Exome Sequencing and Deep in Silico Investigations Revealed Novel Chemical Carcinogenesis and Cancer Risks.
Sharafshah A, Motovali-Bashi M, Keshavarz P. Sharafshah A, et al. Iran J Med Sci. 2025 Feb 1;50(2):98-111. doi: 10.30476/ijms.2024.101852.3450. eCollection 2025 Feb. Iran J Med Sci. 2025. PMID: 40026294 Free PMC article.
Expanded analysis of secondary germline findings from matched tumor/normal sequencing identifies additional clinically significant mutations.
Dumbrava EI, Brusco L, Daniels M, Wathoo C, Shaw K, Lu K, Zheng X, Strong L, Litton J, Arun B, Eterovic AK, Routbort M, Patel K, Qi Y, Piha-Paul S, Subbiah V, Hong D, Rodon J, Kopetz S, Mendelsohn J, Mills GB, Chen K, Meric-Bernstam F. Dumbrava EI, et al. JCO Precis Oncol. 2019;3:PO.18.00143. doi: 10.1200/PO.18.00143. Epub 2019 Apr 11. JCO Precis Oncol. 2019. PMID: 31517177 Free PMC article.
In silico analysis of PFN1 related to amyotrophic lateral sclerosis.
Pereira GRC, Tellini GHAS, De Mesquita JF. Pereira GRC, et al. PLoS One. 2019 Jun 19;14(6):e0215723. doi: 10.1371/journal.pone.0215723. eCollection 2019. PLoS One. 2019. PMID: 31216283 Free PMC article.

References

1. Venter JC, Levy S, Stockwell T, Remington K, Halpern A. Massive parallelism, randomness and genomic advances. Nat Genet 2003; 33: 219–227. - PubMed
1. Zhang J, Chiodini R, Badr A, Zhang G. The impact of next-generation sequencing on genomics. J Genet Genomics 2011; 38: 95–109. - PMC - PubMed
1. Gullapalli RR, Lyons-Weiler M, Petrosko P, Dhir R, Becich MJ, LaFramboise WA. Clinical integration of next-generation sequencing technology. Clin Lab Med 2012; 32: 585–599. - PMC - PubMed
1. Bahassi, el M, Stambrook PJ. Next-generation sequencing technologies: breaking the sound barrier of human genetics. Mutagenesis 2014; 29: 303–310. - PMC - PubMed
1. Pavlopoulos GA, Oulas A, Iacucci E, Sifrim A, Moreau Y, Schneider R et al. Unraveling genomic variation from next generation sequencing data. BioData Min 2013; 6: 13. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Impact of germline and somatic missense variations on drug binding sites

Affiliations

Impact of germline and somatic missense variations on drug binding sites

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical