Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jun 13:14:190.
doi: 10.1186/1471-2105-14-190.

Utilizing protein structure to identify non-random somatic mutations

Affiliations

Utilizing protein structure to identify non-random somatic mutations

Gregory A Ryslik et al. BMC Bioinformatics. .

Abstract

Background: Human cancer is caused by the accumulation of somatic mutations in tumor suppressors and oncogenes within the genome. In the case of oncogenes, recent theory suggests that there are only a few key "driver" mutations responsible for tumorigenesis. As there have been significant pharmacological successes in developing drugs that treat cancers that carry these driver mutations, several methods that rely on mutational clustering have been developed to identify them. However, these methods consider proteins as a single strand without taking their spatial structures into account. We propose an extension to current methodology that incorporates protein tertiary structure in order to increase our power when identifying mutation clustering.

Results: We have developed iPAC (identification of Protein Amino acid Clustering), an algorithm that identifies non-random somatic mutations in proteins while taking into account the three dimensional protein structure. By using the tertiary information, we are able to detect both novel clusters in proteins that are known to exhibit mutation clustering as well as identify clusters in proteins without evidence of clustering based on existing methods. For example, by combining the data in the Protein Data Bank (PDB) and the Catalogue of Somatic Mutations in Cancer, our algorithm identifies new mutational clusters in well known cancer proteins such as KRAS and PI3KC α. Further, by utilizing the tertiary structure, our algorithm also identifies clusters in EGFR, EIF2AK2, and other proteins that are not identified by current methodology. The R package is available at: http://www.bioconductor.org/packages/2.12/bioc/html/iPAC.html.

Conclusion: Our algorithm extends the current methodology to identify oncogenic activating driver mutations by utilizing tertiary protein structure when identifying nonrandom somatic residue mutation clusters.

PubMed Disclaimer

Figures

Figure 1
Figure 1
KRASα-carbons in 3D Space.
Figure 2
Figure 2
KRASα-carbons mapped to the x-axis using MDS.
Figure 3
Figure 3
An example of constructing the order statistics. Suppose we had 3 samples of a protein that is N amino acids long. If amino acid i has a “*” above it, that indicates that the amino acid for that sample had a non-synonymous missense mutation. The samples are then collapsed together and the number of mutations for each residue is shown above the box on the right. These counts form the order statistics. The first mutation is on residue 2 (X(1)=2), the next 3 mutations are on residue 3 (X(2)=X(3)=X(4)=3), the next mutation is on residue 5 (X(5)=5) and the last 2 mutations are on residue 6 (X(6)=X(7)=6).
Figure 4
Figure 4
A comparison of NMC andiPAC over all the structures that were found to be significant. The number of structures in each category is shown along with the percentage.
Figure 5
Figure 5
The EGFR Structure (PDB ID 2GS7)(structure color coded by region: 1) (cluster 1 - light blue and yellow, 2) (cluster 2 - blue and 3) cluster 3 - yellow. The boundary α-carbon amino acids of 719, 751, 768, 790 and 858 are shown as purple spheres (see Table 2 for details of each cluster).
Figure 6
Figure 6
The 3GFT structure color coded by region: amino acids 13-60 are light blue, 62-116 are red and 118-145 are yellow. The boundary α-carbon amino acids of 12,61,117 and 146 are shown as purple spheres (see Table 3 for details of each cluster).
Figure 7
Figure 7
The 3TV4 structure color coded by region: 1) Amino 464-600 are light blue 2) Amino Acids 601-671 are orange. The α-carbons of the mutated amino acids 464, 466, 469, 581, 596, 597, 601 and 671 are shown as purple spheres. Amino acid 600 is colored red (see Table 4 for details on each cluster).

Similar articles

Cited by

References

    1. Vogelstein B, Kinzler KW. Cancer genes and the pathways they control. Nat Med. 2004;10(8):789–799. doi: 10.1038/nm1087. [ http://www.ncbi.nlm.nih.gov/pubmed/15286780] [PMID: 15286780] - DOI - PubMed
    1. Greenman C, Stephens P, Smith R, Dalgliesh GL, Hunter C, Bignell G, Davies H, Teague J, Butler A, Stevens C, Edkins S, O’Meara S, Vastrik I, Schmidt EE, Avis T, Barthorpe S, Bhamra G, Buck G, Choudhury B, Clements J, Cole J, Dicks E, Forbes S, Gray K, Halliday K, Harrison R, Hills K, Hinton J, Jenkinson A, Jones D. et al.Patterns of somatic mutation in human cancer genomes. Nature. 2007;446(7132):153–158. doi: 10.1038/nature05610. [ http://www.nature.com/doifinder/10.1038/nature05610] - DOI - DOI - PMC - PubMed
    1. Weinstein IB, Joe AK. Mechanisms of disease: Oncogene addiction–a rationale for molecular targeting in cancer therapy. Nat Clin Pract Oncol. 2006;3(8):448–457. doi: 10.1038/ncponc0558. [ http://www.ncbi.nlm.nih.gov/pubmed/16894390] [PMID: 16894390] - DOI - PubMed
    1. Torkamani A, Schork NJ. Prediction of cancer driver mutations in protein kinases. Cancer Res. 2008;68(6):1675–1682. doi: 10.1158/0008-5472.CAN-07-5283. [ http://www.ncbi.nlm.nih.gov/pubmed/18339846] [PMID: 18339846] - DOI - PubMed
    1. Bardelli A, Parsons DW, Silliman N, Ptak J, Szabo S, Saha S, Markowitz S, Willson JKV, Parmigiani G, Kinzler KW, Vogelstein B, Velculescu VE. Mutational analysis of the tyrosine kinome in colorectal cancers. Sci (New York, N.Y.) 2003;300(5621):949. doi: 10.1126/science.1082596. [ http://www.ncbi.nlm.nih.gov/pubmed/12738854] [PMID: 12738854] - DOI - PubMed

Publication types

Substances