Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jul 3:15:231.
doi: 10.1186/1471-2105-15-231.

A spatial simulation approach to account for protein structure when identifying non-random somatic mutations

Affiliations

A spatial simulation approach to account for protein structure when identifying non-random somatic mutations

Gregory A Ryslik et al. BMC Bioinformatics. .

Abstract

Background: Current research suggests that a small set of "driver" mutations are responsible for tumorigenesis while a larger body of "passenger" mutations occur in the tumor but do not progress the disease. Due to recent pharmacological successes in treating cancers caused by driver mutations, a variety of methodologies that attempt to identify such mutations have been developed. Based on the hypothesis that driver mutations tend to cluster in key regions of the protein, the development of cluster identification algorithms has become critical.

Results: We have developed a novel methodology, SpacePAC (Spatial Protein Amino acid Clustering), that identifies mutational clustering by considering the protein tertiary structure directly in 3D space. By combining the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC) and the spatial information in the Protein Data Bank (PDB), SpacePAC is able to identify novel mutation clusters in many proteins such as FGFR3 and CHRM2. In addition, SpacePAC is better able to localize the most significant mutational hotspots as demonstrated in the cases of BRAF and ALK. The R package is available on Bioconductor at: http://www.bioconductor.org/packages/release/bioc/html/SpacePAC.html.

Conclusion: SpacePAC adds a valuable tool to the identification of mutational clusters while considering protein tertiary structure.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Statistic construction. In this example, we consider r∈{3,9} and up to 3 potential mutational hotspots in the protein. First, μ and σ are calculated over each column. Next, we normalize each entry in the column by calculating Zi,s,r=Xi,s,rμs,rσs,r. We then take the maximum over each row to get Z0,…,Z1000. The percentage of times Z0Zi, where i∈{1,…,1000}, is the p-value of our observed statistic Z0. Note that if Z0 is greater than Zi for all 1000 simulations, we report a p-value of <1.00 E-03.
Figure 2
Figure 2
Organizing the data example. In our example, the protein has 5 residues with mutations. The residues are sorted from largest to smallest (so residue 1 has the largest number of mutations, residue 2 the second largest number of mutations, etc.), and the inside of the table is calculated as the sum of the mutations on both residues using all the samples in the study. For instance, as residue 1 has 50 mutations and residue 2 has 40 mutations, there were 90 total samples that had a mutation either on residue 1 or 2. In the actual code, only the lower half of the table is considered and then only sequentially to decrease running time, but we present the whole table here for clarity.
Figure 3
Figure 3
Algorithm execution example. This figure refers to the data in Figure 2. The first index, i, represents the row and the second index, j, represents the column. The third index, s, represents the total number of mutations at amino acids i and j. Beginning in position (2,1,90), we then add (3,1,80) to the stack, then {(4,1,70), (3,2,70)} and so forth. After each addition to the stack, we pick the element with the highest value in the third position of the 3-tuple.
Figure 4
Figure 4
Results summary. A breakout of what biologically relevant regions are overlapped by the most significant cluster for each of the 18 proteins. Overall, 77% of the hotspots overlap a binding site or a protein domain.
Figure 5
Figure 5
CHRM 2 clustering. The CHRM2 structure (PDB ID: 3UON) where residues 52 and 144 are highlighted.
Figure 6
Figure 6
FGFR3 clustering. The FGFR3 structure (PDB ID: 1RY7) where residue 248 is highlighted blue.
Figure 7
Figure 7
BRAF clustering. The BRAF structure (PDB ID: 3Q96) where cluster 464-466 is shown in blue, 469-471 is shown in red and 595-597 is shown in purple. The central residue in each cluster (465, 470 and 596 for the blue, red and purple clusters, respectively) is labeled.
Figure 8
Figure 8
ALK clustering. The ALK structure (PDB ID: 2XBA) where cluster 1173-1175 is shown in blue and cluster 1274-1276 is shown in red. The central residue in each cluster (1174 and 1275 for the blue and red clusters respectively) is labeled.

Similar articles

Cited by

References

    1. Vogelstein B, Kinzler KW. Cancer genes and the pathways they control. Nat Med. 2004;10(8):789–799. [ http://www.ncbi.nlm.nih.gov/pubmed/15286780]. [PMID: 15286780] - PubMed
    1. Faivre S, Kroemer G, Raymond E. Current development of mTOR, inhibitors as anticancer agents. Nat Rev Drug Discov. 2006;5(8):671–688. [ http://www.nature.com/doifinder/10.1038/nrd2062] - DOI - PubMed
    1. Hartmann JT, Haap M, Kopp HG, Lipp HP. Tyrosine kinase inhibitors - a review on pharmacology, metabolism and side effects. Curr Drug Metab. 2009;10(5):470–481. [PMID:, 1968 9244] - PubMed
    1. Moreau P, Richardson PG, Cavo M, Orlowski RZ, San Miguel JF, Palumbo A, Harousseau JL. Proteasome inhibitors in multiple myeloma: 10 years later. Blood. 2012;120(5):947–959. [ http://www.bloodjournal.org/cgi/doi/10.1182/blood-2012-04-403733] - DOI - PMC - PubMed
    1. Greenman C, Stephens P, Smith R, Dalgliesh GL, Hunter C, Bignell G, Davies H, Teague J, Butler A, Stevens C, Edkins S, Vastrik I, Schmidt EE, Avis T, Barthorpe S, Bhamra G, Buck G, Choudhury B, Clements J, Cole J, Dicks E, Forbes S, Gray K, Halliday K, Harrison R, Hills K, Hinton J, Jenkinson A, Jones D, Menzies A, Mironenko T. O’Meara S et al.Patterns of somatic mutation in human cancer genomes. Nature. 2007;446(7132):153–158. [ http://www.nature.com/doifinder/10.1038/nature05610] - DOI - PMC - PubMed

Publication types