Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Feb 1;35(3):389-397.
doi: 10.1093/bioinformatics/bty624.

DMCM: a Data-adaptive Mutation Clustering Method to identify cancer-related mutation clusters

Affiliations

DMCM: a Data-adaptive Mutation Clustering Method to identify cancer-related mutation clusters

Xinguo Lu et al. Bioinformatics. .

Abstract

Motivation: Functional somatic mutations within coding amino acid sequences confer growth advantage in pathogenic process. Most existing methods for identifying cancer-related mutations focus on the single amino acid or the entire gene level. However, gain-of-function mutations often cluster in specific protein regions instead of existing independently in the amino acid sequences. Some approaches for identifying mutation clusters with mutation density on amino acid chain have been proposed recently. But their performance in identification of mutation clusters remains to be improved.

Results: Here we present a Data-adaptive Mutation Clustering Method (DMCM), in which kernel density estimate (KDE) with a data-adaptive bandwidth is applied to estimate the mutation density, to find variable clusters with different lengths on amino acid sequences. We apply this approach in the mutation data of 571 genes in over twenty cancer types from The Cancer Genome Atlas (TCGA). We compare the DMCM with M2C, OncodriveCLUST and Pfam Domain and find that DMCM tends to identify more significant clusters. The cross-validation analysis shows DMCM is robust and cluster cancer type enrichment analysis shows that specific cancer types are enriched for specific mutation clusters.

Availability and implementation: DMCM is written in Python and analysis methods of DMCM are written in R. They are all released online, available through https://github.com/XinguoLu/DMCM.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Publication types