Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Oct 15;26(20):2578-85.
doi: 10.1093/bioinformatics/btq470. Epub 2010 Aug 16.

Semi-supervised recursively partitioned mixture models for identifying cancer subtypes

Affiliations

Semi-supervised recursively partitioned mixture models for identifying cancer subtypes

Devin C Koestler et al. Bioinformatics. .

Abstract

Motivation: Patients with identical cancer diagnoses often progress differently. The disparity we see in disease progression and treatment response can be attributed to the idea that two histologically similar cancers may be completely different diseases on the molecular level. Methods for identifying cancer subtypes associated with patient survival have the capacity to be powerful instruments for understanding the biochemical processes that underlie disease progression as well as providing an initial step toward more personalized therapy for cancer patients. We propose a method called semi-supervised recursively partitioned mixture models (SS-RPMM) that utilizes array-based genetic and patient-level clinical data for finding cancer subtypes that are associated with patient survival.

Results: In the proposed SS-RPMM, cancer subtypes are identified using a selected subset of genes that are associated with survival time. Since survival information is used in the gene selection step, this method is semi-supervised. Unlike other semi-supervised clustering classification methods, SS-RPMM does not require specification of the number of cancer subtypes, which is often unknown. In a simulation study, our proposed method compared favorably with other competing semi-supervised methods, including: semi-supervised clustering and supervised principal components analysis. Furthermore, an analysis of mesothelioma cancer data using SS-RPMM, revealed at least two distinct methylation profiles that are informative for survival.

Availability: The analyses implemented in this article were carried out using R (http://www.r.project.org/).

Contact: devin_koestler@brown.edu; e_andres_houseman@brown.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Average pseudo-R2 between SS-RPMM, SS-Clust, SPCA using the first principal component only SPCA(1), and using the first and second principal components SPCA(2), for different settings of M and K. Results are based on 100 simulations.
Fig. 2.
Fig. 2.
Heatmap of predicted class memberships for the observations in the testing set using the average beta values for the 41 loci with largest absolute Cox-scores. Observations within predicted class as well the 41 loci were clustered using hierarchical clustering with Ward linkage and Euclidean distance metric.

Similar articles

Cited by

References

    1. Alizadeh AA, et al. Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature. 2000;403:503–511. - PubMed
    1. Ang PW, et al. Comprehensive profiling of dna methylation in colorectal cancer reveals subgroups with distinct clinicopathological and molecular features. BMC Cancer. 2010;10:227. - PMC - PubMed
    1. Bair E, Tibshirani R. Semi-supervised methods to predict patient survival from gene expression data. PLoS Biol. 2004;2:E108. - PMC - PubMed
    1. Beer DG, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat. Med. 2002;8:816–824. - PubMed
    1. Bullinger L, et al. Use of gene-expression profiling to identify prognostic subclasses in adult acute myeloid leukemia. N. Engl. J. Med. 2004;350:1605–1616. - PubMed

Publication types