Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020:25:551-562.

Integrated Cancer Subtyping using Heterogeneous Genome-Scale Molecular Datasets

Affiliations

Integrated Cancer Subtyping using Heterogeneous Genome-Scale Molecular Datasets

Suzan Arslanturk et al. Pac Symp Biocomput. 2020.

Abstract

Vast repositories of heterogeneous data from existing sources present unique opportunities. Taken individually, each of the datasets offers solutions to important domain and source-specific questions. Collectively, they represent complementary views of related data entities with an aggregate information value often well exceeding the sum of its parts. Integration of heterogeneous data is therefore paramount to i) obtain a more unified picture and comprehensive view of the relations, ii) achieve more robust results, iii) improve the accuracy and integrity, and iv) illuminate the complex interactions among data features. In this paper, we have proposed a data integration methodology to identify subtypes of cancer using multiple data types (mRNA, methylation, microRNA and somatic variants) and different data scales that come from different platforms (microarray, sequencing, etc.). The Cancer Genome Atlas (TCGA) dataset is used to build the data integration and cancer subtyping framework. The proposed data integration and disease subtyping approach accurately identifies novel subgroups of patients with significantly different survival profiles. With current availability of vast genomics, and variant data for cancer, the proposed data integration system will better differentiate cancer and patient subtypes for risk and outcome prediction and targeted treatment planning without additional cost and precious lost time.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Framework of the proposed subtyping and data integration method
Figure 2.
Figure 2.
Kaplan-Meier survival curves of integrative genomic data clustering using proposed approach (left), PINS (center) and CC (right).

References

    1. Saria S, Goldenberg A. “Subtyping: What It is and Its Role in Precision Medicine”, IEEE Intelligent Systems (Volume: 30, Issue: 4, July-August 2015)
    1. Rivenbark AG, O’Connor SM, Coleman WB, 2013. Molecular and Cellular Heterogeneity in Breast Cancer: Challenges for Personalized Medicine, The American Journal of Pathology, 183(4), 1113–1124. - PMC - PubMed
    1. Vaz-Luis I, Winer EP, Lin NU, 2013. Human epidermal growth factor receptor-2-positive breast cancer: does estrogen receptor status define two distinct subtypes Annals of Onc, 24(2), 283–291. - PMC - PubMed
    1. Hollander D, Savage PIM, & Brown PH 2013. Targeted Therapy for Breast Cancer Prevention, Frontiers in Oncology, 3, 250. - PMC - PubMed
    1. Perou CM, Sørlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, et al., 2000. Molecular portraits of human breast tumours. Nature 406: 747–752. - PubMed

Publication types