A mutation profile for top-k patient search exploiting Gene-Ontology and orthogonal non-negative matrix factorization
- PMID: 26209432
- PMCID: PMC4672174
- DOI: 10.1093/bioinformatics/btv409
A mutation profile for top-k patient search exploiting Gene-Ontology and orthogonal non-negative matrix factorization
Erratum in
-
A mutation profile for top-k patient search exploiting Gene-Ontology and orthogonal non-negative matrix factorization.Bioinformatics. 2016 Jul 1;32(13):2081. doi: 10.1093/bioinformatics/btw104. Epub 2016 Mar 15. Bioinformatics. 2016. PMID: 27153726 Free PMC article. No abstract available.
Abstract
Motivation: As the quantity of genomic mutation data increases, the likelihood of finding patients with similar genomic profiles, for various disease inferences, increases. However, so does the difficulty in identifying them. Similarity search based on patient mutation profiles can solve various translational bioinformatics tasks, including prognostics and treatment efficacy predictions for better clinical decision making through large volume of data. However, this is a challenging problem due to heterogeneous and sparse characteristics of the mutation data as well as their high dimensionality.
Results: To solve this problem we introduce a compact representation and search strategy based on Gene-Ontology and orthogonal non-negative matrix factorization. Statistical significance between the identified cancer subtypes and their clinical features are computed for validation; results show that our method can identify and characterize clinically meaningful tumor subtypes comparable or better in most datasets than the recently introduced Network-Based Stratification method while enabling real-time search. To the best of our knowledge, this is the first attempt to simultaneously characterize and represent somatic mutational data for efficient search purposes.
Availability: The implementations are available at: https://sites.google.com/site/postechdm/research/implementation/orgos.
Contact: sael@cs.stonybrook.edu or hwanjoyu@postech.ac.kr
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press.
Figures




Similar articles
-
Simultaneous discovery of cancer subtypes and subtype features by molecular data integration.Bioinformatics. 2016 Sep 1;32(17):i445-i454. doi: 10.1093/bioinformatics/btw434. Bioinformatics. 2016. PMID: 27587661
-
Efficient methods for identifying mutated driver pathways in cancer.Bioinformatics. 2012 Nov 15;28(22):2940-7. doi: 10.1093/bioinformatics/bts564. Epub 2012 Sep 14. Bioinformatics. 2012. PMID: 22982574
-
Gene prioritization using Bayesian matrix factorization with genomic and phenotypic side information.Bioinformatics. 2018 Jul 1;34(13):i447-i456. doi: 10.1093/bioinformatics/bty289. Bioinformatics. 2018. PMID: 29949967 Free PMC article.
-
A mutation profile for top-k patient search exploiting Gene-Ontology and orthogonal non-negative matrix factorization.Bioinformatics. 2016 Jul 1;32(13):2081. doi: 10.1093/bioinformatics/btw104. Epub 2016 Mar 15. Bioinformatics. 2016. PMID: 27153726 Free PMC article. No abstract available.
-
Transfer learning across ontologies for phenome-genome association prediction.Bioinformatics. 2017 Feb 15;33(4):529-536. doi: 10.1093/bioinformatics/btw649. Bioinformatics. 2017. PMID: 27797759
Cited by
-
Driver gene mutations based clustering of tumors: methods and applications.Bioinformatics. 2018 Jul 1;34(13):i404-i411. doi: 10.1093/bioinformatics/bty232. Bioinformatics. 2018. PMID: 29950003 Free PMC article.
-
A novel network regularized matrix decomposition method to detect mutated cancer genes in tumour samples with inter-patient heterogeneity.Sci Rep. 2017 Jun 6;7(1):2855. doi: 10.1038/s41598-017-03141-w. Sci Rep. 2017. PMID: 28588243 Free PMC article.
-
GIFT: Guided and Interpretable Factorization for Tensors with an application to large-scale multi-platform cancer analysis.Bioinformatics. 2018 Dec 15;34(24):4151-4158. doi: 10.1093/bioinformatics/bty490. Bioinformatics. 2018. PMID: 29931238 Free PMC article.
-
Discovering mutated driver genes through a robust and sparse co-regularized matrix factorization framework with prior information from mRNA expression patterns and interaction network.BMC Bioinformatics. 2018 Jun 5;19(1):214. doi: 10.1186/s12859-018-2218-y. BMC Bioinformatics. 2018. PMID: 29871594 Free PMC article.
-
A somatic mutation-derived LncRNA signatures of genomic instability predicts the prognosis and tumor microenvironment immune characters in hepatocellular carcinoma.Hepatol Int. 2022 Oct;16(5):1220-1233. doi: 10.1007/s12072-022-10375-y. Epub 2022 Aug 10. Hepatol Int. 2022. PMID: 35947245
References
-
- Dennis G., et al. (2003) DAVID: Database for annotation, visualization, and integrated discovery. Genome Biol., 4, P3. - PubMed
-
- Ding C. (2006) Orthogonal nonnegative matrix tri-factorizations for clustering. In: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, USA. ACM Press, pp. 126–135.
-
- Fan J., Li R. (2002) Variable selection for cox’s proportional hazards model and frailty model. Ann. Stat., 30, 74–99.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials