Nonnegative matrix factorization: an analytical and interpretive tool in computational biology
- PMID: 18654623
- PMCID: PMC2447881
- DOI: 10.1371/journal.pcbi.1000029
Nonnegative matrix factorization: an analytical and interpretive tool in computational biology
Abstract
In the last decade, advances in high-throughput technologies such as DNA microarrays have made it possible to simultaneously measure the expression levels of tens of thousands of genes and proteins. This has resulted in large amounts of biological data requiring analysis and interpretation. Nonnegative matrix factorization (NMF) was introduced as an unsupervised, parts-based learning paradigm involving the decomposition of a nonnegative matrix V into two nonnegative matrices, W and H, via a multiplicative updates algorithm. In the context of a pxn gene expression matrix V consisting of observations on p genes from n samples, each column of W defines a metagene, and each column of H represents the metagene expression pattern of the corresponding sample. NMF has been primarily applied in an unsupervised setting in image and natural language processing. More recently, it has been successfully utilized in a variety of applications in computational biology. Examples include molecular pattern discovery, class comparison and prediction, cross-platform and cross-species analysis, functional characterization of genes and biomedical informatics. In this paper, we review this method as a data analytical and interpretive tool in computational biology with an emphasis on these applications.
Conflict of interest statement
The author has declared that no competing interests exist.
Figures
References
-
- Lee DD, Seung SH. Learning the parts of objects by nonnegative matrix factorization. Nature. 1999;401:788–791. - PubMed
-
- Lee DD, Seung SH. Algorithms for nonnegative matrix factorization. Adv Neural Inform Process Syst. 2001;13:556–562.
-
- Paatero P, Tapper U. Positive matrix factorization: A nonnegative factor model with optimal utilization of error estimates of data values. Environmetrics. 1994;5:111–126.
-
- Paatero P. Least-squares formulation of robust non-negative factor analysis. Chemometrics Intelligent Laboratory Sys. 1997;37:23–35.
-
- Paatero P. The Multilinear Engine—A table-driven least squares program for solving multilinear problems, including the n-way parallel factor analysis model. J Computational Graphical Stat. 1999;8:854–888.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
