Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb;12(1):48-51.
doi: 10.1016/j.gpb.2013.06.001. Epub 2013 Aug 8.

CloudNMF: a MapReduce implementation of nonnegative matrix factorization for large-scale biological datasets

Affiliations

CloudNMF: a MapReduce implementation of nonnegative matrix factorization for large-scale biological datasets

Ruiqi Liao et al. Genomics Proteomics Bioinformatics. 2014 Feb.

Abstract

In the past decades, advances in high-throughput technologies have led to the generation of huge amounts of biological data that require analysis and interpretation. Recently, nonnegative matrix factorization (NMF) has been introduced as an efficient way to reduce the complexity of data as well as to interpret them, and has been applied to various fields of biological research. In this paper, we present CloudNMF, a distributed open-source implementation of NMF on a MapReduce framework. Experimental evaluation demonstrated that CloudNMF is scalable and can be used to deal with huge amounts of data, which may enable various kinds of a high-throughput biological data analysis in the cloud. CloudNMF is freely accessible at http://admis.fudan.edu.cn/projects/CloudNMF.html.

Keywords: Bioinformatics; MapReduce; Nonnegative matrix factorization.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Using CloudNMF with a local Hadoop cluster
Figure 2
Figure 2
Using CloudNMF with Amazon Web Services
Figure 3
Figure 3
Performance of CloudNMF A. Performance of CloudNMF on four real datasets shows the linear correlation of runtime per iteration with a number of nonzero elements in the matrix. B. Performance of CloudNMF on simulated matrices of different sizes but with the same number of nonzero elements shows that the runtime per iteration is linear to the logarithm of matrix size. Note that the X-axis is on a logarithmic scale.
None

Similar articles

Cited by

References

    1. Lee D.D., Seung H.S. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401:788–791. - PubMed
    1. Brunet J.P., Tamayo P., Golub T.R., Mesirov J.P. Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci U S A. 2004;101:4164–4169. - PMC - PubMed
    1. Gao Y., Church G. Improving molecular cancer class discovery through sparse non-negative matrix factorization. Bioinformatics. 2005;21:3970–3975. - PubMed
    1. Carmona-Saez P., Pascual-Marqui R.D., Tirado F., Carazo J.M., Pascual-Montano A. Biclustering of gene expression data by non-smooth non-negative matrix factorization. BMC Bioinformatics. 2006;7:78. - PMC - PubMed
    1. Qi Q., Zhao Y., Li M., Simon R. Non-negative matrix factorization of gene expression profiles: a plug-in for BRB-ArrayTools. Bioinformatics. 2009;25:545–547. - PubMed

Publication types