Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2010;4(1):72-90.
doi: 10.1504/ijdmb.2010.030968.

Matrix factorisation methods applied in microarray data analysis

Affiliations
Review

Matrix factorisation methods applied in microarray data analysis

Andrew V Kossenkov et al. Int J Data Min Bioinform. 2010.

Abstract

Numerous methods have been applied to microarray data to group genes into clusters that show similar expression patterns. These methods assign each gene to a single group, which does not reflect the widely held view among biologists that most, if not all, genes in eukaryotes are involved in multiple biological processes and therefore will be multiply regulated. Here, we review several methods of matrix factorisation that identify patterns of behaviour in transcriptional response and assign genes to multiple patterns. We focus on these methods rather than traditional clustering methods applied to microarray data, which assign one gene to one cluster.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Matrix Factorization. The data matrix, D, is modeled as arising from the multiplication of a set of patterns, the rows of P, and the assignment of genes to those patterns with varying strengths, the columns of A.
Figure 2
Figure 2
Singular Value Decomposition. The initial matrix D is decomposed into the product of the left singular matrix U, the diagonal matrix of ordered singular values S, and the right singular matrix VT. The vectors vk are the eigenassays, while vectors uI are the eigengenes.
Figure 3
Figure 3
Truncated Singular Value Decomposition. It is possible to discard smaller singular values, keeping only the first p singular values that keep most of the expression information.
Figure 4
Figure 4
Atomic Domain. Bayesian Decomposition utilizes two atomic domains, one that maps to the A matrix and one to the P matrix. The kernel functions, K, allow an atom to map to multiple matrix elements, introducing correlations in the model. Here, K1 is a simple mapping, with all the amplitude of an atom going to a single matrix element, while K2 represents a kernel for a transcriptional regulator, with five genes responding in a correlated manner.

Similar articles

Cited by

References

    1. Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X, Powell JI, Yang L, Marti GE, Moore T, Hudson J, Lu JL, Lewis DB, Tibshirani R, Sherlock G, Chan WC, Greiner TC, Weisenburger DD, Armitage JO, Warnke R, Levy R, Wilson W, Grever MR, Byrd JC, Botstein D, Brown PO, Staudt LM. Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature. 2000;403(6769):503–11. - PubMed
    1. Alter O, Brown PO, Botstein D. Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci U S A. 2000;97(18):10101–6. - PMC - PubMed
    1. Bidaut G, Ochs MF. Clutrfree: cluster tree visualization and interpretation. Bioinformatics. 2004;20(16):2869–71. - PubMed
    1. Bidaut G, Suhre K, Claverie JM, Ochs MF. Determination of strongly overlapping signaling activity from microarray data. BMC Bioinformatics. 2006;7(1):99. - PMC - PubMed
    1. Bleharski JR, Li H, Meinken C, Graeber TG, Ochoa MT, Yamamura M, Burdick A, Sarno EN, Wagner M, Rollinghoff M, Rea TH, Colonna M, Stenger S, Bloom BR, Eisenberg D, Modlin RL. Use of genetic profiling in leprosy to discriminate clinical forms of the disease. Science. 2003;301(5639):1527–30. - PubMed

Publication types

MeSH terms

LinkOut - more resources