Data-adaptive test statistics for microarray data
- PMID: 16204088
- DOI: 10.1093/bioinformatics/bti1119
Data-adaptive test statistics for microarray data
Abstract
Motivation: An important task in microarray data analysis is the selection of genes that are differentially expressed between different tissue samples, such as healthy and diseased. However, microarray data contain an enormous number of dimensions (genes) and very few samples (arrays), a mismatch which poses fundamental statistical problems for the selection process that have defied easy resolution.
Results: In this paper, we present a novel approach to the selection of differentially expressed genes in which test statistics are learned from data using a simple notion of reproducibility in selection results as the learning criterion. Reproducibility, as we define it, can be computed without any knowledge of the 'ground-truth', but takes advantage of certain properties of microarray data to provide an asymptotically valid guide to expected loss under the true data-generating distribution. We are therefore able to indirectly minimize expected loss, and obtain results substantially more robust than conventional methods. We apply our method to simulated and oligonucleotide array data.
Availability: By request to the corresponding author.
Similar articles
-
Large scale data mining approach for gene-specific standardization of microarray gene expression data.Bioinformatics. 2006 Dec 1;22(23):2898-904. doi: 10.1093/bioinformatics/btl500. Epub 2006 Oct 10. Bioinformatics. 2006. PMID: 17032674
-
MDQC: a new quality assessment method for microarrays based on quality control reports.Bioinformatics. 2007 Dec 1;23(23):3162-9. doi: 10.1093/bioinformatics/btm487. Epub 2007 Oct 12. Bioinformatics. 2007. PMID: 17933854
-
Variance stabilization and normalization for one-color microarray data using a data-driven multiscale approach.Bioinformatics. 2006 Oct 15;22(20):2547-53. doi: 10.1093/bioinformatics/btl412. Epub 2006 Jul 28. Bioinformatics. 2006. PMID: 16877753
-
Classification based upon gene expression data: bias and precision of error rates.Bioinformatics. 2007 Jun 1;23(11):1363-70. doi: 10.1093/bioinformatics/btm117. Epub 2007 Mar 28. Bioinformatics. 2007. PMID: 17392326 Review.
-
Experimental design and low-level analysis of microarray data.Int Rev Neurobiol. 2004;60:25-58. doi: 10.1016/S0074-7742(04)60002-X. Int Rev Neurobiol. 2004. PMID: 15474586 Review. No abstract available.
Cited by
-
Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data.BMC Bioinformatics. 2006 Jul 26;7:359. doi: 10.1186/1471-2105-7-359. BMC Bioinformatics. 2006. PMID: 16872483 Free PMC article.
-
Empirical study of supervised gene screening.BMC Bioinformatics. 2006 Dec 18;7:537. doi: 10.1186/1471-2105-7-537. BMC Bioinformatics. 2006. PMID: 17176468 Free PMC article.
-
A unified framework for finding differentially expressed genes from microarray experiments.BMC Bioinformatics. 2007 Sep 18;8:347. doi: 10.1186/1471-2105-8-347. BMC Bioinformatics. 2007. PMID: 17877806 Free PMC article.
-
Next station in microarray data analysis: GEPAS.Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W486-91. doi: 10.1093/nar/gkl197. Nucleic Acids Res. 2006. PMID: 16845056 Free PMC article.
-
Probe-level linear model fitting and mixture modeling results in high accuracy detection of differential gene expression.BMC Bioinformatics. 2006 Aug 25;7:391. doi: 10.1186/1471-2105-7-391. BMC Bioinformatics. 2006. PMID: 16934150 Free PMC article.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources