Methods for evaluating gene expression from Affymetrix microarray datasets
- PMID: 18559105
- PMCID: PMC2442103
- DOI: 10.1186/1471-2105-9-284
Methods for evaluating gene expression from Affymetrix microarray datasets
Abstract
Background: Affymetrix high density oligonucleotide expression arrays are widely used across all fields of biological research for measuring genome-wide gene expression. An important step in processing oligonucleotide microarray data is to produce a single value for the gene expression level of an RNA transcript using one of a growing number of statistical methods. The challenge for the researcher is to decide on the most appropriate method to use to address a specific biological question with a given dataset. Although several research efforts have focused on assessing performance of a few methods in evaluating gene expression from RNA hybridization experiments with different datasets, the relative merits of the methods currently available in the literature for evaluating genome-wide gene expression from Affymetrix microarray data collected from real biological experiments remain actively debated.
Results: The present study reports a comprehensive survey of the performance of all seven commonly used methods in evaluating genome-wide gene expression from a well-designed experiment using Affymetrix microarrays. The experiment profiled eight genetically divergent barley cultivars each with three biological replicates. The dataset so obtained confers a balanced and idealized structure for the present analysis. The methods were evaluated on their sensitivity for detecting differentially expressed genes, reproducibility of expression values across replicates, and consistency in calling differentially expressed genes. The number of genes detected as differentially expressed among methods differed by a factor of two or more at a given false discovery rate (FDR) level. Moreover, we propose the use of genes containing single feature polymorphisms (SFPs) as an empirical test for comparison among methods for the ability to detect true differential gene expression on the basis that SFPs largely correspond to cis-acting expression regulators. The PDNN method demonstrated superiority over all other methods in every comparison, whilst the default Affymetrix MAS5.0 method was clearly inferior.
Conclusion: A comprehensive assessment of seven commonly used data extraction methods based on an extensive barley Affymetrix gene expression dataset has shown that the PDNN method has superior performance for the detection of differentially expressed genes.
Figures

Similar articles
-
Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data.BMC Bioinformatics. 2005 Feb 10;6:26. doi: 10.1186/1471-2105-6-26. BMC Bioinformatics. 2005. PMID: 15705192 Free PMC article.
-
Robust detection and genotyping of single feature polymorphisms from gene expression data.PLoS Comput Biol. 2009 Mar;5(3):e1000317. doi: 10.1371/journal.pcbi.1000317. Epub 2009 Mar 13. PLoS Comput Biol. 2009. PMID: 19282978 Free PMC article.
-
Effect of various normalization methods on Applied Biosystems expression array system data.BMC Bioinformatics. 2006 Dec 15;7:533. doi: 10.1186/1471-2105-7-533. BMC Bioinformatics. 2006. PMID: 17173684 Free PMC article.
-
Detection call algorithms for high-throughput gene expression microarray data.Brief Bioinform. 2010 Mar;11(2):244-52. doi: 10.1093/bib/bbp055. Epub 2009 Nov 25. Brief Bioinform. 2010. PMID: 19939941 Free PMC article. Review.
-
Cross species analysis of microarray expression data.Bioinformatics. 2009 Jun 15;25(12):1476-83. doi: 10.1093/bioinformatics/btp247. Epub 2009 Apr 8. Bioinformatics. 2009. PMID: 19357096 Free PMC article. Review.
Cited by
-
The crimson conundrum: heme toxicity and tolerance in GAS.Front Cell Infect Microbiol. 2014 Nov 5;4:159. doi: 10.3389/fcimb.2014.00159. eCollection 2014. Front Cell Infect Microbiol. 2014. PMID: 25414836 Free PMC article.
-
Comparison of gene expression microarray data with count-based RNA measurements informs microarray interpretation.BMC Genomics. 2014 Aug 4;15(1):649. doi: 10.1186/1471-2164-15-649. BMC Genomics. 2014. PMID: 25091430 Free PMC article.
-
Impact of cigarette smoke exposure on innate immunity: a Caenorhabditis elegans model.PLoS One. 2009 Aug 31;4(8):e6860. doi: 10.1371/journal.pone.0006860. PLoS One. 2009. PMID: 19718433 Free PMC article.
-
Prediction of anti-TNF therapy failure in ulcerative colitis patients by ensemble machine learning: A prospective study.Heliyon. 2023 Oct 18;9(11):e21154. doi: 10.1016/j.heliyon.2023.e21154. eCollection 2023 Nov. Heliyon. 2023. PMID: 37928018 Free PMC article.
-
Comparison of High-Level Microarray Analysis Methods in the Context of Result Consistency.PLoS One. 2015 Jun 9;10(6):e0128845. doi: 10.1371/journal.pone.0128845. eCollection 2015. PLoS One. 2015. PMID: 26057385 Free PMC article.
References
-
- Yang YH, Speed T. Design issues for cDNA microarray experiments. Nat Rev Genet. 2002;3:579–588. - PubMed
-
- Affymetrix Affymetrix Statistical Algorithms Description Document http://www.affymetrix.com/support/technical/whitepapers/sadd_whitepaper.pdf
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources