Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity
- PMID: 19386098
- PMCID: PMC2679019
- DOI: 10.1186/1748-7188-4-7
Ranking differentially expressed genes from Affymetrix gene expression data: methods with reproducibility, sensitivity, and specificity
Abstract
Background: To identify differentially expressed genes (DEGs) from microarray data, users of the Affymetrix GeneChip system need to select both a preprocessing algorithm to obtain expression-level measurements and a way of ranking genes to obtain the most plausible candidates. We recently recommended suitable combinations of a preprocessing algorithm and gene ranking method that can be used to identify DEGs with a higher level of sensitivity and specificity. However, in addition to these recommendations, researchers also want to know which combinations enhance reproducibility.
Results: We compared eight conventional methods for ranking genes: weighted average difference (WAD), average difference (AD), fold change (FC), rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT) with six preprocessing algorithms (PLIER, VSN, FARMS, multi-mgMOS (mmgMOS), MBEI, and GCRMA). A total of 36 real experimental datasets was evaluated on the basis of the area under the receiver operating characteristic curve (AUC) as a measure for both sensitivity and specificity. We found that the RP method performed well for VSN-, FARMS-, MBEI-, and GCRMA-preprocessed data, and the WAD method performed well for mmgMOS-preprocessed data. Our analysis of the MicroArray Quality Control (MAQC) project's datasets showed that the FC-based gene ranking methods (WAD, AD, FC, and RP) had a higher level of reproducibility: The percentages of overlapping genes (POGs) across different sites for the FC-based methods were higher overall than those for the t-statistic-based methods (modT, samT, shrinkT, and ibmT). In particular, POG values for WAD were the highest overall among the FC-based methods irrespective of the choice of preprocessing algorithm.
Conclusion: Our results demonstrate that to increase sensitivity, specificity, and reproducibility in microarray analyses, we need to select suitable combinations of preprocessing algorithms and gene ranking methods. We recommend the use of FC-based methods, in particular RP or WAD.
Figures

Similar articles
-
A weighted average difference method for detecting differentially expressed genes from microarray data.Algorithms Mol Biol. 2008 Jun 26;3:8. doi: 10.1186/1748-7188-3-8. Algorithms Mol Biol. 2008. PMID: 18578891 Free PMC article.
-
Evaluating methods for ranking differentially expressed genes applied to microArray quality control data.BMC Bioinformatics. 2011 Jun 6;12:227. doi: 10.1186/1471-2105-12-227. BMC Bioinformatics. 2011. PMID: 21639945 Free PMC article.
-
Arrow plot: a new graphical tool for selecting up and down regulated genes and genes differentially expressed on sample subgroups.BMC Bioinformatics. 2012 Jun 26;13:147. doi: 10.1186/1471-2105-13-147. BMC Bioinformatics. 2012. PMID: 22734592 Free PMC article.
-
The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies.BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S10. doi: 10.1186/1471-2105-9-S9-S10. BMC Bioinformatics. 2008. PMID: 18793455 Free PMC article.
-
Investigation of reproducibility of differentially expressed genes in DNA microarrays through statistical simulation.BMC Proc. 2009 Mar 10;3 Suppl 2(Suppl 2):S4. doi: 10.1186/1753-6561-3-s2-s4. BMC Proc. 2009. PMID: 19278560 Free PMC article.
Cited by
-
Identification of BCL11B as a regulator of adipogenesis.Sci Rep. 2016 Sep 2;6:32750. doi: 10.1038/srep32750. Sci Rep. 2016. PMID: 27586877 Free PMC article.
-
Mining SOM expression portraits: feature selection and integrating concepts of molecular function.BioData Min. 2012 Oct 8;5(1):18. doi: 10.1186/1756-0381-5-18. BioData Min. 2012. PMID: 23043905 Free PMC article.
-
phrR-like gene praR of Azorhizobium caulinodans ORS571 is essential for symbiosis with Sesbania rostrata and is involved in expression of reb genes.Appl Environ Microbiol. 2010 Jun;76(11):3475-85. doi: 10.1128/AEM.00238-10. Epub 2010 Apr 9. Appl Environ Microbiol. 2010. PMID: 20382809 Free PMC article.
-
ATTED-II in 2018: A Plant Coexpression Database Based on Investigation of the Statistical Property of the Mutual Rank Index.Plant Cell Physiol. 2018 Jan 1;59(1):e3. doi: 10.1093/pcp/pcx191. Plant Cell Physiol. 2018. PMID: 29216398 Free PMC article.
-
Modified Significance Analysis of Microarrays in Heterogeneous Diseases.J Pers Med. 2021 Jan 20;11(2):62. doi: 10.3390/jpm11020062. J Pers Med. 2021. PMID: 33498359 Free PMC article.
References
LinkOut - more resources
Full Text Sources
Other Literature Sources