Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jun 26:3:8.
doi: 10.1186/1748-7188-3-8.

A weighted average difference method for detecting differentially expressed genes from microarray data

Affiliations

A weighted average difference method for detecting differentially expressed genes from microarray data

Koji Kadota et al. Algorithms Mol Biol. .

Abstract

Background: Identification of differentially expressed genes (DEGs) under different experimental conditions is an important task in many microarray studies. However, choosing which method to use for a particular application is problematic because its performance depends on the evaluation metric, the dataset, and so on. In addition, when using the Affymetrix GeneChip(R) system, researchers must select a preprocessing algorithm from a number of competing algorithms such as MAS, RMA, and DFW, for obtaining expression-level measurements. To achieve optimal performance for detecting DEGs, a suitable combination of gene selection method and preprocessing algorithm needs to be selected for a given probe-level dataset.

Results: We introduce a new fold-change (FC)-based method, the weighted average difference method (WAD), for ranking DEGs. It uses the average difference and relative average signal intensity so that highly expressed genes are highly ranked on the average for the different conditions. The idea is based on our observation that known or potential marker genes (or proteins) tend to have high expression levels. We compared WAD with seven other methods; average difference (AD), FC, rank products (RP), moderated t statistic (modT), significance analysis of microarrays (samT), shrinkage t statistic (shrinkT), and intensity-based moderated t statistic (ibmT). The evaluation was performed using a total of 38 different binary (two-class) probe-level datasets: two artificial "spike-in" datasets and 36 real experimental datasets. The results indicate that WAD outperforms the other methods when sensitivity and specificity are considered simultaneously: the area under the receiver operating characteristic curve for WAD was the highest on average for the 38 datasets. The gene ranking for WAD was also the most consistent when subsets of top-ranked genes produced from three different preprocessed data (MAS, RMA, and DFW) were compared. Overall, WAD performed the best for MAS-preprocessed data and the FC-based methods (AD, WAD, FC, or RP) performed well for RMA and DFW-preprocessed data.

Conclusion: WAD is a promising alternative to existing methods for ranking DEGs with two classes. Its high performance should increase researchers' confidence in microarray analyses.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Effect of the weight (w) term in WAD statistic for 36 real experimental datasets (Datasets 3–38). AUC values for the weight term (w, light blue circle) in WAD, AD (black circle), and WAD (red circle) are shown. Analyses of Datasets 3–26 and Datasets 27–38 were performed using MAS- and RMA-preprocessed data, respectively, following the choice of preprocessing algorithm in the original papers. The average AUC values for their respective methods as well as the other methods are shown in Table 3. Note that WAD statistics (AD with the w term) can overall give higher AUC values than AD statistics.

Similar articles

Cited by

References

    1. Feten G, Aastveit AH, Snipen L, Almoy T. A discussion concerning the inclusion of variety effect when analysis of variance is used to detect differentially expressed genes. Gene Regulation Systems Biol. 2007;1:43–47. - PMC - PubMed
    1. Kerr MK, Martin M, Churchill GA. Analysis of variance for gene expression microarray data. J Comput Biol. 2000;7:819–837. doi: 10.1089/10665270050514954. - DOI - PubMed
    1. Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA. 2001;98:5116–5121. doi: 10.1073/pnas.091062498. - DOI - PMC - PubMed
    1. Baldi P, Long AD. A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inference of gene changes. Bioinformatics. 2001;17:509–519. doi: 10.1093/bioinformatics/17.6.509. - DOI - PubMed
    1. Li L, Weinberg C, Darden T, Pedersen L. Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics. 2001;17:1131–1142. doi: 10.1093/bioinformatics/17.12.1131. - DOI - PubMed

LinkOut - more resources