Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2004 May 5:5:53.
doi: 10.1186/1471-2105-5-53.

Statistical monitoring of weak spots for improvement of normalization and ratio estimates in microarrays

Affiliations

Statistical monitoring of weak spots for improvement of normalization and ratio estimates in microarrays

Igor Dozmorov et al. BMC Bioinformatics. .

Abstract

Background: Several aspects of microarray data analysis are dependent on identification of genes expressed at or near the limits of detection. For example, regression-based normalization methods rely on the premise that most genes in compared samples are expressed at similar levels and therefore require accurate identification of nonexpressed genes (additive noise) so that they can be excluded from the normalization procedure. Moreover, key regulatory genes can maintain stringent control of a given response at low expression levels. If arbitrary cutoffs are used for distinguishing expressed from nonexpressed genes, some of these key regulatory genes may be unnecessarily excluded from the analysis. Unfortunately, no accurate method for differentiating additive noise from genes expressed at low levels is currently available.

Results: We developed a multistep procedure for analysis of mRNA expression data that robustly identifies the additive noise in a microarray experiment. This analysis is predicated on the fact that additive noise signals can be accurately identified by both distribution and statistical analysis.

Conclusions: Identification of additive noise in this manner allows exclusion of noncorrelated weak signals from regression-based normalization of compared profiles thus maximizing the accuracy of these methods. Moreover, genes expressed at very low levels can be clearly identified due to the fact that their expression distribution is stable and distinguishable from the random pattern of additive noise.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Normalization procedure for array data as shown on Atlas Clontech membranes. A. Histogram of averaged expression data from duplicated spots. B. Normality plot of A. C. Histogram of the trimmed data +/- 2SD about the mean. D. Histogram of the data after z transformation. E. Scatter plot showing the regression line for duplicated expressions (log transformed) after normalization. F. Scatter plot of background genes exhibiting the normality of their profile with average of 0 and SD of 1.
Figure 2
Figure 2
"Contamination" of the normally distributed additive noise with weak expressions on Micromax cDNA arrays. A. Histogram of gene expression distribution. B. The variability of the gene expressions ratios on two arrays ordered by expression levels. C. Scatter plot of gene expressions used in the calculation of ratios in B with gene order preserved.
Figure 3
Figure 3
Localization of additive noise through an F-test. A. Histogram of gene expressions from a microarray experiment. B. P value distribution showing correlation of additive noise as determined by F test. A window of 20 genes was created to calculate the groups SD and further windows were created by shifting the index of the ordered SD's by one gene. For example: widow 1 contains genes 1–20 and window 2 contains genes 2–21. As the expression level in these windows increases the SDs become correlated and a p-value threshold becomes apparent. C. Close up histogram of low intensity spots used as background. The Gaussian distribution can clearly be seen; furthermore, the right tail cutoff of the Gaussian distribution is at the expression level corresponds to the p value threshold for the ratio calculation in B.
Figure 4
Figure 4
Within slide stability of the additive noise. Localization of the additive noise distribution and estimation of the normal distribution parameters were carried out as described in Materials and Methods. Parameters for additive noise distribution (Mean/SD) were estimated from total data set (left histogram) and from each half or quarters (intermediate and right histograms) of the slide.

Similar articles

Cited by

References

    1. Geiss GK, Bumgarner RE, An MC, Agy MB, van't Wout AB, Hammersmark E, Carter VS, Upchurch D, Mullins JI, Katze MG. Large-scale monitoring of host cell gene expression during HIV-1 infection using cDNA microarrays. Virology. 2000;266:8–16. doi: 10.1006/viro.1999.0044. - DOI - PubMed
    1. Dozmorov I, Saban MR, Gerard NP, Lu B, Nguyen NB, Centola M, Saban R. Neurokinin 1 receptors and neprilysin modulation of mouse bladder generegulation. Physiol Genmics. 2003;12:239–250. - PubMed
    1. Kooperberg C, Fazzio TG, Delrow JJ, Tsukiyama T. Improved background correction for spotted DNA microarrays. J Comput Biol. 2002;9:55–66. doi: 10.1089/10665270252833190. - DOI - PubMed
    1. Yang MC, Ruan QG, Yang JJ, Eckenrode S, Wu S, McIndoe RA, She JX. A statistical method for flagging weak spots improves normalization and ratio estimates in microarrays. Physiol Genomics. 2001;7:45–53. - PubMed
    1. Chen Y, Kamat V, Dougherty ER, Bittner ML, Meltzer PS, Trent JM. Ratio statistics of gene expression levels and applications to microarray data analysis. Bioinformatics. 2002;18:1207–15. doi: 10.1093/bioinformatics/18.9.1207. - DOI - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources