Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Jun 28;3(7):RESEARCH0037.
doi: 10.1186/gb-2002-3-7-research0037. Epub 2002 Jun 28.

Normalization and analysis of DNA microarray data by self-consistency and local regression

Affiliations

Normalization and analysis of DNA microarray data by self-consistency and local regression

Thomas B Kepler et al. Genome Biol. .

Abstract

Background: With the advent of DNA hybridization microarrays comes the remarkable ability, in principle, to simultaneously monitor the expression levels of thousands of genes. The quantitative comparison of two or more microarrays can reveal, for example, the distinct patterns of gene expression that define different cellular phenotypes or the genes induced in the cellular response to insult or changing environmental conditions. Normalization of the measured intensities is a prerequisite of such comparisons, and indeed, of any statistical analysis, yet insufficient attention has been paid to its systematic study. The most straightforward normalization techniques in use rest on the implicit assumption of linear response between true expression level and output intensity. We find that these assumptions are not generally met, and that these simple methods can be improved.

Results: We have developed a robust semi-parametric normalization technique based on the assumption that the large majority of genes will not have their relative expression levels changed from one treatment group to the next, and on the assumption that departures of the response from linearity are small and slowly varying. We use local regression to estimate the normalized expression levels as well as the expression level-dependent error variance.

Conclusions: We illustrate the use of this technique in a comparison of the expression profiles of cultured rat mesothelioma cells under control and under treatment with potassium bromate, validated using quantitative PCR on a selected set of genes. We tested the method using data simulated under various error models and find that it performs well.

PubMed Disclaimer

Figures

Figure 1
Figure 1
A pair of Clontech microarrays hybridized as described in Materials and methods to mRNA from cultured rat mesothelioma cells. (a) Control expression patterns. (b) Expression patterns from cells after treatment with potassium bromate. Waf-1 and HSP-70 are examples of genes that seem to be upregulated as a result of potassium bromate treatment. Note the apparently consistent overall expression pattern from one array to the other, in spite of the toxic treatment received in (b).
Figure 2
Figure 2
The normalized log intensity in a pair of replicate arrays. (a) Data normalized by subtracting the mean over all spots; that is, no bias removal. (b) Data normalized by estimating the normalization function using local regression and then subtracting the inferred bias, as described in the text.
Figure 3
Figure 3
Histogram of estimated mean log abundance levels, ak, and local variance (solid line) from the potassium bromate experiment. This distribution of x and this local variance curve were used as input for the simulation studies. The dashed curve gives the variance estimated in a randomly chosen member of the simulation datasets.
Figure 4
Figure 4
Biases in the four arrays in the potassium bromate experiment (solid lines). These biases were then used as input to the data simulation as described in the text. A simulation dataset was chosen at random and biases were estimated from it (dashed lines).
Figure 5
Figure 5
The normalized log intensities from simulated data generated according to Equation 26. (a) The data normalized without local regression, as in Figure 1. (b) The same data normalized using local regression for bias removal. Note that the apparent curvature is induced simply by adding a background term with non-zero expectation.

Similar articles

Cited by

References

    1. Fodor SP, Rava RP, Huang XC, Pease AC, Holmes CP, Adams CL. Multiplexed biochemical assays with biological chips. Nature. 1993;364:555–556. - PubMed
    1. Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270:467–470. - PubMed
    1. DeRisi J, Penland L, Brown PO, Bittner ML, Meltzer PP, Ray M, Chen Y, Su YA, Trent JM. Use of a cDNA microarray to analyze gene expression patterns in human cancer. Nat Genet. 1996;14:457–460. - PubMed
    1. Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, Brown EL. Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol. 1996;14:1675–1680. - PubMed
    1. DeRisi JL, Iyer VR, Brown PO. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science. 1997;278:680–686. - PubMed

Publication types

LinkOut - more resources