Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2005 Dec 9:6:293.
doi: 10.1186/1471-2105-6-293.

An algorithm for automatic evaluation of the spot quality in two-color DNA microarray experiments

Affiliations

An algorithm for automatic evaluation of the spot quality in two-color DNA microarray experiments

Eugene Novikov et al. BMC Bioinformatics. .

Abstract

Background: Although DNA microarray technologies are very powerful for the simultaneous quantitative characterization of thousands of genes, the quality of the obtained experimental data is often far from ideal. The measured microarrays images represent a regular collection of spots, and the intensity of light at each spot is proportional to the DNA copy number or to the expression level of the gene whose DNA clone is spotted. Spot quality control is an essential part of microarray image analysis, which must be carried out at the level of individual spot identification. The problem is difficult to formalize due to the diversity of instrumental and biological factors that can influence the result.

Results: For each spot we estimate the ratio of measured fluorescence intensities revealing differential gene expression or change in DNA copy numbers between the test and control samples. We also define a set of quality characteristics and a model for combining these characteristics into an overall spot quality value. We have developed a training procedure to evaluate the contribution of each individual characteristic in the overall quality. This procedure uses information available from replicated spots, located in the same array or over a set of replicated arrays. It is assumed that unspoiled replicated spots must have very close ratios, whereas poor spots yield greater diversity in the obtained ratio estimates.

Conclusion: The developed procedure provides an automatic tool to quantify spot quality and to identify different types of spot deficiency occurring in DNA microarray technology. Quality values assigned to each spot can be used either to eliminate spots or to weight contribution of each ratio estimate in follow-up analysis procedures.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Example of a spot with a low DW parameter (0.45). Although the coefficient of determination of the linear regression plot (red line) is relatively high (0.92), it is obvious that the linear regression model is not appropriate in this case. It is possible that there are contributions (blue lines) from two different species occurring within the given spot, leading to two different Cy5/Cy3 ratios. The background pixels are grouped near the origin of the linear regression plot (cyan circle).
Figure 2
Figure 2
Example of a spot with contamination. The filtering procedure removes aberrant pixels (dots with the red contours), improving the Cy5/Cy3 ratio estimation and increasing the coefficient of determination of the linear regression plot. Before filtering (red line): CD = 0.86; linear regression ratio is 1.41; segmentation ratio is 0.712. After filtering (black line): CD = 0.91; linear regression ratio is 0.275; segmentation ratio is 0.273. However, larger amounts of aberrant pixels may result in a less reliable estimation.
Figure 3
Figure 3
Examples of spots with different types of distortions. A) Large deviation from the circular shape (GS = 1.6); B) Bright piece of contamination within a larger circular spot, resulting in a low value of the IS parameter (IS = 1.71); C) Merged spots (UB = 2.98).
Figure 4
Figure 4
The correspondence between the quality characteristics, quality parameters and overall quality value.
Figure 5
Figure 5
Quality plots (Qk versus Vk) for image 7A. Green dots – using only the CD quality parameter; red dots – using only the CRV quality parameter; blue dots – using overall quality parameter Q. The black solid line is the exponential ideal quality curve f(Vk) (Eq. (6),a). Three triplicates showing poor quality (outlined by circles) are given in the insets. The main characteristics of the spots from the selected triplicates are given in Table 3.
Figure 6
Figure 6
Quality plots (Qk versus Vk) for artificial images. Three generated images differed in the percentage of dust clusters with respect to the number of good spots: A) 0%, B) 5% and C) 25%. For the further details see the text. The solid lines represent the user-defined ideal quality curves f(Vk): Red – exponential (Eq. (6),a); Blue – Gaussian-like (Eq. (6),b); Black – inverse (Eq. (6),c).
Figure 7
Figure 7
Experimental images. A) 4 × 4 blocks with 21 × 21 spots per block, spot cell size is about 10 pixels; B) 12 × 4 blocks with 15 × 15 spots per block, spot cell size is about 30 pixels. The locations of triplicates are indicated.
Figure 8
Figure 8
Quality plots (Qk versus Vk) using replicated arrays. Black dots – three replicated images (insets) are combined without normalization and the default quality weights are used; red dots – three replicated images (insets) are combined after the global normalization; blue dots – triplicate spots from the first array are used. The solid lines represent the exponential ideal quality curves f(Vk) (Eq. (6),a): Red line (for the red dots) – V¯ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGwbGvgaqeaaaa@2DF9@ ≈ 0.125; Blue line (for the blue dots) – V¯ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacuWGwbGvgaqeaaaa@2DF9@ ≈ 0.08.
Figure 9
Figure 9
Quality analysis of poor-quality (A) and good-quality (B) images. A: (a) Poor-quality image; (b) Image A with the "bad" spots identified by the quality analysis based on its own triplicates. B: (a) Image B with the bad spots identified by the quality analysis based on the triplicates from image A; (b) Image B with the bad spots identified by the quality analysis based on its own triplicates. "Bad" spots are the spots with the overall quality below 0.3. "Bad" spots are indicated by the white crosses.

Similar articles

Cited by

References

    1. Hegde P, Qi R, Abernathy K, Gay C, Dharap S, Gaspard R, Hughes JE, Snesrud E, Lee N, Quackenbush J. A concise guide to cDNA microarray analysis. Biotechniques. 2000;29:548–562. - PubMed
    1. Pinkel D, Segraves R, Sudar D, Clark S, Poole I, Kowbel D, Collins C, Kuo WL, Chen C, Zhai Y, Dairkee SH, Ljung BM, Gray JW, Albertson DG. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nature Genetics. 1998;20:207–211. doi: 10.1038/2524. - DOI - PubMed
    1. Buhler J, Ideker T, Haynor D. Dapple: improved techniques for finding spots on DNA microarrays. UW CSE Technical Report UWTP 2000-08-05. 2000.
    1. Brown CS, Goodwin PC, Sorger PK. Image metrics in the statistical analysis of DNA microarray data. PNAS. 2001;98:8944–8949. doi: 10.1073/pnas.161242998. - DOI - PMC - PubMed
    1. Wang X, Ghosh S, Guo SW. Quantitative quality control in microarray image processing and data acquisition. Nucleic Acids Res. 2001;29:e75. doi: 10.1093/nar/29.15.e75. - DOI - PMC - PubMed

Publication types

MeSH terms

LinkOut - more resources