Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Nov 30:12:462.
doi: 10.1186/1471-2105-12-462.

Segmentation and intensity estimation for microarray images with saturated pixels

Affiliations

Segmentation and intensity estimation for microarray images with saturated pixels

Yan Yang et al. BMC Bioinformatics. .

Abstract

Background: Microarray image analysis processes scanned digital images of hybridized arrays to produce the input spot-level data for downstream analysis, so it can have a potentially large impact on those and subsequent analysis. Signal saturation is an optical effect that occurs when some pixel values for highly expressed genes or peptides exceed the upper detection threshold of the scanner software (2(16) - 1 = 65, 535 for 16-bit images). In practice, spots with a sizable number of saturated pixels are often flagged and discarded. Alternatively, the saturated values are used without adjustments for estimating spot intensities. The resulting expression data tend to be biased downwards and can distort high-level analysis that relies on these data. Hence, it is crucial to effectively correct for signal saturation.

Results: We developed a flexible mixture model-based segmentation and spot intensity estimation procedure that accounts for saturated pixels by incorporating a censored component in the mixture model. As demonstrated with biological data and simulation, our method extends the dynamic range of expression data beyond the saturation threshold and is effective in correcting saturation-induced bias when the lost information is not tremendous. We further illustrate the impact of image processing on downstream classification, showing that the proposed method can increase diagnostic accuracy using data from a lymphoma cancer diagnosis study.

Conclusions: The presented method adjusts for signal saturation at the segmentation stage that identifies a pixel as part of the foreground, background or other. The cluster membership of a pixel can be altered versus treating saturated values as truly observed. Thus, the resulting spot intensity estimates may be more accurate than those obtained from existing methods that correct for saturation based on already segmented data. As a model-based segmentation method, our procedure is able to identify inner holes, fuzzy edges and blank spots that are common in microarray images. The approach is independent of microarray platform and applicable to both single- and dual-channel microarrays.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Array image with a hexagonal grid superimposed. Valley Fever diagnosis study: Array image with a hexagonal grid superimposed. The red +'s are spot centers.
Figure 2
Figure 2
Boxplots of foreground median intensities. Valley Fever diagnosis study: Boxplots of foreground median intensities for blank spots, two-cluster spots and three-cluster spots on a random block with 484 spots. The number of clusters was selected by BIC.
Figure 3
Figure 3
Three saturated spots segmented by censored GMM and regular GMM. Valley Fever diagnosis study: Three saturated spots segmented by the censored Gaussian mixture model (top panel) or the regular Gaussian mixture model (bottom panel). Foreground pixels are bounded by black line segments. Intermediate pixels that are neither foreground nor background are bounded between black and white line segments.
Figure 4
Figure 4
Comparison of background-subtracted median intensities for four selected spots. Lymphoma diagnosis study: Comparison of background-subtracted median intensity estimates for four spots on 21 arrays, based on the regular Gaussian mixture model and GenePix each with the original, uncensored data (S = 65535) as well as the censored Gaussian mixture model and the regular Gaussian mixture model each with the artificially saturated data (S = 1000 or 800).

Similar articles

Cited by

References

    1. Hsiao LL, Jensen RV, Yoshida T, Clark KE, Blumenstock JE, Gullans SR. Correcting for signal saturation errors in the analysis of microarray data. BioTechniques. 2002;32:330–336. - PubMed
    1. Dudley AM, Aach J, Steffen MA, Church GM. Measuring absolute expression with microarrays with a calibrated reference sample and an extended signal intensity range. Proceedings of the National Academy of Sciences. 2002;99:7554–7559. doi: 10.1073/pnas.112683499. - DOI - PMC - PubMed
    1. Lyng H, Badiee A, Svendsrud DH, Hovig E, Myklebost O, Stokke T. Profound influence of microarray scanner characteristics on gene expression ratios: Analysis and procedure for correction. BMC Genomics. 2004;5:10. doi: 10.1186/1471-2164-5-10. - DOI - PMC - PubMed
    1. Garcia de la Nava J, van Hijum S, Trelles O. Saturation and quantization reduction in microarray experiments using two scans at different sensitivities. Statistical Applications in Genetics and Molecular Biology. 2004;3:Article 11. - PubMed
    1. Wit E, McClure J. Statistical adjustment of signal censoring in gene expression experiments. Bioinformatics. 2003;19:1055–1060. doi: 10.1093/bioinformatics/btg003. - DOI - PubMed

Publication types