Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Jun 18:8:207.
doi: 10.1186/1471-2105-8-207.

Orthogonal projections to latent structures as a strategy for microarray data normalization

Affiliations

Orthogonal projections to latent structures as a strategy for microarray data normalization

Max Bylesjö et al. BMC Bioinformatics. .

Abstract

Background: During generation of microarray data, various forms of systematic biases are frequently introduced which limits accuracy and precision of the results. In order to properly estimate biological effects, these biases must be identified and discarded.

Results: We introduce a normalization strategy for multi-channel microarray data based on orthogonal projections to latent structures (OPLS); a multivariate regression method. The effect of applying the normalization methodology on single-channel Affymetrix data as well as dual-channel cDNA data is illustrated. We provide a parallel comparison to a wide range of commonly employed normalization methods with diverse properties and strengths based on sensitivity and specificity from external (spike-in) controls. On the illustrated data sets, the OPLS normalization strategy exhibits leading average true negative and true positive rates in comparison to other evaluated methods.

Conclusion: The OPLS methodology identifies joint variation within biological samples to enable the removal of sources of variation that are non-correlated (orthogonal) to the within-sample variation. This ensures that structured variation related to the underlying biological samples is separated from the remaining, bias-related sources of systematic variation. As a consequence, the methodology does not require any explicit knowledge regarding the presence or characteristics of certain biases. Furthermore, there is no underlying assumption that the majority of elements should be non-differentially expressed, making it applicable to specialized boutique arrays.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Normalization results for the H8k data set. In A, differences in the total number of identified DE microarray elements between the different normalization methods are displayed for the H8k data set. In B, the TP and TN rates for the H8k data set are displayed based on the DE of the external controls. The TP rates are presented using solid black bars whereas the TN rates are presented using striped bars. Raw refers to the un-normalized data.
Figure 2
Figure 2
Illustration of the array baseline difference. The first Y-orthogonal score vector to,1 is shown together with the average A values for each slide. The to,1 values (averaged per slide) are displayed using point-up, light gray triangles whereas the average A values are displayed using point-down, dark gray triangles. The Pearson correlation coefficient between the two series is 0.992, suggesting that the score vector captures an array bias.
Figure 3
Figure 3
Illustration of a print-tip group effect. The eighth Y-orthogonal loading vector po,8T displayed using a spatial representation of the array layout. The 48 print-tip groups are delimited using solid lines. Darker areas denote higher absolute loading values whereas brighter areas denote lower absolute loading values. One distinct print-tip group with high-magnitude loading values can be seen in the upper right corner of the figure (indicated by the arrow), capturing a print-tip group effect.

References

    1. Schena M, Shalon D, Davis RW, Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995;270:467–470. doi: 10.1126/science.270.5235.467. - DOI - PubMed
    1. Iyer VR, Eisen MB, Ross DT, Schuler G, Moore T, Lee JC, Trent JM, Staudt LM, Hudson J, Jr., Boguski MS, Lashkari D, Shalon D, Botstein D, Brown PO. The transcriptional program in the response of human fibroblasts to serum. Science. 1999;283:83–87. doi: 10.1126/science.283.5398.83. - DOI - PubMed
    1. Moreau C, Aksenov N, Lorenzo MG, Segerman B, Funk C, Nilsson P, Jansson S, Tuominen H. A genomic approach to investigate developmental cell death in woody tissues of Populus trees. Genome Biol. 2005;6:R34. doi: 10.1186/gb-2005-6-4-r34. - DOI - PMC - PubMed
    1. Barrangou R, Azcarate-Peril MA, Duong T, Conners SB, Kelly RM, Klaenhammer TR. Global analysis of carbohydrate utilization by Lactobacillus acidophilus using cDNA microarrays. Proc Natl Acad Sci U S A. 2006;103:3816–3821. doi: 10.1073/pnas.0511287103. - DOI - PMC - PubMed
    1. Hessner MJ, Wang X, Hulse K, Meyer L, Wu Y, Nye S, Guo SW, Ghosh S. Three color cDNA microarrays: quantitative assessment through the use of fluorescein-labeled probes. Nucleic Acids Res. 2003;31:e14. doi: 10.1093/nar/gng014. - DOI - PMC - PubMed

Publication types