Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Apr 14:9:194.
doi: 10.1186/1471-2105-9-194.

A probe-treatment-reference (PTR) model for the analysis of oligonucleotide expression microarrays

Affiliations

A probe-treatment-reference (PTR) model for the analysis of oligonucleotide expression microarrays

Huanying Ge et al. BMC Bioinformatics. .

Abstract

Background: Microarray pre-processing usually consists of normalization and summarization. Normalization aims to remove non-biological variations across different arrays. The normalization algorithms generally require the specification of reference and target arrays. The issue of reference selection has not been fully addressed. Summarization aims to estimate the transcript abundance from normalized intensities. In this paper, we consider normalization and summarization jointly by a new strategy of reference selection.

Results: We propose a Probe-Treatment-Reference (PTR) model to streamline normalization and summarization by allowing multiple references. We estimate parameters in the model by the Least Absolute Deviations (LAD) approach and implement the computation by median polishing. We show that the LAD estimator is robust in the sense that it has bounded influence in the three-factor PTR model. This model fitting, implicitly, defines an "optimal reference" for each probe-set. We evaluate the effectiveness of the PTR method by two Affymetrix spike-in data sets. Our method reduces the variations of non-differentially expressed genes and thereby increases the detection power of differentially expressed genes.

Conclusion: Our results indicate that the reference effect is important and should be considered in microarray pre-processing. The proposed PTR method is a general framework to deal with the issue of reference selection and can readily be applied to existing normalization algorithms such as the invariant-set, sub-array and quantile method.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The scheme of the PTR method. It includes the reference and target selection, multiple normalization, and three-factor model fitting of summarization. Here, we only illustrate the cross strategy for the reference and target selection.
Figure 2
Figure 2
M-A plots of the perturbed data set using different normalization and reference selections. Top (A1-A3): invariant-set; middle (B1-B3): quantile; bottom (C1-C3): sub-array. Left column (A1, B1 and C1): the reference is the perturbed array Exp03_R1*; Middle column: the reference in both A2 and C2 is Exp03_R2, while the reference in B2 is the pseudo-reference defined as the average quantiles of all six arrays; Right column (A3, B3 and C3): the result obtained by the PTR method using all six arrays as references. The grey dots are non-spike-in genes; the black dots are spike-in genes which are expected to have log-ratio M = 1. We can see that the PTR method results are not affected by the perturbed array Exp03_R1* and offers the smallest variation for non-spike-in genes.
Figure 3
Figure 3
The LOESS curves of |M| versus A by various pre-processing method. These plots compare the PTR method with other pre-processing methods based on the variation assessment of non-spike-in genes. The PTR method gives the smallest variation for all three normalization algorithms.
Figure 4
Figure 4
ROC curves of the PTR and other methods. X-axis: 1 – specificity; Y-axis: sensitivity. The PTR method performs the best in all cases.
Figure 5
Figure 5
Distribution of the reference effect. This reference effect box plot is get from the PTR model-fitting on the perturbed data set after the invariant-set normalization. The first reference array, Exp03_R1*, has been perturbed by adding noise. It shows a quite different distribution than others.
Figure 6
Figure 6
The frequency of being the "implicit optimal reference". It illustrates the frequency of reference arrays which have been served as the "implicit optimal reference" across all the probe-sets. It is computed from the residual assessment after the PTR method with the invariant-set normalization on the data set "Expt-3-4".

Similar articles

Cited by

References

    1. Affymetrix . Affymetrix Microarray Suite User Guide, version 5. Santa Clara, CA; 2001.
    1. Li C, Wong WH. Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biology. 2001;2 - PMC - PubMed
    1. Li C, Wong WH. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. PNAS. 2001;98:31–6. doi: 10.1073/pnas.011404098. - DOI - PMC - PubMed
    1. dChip software User's Manual. 2005. http://biosun1.harvard.edu/complab/dchip
    1. Bolstad BM, Irizarry RA, Astrand M, Speed TP. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003;19:185–193. doi: 10.1093/bioinformatics/19.2.185. - DOI - PubMed

Publication types

MeSH terms

LinkOut - more resources