Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Feb 1;12(2):594-604.
doi: 10.1021/pr300624g. Epub 2013 Jan 16.

Statistical inference from multiple iTRAQ experiments without using common reference standards

Affiliations

Statistical inference from multiple iTRAQ experiments without using common reference standards

Shelley M Herbrich et al. J Proteome Res. .

Abstract

Isobaric tags for relative and absolute quantitation (iTRAQ) is a prominent mass spectrometry technology for protein identification and quantification that is capable of analyzing multiple samples in a single experiment. Frequently, iTRAQ experiments are carried out using an aliquot from a pool of all samples, or "masterpool", in one of the channels as a reference sample standard to estimate protein relative abundances in the biological samples and to combine abundance estimates from multiple experiments. In this manuscript, we show that using a masterpool is counterproductive. We obtain more precise estimates of protein relative abundance by using the available biological data instead of the masterpool and do not need to occupy a channel that could otherwise be used for another biological sample. In addition, we introduce a simple statistical method to associate proteomic data from multiple iTRAQ experiments with a numeric response and show that this approach is more powerful than the conventionally employed masterpool-based approach. We illustrate our methods using data from four replicate iTRAQ experiments on aliquots of the same pool of plasma samples and from a 406-sample project designed to identify plasma proteins that covary with nutrient concentrations in chronically undernourished children from South Asia.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Quantification of relative peptide and protein abundances in an experiment with eight technical replicates of pooled plasma samples. The normalization steps are shown for four randomly selected proteins (accession numbers gil678S7790, gi221316614, gi4S0363S, and gi677823S8, in columns 1–4). Shown are the absolute reporter ion intensities on the log10 scale for all spectra detected for the respective proteins (first row), the same data after removal of the spectrum median (second row), the relative peptide abundances calculated as the median of the above row 2 values for each channel and peptide (third row), and the relative protein abundances after removing the loading effect (fourth row). The gray regions (rows 2–4) represent 25% fold changes in relative abundances. Since the same samples were loaded in the eight channels, all relative protein abundances should be equal to 1.
Figure 2
Figure 2
Protein relative abundance estimates (left column) derived from four all-masterpool 8-plex iTRAQ experiments (top to bottom). Proteins observed in all four experiments with two or more peptides were considered (n = 130). Boxplots of the relative abundance estimates are shown for the eight channels in each of the four experiments, using linear mixed effects models estimating all relative abundances simultaneously (white), linear mixed effects models estimating all relative abundances separately for each experiment (light gray), and standard linear models estimating all relative abundances separately for each experiment (dark gray). The median absolute fold change (right column) represents a typical value for the deviation from 0 on the log2 scale, mapped back to the original fold change scale. A value of 1 represents no variability, 1.1 indicates that typical deviation is 10%. The gray regions in the left panels represent 25% fold changes in relative abundances.
Figure 3
Figure 3
Estimates of relative abundances of retinol-binding protein 4 (RBP4) derived from 58 8-plex iTRAQ experiments (seven individual samples and one masterpool each, 406 biological samples total) versus measured plasma retinol concentration. Relative abundance estimates from different experiments were combined using the masterpool (A), the median sweep augmenting the estimates from each experiment (B), and the linear mixed effects model (eq 5), using the median sweep-derived relative abundance estimates (C). Highlighted are two experiments, one with high (lighter gray) and one with low (darker gray) average plasma retinol concentration.
Figure 4
Figure 4
The strongest associations (R2, y-axis) of proteins (accession numbers on the x-axis) with nutrients or indicators of acute phase response observed in our experiments (Table 2). Estimates of relative protein abundance from different experiments were combined using the linear mixed effects model (eq 5), the masterpool normalization, and the median sweep, augmenting (stacking) the estimates from each experiment. The association of RBP4 (accession number gi55743122) and plasma retinol is shown in detail in Figure 3.
Figure 5
Figure 5
The median absolute fold changes (top) and the root mean-squared fold changes (bottom) for seven different relative abundance estimating procedures, defined in the Methods section. Proteins observed in all four experiments with two or more peptides were included in the assessment (n = 130), and results are shown for each procedure and each iTRAQ experiment, calculating summary statistics across all proteins and channels. The masterpool-based procedures 1 and 2 clearly fare the worst. In addition, median-based procedures, for example, 5 and 7, appear to perform better than the respective mean-based procedures, for example, 4 and 6.
Figure 6
Figure 6
The associations (R2, y-axis) of proteins (accession numbers on the x-axis) with nutrients or indicators of acute phase response observed in our experiments (a subset of Table 2). Proteins observed in at least half of the samples (203, corresponding to 29 complete experiments with seven biological samples) with two or more peptides were included in the assessment (n = 92). The R2 values were derived from the random intercept model (eq 5), based on seven different procedures to estimate protein relative abundances (defined in the Methods section). Highlighted are the results for the median-based master pool approach (2) and the median sweep ignoring peptides (5), performing the best in this assessment.

Similar articles

Cited by

References

    1. Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, Khainovski N, Pillai S, Dey S, Daniels S, Purkayastha S, Juhasz P, Martin S, Bartlet-Jones M, He F, Jacobson A, Pappin DJ. Multiplexed protein quantitation in Saccharomyces cerevisiae using amine-reactive sobaric tagging reagents. Mol. Cell. Proteomics. 2004;3(12):1154–1169. - PubMed
    1. Aggarwal K, Choe LH, Lee KH. Quantitative analysis of protein expression using amine-specific isobaric tags in Escherichia coli cells expressing rhsA elements. Proteomics. 2005;5(9):2297–2308. - PubMed
    1. Lau KW, Jones AR, Swainston N, Siepen JA, Hubbard SJ. Capture and analysis of quantitative proteomic data. Proteomics. 2007;7(16):2787–2799. - PMC - PubMed
    1. Wiese S, Reidegeld KA, Meyer HE, Warscheid B. Protein labeling by iTRAQ: a new tool for quantitative mass spectrometry in proteome research. Proteomics. 2007;7(3):340–350. - PubMed
    1. Pierce A, Unwin RD, Evans CA, Griffiths S, Carney L, Zhang L, Jaworska E, Lee C-F, Blinco D, Okoniewski MJ, Miller CJ, Bitton DA, Spooncer E, Whetton AD. Eight-channel iTRAQ enables comparison of the activity of six leukemogenic tyrosine kinases. Mol. Cell. Proteomics. 2008;7(5):853–863. - PubMed

Publication types