Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan;28(1):83-9.
doi: 10.1038/nbt.1592. Epub 2009 Dec 13.

Label-free, normalized quantification of complex mass spectrometry data for proteomic analysis

Affiliations

Label-free, normalized quantification of complex mass spectrometry data for proteomic analysis

Noelle M Griffin et al. Nat Biotechnol. 2010 Jan.

Abstract

Replicate mass spectrometry (MS) measurements and the use of multiple analytical methods can greatly expand the comprehensiveness of shotgun proteomic profiling of biological samples. However, the inherent biases and variations in such data create computational and statistical challenges for quantitative comparative analysis. We developed and tested a normalized, label-free quantitative method termed the normalized spectral index (SI(N)), which combines three MS abundance features: peptide count, spectral count and fragment-ion (tandem MS or MS/MS) intensity. SI(N) largely eliminated variances between replicate MS measurements, permitting quantitative reproducibility and highly significant quantification of thousands of proteins detected in replicate MS measurements of the same and distinct samples. It accurately predicts protein abundance more often than the five other methods we tested. Comparative immunoblotting and densitometry further validate our method. Comparative quantification of complex data sets from multiple shotgun proteomics measurements is relevant for systems biology and biomarker discovery.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Statistical analysis of replicate MS measurement variation before and after normalization
The mean and 95% confidence interval (CI) for the abundance features, peptide number, PN (a), spectral count, SC (b), and spectral index, SI (c) were calculated for 4 MS replicate measurements of pooled endothelial cell plasma membrane isolated from liver and were plotted using the mean diamonds and comparison circles methods. If the CIs, as indicated by the diamonds, do not overlap, the groups are significantly different. For statistical analysis of difference in mean intensities or other features, between multiple replicate samples, analysis of variance (ANOVA, one-way) was performed. Our null hypothesis was that all replicate samples were equal. If our null hypothesis is true then we expect the F-ratio to be ∼1 (d.f.= 5919). Our significance level was p<0.05. The x-axis represents each of the 4 replicate datasets and the y-axis represents the log of abundance feature being examined (n=5923). The indicated normalization methods were applied separately to the SI (d-k) or SC (l) datasets and tested for differences as described above. We applied NSAF and Rsc methods to the replicates datasets (see online methods for equations) and tested for differences (m,n). Graph of the comparison of F-ratios obtained from statistical testing of SIN, Rsc and RSI, where RSI is the Rsc equation with SI substituted for SC (n).
Figure 2
Figure 2. Correlation of SIN with protein abundance
(a) A protein standard mix spanning a wide dynamic range (0.5 – 50000fmol) was spiked into BSA, separated by SDS-PAGE, trypsin digested and analyzed by 2DLC. SIN values for each spiked protein were calculated, averaged and plotted against the amount of the protein standard added. (Note: clustering of many of the data points close to the origin due to the large range in protein abundance; this region was zoom and expanded for ease of visualization). The R correlation was 0.9239. (b-d) Statistical analysis comparing the quantification of proteins across replicate measurements using 6 quantification methods (relative to known value). The mean and 95% CI for protein abundance, as determined by various relative quantitative methods, were plotted for three representative proteins from the “standard protein mixture” and compared to the actual loaded amount using ANOVA, and individual means were compared using the Tukey-Kramer HSD method , , Quantitative methods that were not significantly different from the actual protein abundance, (ANOVA, α=0.05) are highlighted in red. (e) The table summarizes the statistical analysis used to compare the ability of 5 quantitative methods to accurately determine the correct amount of a protein in a standard mixture across replicate datasets. The number of correct abundance determinations was determined using ANOVA where the predicted protein amount, as determined by each method, did not deviate significantly from the mean of the actual protein amount, α=0.05 Abbreviations; SC- spectral count, AUC- area under the curve; total AUC for all identifying peptides.
Figure 3
Figure 3. Statistical analysis of normalization methods applied to variable protein load and distinct sample datasets
The indicated normalization methods were each applied to the 40 and 150μg MS datasets from normal lung endothelial cell plasma membranes (ECPM). Mean and 95% CI for (a) raw SI dataset and datasets normalized by: (b) the dilution factor, (c) SIN, (d) Rsc, (e) NSAF, were plotted using the mean diamonds and comparison circles. The x-axis represents the 2 different protein loads and the y-axis represents the log of the normalized abundance feature (number of common proteins, n = 2660). (f) T-ratios for statistical testing of SIN & Rsc are plotted as a function of peptide cut-off numbers (number of peptides/protein commonly identified between the samples). α=0.05 significance line is plotted, T-ratios above this line indicate that samples are different. (g) The ng converted SIN values (based on initial sample load) for 2,660 proteins common between the 40 and 150ug datasets were plotted against each other. The slope of the line is 3.72, R = 0.94. (h) Two-way clustering of ∼3,000 proteins identified in ECPM heart and kidney samples. Each column in the matrix represents a single 2D-LC-MS/MS run for either heart or kidney, based on the SIN normalized MS data. Proteins (rows) and tissues (columns) are clustered based on their similarities in protein intensity profile. Colors within the heatmap range from light blue (least prevalent) to dark red (most prevalent), illustrating the relative abundance of each protein within a particular sample.
Figure 4
Figure 4. Comparative analysis of proteins quantified by SDS-PAGE and MS analysis
(a) Proteins in ECPM from rat lung were separated by SDS-PAGE, stained with coomassie blue and cut into 51 slices. Each gel slice was subjected to densitometry and MS analysis. (b) The densitometry intensities for each slice were compared to the SI on the same axis, with the x-axis being the gel slice number. (c) 64 proteins found in both lung ECPM (P) and the entire lung homogenate (H) were analyzed by Western blotting to quantify protein signal by densitometry. The P/H ratio for each protein from the Western analysis is plotted against its P/H ratio from the SIN values (multiple measurements). Spearman's Rho correlation between Western and SIN ratio is ρ = 0.86, all the points fall within 95% CI (red line). (d) The Bland-Altman plot for the two methods with 1 and 2 s.d. of the mean.

Similar articles

Cited by

References

    1. Durr E, et al. Direct proteomic mapping of the lung microvascular endothelial cell surface in vivo and in cell culture. Nat Biotechnol. 2004;22:985–992. - PubMed
    1. Li Y, et al. Enhancing identifications of lipid-embedded proteins by mass spectrometry for improved mapping of endothelial plasma membranes in vivo. Mol Cell Proteomics. 2009;8:1219–1235. - PMC - PubMed
    1. Oh P, et al. Subtractive proteomic mapping of the endothelial surface in lung and solid tumours for tissue-specific therapy. Nature. 2004;429:629–635. - PubMed
    1. Wong JW, Sullivan MJ, Cagney G. Computational methods for the comparative quantification of proteins in label-free LCn-MS experiments. Brief Bioinform. 2008;9:156–165. - PubMed
    1. Oh P, et al. Live dynamic imaging of caveolae pumping targeted antibody rapidly and specifically across endothelium in the lung. Nat Biotechnol. 2007;25:327–337. - PMC - PubMed

Publication types