Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2007 Aug;7(16):2904-19.
doi: 10.1002/pmic.200700267.

Protein abundance ratios for global studies of prokaryotes

Affiliations
Review

Protein abundance ratios for global studies of prokaryotes

Qiangwei Xia et al. Proteomics. 2007 Aug.

Abstract

The use of multidimensional capillary HPLC combined with MS/MS has allowed high qualitative and quantitative proteome coverage of prokaryotic organisms. The determination of protein abundance change between two or more conditions has matured to the point that false discovery rates can be very low and for smaller proteomes coverage is sufficiently high to explicitly consider false negative error. Selected aspects of using these methods for global protein abundance assessments are reviewed. These include instrumental issues that influence the reliability of abundance ratios; a comparison of sources of nonlinearity, errors, and data compression in proteomics and spotted cDNA arrays; strengths and weaknesses of spectral counting versus stable isotope metabolic labeling; and a survey of microbiological applications of global abundance analysis at the protein level. Proteomic results for two organisms that have been studied extensively using these methods are reviewed in greater detail. Spectral counting and metabolic labeling data are compared and the utility of proteomics for global gene regulation studies are discussed for the methanogenic Archaeon Methanococcus maripaludis. The oral pathogen Porphyromonas gingivalis is discussed as an example of an organism where a large percentage of the proteome differs in relative abundance between the intracellular and extracellular phenotype.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Scatter plot of log10 of isotopic peptide pairs versus RSD for 1129 proteins associated with more than five isotope pairs, (A). Scatter plot of log10 isotopic heavy/light pairs versus log2 14N/15N signal intensity ratios, (B). In order to simulate a proteome wide analysis in which all abundance ratios were known, we used AH030-54 and AH030-49, biological replicates of M. maripaludis with flipped isotope labeling, without normalization. AH030-54 was at natural abundance and AH030-49 was 15N labeled. We adjusted the total amount of AH030-54 (~1.0 mg total protein in each undiluted sample) and AH030-49 according to the results of a Bradford total protein assay such that AH030-49 was diluted approximately 16-fold. The two samples were mixed and run through the MudPIT procedure together, as described previously [61], with changes as noted here. An LTQ instrument was used with acquisition parameters as described [9]. The MAD z-score test was applied to detect outliers with a value of 3.5 or greater [61]. Then, the top five ranked isotope ratios with the lowest MAD z-scores were used to determine the reported ratio and RSD. The average RSD was 19%.
Figure 1
Figure 1
Scatter plot of log10 of isotopic peptide pairs versus RSD for 1129 proteins associated with more than five isotope pairs, (A). Scatter plot of log10 isotopic heavy/light pairs versus log2 14N/15N signal intensity ratios, (B). In order to simulate a proteome wide analysis in which all abundance ratios were known, we used AH030-54 and AH030-49, biological replicates of M. maripaludis with flipped isotope labeling, without normalization. AH030-54 was at natural abundance and AH030-49 was 15N labeled. We adjusted the total amount of AH030-54 (~1.0 mg total protein in each undiluted sample) and AH030-49 according to the results of a Bradford total protein assay such that AH030-49 was diluted approximately 16-fold. The two samples were mixed and run through the MudPIT procedure together, as described previously [61], with changes as noted here. An LTQ instrument was used with acquisition parameters as described [9]. The MAD z-score test was applied to detect outliers with a value of 3.5 or greater [61]. Then, the top five ranked isotope ratios with the lowest MAD z-scores were used to determine the reported ratio and RSD. The average RSD was 19%.
Figure 2
Figure 2
Scatter plot of log2 total spectral counts for a control population of P. gingivalis versus the ratio of spectral counts for two technical replicates of the same sample, for 1074 proteins, (A). Such plots have been used to determine error boundaries and as an aid to determining FPRs and FDRs [9]. The solid line is a LOWESS curve [103] fit to the extreme values above and below zero. The 19 values in red are false positives for abundance change as determined by a q-value cutoff of 0.01, see Table 3 and [9]. Scatter plot of log2 of sum of signal intensity measurements for the same two P. gingivalis technical replicates versus ratios, (B). The data handling procedures for the intensity data have been described [9]. Only one data point (red) generated a false positive result at a q-cutoff of 0.001, see Table 3. A total of 884 data points were plotted. Taken from Xia et al., (submitted for publication).
Figure 2
Figure 2
Scatter plot of log2 total spectral counts for a control population of P. gingivalis versus the ratio of spectral counts for two technical replicates of the same sample, for 1074 proteins, (A). Such plots have been used to determine error boundaries and as an aid to determining FPRs and FDRs [9]. The solid line is a LOWESS curve [103] fit to the extreme values above and below zero. The 19 values in red are false positives for abundance change as determined by a q-value cutoff of 0.01, see Table 3 and [9]. Scatter plot of log2 of sum of signal intensity measurements for the same two P. gingivalis technical replicates versus ratios, (B). The data handling procedures for the intensity data have been described [9]. Only one data point (red) generated a false positive result at a q-cutoff of 0.001, see Table 3. A total of 884 data points were plotted. Taken from Xia et al., (submitted for publication).
Figure 3
Figure 3
Scatter plot log2 of sum of spectral counts for the same P. gingivalis control sample as shown in Fig. 2 and a 16-fold dilution versus the ratio of spectral counts, (A). The data were used to establish an FNR under these conditions, see Table 3 and supplementary Fig. S1. A total of 975 data points were plotted, 293 proteins shown in red had q-values less than 0.01. The solid lines are LOWESS curves defining the region of random error about zero, see Fig. 2. Scatter plot of log10 of sum of signal intensities versus log2 ratios for the same data, (B). Data points for 694 proteins were plotted, 693 (red) were significantly greater than zero (log2 scale), with q-values less than 0.001, see Table 3.
Figure 3
Figure 3
Scatter plot log2 of sum of spectral counts for the same P. gingivalis control sample as shown in Fig. 2 and a 16-fold dilution versus the ratio of spectral counts, (A). The data were used to establish an FNR under these conditions, see Table 3 and supplementary Fig. S1. A total of 975 data points were plotted, 293 proteins shown in red had q-values less than 0.01. The solid lines are LOWESS curves defining the region of random error about zero, see Fig. 2. Scatter plot of log10 of sum of signal intensities versus log2 ratios for the same data, (B). Data points for 694 proteins were plotted, 693 (red) were significantly greater than zero (log2 scale), with q-values less than 0.001, see Table 3.
Figure 4
Figure 4
Scatter plot of log2 sum of spectral counts for two replicates of M. maripaludis AH030-104 versus log2 of AH030-104 ratios of spectral counts for each replicate, (A). 1374 data points were plotted; 35 false positives for non-zero ratios with q-values less than 0.01 are shown in red, see Tables 1, 2. Scatter plot of log10 sum of signal intensities versus ratios calculated from signal intensity [9], (B). The solid lines are LOWESS curves used to set boundaries on the region of random scatter about zero. Of 1214 proteins, only two (red) had q-values less than 0.001.
Figure 4
Figure 4
Scatter plot of log2 sum of spectral counts for two replicates of M. maripaludis AH030-104 versus log2 of AH030-104 ratios of spectral counts for each replicate, (A). 1374 data points were plotted; 35 false positives for non-zero ratios with q-values less than 0.01 are shown in red, see Tables 1, 2. Scatter plot of log10 sum of signal intensities versus ratios calculated from signal intensity [9], (B). The solid lines are LOWESS curves used to set boundaries on the region of random scatter about zero. Of 1214 proteins, only two (red) had q-values less than 0.001.
Figure 5
Figure 5
Scatter plot of log2 sum of spectral counts for M. maripaludis AH030-104 and a five-fold dilution versus log2 AH030-104/dilution spectral count ratios, (A). The lack of adequate statistical power to detect a five-fold simulated abundance change is evident, see Table 2. There were 1397 data points, of which 120 proteins shown in red had q-values less than 0.01, yielding an FNR of 92%. The LOWESS curves show the expected region of random scatter about zero, see Fig. 4. Scatter plot of log2 sum of signal intensities for AH030-104 and dilution versus log2 AH030-104/dilution signal intensity ratios, (B). Note the much greater power of the signal intensity approach [9] to detect a five-fold change. There were 1212 data points, of which 923 ORFs shown in red had q-values less than 0.001, see Table 2. The LOWESS curves show the region of expected random scatter about zero.
Figure 5
Figure 5
Scatter plot of log2 sum of spectral counts for M. maripaludis AH030-104 and a five-fold dilution versus log2 AH030-104/dilution spectral count ratios, (A). The lack of adequate statistical power to detect a five-fold simulated abundance change is evident, see Table 2. There were 1397 data points, of which 120 proteins shown in red had q-values less than 0.01, yielding an FNR of 92%. The LOWESS curves show the expected region of random scatter about zero, see Fig. 4. Scatter plot of log2 sum of signal intensities for AH030-104 and dilution versus log2 AH030-104/dilution signal intensity ratios, (B). Note the much greater power of the signal intensity approach [9] to detect a five-fold change. There were 1212 data points, of which 923 ORFs shown in red had q-values less than 0.001, see Table 2. The LOWESS curves show the region of expected random scatter about zero.

Similar articles

Cited by

References

    1. Ishii N, Robert M, Nakayama Y, Kanai A, Tomita M. Toward large-scale modeling of the microbial cell for computer simulation. J Biotechnol. 2004;113:281–294. - PubMed
    1. Souchelnytskyi S. Bridging proteomics and systems biology: what are the roads to be traveled? Proteomics. 2005;5:4123–4137. - PubMed
    1. Mayya V, Han KD. Proteomic applications of protein quantification by isotope-dilution mass spectrometry. Expert Rev Proteomics. 2006;3:597–610. - PubMed
    1. Domon B, Aebersold R. Mass spectrometry and protein analysis. Science. 2006;312:212–217. - PubMed
    1. Quackenbush J. Animal Genet. 37 Suppl 1. 2006. From’omes to biology; pp. 48–56. - PubMed

Publication types

MeSH terms

LinkOut - more resources