Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2019 Oct;18(10):1967-1980.
doi: 10.1074/mcp.RA119.001472. Epub 2019 Jul 22.

Multibatch TMT Reveals False Positives, Batch Effects and Missing Values

Affiliations
Comparative Study

Multibatch TMT Reveals False Positives, Batch Effects and Missing Values

Alejandro Brenes et al. Mol Cell Proteomics. 2019 Oct.

Abstract

Multiplexing strategies for large-scale proteomic analyses have become increasingly prevalent, tandem mass tags (TMT) in particular. Here we used a large iPSC proteomic experiment with twenty-four 10-plex TMT batches to evaluate the effect of integrating multiple TMT batches within a single analysis. We identified a significant inflation rate of protein missing values as multiple batches are integrated and show that this pattern is aggravated at the peptide level. We also show that without normalization strategies to address the batch effects, the high precision of quantitation within a single multiplexed TMT batch is not reproduced when data from multiple TMT batches are integrated.Further, the incidence of false positives was studied by using Y chromosome peptides as an internal control. The iPSC lines quantified in this data set were derived from both male and female donors, hence the peptides mapped to the Y chromosome should be absent from female lines. Nonetheless, these Y chromosome-specific peptides were consistently detected in the female channels of all TMT batches. We then used the same Y chromosome specific peptides to quantify the level of ion coisolation as well as the effect of primary and secondary reporter ion interference. These results were used to propose solutions to mitigate the limitations of multi-batch TMT analyses. We confirm that including a common reference line in every batch increases precision by facilitating normalization across the batches and we propose experimental designs that minimize the effect of cross population reporter ion interference.

Keywords: Tandem mass spectrometry; bioinformatics; computational biology; data analysis; data evaluation; false positives; ipsc; isobaric tags; mass spectrometry.

PubMed Disclaimer

Figures

None
Graphical abstract
Fig. 1.
Fig. 1.
Protein and peptide missing values: A, Percentage of missing values for each TMT batch calculated at the protein level. B, Percentage of missing values for each TMT batch calculated at the peptide level. C, Box plot showing the results for protein missing values as a function of the number of 10-plex TMT batches (see methods). D, Box plot showing the results for peptide missing values as a function of the number of 10-plex TMT batches (see methods). For both C and D the lower and upper hinges represent the 1st and 3rd quartiles. The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge, the lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge.
Fig. 2.
Fig. 2.
Peptide identifications and intensities: A, Number of peptides identified with MS3 intensity greater than zero in all TMT channels, colored by TMT batch. B, Histogram of the median normalized peptide intensity (see methods). C, Stacked density plot showing peptides grouped by median normalized peptide intensity quartiles and their percentage of detection across all TMT channels. D, Stacked density plot showing quartiles of identification rates for each peptide and their corresponding log10 normalized MS3 intensity.
Fig. 3.
Fig. 3.
Variation: A, Box plots showing the protein copy number coefficient of variation for all proteins detected in each 10-plex TMT batch as well as all proteins detected in all the reference line replicates (TMT channel 126C in all batches). B, Box plots showing the protein copy numbers of the 100 most abundant proteins with a coefficient of variation greater than or equal to 7.5 across all reference line replicates (TMT channel 126C in all batches). For both boxplots the lower and upper hinges represent the 1st and 3rd quartiles. The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge, the lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge.
Fig. 4.
Fig. 4.
Y chromosome peptides in female channels: Scatter plot showing the gender/incidence of false positives across 21 TMT batches and their reporter ion mass tags. Male cell lines are shown as a gray square, female cell lines are represented by a circle. The female lines (circles) are shaded to indicate the percentage of Y chromosome specific peptides that were detected in their channel within each TMT batch.
Fig. 5.
Fig. 5.
TMT channel leakage analysis: A, Box plot showing the median normalized intensity of Y chromosome specific peptides for both female and male cell lines across 21 TMT batches. B, Box plot of ratios for Y chromosome specific peptides, comparing male channels versus female channels affected by different reporter ion interference type. C, Box plot of ratios for Y chromosome specific peptides, stratified by the median normalized intensity, comparing female channels affected by different types of reporter ion interference versus female channels not subjected to reporter ion interference. For all 3 boxplots the lower and upper hinges represent the 1st and 3rd quartiles. The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the hinge, the lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge.
Fig. 6.
Fig. 6.
TMT experimental design from reporter ion interference analysis: A, 5–5 grouped layout for a 10-plex TMT batch with 2 populations and 5 replicates each. Two channels are being affected by cross population primary and secondary reporter ion interference. B, optimal layout for a 10-plex TMT batch with 2 populations and 5 replicates each, with no cross population primary or secondary reporter ion interference. C, optimal 11-plex configuration for 3 populations with 3 replicates each. By leaving two empty channels, it eliminates cross population reporter ion interference. D, optimal 11-plex configuration for 3 populations with 3 replicates each, with one empty channel and one reference line channel. Only two channels suffer primary and secondary reporter ion interference.

References

    1. Bekker-Jensen D. B., Kelstrup C. D., Batth T. S., Larsen S. C., Haldrup C., Bramsen J. B., Sorensen K. D., Hoyer S., Orntoft T. F., Andersen C. L., Nielsen M. L., and Olsen J. V. (2017) An Optimized Shotgun Strategy for the Rapid Generation of Comprehensive Human Proteomes. Cell Syst. 4, 587–599 e584 - PMC - PubMed
    1. Beck M., Schmidt A., Malmstroem J., Claassen M., Ori A., Szymborska A., Herzog F., Rinner O., Ellenberg J., and Aebersold R. (2011) The quantitative proteome of a human cell line. Mol. Syst. Biol. 7, 549. - PMC - PubMed
    1. Meier F., Geyer P. E., Virreira Winter S., Cox J., and Mann M. (2018) BoxCar acquisition method enables single-shot proteomics at a depth of 10,000 proteins in 100 minutes. Nat. Methods 15, 440–448 - PubMed
    1. Camerini S. and Mauri P. (2015) The role of protein and peptide separation before mass spectrometry analysis in clinical proteomics. J. Chromatogr. A 1381, 1–12 - PubMed
    1. Rost H. L., Malmstrom L., and Aebersold R. (2015) Reproducible quantitative proteotype data matrices for systems biology. Mol. Biol. Cell 26, 3926–3931 - PMC - PubMed

Publication types

LinkOut - more resources