Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Sep;21(9):100279.
doi: 10.1016/j.mcpro.2022.100279. Epub 2022 Aug 6.

Rapid and In-Depth Coverage of the (Phospho-)Proteome With Deep Libraries and Optimal Window Design for dia-PASEF

Affiliations

Rapid and In-Depth Coverage of the (Phospho-)Proteome With Deep Libraries and Optimal Window Design for dia-PASEF

Patricia Skowronek et al. Mol Cell Proteomics. 2022 Sep.

Abstract

Data-independent acquisition (DIA) methods have become increasingly attractive in mass spectrometry-based proteomics because they enable high data completeness and a wide dynamic range. Recently, we combined DIA with parallel accumulation-serial fragmentation (dia-PASEF) on a Bruker trapped ion mobility (IM) separated quadrupole time-of-flight mass spectrometer. This requires alignment of the IM separation with the downstream mass selective quadrupole, leading to a more complex scheme for dia-PASEF window placement compared with DIA. To achieve high data completeness and deep proteome coverage, here we employ variable isolation windows that are placed optimally depending on precursor density in the m/z and IM plane. This is implemented in the freely available py_diAID (Python package for DIA with an automated isolation design) package. In combination with in-depth project-specific proteomics libraries and the Evosep liquid chromatography system, we reproducibly identified over 7700 proteins in a human cancer cell line in 44 min with quadruplicate single-shot injections at high sensitivity. Even at a throughput of 100 samples per day (11 min liquid chromatography gradients), we consistently quantified more than 6000 proteins in mammalian cell lysates by injecting four replicates. We found that optimal dia-PASEF window placement facilitates in-depth phosphoproteomics with very high sensitivity, quantifying more than 35,000 phosphosites in a human cancer cell line stimulated with an epidermal growth factor in triplicate 21 min runs. This covers a substantial part of the regulated phosphoproteome with high sensitivity, opening up for extensive systems-biological studies.

Keywords: PASEF; TIMS; data-independent acquisition; phosphoproteomics; systems biology.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest M. M. is an indirect investor in Evosep Biosystems. All other authors declare no competing interests.

Figures

None
Graphical abstract
Fig. 1
Fig. 1
Principle of dia-PASEF on a timsTOF with equidistant two-dimensional isolation windows.A, schematic of a TIMS tunnel followed by quadrupole isolation. B, dia-PASEF acquisition scheme depicting three dia-PASEF scans divided into three ion mobility (IM) windows. Vertical arrows indicate the elution of the ions with decreasing electrical field, and horizontal arrows indicate the movement of the quadrupole. The pattern of the top IM windows is repeated, and the top and bottom IM windows are extended to the upper and lower IM range, respectively. C, original dia-PASEF acquisition scheme (6) plotted on a kernel density distribution of all precursors. One dia-PASEF scan is divided into three IM windows by three distinct movements of quadrupole isolation. This scheme comprises eight dia-PASEF scans with equidistant isolation width covering in total 84% of the peptide ion population. D, histogram of m/z of all peptides covered by the acquisition method in (C), and peptides not covered by the method but identified in a separately recorded spectral library. E, number of peptide ions per isolation window. F, histogram of IMs of all peptides covered by the acquisition method, and peptides not covered by the method but identified in a separately recorded spectral library. The subfigures CF are based on a reference proteome library (see the Experimental Procedures section). DIA, data-independent acquisition; PASEF, parallel accumulation–serial fragmentation; TIMS, trapped ion mobility spectrometry.
Fig. 2
Fig. 2
py_diAID algorithm and evaluation.A, py_diAID design of the optimal acquisition scheme and window placement for a 21 min gradient (60 SPD, Evosep) with variable widths to balance the distribution of peptide ions, providing nearly complete peptide ion coverage. The left panel illustrates the first steps of the py_diAID algorithm: defining the m/z range of interest, binning the peptide ions in the m/z dimension and definition of the scan area in the IM dimension. Middle panel, calculation of the isolation window dimensions and coordinates based on the scan area. Right panel, extension of the isolation windows to the limits of the IM ranges. The arrow at the bottom indicates that the py_diAID algorithm evaluates the new acquisition scheme, defines the following test set of scan area parameters by Bayesian optimization, and resumes with the steps in the left panel. This is repeated for a user-defined number of iterations (more details in supplemental Fig. S6). A is plotted on top of a kernel density distribution based on the reference proteome library. B, average peptide identifications by the original and optimal dia-PASEF methods. C, number of peptides identified per minute over the entire retention time. D, Venn diagram showing the shared and unique peptides identified by both methods. Data in BD are from quadruplicate injections of 200 ng tryptic HeLa digest with a 21 min gradient and analyzed with the reference proteome library. DIA, data-independent acquisition; IM, ion mobility; PASEF, parallel accumulation–serial fragmentation; py_diAID, Python package for DIA with an automated isolation design; SPD, samples per day.
Fig. 3
Fig. 3
Workflow optimization for the 21 min gradient with project-specific deep libraries.A, peptides identified of the reference versus the project-specific deep library for 21 min runs. B, shared proteins and depth on the protein level in the two libraries. C, average peptide identification of four single-run injections. These data and the one in (D) and (E) were generated from quadruplicate injections of 200 ng tryptic HeLa digest acquired with a 21 min gradient and searched with the reference (24 fractions) or project-specific library (48 fractions). D, average protein identifications and identifications with only one peptide in the single runs. E, CVs at the protein level based on the MaxLFQ algorithm of DIA-NN. Boxplots show the median (center line), 25th, and 75th percentiles (lower and upper box limits, respectively), and the 1.5× interquartile range (whiskers). n = 6384 (24 fractions) and 7121 (48 fractions) shown in C. DIA, data-independent acquisition.
Fig. 4
Fig. 4
Comparison of different gradient lengths/throughput based on single-run analysis.A, all single-run identifications and those with a CV <20% for the 11, 21, and 44 min gradients. B, CVs at the protein level based on the MaxLFQ algorithm of DIA-NN. Boxplots show the median (center line), 25th and 75th percentiles (lower and upper box limits, respectively), and the 1.5× interquartile range (whiskers). n = 6341 (11 min/100 SPD) and 7121 (21 min/60 SPD), and 7802 (44 min/30 SPD) shown in panel A. C, analysis of peptide quantification in n out of four technical replicates shows that the large majority is quantified consistently. D, the number of peptides per second over the retention time for the three gradient lengths. The data were acquired in quadruplicate injections of 200 ng HeLa digest and analyzed with 48 fraction, DDA–PASEF libraries each recorded with the corresponding gradient length. 11-min library: 8553 proteins and 122,105 peptides; 21-min library: 8439 proteins and 124,155 peptides; 44-min library: 9461 proteins and 175,839 peptides. DDA, data-dependent acquisition; DIA, data-independent acquisition; PASEF, parallel accumulation–serial fragmentation.
Fig. 5
Fig. 5
Method optimization specifically for phosphoproteomics.A, peptide distribution of a proteomics digest displayed as kernel density estimation dependent on the charge and histograms of the abundance of differently charged precursors based on our deep proteomics library. B, peptide distribution of a phosphoproteomics digest displayed as kernel density estimation and histograms of the abundance of differently charged precursors based on our phosphopeptide library. C, original dia-PASEF method plotted on top of the phosphopeptide library. D, optimal dia-PASEF method tailored to the phospholibrary. E, identified phosphosites and phosphopeptides based on quadruplicates of 100 μg EGF-stimulated and enriched HeLa digest, separated within 21 min, and searched with DIA-NN against the phospholibrary. F, AlphaMap visualization (47): Protein sequence coverage of the EGF receptor (EGFR) depending on the acquisition method. DIA, data-independent acquisition; EGF, epidermal growth factor; PASEF, parallel accumulation–serial fragmentation.
Fig. 6
Fig. 6
The dia-PASEF workflow allows the robust detection of characteristic EGF signaling events.A, numbers of all identified phosphopeptides and phosphosites before and after filtering for localization probability and data completeness. B, phosphoproteome Pearson correlation matrix. Scatter plot shows the correlation of replicates within a condition. C, volcano plot of phosphosites regulated upon 15 min of EGF treatment in HeLa cells versus untreated cells. (Two-sided Student’s t test, FDR <0.01 = gray, FDR <0.05 = dark gray). Protein’s part of the GOBP term “EGFR signaling pathway” are highlighted in turquoise. D, Fisher’s exact test of proteins with significantly increased phosphosites upon EGF treatment (p < 0.002). Enrichment annotations are GOBP, GOMF, and KEGG. E, scheme of significantly upregulated phosphosites that were detected in this study and are part of the GOBP term “EGFR signaling pathway” and/or changed significantly upon EGF stimulation (FDR < 0.05). DIA, data-independent acquisition; EGF, epidermal growth factor; EGFR, EGF receptor; FDR, false discovery rate; GOBP, Gene Ontology Biological Process; GOMF, Gene Ontology Molecular Function; KEGG, Kyoto Encyclopedia of Genes and Genomes; PASEF, parallel accumulation–serial fragmentation.

Similar articles

Cited by

References

    1. Aebersold R., Mann M. Mass-spectrometric exploration of proteome structure and function. Nature. 2016;537:347–355. - PubMed
    1. Ludwig C., Gillet L., Rosenberger G., Amon S., Collins B.C., Aebersold R. Data-independent acquisition-based SWATH - MS for quantitative proteomics: a tutorial. Mol. Syst. Biol. 2018;14:1–23. - PMC - PubMed
    1. Bruderer R., Bernhardt O.M., Gandhi T., Miladinović S.M., Cheng L.Y., Messner S., et al. Extending the limits of quantitative proteome profiling with data-independent acquisition and application to acetaminophen-treated three-dimensional liver microtissues. Mol. Cell. Proteomics. 2015;14:1400–1410. - PMC - PubMed
    1. Gillet L.C., Navarro P., Tate S., Röst H., Selevsek N., Reiter L., et al. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteomics. 2012;11:1–17. - PMC - PubMed
    1. Chapman J.D., Goodlett D.R., Masselon C.D. Multiplexed and data-independent tandem mass spectrometry for global proteome profiling. Mass Spectrom. Rev. 2014;33:452–470. - PubMed

Publication types

LinkOut - more resources