Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 2;16(1):73.
doi: 10.1038/s41467-024-55057-5.

Robust proteome profiling of cysteine-reactive fragments using label-free chemoproteomics

Affiliations

Robust proteome profiling of cysteine-reactive fragments using label-free chemoproteomics

George S Biggs et al. Nat Commun. .

Abstract

Identifying pharmacological probes for human proteins represents a key opportunity to accelerate the discovery of new therapeutics. High-content screening approaches to expand the ligandable proteome offer the potential to expedite the discovery of novel chemical probes to study protein function. Screening libraries of reactive fragments by chemoproteomics offers a compelling approach to ligand discovery, however, optimising sample throughput, proteomic depth, and data reproducibility remains a key challenge. We report a versatile, label-free quantification proteomics platform for competitive profiling of cysteine-reactive fragments against the native proteome. This high-throughput platform combines SP4 plate-based sample preparation with rapid chromatographic gradients. Data-independent acquisition performed on a Bruker timsTOF Pro 2 consistently identified ~23,000 cysteine sites per run, with a total of ~32,000 cysteine sites profiled in HEK293T and Jurkat lysate. Crucially, this depth in cysteinome coverage is met with high data completeness, enabling robust identification of liganded proteins. In this study, 80 reactive fragments were screened in two cell lines identifying >400 ligand-protein interactions. Hits were validated through concentration-response experiments and the platform was utilised for hit expansion and live cell experiments. This label-free platform represents a significant step forward in high-throughput proteomics to evaluate ligandability of cysteines across the human proteome.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Workflow and cysteinome coverage from our HT-LFQ chemoproteomic method.
a Schematic of our label-free sample preparation method that allows detection of cysteine-containing peptides from lysates or live cells using a hyperreactive IA-DTB probe. b Data acquisition was performed using an Evosep One and Bruker timsTOF Pro 2, followed by identification and quantification of peptides using Spectronaut. Peptides that were bound by a covalent fragment are expected to show a reduced intensity in compound-treated samples, relative to control samples. c Our method allows detection of high numbers of peptides in HEK293T and Jurkat lysates, providing the opportunity to detect liganding events at over 30,000 cysteine residues from over 8000 proteins (n = 16 DMSO control samples from each lysate). d We see high data completeness of peptide detection, with two-thirds of peptides detected in ≥ 75% of samples, allowing more confident detection of binding events. Data shown here is from HEK293T lysate; see Supplementary Fig. 1c for Jurkat data. e Our detection of cysteine residues from an individual cell lysate (HEK293T; orange) represents approximately ~40% coverage of residues that can be considered to be feasibly detectable, based on their location relative to tryptic cleavages sites (green) and the general detectability of proteins by global proteomics methods (purple), as well as the presence of disulfide bonds and post-translational modifications (see Supplementary Fig. 3). Tryptic peptides were classified as being detectable if they were 7–40 residues (not considering missed cleavages). f Distribution of proteins across protein families (top) and target development levels (bottom), as defined by the ‘Illuminating the Druggable Genome’ programme,,. The colour scheme in this figure follows that used in (e). GPCRs: G protein-coupled receptors; TFs: transcription factors. Parts of both (a) and (b) were created in BioRender. Cawood, E. (2025) https://BioRender.com/m32r739. Source data are provided as a Source Data file.
Fig. 2
Fig. 2. The 80-compound screen performed in HEK293T and Jurkat lysate using HT-LFQ chemoproteomics.
a Distribution of molecular weights (MW) and hydrogen-bond acceptor/donor (HBA/HBD) counts for the 80 chloroacetamide fragments in the screening library. b Volcano plot showing an overlay of all the data obtained in this experiment, where each data point represents the interaction measured in each cell line between a fragment (50 μM) and a cysteine-containing peptide. Liganding events are defined where a fragment shows strong, statistically-significant competition (measured by a competition ratio, CR) with IA-DTB: log2(CR) ≥ 1 and -log10(p-value) ≥ 1.3, as indicated by dotted lines. This experiment was performed with technical replicates (n = 4 for compound-treated samples, n = 16 for DMSO control samples) in both cell lines. c Distribution of liganded proteins across ‘Illuminating the Druggable Genome’ protein families (top) and target development levels (bottom). GPCRs: G protein-coupled receptors; TFs: transcription factors. d The proportion of cysteine residues that lie within or near pockets (grey) increases when considering cysteines that are liganded (log2(CR) ≥ 1) or strongly liganded (log2(CR) ≥ 2) by at least one fragment, compared to the cysteinome as a whole. e Heatmap of all the interactions detected in HEK293T lysate. For clarity, this heatmap only includes cysteines that are liganded by at least one compound with log2(CR) ≥ 1.5. f Volcano plots showing interactions detected in HEK293T lysate with active site cysteine residues – these residues show binding to a high proportion (≥10%) of fragments in the screening library. g NIT1 and NIT2 show very similar binding profiles to tertiary chloroacetamides (heatmap is clustered by molecular fingerprint), which reflects the high structural similarity of these proteins. The colour scheme used for the volcano plots/heatmap in (f, g) matches that used in (e). In the NIT1 and NIT2 structures (AlphaFold2), the side chain of the liganded cysteine residue (Cys203 and Cys153, respectively) is shown as spheres. All p-values were calculated using Welch’s t-test (two-sided). Source data are provided as a Source Data file.
Fig. 3
Fig. 3. Top four specific protein-fragment interactions detected in the initial screen.
Specific interactions were detected with TPMT Cys70 (a), VCP Cys522 (b), MOB4 Cys134 (c), and MKLN1 Cys82 (d), based on comparison of compound-treated samples (50 μM, n = 4) with DMSO control samples (n = 16). Volcano plots are shown both for the cysteine residue (left) and compound (right) involved in each interaction, to highlight that each of the four cysteine residues is only liganded by one fragment in both HEK293T (orange) and Jurkat (blue) lysate, and that this cysteine site is most strongly competed target of these compounds. Dotted lines indicate the thresholds used to identify liganding events: log2(CR) ≥ 1 and -log10(p-value) ≥ 1.3. All p-values were calculated using Welch’s t-test (two-sided). Protein structures are either from the Protein Data Bank (PDB) or AlphaFold2: TPMT, PDB ID 2H11 (residues 40–245); VCP, PDB ID 5FTJ; STRIPAK (containing MOB4), PDB ID 7K36; MKLN1, AlphaFold2 model. The side chain of each targeted cysteine residue is shown as blue and yellow spheres (for the carbon and sulfur atoms, respectively). SAH S-adenosyl-L-homocysteine, ADP adenosine diphosphate. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Concentration-response chemoproteomics experiment.
a The structures of four compounds that showed a range of promiscuity levels in the initial screen. These four compounds, along with the four compounds that showed specific interactions with a protein target (Fig. 3), were tested in a 10-point concentration-response experiment in HEK293T lysate (n = 4 for compound-treated samples; n = 25 DMSO control samples). The total number of liganding events detected for each of these eight compounds varied widely across the concentration range tested (b, c); p-values were calculated using Welch’s t-test (two-sided). d The concentration-response experiment was analysed by performing logistic regression to identify any concentration-dependent interactions between each compound and all detectable cysteine residues. e Heatmap showing the pTE50 values of all concentration-dependent interactions that were confidently identified in this experiment. The selective interactions that were identified in the initial screen are highlighted by black boxes. Source data are provided as a Source Data file.
Fig. 5
Fig. 5. The selectivity and potency of prioritised protein-fragment interactions.
a Concentration-response data (mean ± standard deviation) obtained for VCP Cys522, TPMT Cys70, MOB4 Cys134 and MKLN1 Cys82 in HEK293T lysate (n = 4 for compound-treated samples; n = 25 for DMSO control samples), highlighting the concentration-dependent engagement of these cysteines with their identified ligand (PP222, PP183, PP48 and PP156, respectively; black data), in contrast with the data obtained for the other seven compounds tested in the experiment (grey data points; only mean values are shown for clarity). b The strongest concentration-dependent interactions detected for PP222, PP183, PP48 and PP156, showing other cysteine residues engaged by these compounds in addition to those shown in (a). These off-target cysteine residues are as follows: PNN Cys249 (PP222 pTE50 = 4.2 ± 0.4; PP156 pTE50 = 5.5 ± 0.7), ATP6V1A Cys138 (PP183 pTE50 = 5.1 ± 0.1; PP156 pTE50 = 5.3 ± 0.1), NIT1 Cys203 (PP48 pTE50 = 5.6 ± 0.8), and NIT2 Cys153 (PP48 pTE50 = 5.4 ± 0.1). c SAR analysis around the interaction between PP48 and MOB4 Cys134, NIT1 Cys203 and NIT2 Cys153, showing the results of five-point concentration-response data acquired in HEK293T lysate (1-h incubation; n = 4 for compound-treated samples; n = 25 for DMSO control samples) and data acquired in live HEK293T cells (25 μM, 2-h incubation; n = 3 for compound-treated samples; n = 12 for DMSO control samples). Precise pTE50 values for these interactions and for other compounds in the SAR compound set can be found in Supplementary Fig. 9. Unless otherwise stated, concentration-response data is shown as mean ± standard deviation (data points ± error bars) alongside the logistic regression curve. All p-values were calculated using Welch’s t-test (two-sided). Source data are provided as a Source Data file.

References

    1. Garbaccio, R. M. & Parmee, E. R. The impact of chemical probes in drug discovery: a pharmaceutical industry perspective. Cell Chem. Biol.23, 10–17 (2016). - PubMed
    1. Müller, S. et al. Target 2035 – update on the quest for a probe for every protein. RSC Med. Chem.13, 13–21 (2022). - PMC - PubMed
    1. Oprea, T. I. et al. Unexplored therapeutic opportunities in the human genome. Nat. Rev. Drug Discov.17, 317–332 (2018). - PMC - PubMed
    1. Blagg, J. & Workman, P. Choose and use your chemical probe wisely to explore cancer biology. Cancer Cell32, 9–25 (2017). - PMC - PubMed
    1. Arrowsmith, C. H. et al. The promise and peril of chemical probes. Nat. Chem. Biol.11, 536–541 (2015). - PMC - PubMed

Publication types

LinkOut - more resources