Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 14;16(1):6474.
doi: 10.1038/s41467-025-61685-2.

Immunodiagnostic plasma amino acid residue biomarkers detect cancer early and predict treatment response

Affiliations

Immunodiagnostic plasma amino acid residue biomarkers detect cancer early and predict treatment response

Cong Tang et al. Nat Commun. .

Abstract

The immune response to tumour development is frequently targeted with therapeutics but remains largely unexplored in diagnostics, despite being stronger for early-stage tumours. We present an immunodiagnostic platform to detect this. We identify a panel of amino acid residue biomarkers providing a signature of cancer-specific immune activation associated with tumour development and distinct from autoimmune and infectious diseases, measurable optically in neat blood plasma, and validate within N = 170 participants. By measuring the total concentrations of cysteine, free cysteine, lysine, tryptophan, and tyrosine protein-incorporated biomarkers and analyzing the results with supervised machine learning, we identify 78% of cancers with 0% false positive rate (N = 97) with an AUROC of 0.95. The cancer, healthy, and autoimmune/infectious biomarker pattern are statistically significantly different (p < 0.0001). Smaller-scale changes in biomarker concentrations reveal inter-patient differences in immune activation that predict treatment response. Specific concentration ranges of these biomarkers predict response to Cyclin-dependent kinase inhibitors in advanced breast cancer patients (p < 0.05), identifying 98% of responders (N = 33). Here we provide an immunodiagnostic technology platform that, to our knowledge, has not been previously reported, and prove initial clinical application in a cohort of N = 170, including proof of concept in Multi Cancer Early Detection and personalized medicine.

PubMed Disclaimer

Conflict of interest statement

Competing interests: W.S., E.V.Y. and G.J.L.B. are co-founders of Proteotype Diagnostics Ltd. C.T., W.S., L.C., E.V.Y. and G.J.L.B. are stockholders of Proteotype Diagnostics Ltd. W.S. and E.V.Y. are employed by Proteotype Diagnostics Ltd. Proteotype Diagnostics Ltd owns a patent application that incorporates a method of identifying the presence and/or concentration and/or amount of proteins or proteomes, which is described in this manuscript (EP4196797A1). G.J.L.B. is a Visiting Professor at Xi’an Fengcheng Hospital. All other authors declare no conflict of interest.

Figures

Fig. 1
Fig. 1. Detecting tumour immunosurveillance for diagnostics.
a A patient with a breast tumour provides a 5 mL whole blood sample. b The c. 70 mg/mL protein component plays roles in immunosurveillance, in which the immune system identifies and eliminates malignant cells. This process increases in early-stage cancer before cancer immunoediting allows immune escape and metastasis. It is also the target of immunosurveillance increasing drugs, such as immune checkpoint inhibitors (ICI’s) and a recently discovered additional mechanism of Cyclin-dependent kinase inhibitors (CDKi’s). c Example of immune surveillance enhancement observed upon development of prostate cancer. d The host immune response includes up to 3000 additional proteins, however, clinically meaningful quantification remains challenging due to high intra and inter-individual variation. e We introduce the concept of taking an average across all individual protein targets. This Amino Acid Concentration Signature (AACS) provides a protein cross-section, which reduces intra and inter-individual variation via the Law of Large Numbers.
Fig. 2
Fig. 2. Design of biomarkers using a biological embedding.
a Embeddings are frequently used to process complex systems with many targets for machine learning applications, such as in natural language processing. Such embeddings represent each target as a vector of component parts within a high-dimensional space, and combines the component vectors in a dimensionality reduction step, such that similar targets cluster together in reduced n-dimensional space. b Contrary to traditional embeddings that derive the relative contributions of the dataset dimensions to maximise captured variation, our biological embedding is determined by the proportional molar concentration of components in blood plasma. The Immunoglobulin fraction, c. 38% of proteins, is further divided into immunoglobulin classes, with the proportional contribution shown in healthy individuals. c We performed bioinformatic analysis to determine the optimal biological embedding dimensions to capture this variability. Possible dimensions include any of the N = 20 major amino acid types, which comprise the major plasma protein fractions. We identified N = 5 amino acid types whose numbers changed substantially across the proportion-weighted fractions, such that they would be detectable in an average biological embedding signature.
Fig. 3
Fig. 3. Quantifying amino acids in neat patient plasma with bio orthogonal labelling.
a Our strategy involves performing five bioorthogonal labelling reactions. Each label is targeted to protein-incorporated amino acid residues of a specific amino acid type and reacts only with that amino acid type. Labels are fluorogenic, with fluorescence turning on exclusively after reaction with the targeted amino acid type of interest. Labelling is quantitative, so fluorescence intensity scales quantitatively with the concentration of the targeted amino acid type. b We identified fluorescence imaging regions where we did not observe autofluorescence from neat patient blood plasma. Fluorescence excitation spectra for labelling of protein-incorporated cysteine residues within the blood plasma of a patient with breast cancer, for the dye before mixing with the patient plasma, and for the patient plasma alone. c Because labelling is quantitative and no autofluorescence is observed from patient plasma, we use a calibration curve derived from solutions of known amino acid concentrations - Bovine Serum Albumin (BSA), Beta-Lactoglobulin (BetaLac), and Lysozyme (LYZ) of known protein sequence at known protein concentration — to determine the relationship between fluorescence intensity and amino acid concentration for each amino acid type. We used the quantitative linear fit to the protein calibration curve to transform the fluorescence intensity measured for cysteine labelling in the patient plasma samples into the corresponding concentration of cysteine amino acid residues. Data shown for triplicates of N = 20 breast cancer patients plasma samples. d The relationship between the experimentally measured Amino Acid Concentration Signature (AACS) with theoretically calculated AACS from the known concentrations of individual plasma proteins. The AACS of Tyrosine (Tyr), Cysteine (Cys_T) and Lysine (Lys) are shown in the graph. Data shown for triplicates of N = 20 healthy patient plasma samples. e To investigate whether changes in the host immune response to tumour development drove AACS alternations, we measured the AACS of mice which had been injected with an immunologically Hot Tumour or a Cold Tumour. Note: Throughout, Cys_R refers to free cysteine residues that are not disulfide bonded, while Cys_T refers to total cysteine residues, including those involved in disulfide bonding.
Fig. 4
Fig. 4. Multi Cancer Early Detection using AACS.
a AACS signature measured from N = 77 cancer patients (blue squares) or N = 20 healthy donors (grey circles), plotted in N-dimensional space. 3/5 measured AACS dimensions shown for visual clarity, selected by choosing the 3 dimensions with the highest feature importance in the ANOVA analysis presented in Supplementary Fig. 5. b Receiver Operating Characteristic (ROC) Curve examining the accuracy of ensemble subspace discriminant classifier performance on unseen validation data. Area Under the ROC Curve was 0.95. c Cross-validation approach scheme. d Average normalised SHAP values with standard errors for a linear Support Vector Machine (SVM) model trained and validated on the N = 97 datapoints using 5X cross validation. e AACS measured and plotted in N-dimensional space, with patients labelled according to tumour location: breast (pink squares), prostate (blue squares), colorectal (green squares), pancreatic (black squares), and cancer-negative controls (grey circles). f K-nearest neighbour classifier performance on a held-back, unseen validation set trained and validated using cross-validation as in panel (c) to identify tumour origin. Cancers are localised to either abdominal (colorectal, pancreatic) or hormonal (breast, prostate) origin. Abdominal cancers could be identified via abdominal CT scan, and hormonal cancers could be triangulated for further imaging or biopsy, considering patient biological sex. g AACS measured and plotted for N = 77 cancer patients (blue squares) and N = 20 cancer-negative controls (red circles), with the addition of AACS signatures for N = 20 patients with autoimmune disease (green diamond) and N = 20 patients with infection (black plus), providing controls of non-cancer heightened immune surveillance. 2/5 measured AACS dimensions shown for visual clarity and were selected using the ANOVA analysis in Supplementary Fig. 10. An Additional 3 dimensions provide required specificity, and UMAP representations are shown in Supplementary Fig. 8 and 9. h Healthy (black diamond) and pancreatic cancer (red circles) AACS, with the clinical stage of each pancreatic cancer patient shown numerically next to its data point. Patients with metastatic cancer cluster towards the healthy distribution, highlighted schematically with a shadow, whereas patients with earlier stages of cancer have signals further from the healthy distribution.
Fig. 5
Fig. 5. Companion Diagnostic using AACS.
a Cyclin-dependent kinase inhibitors (CDKi) mechanisms of action,. Canonically, CDKi’s (including Palbociclib, Ribociclib, Abemaciclib, and assets in development) inhibit cancer cell proliferation through the cell cycle by inhibiting CDK4/6 binding to cyclin D1. This limits phosphorylation of Rb, needed for activation of E2F-associated cell cycle control genes. In addition, CDKi’s have recently been shown to suppress production of Regulatory T cells. These cells create immune tolerance and are associated with decreasing immunosurveillance during cancer metastasis. In this way, CDKi’s increase immunosurveillance. b Clinical characteristics of N = 33 Hormone Receptor positive, HER2 negative metastatic breast cancer patients who were recommended for CDKi’s according to current treatment guidelines. Patients were prescribed CDKi’s as per standard of care, but before starting treatment, their blood sample was collected and analysed using AACS. Their response was evaluated on a 6-month CT scan according to RECIST criteria. Non-responding patients had cancer progression by 6 months. c Pre-treatment AACS measured for CDKi-prescribed patients plotted in N-dimensional space, coloured according to subsequently assessed response at 6 months. 3/5 dimensions shown for visual clarity. The 3 presented features were selected using an ANOVA feature ranking in Supplementary Fig. 13. d A linear SVM classifier was trained and validated using held-back, unseen validation data. True positive rate (sensitivity) and false negative rate (1 - specificity) on the validation data. This metric evaluates what percentage of true responders and non-responders are identified using the classifier. e Positive predictive value, and false discovery rate for the classifier described in (d). This metric evaluates when a prediction is made by the classifier, what percentage of the time is it correct. f Average normalised SHAP values with standard errors for a linear Support Vector Machine (SVM) model trained and validated on the N = 33 datapoints using 5X cross validation. g Among N = 33 CDKi-treated patients, 10 had progressed by the time of writing. Their plasma sample was measured again when progression was identified, and the AACS results compared in N-dimensional space for before the started CDKi treatment versus at the timepoints where their progression was identified.

References

    1. Waldman, A. D., Fritz, J. M. & Lenardo, M. J. A guide to cancer immunotherapy: from T cell basic science to clinical practice. Nat. Rev. Immunol.20, 651–668 (2020). - PMC - PubMed
    1. Dagher, O. K., Schwab, R. D., Brookens, S. K. & Posey, A. D. Advances in cancer immunotherapies. Cell186, 1814–1814.e1 (2023). - PubMed
    1. Nicholson, B. D. et al. Multi-cancer early detection test in symptomatic patients referred for cancer investigation in England and Wales (SYMPLIFY): a large-scale, observational cohort study. Lancet Oncol.24, 733–743 (2023). - PubMed
    1. Stejskal, P. et al. Circulating tumor nucleic acids: biology, release mechanisms, and clinical relevance. Mol. Cancer22, 1–21 (2023). - PMC - PubMed
    1. Klein, E. et al. Clinical validation of a targeted methylation-based multi-cancer early detection test using an independent validation set. Ann. Oncol.32, 1167–1177 (2021). - PubMed

LinkOut - more resources