Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Aug 6;13(16):1311.
doi: 10.3390/cells13161311.

Combined High-Throughput Proteomics and Random Forest Machine-Learning Approach Differentiates and Classifies Metabolic, Immune, Signaling and ECM Intra-Tumor Heterogeneity of Colorectal Cancer

Affiliations

Combined High-Throughput Proteomics and Random Forest Machine-Learning Approach Differentiates and Classifies Metabolic, Immune, Signaling and ECM Intra-Tumor Heterogeneity of Colorectal Cancer

Cristina Contini et al. Cells. .

Abstract

Colorectal cancer (CRC) is a frequent, worldwide tumor described for its huge complexity, including inter-/intra-heterogeneity and tumor microenvironment (TME) variability. Intra-tumor heterogeneity and its connections with metabolic reprogramming and epithelial-mesenchymal transition (EMT) were investigated with explorative shotgun proteomics complemented by a Random Forest (RF) machine-learning approach. Deep and superficial tumor regions and distant-site non-tumor samples from the same patients (n = 16) were analyzed. Among the 2009 proteins analyzed, 91 proteins, including 23 novel potential CRC hallmarks, showed significant quantitative changes. In addition, a 98.4% accurate classification of the three analyzed tissues was obtained by RF using a set of 21 proteins. Subunit E1 of 2-oxoglutarate dehydrogenase (OGDH-E1) was the best classifying factor for the superficial tumor region, while sorting nexin-18 and coatomer-beta protein (beta-COP), implicated in protein trafficking, classified the deep region. Down- and up-regulations of metabolic checkpoints involved different proteins in superficial and deep tumors. Analogously to immune checkpoints affecting the TME, cytoskeleton and extracellular matrix (ECM) dynamics were crucial for EMT. Galectin-3, basigin, S100A9, and fibronectin involved in TME-CRC-ECM crosstalk were found to be differently variated in both tumor regions. Different metabolic strategies appeared to be adopted by the two CRC regions to uncouple the Krebs cycle and cytosolic glucose metabolism, promote lipogenesis, promote amino acid synthesis, down-regulate bioenergetics in mitochondria, and up-regulate oxidative stress. Finally, correlations with the Dukes stage and budding supported the finding of novel potential CRC hallmarks and therapeutic targets.

Keywords: CRC proteomics; GRASP-1; ROS; S100A9; basigin; extracellular matrix; galectin-3; intra-tumor heterogeneity; mitochondrial metabolism; sorting nexin-18.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Figures

Figure 1
Figure 1
(a) Confusion matrix and sensitivity/specificity of RF classification. (b) Sparkline graphic representing the relative importance of the 21 proteins selected for RF classification of the S, D, and H groups, calculated as MDA. Proteins with a higher average MDA have a greater importance in the RF model. Blue columns show the MDA scores calculated for each group. The red column shows the average MDA score.
Figure 2
Figure 2
Multidimensional scaling plot showing the relationships among the three groups of samples (S, red; D, blue; H, green), using the proximity values calculated by RF. Patient identification numbers correspond to those shown in Table 1. Each group is delimited by a dispersion ellipse with a confidence of 1.6 standard deviations.
Figure 3
Figure 3
Heatmap of the 91 proteins showing significant changes among the three groups of samples. Each row of the heatmap represents a specific protein, and each column represents a single subject. The “RF” labels indicate the 21 proteins used for RF classification. For each protein, the color of each cell reflects the percentile value of each subject relative to all 48 subjects present in the same row. Shades of green and red represent decreasing and increasing percentiles, respectively, from the median. To obtain homogeneous blocks of color, the proteins were preliminarily ordered vertically based on the output of a hierarchical cluster analysis. Note that the 21 RF proteins are spread across all the major clusters produced by the 91 differentially expressed proteins. This suggests that the RF proteins span the full range of the different relationships among the proteins found to be varied among the three groups.
Figure 4
Figure 4
Enrichment analysis via g:Profiler coupled with Cytoscape EnrichmentMap Pipeline Collection of the following: (a) of all the 2009 proteins identified in the study, only functional networks defined by the processes with p-values < 0.01 are shown; (b) the 91 proteins showing significant variations in the statistical differential analysis. Colors refer to FDR, from the least significant (light yellow, FDR 0.04) to the most significant (red, FDR 0.002).
Figure 5
Figure 5
Results from correlation analysis: (a) proteins correlated with the Dukes stage and (b) with the budding index in the two tumor tissues; D in red, S in cyan; positive and negative r coefficients are reported in the plot.
Figure 6
Figure 6
Representation of the metabolic implications, and their probable down- or up-regulation, relative to proteins significantly variated in S (a) or D (b) samples.

References

    1. WHO Cancer. [(accessed on 20 March 2024)]. Available online: https://www.who.int/news-room/fact-sheets/detail/cancer.
    1. Lim L.C., Lim Y.M. Proteome Heterogeneity in Colorectal Cancer. Proteomics. 2018;18:1700169. doi: 10.1002/pmic.201700169. - DOI - PubMed
    1. Árnadóttir S.S., Mattesen T.B., Vang S., Madsen M.R., Madsen A.H., Birkbak N.J., Bramsen J.B., Andersen C.L. Transcriptomic and Proteomic Intra-Tumor Heterogeneity of Colorectal Cancer Varies Depending on Tumor Location within the Colorectum. PLoS ONE. 2020;15:e0241148. doi: 10.1371/journal.pone.0241148. - DOI - PMC - PubMed
    1. Nenkov M., Ma Y., Gaßler N., Chen Y. Metabolic Reprogramming of Colorectal Cancer Cells and the Microenvironment: Implication for Therapy. Int. J. Mol. Sci. 2021;22:6262. doi: 10.3390/ijms22126262. - DOI - PMC - PubMed
    1. Zhang J., Zou S., Fang L. Metabolic Reprogramming in Colorectal Cancer: Regulatory Networks and Therapy. Cell Biosci. 2023;13:25. doi: 10.1186/s13578-023-00977-w. - DOI - PMC - PubMed

Publication types