Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun;18(6):1460-1485.
doi: 10.1002/1878-0261.13622. Epub 2024 Mar 11.

Transcriptome-wide gene expression outlier analysis pinpoints therapeutic vulnerabilities in colorectal cancer

Affiliations

Transcriptome-wide gene expression outlier analysis pinpoints therapeutic vulnerabilities in colorectal cancer

Elisa Mariella et al. Mol Oncol. 2024 Jun.

Abstract

Multiple strategies are continuously being explored to expand the drug target repertoire in solid tumors. We devised a novel computational workflow for transcriptome-wide gene expression outlier analysis that allows the systematic identification of both overexpression and underexpression events in cancer cells. Here, it was applied to expression values obtained through RNA sequencing in 226 colorectal cancer (CRC) cell lines that were also characterized by whole-exome sequencing and microarray-based DNA methylation profiling. We found cell models displaying an abnormally high or low expression level for 3533 and 965 genes, respectively. Gene expression abnormalities that have been previously associated with clinically relevant features of CRC cell lines were confirmed. Moreover, by integrating multi-omics data, we identified both genetic and epigenetic alternations underlying outlier expression values. Importantly, our atlas of CRC gene expression outliers can guide the discovery of novel drug targets and biomarkers. As a proof of concept, we found that CRC cell lines lacking expression of the MTAP gene are sensitive to treatment with a PRMT5-MTA inhibitor (MRTX1719). Finally, other tumor types may also benefit from this approach.

Keywords: biomarkers; colorectal cancer; drug targets; gene expression outliers.

PubMed Disclaimer

Conflict of interest statement

AB served in a consulting/advisory role for Inivata and Guardant Health. AB is a member of the scientific advisory board of Neophore, Inivata, and Roche Genentech CRC Advisory Board. AB receipts grants/research supports from Neophore, Astrazeneca, and Boehringer. AB is cofounder and shareholder of NeoPhore Limited and shareholder of Kither. FDN received speaker's fees from Illumina and served in a consulting/advisory role for Amgen and Pierre Fabre Pharma. SA reports personal fees from MSD Italia and a patent (international PCT patent application No. WO 2023/199255 and Italian patent application No. 102022000007535) outside the submitted work. The remaining authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Genetic, transcriptomic, and epigenetic features of 226 colorectal cancer (CRC) cell lines. Circos plot delineating clinically relevant molecular features of 226 CRC cell lines. Each layer corresponds to a different genetic, transcriptomic, or epigenetic feature, organized from the outside to the inside in the following order: microsatellite status as microsatellite stable (MSS) or instable (MSI); KRAS mutational status; NRAS mutational status; BRAF mutational status; APC mutational status; TP53 mutational status; consensus molecular subtypes (CMS) as CMS1, CMS2, CMS3, or CMS4; CRC intrinsic subtypes (CRIS) as CRIS‐A, CRIS‐B, CRIS‐C, CRIS‐D, or CRIS‐E; classification based on the CpG island methylator phenotype (CIMP) as CIMP‐H, CIMP‐L, CIMP3, or CIMP4. In the layers corresponding to CMS and CRIS gene expression‐based subtypes, CRC cell lines that cannot be confidently assigned to a single subtype (FDR > 5%) were labeled as NA (not available).
Fig. 2
Fig. 2
Identification of overexpression and underexpression events in 226 colorectal cancer (CRC) cell lines. (A) Graphical representation of the computational pipeline. The number of genes for which positive or negative outliers(s) were found after each step of the pipeline is shown on the left and the right sides, respectively. (B) Identification of extreme positive outliers for the SERPINA4 gene. Each boxplot shows the distribution of SERPINA4 expression values. Red dots are samples selected as positive outliers after each step of the pipeline. Their number is also reported between brackets in the title of each subpanel. (C) In the left subpanel, the number of extreme positive outliers is shown for each overexpressed gene. A detailed representation is reported for genes with the most elevated outlier frequency in the right subpanel (number of outlier samples ≥ 15). (D) In the left subpanel, the number of extreme negative outliers is shown for each underexpressed gene. A detailed representation is reported for genes with the most elevated outlier frequency in the right subpanel (number of outlier samples ≥ 15).
Fig. 3
Fig. 3
Gene expression abnormalities associated with clinically relevant features of colorectal cancer (CRC) cell lines. (A) Boxplots depicting the distribution of expression values of tyrosine kinase (TK) genes whose overexpression has been previously associated with oncogene addiction in CRC cells. For each TK gene, extreme positive outliers are highlighted with dots of a particular color, while other samples are reported as gray dots. The name of each CRC cell line selected as extreme positive outlier is also indicated (in gray those that were not included in a previous drug screening). The bottom panel summarizes the matching between overexpressed TK genes and kinase inhibitors whose effect on cell viability has been previously assessed. (B) Heatmap showing DNA methylation levels of promoter probes that were found differentially methylated between MGMT extreme negative outliers and other samples. DNA methylation levels were measured as β‐values and are represented using a color scale from dark blue (low DNA methylation level) to yellow (high DNA methylation level). MGMT expression profile is shown above the heatmap and sample ordering is the same in the two graphs, from high to low MGMT expression levels. Annotation bars below the heatmap indicate – from top to bottom – samples that were scored as MGMT extreme negative outliers, the microsatellite status of each sample, and sample response to temozolomide (TMZ). Only CRC cell lines that were previously tested for TMZ sensitivity are shown in the figure.
Fig. 4
Fig. 4
Promoter hypermethylation in extreme negative outliers occurs concomitantly with or independently from the CpG island methylator phenotype (CIMP). Genes whose underexpression events were associated with promoter hypermethylation were classified as CIMP‐associated (left subpanel) or not CIMP‐associated (right subpanel) based on the enrichment of CIMP‐positive samples (CIMP‐H and CIMP‐L) in extreme negative outliers. Each bar depicts the CIMP classes that were attributed to the extreme negative outliers of a single gene. Only genes for which at least 10 colorectal cancer (CRC) cell lines were identified as extreme negative outliers are shown in the figure.
Fig. 5
Fig. 5
Gene amplifications and deletions leading to overexpression and underexpression events. (A) Scatter plot of all the extreme positive (red) and negative (blue) gene expression outliers that were identified. The outliers of each gene are defined by differential expression values (x‐axis), measured as log2 fold change with respect to the median gene expression, and log2‐transformed copy number variation (CNV) values (y‐axis). The gray area in the middle corresponds to the interval where extreme positive and negative outliers cannot be located according to the filters that were applied. Horizontal dashed lines correspond to thresholds used in calling gene amplifications and deletions. Outliers in which either gene overexpression was associated with gene amplification or gene underexpression was associated with gene deletion are shown in darker colors. (B) Scatter plot of PTEN differential expression values, measured as log2 fold change with respect to the median expression, and CNV values in 226 colorectal cancer (CRC) cell lines. Samples in which both PTEN underexpression and PTEN genetic deletion were found were colored in dark blue. (C) Scatter plot of SMAD4 differential expression values, measured as log2 fold change with respect to the median expression, and CNV values in 226 CRC cell lines. SMAD4 extreme negative outliers are blue colored and a darker color is used for samples in which both SMAD4 underexpression and SMAD4 genetic deletion were found. (D) Scatter plots of differential expression values, defined as log2 fold change with respect to the median gene expression, and log2‐transformed CNV values in 226 CRC cell lines for genes that are co‐overexpressed and co‐amplified with ERBB2 due to genome proximity. Extreme positive outliers are red colored. The ideogram of chromosome 17 with G‐banding pattern is shown at the top and the genomic region in which the selected genes are located (chromosome 17q12) is highlighted in red.
Fig. 6
Fig. 6
Somatic fusion transcripts associated with the overexpression of 3′ partner genes in extreme positive outliers. Boxplots depicting the distribution of expression values for genes whose overexpression was associated with a somatic fusion transcript in extreme positive outliers. Different point shapes are used to distinguish colorectal cancer (CRC) cell lines based on their gene‐specific status. In addition, fusion‐positive extreme positive outliers are shown in different colors according to the somatic fusion transcript that was identified.
Fig. 7
Fig. 7
Target developmental level (TDL) categories applied to overexpressed enzyme genes. (A) Enzyme genes expressed in our colorectal cancer (CRC) cell line collection were categorized into four classes according to their TDL, from high (Tclin) to low (Tdark) depth of clinical, biological, and chemical investigation. Each bar corresponds to the number of expressed enzyme genes for each TDL class and the fraction of genes for which extreme positive outliers were found is highlighted with darker colors. (B) Bubble chart of extreme positive outliers identified for kinase genes (kinome‐wide extreme positive outliers). The outliers of each kinase gene are defined by absolute gene expression (FPKM value) and differential expression (log2 fold change with respect to the median gene expression). Overexpressed kinase genes are categorized into the four TDL classes and reported along with the name of the kinase families.
Fig. 8
Fig. 8
MTAP‐deleted colorectal cancer (CRC) cell lines are sensitive to the MRTX1719, an inhibitor of the PRMT5:MTA complex. (A) Heatmap showing log2‐transformed copy number variation (CNV) values of MTAP, CDKN2A, and CDKN2B genes in 226 CRC cell lines. Log2‐transformed CNV values are represented using a color scale from red (gene amplification) to blue (gene deletion), and white indicates copy number neutrality. MTAP expression profile is shown above the heatmap, and sample ordering is the same in the two graphs, from high to low MTAP expression levels. The annotation bars below the heatmap indicate samples that were identified as MTAP extreme negative outliers. (B) MTAP protein expression in MTAP‐deleted CRC cell lines (MTAPNULL) and models in which the MTAP gene is normally expressed (MTAPWT). Hsp90 was used as loading control. (C) Cell viability (% control cells treated with DMSO) measured in MTAP‐deleted (blue bars) and wild‐type (red bars) CRC cell lines after 7‐day treatment with MRTX1719 at a relevant concentration (82.3 nm). Data represent mean ± SD of at least three independent biological replicates. The P‐value obtained with one‐way ANOVA test is reported.

References

    1. Malone ER, Oliva M, Sabatini PJB, Stockley TL, Siu LL. Molecular profiling for precision cancer therapies. Genome Med. 2020;12(1):8. - PMC - PubMed
    1. Andrei P, Battuello P, Grasso G, Rovera E, Tesio N, Bardelli A. Integrated approaches for precision oncology in colorectal cancer: the more you know, the better. Semin Cancer Biol. 2022;84:199–213. - PubMed
    1. Chapman PB, Hauschild A, Robert C, Haanen JB, Ascierto P, Larkin J, et al. Improved survival with vemurafenib in melanoma with BRAF V600E mutation. N Engl J Med. 2011;364(26):2507–2516. - PMC - PubMed
    1. Sharma SV, Bell DW, Settleman J, Haber DA. Epidermal growth factor receptor mutations in lung cancer. Nat Rev Cancer. 2007;7(3):169–181. - PubMed
    1. Lièvre A, Bachet JB, Le Corre D, Boige V, Landi B, Emile JF, et al. KRAS mutation status is predictive of response to cetuximab therapy in colorectal cancer. Cancer Res. 2006;66(8):3992–3995. - PubMed

MeSH terms