Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jun 8;8(1):130.
doi: 10.1038/s41698-024-00605-x.

Artificial intelligence-based epigenomic, transcriptomic and histologic signatures of tobacco use in oral squamous cell carcinoma

Affiliations

Artificial intelligence-based epigenomic, transcriptomic and histologic signatures of tobacco use in oral squamous cell carcinoma

Chi T Viet et al. NPJ Precis Oncol. .

Abstract

Oral squamous cell carcinoma (OSCC) biomarker studies rarely employ multi-omic biomarker strategies and pertinent clinicopathologic characteristics to predict mortality. In this study we determine for the first time a combined epigenetic, gene expression, and histology signature that differentiates between patients with different tobacco use history (heavy tobacco use with ≥10 pack years vs. no tobacco use). Using The Cancer Genome Atlas (TCGA) cohort (n = 257) and an internal cohort (n = 40), we identify 3 epigenetic markers (GPR15, GNG12, GDNF) and 13 expression markers (IGHA2, SCG5, RPL3L, NTRK1, CD96, BMP6, TFPI2, EFEMP2, RYR3, DMTN, GPD2, BAALC, and FMO3), which are dysregulated in OSCC patients who were never smokers vs. those who have a ≥ 10 pack year history. While mortality risk prediction based on smoking status and clinicopathologic covariates alone is inaccurate (c-statistic = 0.57), the combined epigenetic/expression and histologic signature has a c-statistic = 0.9409 in predicting 5-year mortality in OSCC patients.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. TCGA methylation analysis results.
A Volcano plot of batched corrected data. Only differentially methylated sites with unadjusted p < 0.1 are included. Unadjusted p < 0.05 and log fold change > +/−0.5 are considered significant. B QQ plot of the batch corrected data, which compares the expected to observed -log10P, and demonstrates an inflation factor = 0.99.
Fig. 2
Fig. 2. Methylation and RNA Seq array work flow.
The analysis steps for the array data from the TCGA cohort are shown, with (A) representing the methylation array workflow and (B) representing the RNA Seq workflow.
Fig. 3
Fig. 3. Functional pathway analysis of RNASeq biomarkers.
A GO BP top 10 dysregulated pathways. B Top 10 dysregulated KEGG pathways. C Top 5 Reactome gene sets.
Fig. 4
Fig. 4. Functional pathway analysis of methylation biomarkers.
A GO BP gene concept network. B Top 10 dysregulated KEGG pathways. C Dot plot of ORA Reactome gene sets.
Fig. 5
Fig. 5. Digital histopathology analysis with a deep learning model designed to predict patient smoking status.
A Whole Slide Images from 203 TCGA hematoxylin and eosin stained histopathology slides served as training data for a deep learning model constructed with the Slideflow pipeline. B Expert pathologists annotated regions of interest (ROI) on each WSI. Within each ROI, the WSIs are divided into tiles of size 299 pixels × 299 pixels. Tiles underwent stain normalization and augmentation prior to model training. C UMAP of the post-convolution layer activations from all images in the validation set. Plotted tiles are a subset of all image tiles within the validation set.
Fig. 6
Fig. 6. Deep learning model explainability analysis.
A Heat map of the model’s logit score assigned to a given location within the image’s ROI. B UMAP of the post-convolutional layer activations from all images in the model’s validation set with a label of the model’s smoking status prediction (1- heavy smoker, 0- non-smoker). C UMAP in B labeled with the ground truth smoking status prediction. D Heat map of the model’s uncertainty quantification of the outcome prediction assigned to a given location within the image’s ROI. E UMAP in B labeled with the uncertainty quantification. F UMAP in B labeled with the TCGA donating site. G UMAP in B labeled with anatomic site. H UMAP in B labeled with perineural invasion status (PNI).

References

    1. Gulland, A., Oral cancer rates rise by two thirds. BMJ355, i6369 (2016). - PubMed
    1. Ferris, R. L., et al. Phase II Randomized Trial of Transoral Surgery and Low-Dose Intensity Modulated Radiation Therapy in Resectable p16+ Locally Advanced Oropharynx Cancer: An ECOG-ACRIN Cancer Research Group Trial (E3311). J. Clin. Oncol. 40, 138–149 (2022). - PMC - PubMed
    1. Stransky N, et al. The mutational landscape of head and neck squamous cell carcinoma. Science. 2011;333:1157–1160. doi: 10.1126/science.1208130. - DOI - PMC - PubMed
    1. Poage GM, et al. Global hypomethylation identifies Loci targeted for hypermethylation in head and neck cancer. Clin. Cancer Res. 2011;17:3579–3589. doi: 10.1158/1078-0432.CCR-11-0044. - DOI - PMC - PubMed
    1. Viet CT, Jordan RC, Schmidt BL. DNA promoter hypermethylation in saliva for the early diagnosis of oral cancer. J. Calif. Dent. Assoc. 2007;35:844–849. - PubMed