Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 15;22(1):125.
doi: 10.1186/s12938-023-01190-z.

Radiotranscriptomics of non-small cell lung carcinoma for assessing high-level clinical outcomes using a machine learning-derived multi-modal signature

Affiliations

Radiotranscriptomics of non-small cell lung carcinoma for assessing high-level clinical outcomes using a machine learning-derived multi-modal signature

Eleftherios Trivizakis et al. Biomed Eng Online. .

Abstract

Background: Multi-omics research has the potential to holistically capture intra-tumor variability, thereby improving therapeutic decisions by incorporating the key principles of precision medicine. The purpose of this study is to identify a robust method of integrating features from different sources, such as imaging, transcriptomics, and clinical data, to predict the survival and therapy response of non-small cell lung cancer patients.

Methods: 2996 radiomics, 5268 transcriptomics, and 8 clinical features were extracted from the NSCLC Radiogenomics dataset. Radiomics and deep features were calculated based on the volume of interest in pre-treatment, routine CT examinations, and then combined with RNA-seq and clinical data. Several machine learning classifiers were used to perform survival analysis and assess the patient's response to adjuvant chemotherapy. The proposed analysis was evaluated on an unseen testing set in a k-fold cross-validation scheme. Score- and concatenation-based multi-omics were used as feature integration techniques.

Results: Six radiomics (elongation, cluster shade, entropy, variance, gray-level non-uniformity, and maximal correlation coefficient), six deep features (NasNet-based activations), and three transcriptomics (OTUD3, SUCGL2, and RQCD1) were found to be significant for therapy response. The examined score-based multi-omic improved the AUC up to 0.10 on the unseen testing set (0.74 ± 0.06) and the balance between sensitivity and specificity for predicting therapy response for 106 patients, resulting in less biased models and improving upon the either highly sensitive or highly specific single-source models. Six radiomics (kurtosis, GLRLM- and GLSZM-based non-uniformity from images with no filtering, biorthogonal, and daubechies wavelets), seven deep features (ResNet-based activations), and seven transcriptomics (ELP3, ZZZ3, PGRMC2, TRAK1, ATIC, USP7, and PNPLA2) were found to be significant for the survival analysis. Accordingly, the survival analysis for 115 patients was also enhanced up to 0.20 by the proposed score-based multi-omics in terms of the C-index (0.79 ± 0.03).

Conclusions: Compared to single-source models, multi-omics integration has the potential to improve prediction performance, increase model stability, and reduce bias for both treatment response and survival analysis.

Keywords: Adjuvant chemotherapy response; Deep features; Integrative data analysis; Multi-omics score; Non-small cell lung cancer; Radiomics; Survival analysis; Transcriptomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Performance comparison of therapy response models with 95% confidence intervals. MV multi-view, MVS multi-view score, TSC transcriptomics, TSCS transcriptomic score, RAD radiomics, RADS radiomic score, DL deep learning, DLS deep learning score
Fig. 2
Fig. 2
The receiver operating characteristic curves for the sigmoid SVM models are based on multi-view, single-source, and score-based analyses. SVM support vector machine, AUC area under curve
Fig. 3
Fig. 3
Performance comparison of survival analysis models with 95% confidence intervals. MV multi-view, MVS multi-view score, TSC transcriptomics,  TSCS transcriptomic score, RAD radiomics, RADS radiomic score, DL deep learning, DLS deep learning score
Fig. 4
Fig. 4
The predicted survival function of the multi-view analysis (a) and best single-source model (b). The per-patient probability of survival for high-risk patients (R01-037, R01-039 and R01-138) appear with a lower score in this figure compared to the low-risk patients (R01-035, R01-055 and R01-077). The deep feature score-based model (b) assigns high-risk patients with higher survival probability (R01-039, R01-106) and vice versa (R01-077)
Fig. 5
Fig. 5
The differences between multi-view and single-source models in terms of AUC. MV multi-view, TSC transcriptomics, RAD radiomics, DL deep learning, AUC area under curve
Fig. 6
Fig. 6
The differences between multi-view and single-source survival models in terms of concordance index. MV multi-view, TSC transcriptomics, RAD radiomics, DL deep learning
Fig. 7
Fig. 7
The CONSORT diagram of the study. NSCLC non-small cell lung cancer, CT computed tomography
Algorithm 1
Algorithm 1
A simplified snippet of pseudo-code for the multi-view pipeline and single-source models
Fig. 8
Fig. 8
The proposed multi-view analysis for assessing high-level clinical outcomes. This pipeline includes feature extraction from multiple sources, followed by feature selection to identify the most relevant features to the specific clinical endpoint. SMOTE was applied to balance the examined distributions on the training set. Feature integration provides unified, compact representations of patient data for machine learning classification, assessing high-level clinical outcomes. SMOTE synthetic minority oversampling technique

References

    1. The Global Cancer Observatory (GCO). Lung fact sheet. 2020. https://gco.iarc.fr/today/data/factsheets/cancers/15-Lung-fact-sheet.pdf. Accessed 10 Jul 2022.
    1. International Agency for Research on Cancer. Latest global cancer data: cancer burden rises to 18.1 million new cases and 9.6 million cancer deaths in 2018. Geneva, Switzerland; 2018. https://www.who.int/cancer/PRGlobocanFinal.pdf.
    1. Sanchez-Palencia A, Gomez-Morales M, Gomez-Capilla JA, Pedraza V, Boyero L, Rosell R, et al. Gene expression profiling reveals novel biomarkers in nonsmall cell lung cancer. Int J Cancer. 2011;129:355–364. doi: 10.1002/ijc.25704. - DOI - PubMed
    1. Krzak M, Raykov Y, Boukouvalas A, Cutillo L, Angelini C. Benchmark and parameter sensitivity analysis of single-cell RNA sequencing clustering methods. Front Genet. 2019;10:1253. doi: 10.3389/fgene.2019.01253/full. - DOI - PMC - PubMed
    1. Cui W, Xue H, Wei L, Jin J, Tian X, Wang Q. High heterogeneity undermines generalization of differential expression results in RNA-Seq analysis. Hum Genomics. 2021;15:1–9. doi: 10.1186/s40246-021-00308-5. - DOI - PMC - PubMed

MeSH terms

Substances