Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep 23:8:637355.
doi: 10.3389/fmolb.2021.637355. eCollection 2021.

Integrative Predictive Modeling of Metastasis in Melanoma Cancer Based on MicroRNA, mRNA, and DNA Methylation Data

Affiliations

Integrative Predictive Modeling of Metastasis in Melanoma Cancer Based on MicroRNA, mRNA, and DNA Methylation Data

Ayşegül Kutlay et al. Front Mol Biosci. .

Abstract

Introduction: Despite the significant progress in understanding cancer biology, the deduction of metastasis is still a challenge in the clinic. Transcriptional regulation is one of the critical mechanisms underlying cancer development. Even though mRNA, microRNA, and DNA methylation mechanisms have a crucial impact on the metastatic outcome, there are no comprehensive data mining models that combine all transcriptional regulation aspects for metastasis prediction. This study focused on identifying the regulatory impact of genetic biomarkers for monitoring metastatic molecular signatures of melanoma by investigating the consolidated effect of miRNA, mRNA, and DNA methylation. Method: We developed multiple machine learning models to distinguish the metastasis by integrating miRNA, mRNA, and DNA methylation markers. We used the TCGA melanoma dataset to differentiate between metastatic melanoma samples by assessing a set of predictive models. For this purpose, machine learning models using a support vector machine with different kernels, artificial neural networks, random forests, AdaBoost, and Naïve Bayes are compared. An iterative combination of differentially expressed miRNA, mRNA, and methylation signatures is used as a candidate marker to reveal each new biomarker category's impact. In each iteration, the performances of the combined models are calculated. During all comparisons, the choice of the feature selection method and under and oversampling approaches are analyzed. Selected biomarkers of the highest performing models are further analyzed for the biological interpretation of functional enrichment. Results: In the initial model, miRNA biomarkers can identify metastatic melanoma with an 81% F-score. The addition of mRNA markers upon miRNA increased the F-score to 92%. In the final integrated model, the addition of the methylation data resulted in a similar F-score of 92% but produced a stable model with low variance across multiple trials. Conclusion: Our results support the role of miRNA regulation in metastatic melanoma as miRNA markers model metastasis outcomes with high accuracy. Moreover, the integrated evaluation of miRNA with mRNA and methylation biomarkers increases the model's power. It populates selected biomarkers on the metastasis-associated pathways of melanoma, such as the "osteoclast", "Rap1 signaling", and "chemokine signaling" pathways. Source Code: https://github.com/aysegul-kt/MelonomaMetastasisPrediction/.

Keywords: DNA methylation; mRNA; machine learning; melanoma; metastasis; metastatic molecular signatures; miRNA.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Experimental pool generation process: each method is evaluated using a sample experimental pool under the same circumstances. miRNA, mRNA, and methylation data consumed through TCGA were processed separately and merged to generate the whole melanoma marker dataset. Then, through random splinting, 10 individual sample datasets are constructed. Each random split is saved by applying both undersampling and oversampling (SMOTE) techniques.
FIGURE 2
FIGURE 2
Training validation and unseen test data generation in each trial; for this purpose, the following steps are followed for both undersampling and oversampling. 1) The significant variables listed in the given category are selected from the dataset. 2) A set selected randomly from whole data with an 80% ratio of each class is kept for unseen data. 3) A technique is applied to solve the curse of the dimensionality problem (for dimensional reduction, principal component analysis is applied, and for oversampling runs, the SMOTE algorithm is used with K = 3). 4) Steps 1–3 are repeated for each data split in the experiment pool.
FIGURE 3
FIGURE 3
Model training and testing process: experiment flow initiated by applying alternative dimensionality solutions, namely, PCA and feature selection. Through each experiment flow, models are trained with seven (SVMs with linear, radial, and polynomial kernels, neural networks, random forests, AdaBoost, and Naive Bayes) machine learning algorithms and tested with the same unseen data. Overall flow is repeated for each data subset in the experiment pool.
FIGURE 4
FIGURE 4
Illustration for category-based analysis with techniques applied: each evaluation criterion is represented with a code. For example, (a1) represents the predictive models by using miRNA signatures with the hybrid method, that is, the random forest, to calculate feature importance undersampling for the class imbalance solution. Similarly, (d3) represents the outcomes of models applied to predict metastasis using significant miRNA, mRNA, and methylation biomarkers using PCA as a dimensional solution and SMOTE as a class imbalance solution.
FIGURE 5
FIGURE 5
Illustration for results of category-based analysis with techniques applied to solve significant issues: as a result of the evaluation process, (c1) is selected as the successor model for miRNA markers. When two markers, miRNA and mRNA, are combined, the winner is identified as (d2). In the final cycle, the merge of all biomarkers resulted in (d3) as the successor. Among all, (d3) was the winner to predict the metastatic outcome.
FIGURE 6
FIGURE 6
Model comparison of techniques used for miRNA biomarkers (red, sensitivity; green, predictivity; blue, accuracy; purple, F-score): Category 1, which uses a hybrid model of feature selection and an AdaBoost classifier, has the best results among all scenarios.
FIGURE 7
FIGURE 7
Model comparison of techniques used for miRNA and mRNA biomarkers (red, sensitivity; green, predictivity; blue, accuracy; purple, F-score): the model listed in 4, which applies (d2), is selected as the successor model for the second cycle.
FIGURE 8
FIGURE 8
Model comparison of techniques used for miRNA, mRNA, and methylation biomarkers (red, sensitivity; green, predictivity; blue, accuracy; purple, F-score). The model listed in 4, which applies (d3), is selected as the successor model for the final cycle.
FIGURE 9
FIGURE 9
Comparison of best models for each biomarker set (red, sensitivity; green, predictivity; blue, accuracy; purple, F-score): 1) the performance of the predictive model by using miRNA, 2) the performance of the predictive model by using miRNA and mRNA markers, and 3) the performance of the predictive model by using miRNA, mRNA, and methylation markers.
FIGURE 10
FIGURE 10
Significant pathways functionally enriched in all three feature sets. As the new biomarker set is added, the significance of the pathways is evaluated. Osteoclast, Rap1 signaling pathway, and chemokine signaling pathways showed a significant increase in the third model.

Similar articles

Cited by

References

    1. Alfaro E., Gáamez M., García N. (2013). Adabag: An R Package for Classification with Boosting and Bagging. J. Stat. Softw. 54 (2), 1–35. 10.18637/jss.v054.i02 - DOI
    1. Alfaro E., Gamez M., Garcia N. (2018). CRAN - Package Adabag,” CRAN R Project. Online. Available: https://cran.r-project.org/web/packages/adabag/index.html (Accessed Jun 23, 2021).
    1. American Cancer Society (2016). European Commission Melanoma Skin Cancer. Atlanta.
    1. Burton M., Thomassen M., Tan Q., Kruse T. A. (2012). Prediction of Breast Cancer Metastasis by Gene Expression Profiles: A Comparison of Metagenes and Single Genes. Cancer Inform. 11, 193–217. 10.4137/cin.s10375 - DOI - PMC - PubMed
    1. Cancer Research UK (2017). Melanoma Skin Cancer Incidence Statistics. Online. Available: https://www.cancerresearchuk.org/health-professional/cancer-statistics/s... (Accessed 03 Jun, 2021).

LinkOut - more resources