Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Feb 14:1:49-56.
doi: 10.1109/OJEMB.2020.2965191. eCollection 2020.

Predicting Lymphoma Development by Exploiting Genetic Variants and Clinical Findings in a Machine Learning-Based Methodology With Ensemble Classifiers in a Cohort of Sjögren's Syndrome Patients

Affiliations

Predicting Lymphoma Development by Exploiting Genetic Variants and Clinical Findings in a Machine Learning-Based Methodology With Ensemble Classifiers in a Cohort of Sjögren's Syndrome Patients

Konstantina D Kourou et al. IEEE Open J Eng Med Biol. .

Abstract

Lymphoma development constitutes one of the most serious clinico-pathological manifestations of patients with Sjögren's Syndrome (SS). Over the last decades the risk for lymphomagenesis in SS patients has been studied aiming to identify novel biomarkers and risk factors predicting lymphoma development in this patient population. Objective: The current study aims to explore whether genetic susceptibility profiles of SS patients along with known clinical, serological and histological risk factors enhance the accuracy of predicting lymphoma development in this patient population. Methods: The potential predicting role of both genetic variants, clinical and laboratory risk factors were investigated through a Machine Learning-based (ML) framework which encapsulates ensemble classifiers. Results: Ensemble methods empower the classification accuracy with approaches which are sensitive to minor perturbations in the training phase. The evaluation of the proposed methodology based on a 10-fold stratified cross validation procedure yielded considerable results in terms of balanced accuracy (GB: 0.7780 ± 0.1514, RF Gini: 0.7626 ± 0.1787, RF Entropy: 0.7590 ± 0.1837). Conclusions: The initial clinical, serological, histological and genetic findings at an early diagnosis have been exploited in an attempt to establish predictive tools in clinical practice and further enhance our understanding towards lymphoma development in SS.

Keywords: Ensemble methods; Sjögren's Syndrome; genetic variants; lymphoma prediction; machine learning.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
The normalized and non-normalized confusion matrices obtained for each classification model. The ROC curves after the evaluation of models’ performance are also illustrated. Each row corresponds to the respective classifier's evaluated performance. In the upper side the classification performance of RF Gini estimator is depicted (confusion matrices and ROC curve). In the middle and lower side of the figure the classification results of RF Entropy and GB classifiers are presented, respectively. The ROC curves correspond to the mean ROC curves and auc after applying the 10-fold cross validation procedure in the proposed ML methodology. The ROC curve in each fold is also illustrated for comparison purposes. In addition, the ± 1SD is also given with the mean ROC.
Figure 2.
Figure 2.
Boxplot with the mean feature rankings for each variable considered by the respective estimator. RF feature selection was performed with threshold the “mean” and “max_features” equal to the max number of features in the dataset considered at each experiment (input case 1clinical and genetic data).
Figure 3.
Figure 3.
The calculated mean ROC curve and auc (a–c), with the variance of each curve when the training set is split into 10 different subsets. This pinpoints how the estimator output is affected by changes in the training data, and how different the splits are from one another in 10-fold cross validation. The left ROC curve corresponds to RF Gini estimator and the middle and right ones to RG Entropy and GB classifiers, respectively.

References

    1. Mavragani C. P. and Moutsopoulos H. M., “Sjögren syndrome,” Cmaj, vol. 186, pp. E579–E586, 2014. - PMC - PubMed
    1. Zintzaras E., Voulgarelis M., and Moutsopoulos H. M., “The risk of lymphoma development in autoimmune diseases: A meta-analysis,” Archives Internal Med., vol. 165, pp. 2337–2344, 2005. - PubMed
    1. Skopouli F. N., Dafni U., Ioannidis J. P., and Moutsopoulos H. M., “Clinical evolution, and morbidity and mortalityof primary Sjögren's syndrome,” in Proc. Seminars Arthritis Rheumatism, 2000, pp. 296–304. - PubMed
    1. Fragkioudaki S., Mavragani C. P., and Moutsopoulos H. M., “Predicting the risk for lymphoma development in Sjogren syndrome: An easy tool for clinical use,” Medicine, vol. 95, 2016, Art. no. e3766. - PMC - PubMed
    1. Nocturne G. et al. , “Rheumatoid factor and disease activity are independent predictors of lymphoma in primary Sjögren's syndrome,” Arthritis Rheumatology, vol. 68, pp. 977–985, 2016. - PubMed