Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 13;14(16):3914.
doi: 10.3390/cancers14163914.

Thyroid Disease Prediction Using Selective Features and Machine Learning Techniques

Affiliations

Thyroid Disease Prediction Using Selective Features and Machine Learning Techniques

Rajasekhar Chaganti et al. Cancers (Basel). .

Abstract

Thyroid disease prediction has emerged as an important task recently. Despite existing approaches for its diagnosis, often the target is binary classification, the used datasets are small-sized and results are not validated either. Predominantly, existing approaches focus on model optimization and the feature engineering part is less investigated. To overcome these limitations, this study presents an approach that investigates feature engineering for machine learning and deep learning models. Forward feature selection, backward feature elimination, bidirectional feature elimination, and machine learning-based feature selection using extra tree classifiers are adopted. The proposed approach can predict Hashimoto's thyroiditis (primary hypothyroid), binding protein (increased binding protein), autoimmune thyroiditis (compensated hypothyroid), and non-thyroidal syndrome (NTIS) (concurrent non-thyroidal illness). Extensive experiments show that the extra tree classifier-based selected feature yields the best results with 0.99 accuracy and an F1 score when used with the random forest classifier. Results suggest that the machine learning models are a better choice for thyroid disease detection regarding the provided accuracy and the computational complexity. K-fold cross-validation and performance comparison with existing studies corroborate the superior performance of the proposed approach.

Keywords: bidirectional feature elimination; forward feature selection; machine learning; thyroid prediction.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interests.

Figures

Figure 1
Figure 1
Flow of the proposed methodology.
Figure 2
Figure 2
Feature impact on models performance.
Figure 3
Figure 3
Feature Importance using MLFS.
Figure 4
Figure 4
Feature space using different feature selection methods.
Figure 5
Figure 5
Feature Space using Different Feature Selection Methods. (a) ML. (b) Forward Feature Selection (FFS). (c) Backward Feature Elimination (BFE). (d) Bi-Directional Feature Elimination (BiDFE). (e) Original.
Figure 6
Figure 6
Deep learning models per epochs evaluation scores using original features and MLFS. (a) CNN Accuracy using Original Features, (b) CNN Loss using Original Features, (c) CNN-LSTM Accuracy using Original Features, (d) CNN-LSTM Loss using Original Features, (e) LSTM Accuracy using Original Features, (f) CNN Loss using Original Features, (g) CNN Accuracy using MLFS, (h) CNN Loss using MLFS, (i) CNN-LSTM Accuracy using MLFS, (j) CNN-LSTM Loss using MLFS, (k) LSTM Accuracy using MLFS, and (l) LSTM Loss using MLFS.
Figure 7
Figure 7
Deep learning models per epochs evaluation scores using BiDFE and BFE. (a) CNN accuracy using BFE, (b) CNN loss using BFE, (c) CNN-LSTM accuracy using BFE, (d) CNN-LSTM loss using BFE, (e) LSTM accuracy using BFE, (f) LSTM loss using BFE, (g) CNN accuracy using BiDFE, (h) CNN loss using BiDFE, (i) CNN-LSTM accuracy using BiDFE, (j) CNN-LSTM loss using BiDFE, (k) LSTM accuracy using BiDFE and (l) LSTM loss using BiDFE.
Figure 8
Figure 8
Deep learning models per epochs evaluation scores using FFS. (a) CNN accuracy using FFS, (b) CNN loss using FFS, (c) CNN-LSTM accuracy using FFS, (d) CNN-LSTM loss using FFS, (e) LSTM accuracy using FFS and (f) LSTM loss using FFS.

References

    1. Chaubey G., Bisen D., Arjaria S., Yadav V. Thyroid disease prediction using machine learning approaches. Natl. Acad. Sci. Lett. 2021;44:233–238. doi: 10.1007/s40009-020-00979-z. - DOI
    1. Ioniţă I., Ioniţă L. Prediction of thyroid disease using data mining techniques. BRAIN Broad Res. Artif. Intell. Neurosci. 2016;7:115–124.
    1. Webster A., Wyatt S. Health, Technology and Society. Springer; Berlin/Heidelberg, Germany: 2020.
    1. Hong L., Luo M., Wang R., Lu P., Lu W., Lu L. Big data in health care: Applications and challenges. Data Inf. Manag. 2018;2:175–197. doi: 10.2478/dim-2018-0014. - DOI
    1. Association A.T. General Information/Press Room|American Thyroid Association. [(accessed on 7 April 2022)]. Available online: https://www.thyroid.org/media-main/press-room/

LinkOut - more resources