Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 15;20(1):228.
doi: 10.1186/s12911-020-01250-7.

Gradient boosting for Parkinson's disease diagnosis from voice recordings

Affiliations

Gradient boosting for Parkinson's disease diagnosis from voice recordings

Ibrahim Karabayir et al. BMC Med Inform Decis Mak. .

Abstract

Background: Parkinson's Disease (PD) is a clinically diagnosed neurodegenerative disorder that affects both motor and non-motor neural circuits. Speech deterioration (hypokinetic dysarthria) is a common symptom, which often presents early in the disease course. Machine learning can help movement disorders specialists improve their diagnostic accuracy using non-invasive and inexpensive voice recordings.

Method: We used "Parkinson Dataset with Replicated Acoustic Features Data Set" from the UCI-Machine Learning repository. The dataset included 44 speech-test based acoustic features from patients with PD and controls. We analyzed the data using various machine learning algorithms including Light and Extreme Gradient Boosting, Random Forest, Support Vector Machines, K-nearest neighborhood, Least Absolute Shrinkage and Selection Operator Regression, as well as logistic regression. We also implemented a variable importance analysis to identify important variables classifying patients with PD.

Results: The cohort included a total of 80 subjects: 40 patients with PD (55% men) and 40 controls (67.5% men). Disease duration was 5 years or less for all subjects, with a mean Unified Parkinson's Disease Rating Scale (UPDRS) score of 19.6 (SD 8.1), and none were taking PD medication. The mean age for PD subjects and controls was 69.6 (SD 7.8) and 66.4 (SD 8.4), respectively. Our best-performing model used Light Gradient Boosting to provide an AUC of 0.951 with 95% confidence interval 0.946-0.955 in 4-fold cross validation using only seven acoustic features.

Conclusions: Machine learning can accurately detect Parkinson's disease using an inexpensive and non-invasive voice recording. Light Gradient Boosting outperformed other machine learning algorithms. Such approaches could be used to inexpensively screen large patient populations for Parkinson's disease.

Keywords: Artificial intelligence; Gradient boosting; Machine learning; Parkinson’s disease; Speech test.

PubMed Disclaimer

Conflict of interest statement

No author has any conflict of interest to report.

Figures

Fig. 1
Fig. 1
Acoustic features used in modeling
Fig. 2
Fig. 2
Feature selection and reclassification results for 4-fold cross validation using the LGB model

Similar articles

Cited by

References

    1. Tanner CM, Goldman SM. Epidemiology of Parkinson's disease. Neurol Clin. 1996;14(2):317–335. doi: 10.1016/s0733-8619(05)70259-0. - DOI - PMC - PubMed
    1. Dorsey ER, et al. Projected number of people with Parkinson disease in the most populous nations, 2005 through 2030. Neurology. 2007;68(5):384–386. doi: 10.1212/01.wnl.0000247740.47667.03. - DOI - PubMed
    1. Marras C, et al. Prevalence of Parkinson's disease across North America. NPJ Parkinsons Dis. 2018;4:21. doi: 10.1038/s41531-018-0058-0. - DOI - PMC - PubMed
    1. Fearnley JM, Lees AJ. Ageing and Parkinson's disease: substantia nigra regional selectivity. Brain. 1991;114(Pt 5):2283–2301. doi: 10.1093/brain/114.5.2283. - DOI - PubMed
    1. Ross GW, Abbott RD, Petrovitch H, Tanner CM, White LR. Pre-motor features of Parkinson's disease: the Honolulu-Asia Aging Study experience. Parkinsonism Relat Disord. 2012;18(Suppl 1):S199–S202. doi: 10.1016/s1353-8020(11)70062-1. - DOI - PubMed