Transparent Quality Optimization for Machine Learning-Based Regression in Neurology

Karsten Wendt¹, Katrin Trentzsch², Rocco Haase², Marie Luise Weidemann², Robin Weidemann², Uwe Aßmann¹, Tjalf Ziemssen²

Affiliations

¹ Software Technology Group, Technische Universität Dresden, 01187 Dresden, Germany.
² Center of Clinical Neuroscience, Neurological Clinic, University Hospital Carl Gustav Carus, 01307 Dresden, Germany.

PMID: 35743693
PMCID: PMC9224715
DOI: 10.3390/jpm12060908

Transparent Quality Optimization for Machine Learning-Based Regression in Neurology

Karsten Wendt et al. J Pers Med. 2022.

. 2022 May 31;12(6):908.

doi: 10.3390/jpm12060908.

Authors

Karsten Wendt¹, Katrin Trentzsch², Rocco Haase², Marie Luise Weidemann², Robin Weidemann², Uwe Aßmann¹, Tjalf Ziemssen²

Affiliations

¹ Software Technology Group, Technische Universität Dresden, 01187 Dresden, Germany.
² Center of Clinical Neuroscience, Neurological Clinic, University Hospital Carl Gustav Carus, 01307 Dresden, Germany.

PMID: 35743693
PMCID: PMC9224715
DOI: 10.3390/jpm12060908

Abstract

The clinical monitoring of walking generates enormous amounts of data that contain extremely valuable information. Therefore, machine learning (ML) has rapidly entered the research arena to analyze and make predictions from large heterogeneous datasets. Such data-driven ML-based applications for various domains become increasingly applicable, and thus their software qualities are taken into focus. This work provides a proof of concept for applying state-of-the-art ML technology to predict the distance travelled of the 2-min walk test, an important neurological measurement which is an indicator of walking endurance. A transparent lean approach was emphasized to optimize the results in an explainable way and simultaneously meet the specified software requirements for a generic approach. It is a general-purpose strategy as a fractional−factorial design benchmark combined with standardized quality metrics based on a minimal technology build and a resulting optimized software prototype. Based on 400 training and 100 validation data, the achieved prediction yielded a relative error of 6.1% distributed over multiple experiments with an optimized configuration. The Adadelta algorithm (LR=0.000814, fModelSpread=5, nModelDepth=6, nepoch=1000) performed as the best model, with 90% of the predictions with an absolute error of <15 m. Factors such as gender, age, disease duration, or use of walking aids showed no effect on the relative error. For multiple sclerosis patients with high walking impairment (EDSS Ambulation Score ≥6), the relative difference was significant (n=30; 24.0%; p<0.050). The results show that it is possible to create a transparently working ML prototype for a given medical use case while meeting certain software qualities.

Keywords: deep learning; fractional factorial design benchmark; inertial measurement units; machine learning; multiple sclerosis; software quality.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Minimal viable ML-based approach for 2MWT prediction. After aggregating MLS data and manually measured walking distances from PwMS, the data are transposed to a table-based (columns $P_{1 . . \max}$ : patients; rows: speed, …, 2MW: features and learning objective), thus ML-compatible representation, prepared, split and fed into a DFFNN based on TensorFlow. The model training bases on incrementally improved configurations (FFDB) to optimize the prediction quality, expressed by predefined metrics. [Abbreviations: ML = Machine Learning; 2MWT = 2 minute Walk Test; MLS = Mobility Lab System; PwMS = People with Multiple Sclerosis; DFFNN = Deep Feed Forward Neural Network; FFDB = Fractional Factorial Design Benchmark].

**Figure 2**
$L R$ Optimization. $M S E_{arr}$ as prediction quality ((a) normal, (b) moving average and marked minima) in dependence of $L R$ s for 7 ML training algorithms, optimizing DFFNNs of fixed shape [Abbreviations: LR = Learning Rate, ML = Machine Learning, DFFNN = Deep Feed Forward Neural Network].

**Figure 3**
Model Shape Optimization. (a) Optimizer = Adadelta; Average, Relative RMSE for Model Spread and Depth (%); (b) Optimizer = Adadelta; Average, Relative SD for Model Spread and Depth (%); (c) Optimizer = RMSProp; Average, Relative RMSE for Model Spread and Depth (%); (d) Optimizer = RMSProp; Average, Relative SD for Model Spread and Depth (%) $M S E_{arr}$ as prediction quality (c) and its SD (d) in dependence of $f_{ModelSpread}$ and $n_{ModelDepth}$ for the Adadelta (best case) and the RMSProp algorithm (worst case); values are scaled for better readability [Abbreviations: RMSE = Rooted Mean Square Error; SD = Standard Deviation].

**Figure 4**
(a) Optimization algorithm comparison. $M S E_{arr}$ as prediction quality and its SD for each algorithm; (b,c) Prediction distribution; individual results of the best regressor model after completing the FFDB as total error [Abbreviations: SD = Standard Deviation; FFDB = Fractional Factorial Design Benchmark; LR = Learning Rate].

See this image and copyright information in PMC

References

1. Zhou L., Pan S., Wang J., Vasilakos A.V. Machine learning on big data: Opportunities and challenges. Neurocomputing. 2017;237:350–361. doi: 10.1016/j.neucom.2017.01.026. - DOI
1. L’heureux A., Grolinger K., Elyamany H.F., Capretz M.A.M. Machine learning with big data: Challenges and approaches. IEEE Access. 2017;5:7776–7797. doi: 10.1109/ACCESS.2017.2696365. - DOI
1. Franch X., Ayala C., López L., Martinez-Fernández S., Rodriguez P., Gómez C., Jedlitschka A., Oivo M., Partanen J., Räty T., et al. Data-driven requirements engineering in agile projects: The Q-rapids approach; Proceedings of the 2017 IEEE 25th International Requirements Engineering Conference Workshops (REW); Lisbon, Portugal. 4–8 September 2017; pp. 411–414. - DOI
1. Chitnis T., Glanz B.I., Gonzalez C., Healy B.C., Saraceno T.J., Sattarnezhad N., Diaz-Cruz C., Polgar-Turcsanyi M., Tummala S., Bakshi R., et al. Quantifying neurologic disease using biosensor measurements in-clinic and in free-living settings in multiple sclerosis. Npj Digit. Med. 2019;2:1–8. doi: 10.1038/s41746-019-0197-7. - DOI - PMC - PubMed
1. Reich D.S., Lucchinetti C.F., Calabresi P.A. Multiple Sclerosis. N. Engl. J. Med. 2018;378:169–180. doi: 10.1056/NEJMra1401483. - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Transparent Quality Optimization for Machine Learning-Based Regression in Neurology

Affiliations

Transparent Quality Optimization for Machine Learning-Based Regression in Neurology

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources