Applying Deep Learning Techniques to Estimate Patterns of Musical Gesture
- PMID: 33469435
- PMCID: PMC7813937
- DOI: 10.3389/fpsyg.2020.575971
Applying Deep Learning Techniques to Estimate Patterns of Musical Gesture
Abstract
Repetitive practice is one of the most important factors in improving the performance of motor skills. This paper focuses on the analysis and classification of forearm gestures in the context of violin playing. We recorded five experts and three students performing eight traditional classical violin bow-strokes: martelé, staccato, detaché, ricochet, legato, trémolo, collé, and col legno. To record inertial motion information, we utilized the Myo sensor, which reports a multidimensional time-series signal. We synchronized inertial motion recordings with audio data to extract the spatiotemporal dynamics of each gesture. Applying state-of-the-art deep neural networks, we implemented and compared different architectures where convolutional neural networks (CNN) models demonstrated recognition rates of 97.147%, 3DMultiHeaded_CNN models showed rates of 98.553%, and rates of 99.234% were demonstrated by CNN_LSTM models. The collected data (quaternion of the bowing arm of a violinist) contained sufficient information to distinguish the bowing techniques studied, and deep learning methods were capable of learning the movement patterns that distinguish these techniques. Each of the learning algorithms investigated (CNN, 3DMultiHeaded_CNN, and CNN_LSTM) produced high classification accuracies which supported the feasibility of training classifiers. The resulting classifiers may provide the foundation of a digital assistant to enhance musicians' time spent practicing alone, providing real-time feedback on the accuracy and consistency of their musical gestures in performance.
Keywords: CNN; CNN_LSTM; ConvLSTM; LSTM; bow-strokes; gesture recognition; music education; music interaction.
Copyright © 2021 Dalmazzo, Waddell and Ramírez.
Conflict of interest statement
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures
References
-
- Ahmed S. H., Kim D. (2016). Named data networking-based smart home. ICT Express 2, 130–134. 10.1016/j.icte.2016.08.007 - DOI
-
- Anguita D., Ghio A., Oneto L., Parra X., Reyes-Ortiz J. L. (2013). A public domain dataset for human activity recognition using smartphones, in ESANN 2013 Proceedings, 21st European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (Bruges: ), 437–442.
-
- Caramiaux B., Bevilacqua F., Tanaka A. (2013). Beyond recognition, in CHI '13 Extended Abstracts on Human Factors in Computing Systems–CHI EA '13 (Seoul: ), 2109 10.1145/2468356.2468730 - DOI
-
- Caramiaux B., Montecchio N., Tanaka A., Bevilacqua F. (2015). Adaptive gesture recognition with variation estimation for interactive systems. ACM Trans. Interact. Intell. Syst. 4:18 10.1145/2643204 - DOI
LinkOut - more resources
Full Text Sources
Other Literature Sources
