. 2023 Dec 29;24(1):206.

doi: 10.3390/s24010206.

Feasibility of 3D Body Tracking from Monocular 2D Video Feeds in Musculoskeletal Telerehabilitation

Carolina Clemente^{1

2}, Gonçalo Chambel², Diogo C F Silva^{3

4

5}, António Mesquita Montes^{3

5

6}, Joana F Pinto², Hugo Plácido da Silva^{1

7

8}

Affiliations

¹ Instituto Superior Técnico (IST), Department of Bioengineering (DBE), Av. Rovisco Pais n. 1, 1049-001 Lisboa, Portugal.
² CLYNXIO, LDA, Rua Augusto Macedo, n. 6, 5 Dto., 1600-794 Lisboa, Portugal.
³ Department of Physiotherapy, Santa Maria Health School, Trav. Antero de Quental 173/175, 4049-024 Porto, Portugal.
⁴ Department of Functional Sciences, School of Health, Polytechnic Institute of Porto, Rua Dr. António Bernardino de Almeida 400, 4200-072 Porto, Portugal.
⁵ Center for Rehabilitation Research, School of Health, Polytechnic Institute of Porto, Rua Dr. António Bernardino de Almeida 400, 4200-072 Porto, Portugal.
⁶ Department of Physiotherapy, School of Health, Polytechnic Institute of Porto, Rua Dr. António Bernardino de Almeida 400, 4200-072 Porto, Portugal.
⁷ Instituto de Telecomunicações (IT), Av. Rovisco Pais n. 1, Torre Norte-Piso 10, 1049-001 Lisboa, Portugal.
⁸ Lisbon Unit for Learning and Intelligent Systems (LUMLIS), European Laboratory for Learning and Intelligent Systems (ELLIS), Av. Rovisco Pais n. 1, 1049-001 Lisboa, Portugal.

PMID: 38203068
PMCID: PMC10781343
DOI: 10.3390/s24010206

Feasibility of 3D Body Tracking from Monocular 2D Video Feeds in Musculoskeletal Telerehabilitation

Carolina Clemente et al. Sensors (Basel). 2023.

. 2023 Dec 29;24(1):206.

doi: 10.3390/s24010206.

Authors

Carolina Clemente^{1

2}, Gonçalo Chambel², Diogo C F Silva^{3

4

5}, António Mesquita Montes^{3

5

6}, Joana F Pinto², Hugo Plácido da Silva^{1

7

8}

Affiliations

¹ Instituto Superior Técnico (IST), Department of Bioengineering (DBE), Av. Rovisco Pais n. 1, 1049-001 Lisboa, Portugal.
² CLYNXIO, LDA, Rua Augusto Macedo, n. 6, 5 Dto., 1600-794 Lisboa, Portugal.
³ Department of Physiotherapy, Santa Maria Health School, Trav. Antero de Quental 173/175, 4049-024 Porto, Portugal.
⁴ Department of Functional Sciences, School of Health, Polytechnic Institute of Porto, Rua Dr. António Bernardino de Almeida 400, 4200-072 Porto, Portugal.
⁵ Center for Rehabilitation Research, School of Health, Polytechnic Institute of Porto, Rua Dr. António Bernardino de Almeida 400, 4200-072 Porto, Portugal.
⁶ Department of Physiotherapy, School of Health, Polytechnic Institute of Porto, Rua Dr. António Bernardino de Almeida 400, 4200-072 Porto, Portugal.
⁷ Instituto de Telecomunicações (IT), Av. Rovisco Pais n. 1, Torre Norte-Piso 10, 1049-001 Lisboa, Portugal.
⁸ Lisbon Unit for Learning and Intelligent Systems (LUMLIS), European Laboratory for Learning and Intelligent Systems (ELLIS), Av. Rovisco Pais n. 1, 1049-001 Lisboa, Portugal.

PMID: 38203068
PMCID: PMC10781343
DOI: 10.3390/s24010206

Abstract

Musculoskeletal conditions affect millions of people globally; however, conventional treatments pose challenges concerning price, accessibility, and convenience. Many telerehabilitation solutions offer an engaging alternative but rely on complex hardware for body tracking. This work explores the feasibility of a model for 3D Human Pose Estimation (HPE) from monocular 2D videos (MediaPipe Pose) in a physiotherapy context, by comparing its performance to ground truth measurements. MediaPipe Pose was investigated in eight exercises typically performed in musculoskeletal physiotherapy sessions, where the Range of Motion (ROM) of the human joints was the evaluated parameter. This model showed the best performance for shoulder abduction, shoulder press, elbow flexion, and squat exercises. Results have shown a MAPE ranging between 14.9% and 25.0%, Pearson's coefficient ranging between 0.963 and 0.996, and cosine similarity ranging between 0.987 and 0.999. Some exercises (e.g., seated knee extension and shoulder flexion) posed challenges due to unusual poses, occlusions, and depth ambiguities, possibly related to a lack of training data. This study demonstrates the potential of HPE from monocular 2D videos, as a markerless, affordable, and accessible solution for musculoskeletal telerehabilitation approaches. Future work should focus on exploring variations of the 3D HPE models trained on physiotherapy-related datasets, such as the Fit3D dataset, and post-preprocessing techniques to enhance the model's performance.

Keywords: 2D camera; 3D Human Pose Estimation; MediaPipe Pose; ROM; deep learning; monocular; musculoskeletal; telerehabilitation; videos.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest. The sponsors had no role in the design, execution, interpretation, or writing of the study.

Figures

**Figure 1**
Example of skeletal human body representation: 33 landmarks of MediaPipe Pose, where the right-side landmarks are represented in blue, the left-side landmarks in orange, and the nose landmark in white.

**Figure 2**
Classification of 2D camera-based models for Human Pose Estimation (HPE).

**Figure 3**
Eight exercises selected for the experimental study: Shoulder Flexion/Extension (SF), Shoulder Abduction/Adduction (SA), Elbow Flexion/Extension (EF), Shoulder Press (SP), Hip Abduction/Adduction (HA), Squat (SQ), March (MCH), and Seated Knee Flexion/Extension (SKF). Shoulder press and squat exercises are illustrated by a sequence of two representative images of the movement.

**Figure 4**
Experimental setup for the data acquisition, showing some of the Qualisys cameras, two 2D cameras, and the relative position between the subject and the two 2D cameras.

**Figure 5**
Anatomical location of the six Qualisys MoCap markers.

**Figure 6**
The 3D Cartesian coordinate system of Qualisys (in orange) and its spatial relation with respect to the participant position during data acquisition.

**Figure 7**
Comparison of the normal vectors of the anatomical planes (in black) with the Qualisys coordinate system (in orange).

**Figure 8**
Relation between the participant position and the Cartesian coordinate system of the MediaPipe Pose model for three camera orientations: (a) camera plane parallel to participant frontal plane; (b) camera plane rotated around the Y-axis relative to participant frontal plane; and (c) camera plane rotated around the X-axis relative to participant frontal plane. The camera 2D coordinate system is represented by the X’-axis and Y’-axis, which are parallel to the X-axis and Y-axis of the algorithm coordinate system, respectively.

**Figure 9**
The virtual 3D coordinate system of MediaPipe Pose coincident with the normal vectors of the anatomical planes. The origin is the midpoint between the hips. The X-axis is the sagittal plane normal, the Y-axis is the transverse plane normal, and the Z-axis is the frontal plane normal. The four points (representing the shoulders and hips) are used to define the virtual 3D coordinate system.

**Figure 10**
Representation of virtual 3D coordinate system definition: (1) Z-axis or frontal plane normal; (2) Y-axis or transverse plane normal; and (3) X-axis or sagittal plane normal.

**Figure 11**
Comparison of the normal vectors of the anatomical planes (in black) with the MediaPipe Pose virtual coordinate system (in blue).

**Figure 12**
Amplitude calculation between the projected body segment vector and a reference direction.

**Figure 13**
Data alignment between the Qualisys ground truth amplitudes (in orange) and MediaPipe Pose predicted amplitudes (in blue).

**Figure 14**
Example of Qualisys ground truth (in orange) and MediaPipe Pose predicted (in blue) amplitudes for Subject 1 performing SA exercise and SKF exercise. (a,b) show the raw amplitude before the alignment procedure, and (c,d) the aligned amplitude data, before segmenting the sample to extract the exercise repetitions.

**Figure 15**
Relation between Qualisys and MediaPipe Pose motion amplitudes for (a) SA exercise and (b) SKF exercise. Each color represents a different subject, and the yellow line is the linear regression that best fits the amplitude data for the exercise; the coefficient of determination ( $R^{2}$ ) and the linear regression equation (slope and intercept) are also shown, where y and x are the Qualisys and MediaPipe Pose amplitudes, respectively.

See this image and copyright information in PMC

References

1. Cieza A., Causey K., Kamenov K., Hanson S.W., Chatterji S., Vos T. Global estimates of the need for rehabilitation based on the Global Burden of Disease study 2019: A systematic analysis for the Global Burden of Disease Study 2019. Lancet. 2020;396:2006–2017. doi: 10.1016/S0140-6736(20)32340-0. - DOI - PMC - PubMed
1. Vieira L.M.S.M.d.A., de Andrade M.A., Sato T.d.O. Telerehabilitation for musculoskeletal pain–An overview of systematic reviews. Digit. Health. 2023;9:20552076231164242. doi: 10.1177/20552076231164242. - DOI - PMC - PubMed
1. Cottrell M.A., Russell T.G. Telehealth for musculoskeletal physiotherapy. Musculoskelet. Sci. Pract. 2020;48:102193. doi: 10.1016/j.msksp.2020.102193. - DOI - PMC - PubMed
1. Areias A.C., Costa F., Janela D., Molinos M., Moulder R.G., Lains J., Scheer J.K., Bento V., Yanamadala V., Correia F.D. Long-term clinical outcomes of a remote digital musculoskeletal program: An ad hoc analysis from a longitudinal study with a non-participant comparison group. Healthcare. 2022;10:2349. doi: 10.3390/healthcare10122349. - DOI - PMC - PubMed
1. Dias G., Adrião M.L., Clemente P., da Silva H.P., Chambel G., Pinto J.F. Effectiveness of a Gamified and Home-Based Approach for Upper-limb Rehabilitation; Proceedings of the Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC); Scotland, UK. 11–15 July 2022; pp. 2602–2605. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Feasibility of 3D Body Tracking from Monocular 2D Video Feeds in Musculoskeletal Telerehabilitation

Affiliations

Feasibility of 3D Body Tracking from Monocular 2D Video Feeds in Musculoskeletal Telerehabilitation

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources