Review of models for estimating 3D human pose using deep learning
- PMID: 40062308
- PMCID: PMC11888865
- DOI: 10.7717/peerj-cs.2574
Review of models for estimating 3D human pose using deep learning
Abstract
Human pose estimation (HPE) is designed to detect and localize various parts of the human body and represent them as a kinematic structure based on input data like images and videos. Three-dimensional (3D) HPE involves determining the positions of articulated joints in 3D space. Given its wide-ranging applications, HPE has become one of the fastest-growing areas in computer vision and artificial intelligence. This review highlights the latest advances in 3D deep-learning-based HPE models, addressing the major challenges such as accuracy, real-time performance, and data constraints. We assess the most widely used datasets and evaluation metrics, providing a comparison of leading algorithms in terms of precision and computational efficiency in tabular form. The review identifies key applications of HPE in industries like healthcare, security, and entertainment. Our findings suggest that while deep learning models have made significant strides, challenges in handling occlusion, real-time estimation, and generalization remain. This study also outlines future research directions, offering a roadmap for both new and experienced researchers to further develop 3D HPE models using deep learning.
Keywords: 3D image; Deep learning; Human pose estimation; Neural network; Review; Survey.
© 2025 Salisu et al.
Conflict of interest statement
The authors declared that there is no competing interest.
Figures





References
-
- Andriluka M, Roth S, Schiele B. Pictorial structures revisited: people detection and articulated pose estimation. 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR. 2009;2009:1014–1021. doi: 10.1109/CVPR.2009.5206754. - DOI
-
- Angelini F, Fu Z, Long Y, Shao L, Naqvi SM. ActionXPose: a novel 2D multi-view pose-based algorithm for real-time human action recognition. 2018. pp. 1–14. ArXiv preprint. - DOI
-
- Azam MM, Desai K. A survey on 3D egocentric human pose estimation. 2024. pp. 1643–1654. ArXiv preprint. - DOI
-
- Barajas M, Dávalos-Viveros JP, Gordillo JL. 3D tracking and control of UAV using planar faces and monocular camera. In: Carrasco-Ochoa JA, Martínez-Trinidad JF, Rodríguez JS, di Baja GS, editors. Pattern Recognition. MCPR 2013. Lecture Notes in Computer Science. Vol. 7914. Berlin, Heidelberg: Springer; 2013. pp. 64–73. - DOI
-
- Belagiannis V, Amin S, Andriluka M, Schiele B, Navab N, Ilic S. 3D pictorial structures for multiple human pose estimation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 2014. pp. 1669–1676. - DOI
Publication types
LinkOut - more resources
Full Text Sources