Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 Feb 4:11:e2574.
doi: 10.7717/peerj-cs.2574. eCollection 2025.

Review of models for estimating 3D human pose using deep learning

Affiliations
Review

Review of models for estimating 3D human pose using deep learning

Sani Salisu et al. PeerJ Comput Sci. .

Abstract

Human pose estimation (HPE) is designed to detect and localize various parts of the human body and represent them as a kinematic structure based on input data like images and videos. Three-dimensional (3D) HPE involves determining the positions of articulated joints in 3D space. Given its wide-ranging applications, HPE has become one of the fastest-growing areas in computer vision and artificial intelligence. This review highlights the latest advances in 3D deep-learning-based HPE models, addressing the major challenges such as accuracy, real-time performance, and data constraints. We assess the most widely used datasets and evaluation metrics, providing a comparison of leading algorithms in terms of precision and computational efficiency in tabular form. The review identifies key applications of HPE in industries like healthcare, security, and entertainment. Our findings suggest that while deep learning models have made significant strides, challenges in handling occlusion, real-time estimation, and generalization remain. This study also outlines future research directions, offering a roadmap for both new and experienced researchers to further develop 3D HPE models using deep learning.

Keywords: 3D image; Deep learning; Human pose estimation; Neural network; Review; Survey.

PubMed Disclaimer

Conflict of interest statement

The authors declared that there is no competing interest.

Figures

Figure 1
Figure 1. Article selection process using PRISMA protocol.
Direct component sources. https://creativecommons.org/licenses/by/4.0/.
Figure 2
Figure 2. The number of published articles from 2017 to 2024.
The growing research interest in the field of 3D human pose estimation (3DHPE).
Figure 3
Figure 3. The architecture of the proposed hybrid mode to predict 3D human pose and shape (Wang et al., 2023).
Direct License. https://s100.copyright.com/CustomerAdmin/PLF.jsp?ref=cef4c52c-cb41-4996-9ed0-5b42de22dfc5.
Figure 4
Figure 4. 3D human pose generator (Guan et al., 2023).
Figure 5
Figure 5. Occ-Corrector framework (Zhao et al., 2023a).

References

    1. Andriluka M, Roth S, Schiele B. Pictorial structures revisited: people detection and articulated pose estimation. 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR. 2009;2009:1014–1021. doi: 10.1109/CVPR.2009.5206754. - DOI
    1. Angelini F, Fu Z, Long Y, Shao L, Naqvi SM. ActionXPose: a novel 2D multi-view pose-based algorithm for real-time human action recognition. 2018. pp. 1–14. ArXiv preprint. - DOI
    1. Azam MM, Desai K. A survey on 3D egocentric human pose estimation. 2024. pp. 1643–1654. ArXiv preprint. - DOI
    1. Barajas M, Dávalos-Viveros JP, Gordillo JL. 3D tracking and control of UAV using planar faces and monocular camera. In: Carrasco-Ochoa JA, Martínez-Trinidad JF, Rodríguez JS, di Baja GS, editors. Pattern Recognition. MCPR 2013. Lecture Notes in Computer Science. Vol. 7914. Berlin, Heidelberg: Springer; 2013. pp. 64–73. - DOI
    1. Belagiannis V, Amin S, Andriluka M, Schiele B, Navab N, Ilic S. 3D pictorial structures for multiple human pose estimation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 2014. pp. 1669–1676. - DOI

LinkOut - more resources