Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 25;24(1):110.
doi: 10.3390/s24010110.

SOCA-PRNet: Spatially Oriented Attention-Infused Structured-Feature-Enabled PoseResNet for 2D Human Pose Estimation

Affiliations

SOCA-PRNet: Spatially Oriented Attention-Infused Structured-Feature-Enabled PoseResNet for 2D Human Pose Estimation

Ali Zakir et al. Sensors (Basel). .

Abstract

In the recent era, 2D human pose estimation (HPE) has become an integral part of advanced computer vision (CV) applications, particularly in understanding human behaviors. Despite challenges such as occlusion, unfavorable lighting, and motion blur, advancements in deep learning have significantly enhanced the performance of 2D HPE by enabling automatic feature learning from data and improving model generalization. Given the crucial role of 2D HPE in accurately identifying and classifying human body joints, optimization is imperative. In response, we introduce the Spatially Oriented Attention-Infused Structured-Feature-enabled PoseResNet (SOCA-PRNet) for enhanced 2D HPE. This model incorporates a novel element, Spatially Oriented Attention (SOCA), designed to enhance accuracy without significantly increasing the parameter count. Leveraging the strength of ResNet34 and integrating Global Context Blocks (GCBs), SOCA-PRNet precisely captures detailed human poses. Empirical evaluations demonstrate that our model outperforms existing state-of-the-art approaches, achieving a Percentage of Correct Keypoints at 0.5 (PCKh@0.5) of 90.877 at a 50% threshold and a Mean Precision (Mean@0.1) score of 41.137. These results underscore the potential of SOCA-PRNet in real-world applications such as robotics, gaming, and human-computer interaction, where precise and efficient 2D HPE is paramount.

Keywords: 2D human pose estimation; CV; Global Context Blocks; SOCA-PRNet.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Detailed architecture of our proposed SOCA-PRNet model for 2D HPE.
Figure 2
Figure 2
Detailed architecture of the simple base line model for 2D HPE [6].
Figure 3
Figure 3
Modified ResNet with deconvolution module.
Figure 4
Figure 4
Visualization of the SOCA module’s feature integration and weights W generation and distribution mechanisms.
Figure 5
Figure 5
Comparative visualization of HardSwish and ReLU activation functions.
Figure 6
Figure 6
Visual analysis of 2D HPE models in terms of accuracy and parameter count.
Figure 7
Figure 7
Graphical Illustration of the proposed model and simple baseline models. (a) Illustration of PCKh@0.5 results: proposed model and simple baseline models. (b) Graphical analysis of Mean and Mean@0.1: proposed models and simple baseline models.
Figure 8
Figure 8
Qualitative results on MPII pose estimation result, containing viewpoint change, occlusion, and self-occlusion.

References

    1. Bertasius G., Feichtenhofer C., Tran D., Shi J., Torresani L. Learning temporal pose estimation from sparsely-labeled videos. Adv. Neural Inf. Process. Syst. 2019;32
    1. Chen H., Feng R., Wu S., Xu H., Zhou F., Liu Z. 2D Human pose estimation: A survey. Multimed. Syst. 2023;29:3115–3138. doi: 10.1007/s00530-022-01019-0. - DOI
    1. Sapp B., Toshev A., Taskar B. Cascaded models for articulated pose estimation; Proceedings of the Computer Vision–ECCV 2010: 11th European Conference on Computer Vision; Heraklion, Crete, Greece. 5–11 September 2010; Berlin/Heidelberg, Germany: Springer; 2010. pp. 406–420.
    1. Wang F., Li Y. Beyond physical connections: Tree models in human pose estimation; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Portland, OR, USA. 23–28 June 2013; pp. 596–603.
    1. Cao Z., Simon T., Wei S.E., Sheikh Y. Realtime multi-person 2d pose estimation using part affinity fields; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA. 21–26 July 2017; pp. 7291–7299.

LinkOut - more resources