SPCANet: congested crowd counting via strip pooling combined attention network
- PMID: 39314741
- PMCID: PMC11419659
- DOI: 10.7717/peerj-cs.2273
SPCANet: congested crowd counting via strip pooling combined attention network
Abstract
Crowd counting aims to estimate the number and distribution of the population in crowded places, which is an important research direction in object counting. It is widely used in public place management, crowd behavior analysis, and other scenarios, showing its robust practicality. In recent years, crowd-counting technology has been developing rapidly. However, in highly crowded and noisy scenes, the counting effect of most models is still seriously affected by the distortion of view angle, dense occlusion, and inconsistent crowd distribution. Perspective distortion causes crowds to appear in different sizes and shapes in the image, and dense occlusion and inconsistent crowd distributions result in parts of the crowd not being captured completely. This ultimately results in the imperfect capture of spatial information in the model. To solve such problems, we propose a strip pooling combined attention (SPCANet) network model based on normed-deformable convolution (NDConv). We model long-distance dependencies more efficiently by introducing strip pooling. In contrast to traditional square kernel pooling, strip pooling uses long and narrow kernels (1×N or N×1) to deal with dense crowds, mutual occlusion, and overlap. Efficient channel attention (ECA), a mechanism for learning channel attention using a local cross-channel interaction strategy, is also introduced in SPCANet. This module generates channel attention through a fast 1D convolution to reduce model complexity while improving performance as much as possible. Four mainstream datasets, Shanghai Tech Part A, Shanghai Tech Part B, UCF-QNRF, and UCF CC 50, were utilized in extensive experiments, and mean absolute error (MAE) exceeds the baseline, which is 60.9, 7.3, 90.8, and 161.1, validating the effectiveness of SPCANet. Meanwhile, mean squared error (MSE) decreases by 5.7% on average over the four datasets, and the robustness is greatly improved.
Keywords: Channel attention; Convolutional neural network; Crowd counting; Spatial pooling.
©2024 Yuan.
Conflict of interest statement
The authors declare there are no competing interests.
Figures
Similar articles
-
An Adaptive Multi-Scale Network Based on Depth Information for Crowd Counting.Sensors (Basel). 2023 Sep 11;23(18):7805. doi: 10.3390/s23187805. Sensors (Basel). 2023. PMID: 37765861 Free PMC article.
-
Counting Crowds with Perspective Distortion Correction via Adaptive Learning.Sensors (Basel). 2020 Jul 6;20(13):3781. doi: 10.3390/s20133781. Sensors (Basel). 2020. PMID: 32640552 Free PMC article.
-
Context-Aware Multi-Scale Aggregation Network for Congested Crowd Counting.Sensors (Basel). 2022 Apr 22;22(9):3233. doi: 10.3390/s22093233. Sensors (Basel). 2022. PMID: 35590922 Free PMC article.
-
Convolutional-Neural Network-Based Image Crowd Counting: Review, Categorization, Analysis, and Performance Evaluation.Sensors (Basel). 2019 Dec 19;20(1):43. doi: 10.3390/s20010043. Sensors (Basel). 2019. PMID: 31861734 Free PMC article. Review.
-
Deep Learning-Based Crowd Scene Analysis Survey.J Imaging. 2020 Sep 11;6(9):95. doi: 10.3390/jimaging6090095. J Imaging. 2020. PMID: 34460752 Free PMC article. Review.
References
-
- Abdelghany A, Abdelghany K, Mahmassani H, Alhalabi W. Modeling framework for optimal evacuation of large-scale crowded pedestrian facilities. European Journal of Operational Research. 2014;237(3):1105–1118. doi: 10.1016/j.ejor.2014.02.054. - DOI
-
- Almeida JE, Rosseti RJ, Coelho AL. Crowd simulation modeling applied to emergency and evacuation simulations using multi-agent systems. 20131303.4692
-
- Cao X, Wang Z, Zhao Y, Su F. Context for accurate and efficient crowd counting. Proceedings of the European conference on computer vision (ECCV); 2018a. pp. 734–750.
-
- Cao X, Wang Z, Zhao Y, Su F. Scale aggregation network for accurate and efficient crowd counting. Proceedings of the European conference on computer vision (ECCV); 2018b. pp. 734–750.
-
- Chan AB, Liang Z-SJ, Vasconcelos N. Privacy preserving crowd monitoring: counting people without people models or tracking. 2008 IEEE conference on computer vision and pattern recognition; Piscataway. 2008. pp. 1–7.
Associated data
LinkOut - more resources
Full Text Sources