Congested Crowd Counting via Adaptive Multi-Scale Context Learning

Yani Zhang¹, Huailin Zhao², Zuodong Duan³, Liangjun Huang¹, Jiahao Deng³, Qing Zhang¹

Affiliations

¹ School of Computer Science and Information Engineering, Shanghai Institute of Technology, Shanghai 201418, China.
² School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai 201418, China.
³ School of Mechatronical Engineering, Beijing Institute of Technology, Beijing 100081, China.

PMID: 34072408
PMCID: PMC8198824
DOI: 10.3390/s21113777

Congested Crowd Counting via Adaptive Multi-Scale Context Learning

Yani Zhang et al. Sensors (Basel). 2021.

. 2021 May 29;21(11):3777.

doi: 10.3390/s21113777.

Authors

Yani Zhang¹, Huailin Zhao², Zuodong Duan³, Liangjun Huang¹, Jiahao Deng³, Qing Zhang¹

Affiliations

¹ School of Computer Science and Information Engineering, Shanghai Institute of Technology, Shanghai 201418, China.
² School of Electrical and Electronic Engineering, Shanghai Institute of Technology, Shanghai 201418, China.
³ School of Mechatronical Engineering, Beijing Institute of Technology, Beijing 100081, China.

PMID: 34072408
PMCID: PMC8198824
DOI: 10.3390/s21113777

Abstract

In this paper, we propose a novel congested crowd counting network for crowd density estimation, i.e., the Adaptive Multi-scale Context Aggregation Network (MSCANet). MSCANet efficiently leverages the spatial context information to accomplish crowd density estimation in a complicated crowd scene. To achieve this, a multi-scale context learning block, called the Multi-scale Context Aggregation module (MSCA), is proposed to first extract different scale information and then adaptively aggregate it to capture the full scale of the crowd. Employing multiple MSCAs in a cascaded manner, the MSCANet can deeply utilize the spatial context information and modulate preliminary features into more distinguishing and scale-sensitive features, which are finally applied to a 1 × 1 convolution operation to obtain the crowd density results. Extensive experiments on three challenging crowd counting benchmarks showed that our model yielded compelling performance against the other state-of-the-art methods. To thoroughly prove the generality of MSCANet, we extend our method to two relevant tasks: crowd localization and remote sensing object counting. The extension experiment results also confirmed the effectiveness of MSCANet.

Keywords: crowd counting; crowd density estimation; crowd localization; multi-scale context learning; remote sensing object counting.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Representative examples in the UCF-QNRF dataset [17]. From left to right: input images, ground-truth, results of CSRNet [8], and the results of MSCANet. Compared to CSRNet, MSCANet can effectively handle the ambiguity of appearance between crowd and background objects.

**Figure 2**
Detailedillustration of our Adaptive Multi-scale Context Aggregation Network for crowd counting.

**Figure 3**
Different structures of multi-scale context modules. (a) Multi-scale context aggregation module (MSCA) w/o channel attention (CA); (b) cascade context pyramid module (CCPM); (c) scale pyramid module (SPM); and (d) scale-aware context module (SACM).

**Figure 4**
Visualizations of MSCANet for crowd localization on the UCF-QNRF dataset. Red points denote the ground-truth, and green points denote the estimated location results of MSCANet.

**Figure 5**
Visualization results of MSCANet for remote sensing object counting on RSOC dataset.

**Figure 6**
Impacts of different pyramid scale settings on UCF-QNRF. From left to right: input image, ground truth, result of PS = {1}, result of PS = {1,2}, result of PS = {1,2,3}, and result of PS = {1,2,3,4}.

**Figure 7**
Impacts of CA on UCF-QNRF. From left to right: input image, ground-truth, result of MSCA w/o CA, and result of MSCA.

**Figure 8**
Visual comparision of different multi-scale context modules on UCF-QNRF. From left to right: input images, ground-truth, results of our method, results of CCPM, results of SPM, and results of SACM.

See this image and copyright information in PMC

References

1. Yu Y., Huang J., Du W., Xiong N. Design and analysis of a lightweight context fusion CNN scheme for crowd counting. Sensors. 2019;19:2013. doi: 10.3390/s19092013. - DOI - PMC - PubMed
1. Tong M., Fan L., Nan H., Zhao Y. Smart camera aware crowd counting via multiple task fractional stride deep learning. Sensors. 2019;19:1346. doi: 10.3390/s19061346. - DOI - PMC - PubMed
1. Csönde G., Sekimoto Y., Kashiyama T. Crowd counting with semantic scene segmentation in helicopter footage. Sensors. 2020;20:4855. doi: 10.3390/s20174855. - DOI - PMC - PubMed
1. Ilyas N., Shahzad A., Kim K. Convolutional-neural network-based image crowd counting: Review, categorization, analysis, and performance evaluation. Sensors. 2020;20:43. doi: 10.3390/s20010043. - DOI - PMC - PubMed
1. Fortino G., Savaglio C., Spezzano G., Zhou M. Internet of Things as System of Systems: A Review of Methodologies, Frameworks, Platforms, and Tools. IEEE Trans. Syst. Man Cybern. Syst. 2020 doi: 10.1109/TSMC.2020.3042898. - DOI

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Congested Crowd Counting via Adaptive Multi-Scale Context Learning

Affiliations

Congested Crowd Counting via Adaptive Multi-Scale Context Learning

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources