. 2022 Apr 26;22(9):3320.

doi: 10.3390/s22093320.

Meta-Knowledge and Multi-Task Learning-Based Multi-Scene Adaptive Crowd Counting

Siqi Tang¹, Zhisong Pan¹, Guyu Hu¹, Yang Wu², Yunbo Li¹

Affiliations

¹ Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China.
² Beijing Information and Communications Technology Research Center, Beijing 100036, China.

PMID: 35591010
PMCID: PMC9104539
DOI: 10.3390/s22093320

Meta-Knowledge and Multi-Task Learning-Based Multi-Scene Adaptive Crowd Counting

Siqi Tang et al. Sensors (Basel). 2022.

. 2022 Apr 26;22(9):3320.

doi: 10.3390/s22093320.

Authors

Siqi Tang¹, Zhisong Pan¹, Guyu Hu¹, Yang Wu², Yunbo Li¹

Affiliations

¹ Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China.
² Beijing Information and Communications Technology Research Center, Beijing 100036, China.

PMID: 35591010
PMCID: PMC9104539
DOI: 10.3390/s22093320

Abstract

In this paper, we propose a multi-scene adaptive crowd counting method based on meta-knowledge and multi-task learning. In practice, surveillance cameras are stationarily deployed in various scenes. Considering the extensibility of a surveillance system, the ideal crowd counting method should have a strong generalization capability to be deployed in unknown scenes. On the other hand, given the diversity of scenes, it should also effectively suit each scene for better performance. These two objectives are contradictory, so we propose a coarse-to-fine pipeline including meta-knowledge network and multi-task learning. Specifically, at the coarse-grained stage, we propose a generic two-stream network for all existing scenes to encode meta-knowledge especially inter-frame temporal knowledge. At the fine-grained stage, the regression of the crowd density map to the overall number of people in each scene is considered a homogeneous subtask in a multi-task framework. A robust multi-task learning algorithm is applied to effectively learn scene-specific regression parameters for existing and new scenes, which further improve the accuracy of each specific scenes. Taking advantage of multi-task learning, the proposed method can be deployed to multiple new scenes without duplicated model training. Compared with two representative methods, namely AMSNet and MAML-counting, the proposed method reduces the MAE by 10.29% and 13.48%, respectively.

Keywords: crowd counting; meta-knowledge; multi-scene adaptive; multi-task learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
The pipeline of proposed coarse-to-fine multi-scene adaptive crowd counting. At the coarse-grained stage, the frame pairs of multiple known scenes are used to train a generic model with meta-knowledge. At the fine-grained stage, overall counting regression from estimated density maps of each scene is regarded as a specific task. Multi-task learning is used to learn the regression weight of each specific scene.

**Figure 2**
Two-stream network to capture meta-knowledge.

**Figure 3**
Comparison of density maps estimated by CRSNet and our proposed TSN.

**Figure 4**
Robustness performance of the proposed TSN-MTL under different lighting conditions and crowd density. Display of estimated crowd density map of pictures collected by camera 100,400 and 100,730. The first column is the original picture, the second column is the density map estimated by STN and the third column is the density map estimated by STN with MTL. The first two rows are frames collected by camera 100,400, while the next two rows are frames collected by camera 100,730.

**Figure 5**
Similarity relationship of parameters in multiple scenes.

See this image and copyright information in PMC

References

1. Saleh S.A.M., Suandi S.A., Ibrahim H. Recent survey on crowd density estimation and counting for visual surveillance. Eng. Appl. Artif. Intell. 2015;41:103–114. doi: 10.1016/j.engappai.2015.01.007. - DOI
1. Gao G., Gao J., Liu Q., Wang Q., Wang Y. CNN-based density estimation and crowd counting: A survey. arXiv. 20202003.12783
1. Zhang Y., Zhou D., Chen S., Gao S., Ma Y. Single-Image Crowd Counting via Multi-Column Convolutional Neural Network; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Las Vegas, NV, USA. 27–30 June 2016; pp. 589–597.
1. Sam D.B., Surya S., Babu R.V. Switching Convolutional Neural Network for Crowd Counting; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA. 21–26 July 2017; p. 6.
1. Chen X., Bin Y., Sang N., Gao C. Scale Pyramid Network for Crowd Counting; Proceedings of the IEEE Winter Conference on Applications of Computer Vision; Waikoloa, HI, USA. 7–11 January 2019; pp. 1941–1950.

MeSH terms

Actions
Actions
Actions
Actions

Grants and funding

62076251/National Natural Science Foundation of China

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Meta-Knowledge and Multi-Task Learning-Based Multi-Scene Adaptive Crowd Counting

Affiliations

Meta-Knowledge and Multi-Task Learning-Based Multi-Scene Adaptive Crowd Counting

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources