Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 26;22(9):3320.
doi: 10.3390/s22093320.

Meta-Knowledge and Multi-Task Learning-Based Multi-Scene Adaptive Crowd Counting

Affiliations

Meta-Knowledge and Multi-Task Learning-Based Multi-Scene Adaptive Crowd Counting

Siqi Tang et al. Sensors (Basel). .

Abstract

In this paper, we propose a multi-scene adaptive crowd counting method based on meta-knowledge and multi-task learning. In practice, surveillance cameras are stationarily deployed in various scenes. Considering the extensibility of a surveillance system, the ideal crowd counting method should have a strong generalization capability to be deployed in unknown scenes. On the other hand, given the diversity of scenes, it should also effectively suit each scene for better performance. These two objectives are contradictory, so we propose a coarse-to-fine pipeline including meta-knowledge network and multi-task learning. Specifically, at the coarse-grained stage, we propose a generic two-stream network for all existing scenes to encode meta-knowledge especially inter-frame temporal knowledge. At the fine-grained stage, the regression of the crowd density map to the overall number of people in each scene is considered a homogeneous subtask in a multi-task framework. A robust multi-task learning algorithm is applied to effectively learn scene-specific regression parameters for existing and new scenes, which further improve the accuracy of each specific scenes. Taking advantage of multi-task learning, the proposed method can be deployed to multiple new scenes without duplicated model training. Compared with two representative methods, namely AMSNet and MAML-counting, the proposed method reduces the MAE by 10.29% and 13.48%, respectively.

Keywords: crowd counting; meta-knowledge; multi-scene adaptive; multi-task learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
The pipeline of proposed coarse-to-fine multi-scene adaptive crowd counting. At the coarse-grained stage, the frame pairs of multiple known scenes are used to train a generic model with meta-knowledge. At the fine-grained stage, overall counting regression from estimated density maps of each scene is regarded as a specific task. Multi-task learning is used to learn the regression weight of each specific scene.
Figure 2
Figure 2
Two-stream network to capture meta-knowledge.
Figure 3
Figure 3
Comparison of density maps estimated by CRSNet and our proposed TSN.
Figure 4
Figure 4
Robustness performance of the proposed TSN-MTL under different lighting conditions and crowd density. Display of estimated crowd density map of pictures collected by camera 100,400 and 100,730. The first column is the original picture, the second column is the density map estimated by STN and the third column is the density map estimated by STN with MTL. The first two rows are frames collected by camera 100,400, while the next two rows are frames collected by camera 100,730.
Figure 5
Figure 5
Similarity relationship of parameters in multiple scenes.

Similar articles

References

    1. Saleh S.A.M., Suandi S.A., Ibrahim H. Recent survey on crowd density estimation and counting for visual surveillance. Eng. Appl. Artif. Intell. 2015;41:103–114. doi: 10.1016/j.engappai.2015.01.007. - DOI
    1. Gao G., Gao J., Liu Q., Wang Q., Wang Y. CNN-based density estimation and crowd counting: A survey. arXiv. 20202003.12783
    1. Zhang Y., Zhou D., Chen S., Gao S., Ma Y. Single-Image Crowd Counting via Multi-Column Convolutional Neural Network; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Las Vegas, NV, USA. 27–30 June 2016; pp. 589–597.
    1. Sam D.B., Surya S., Babu R.V. Switching Convolutional Neural Network for Crowd Counting; Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA. 21–26 July 2017; p. 6.
    1. Chen X., Bin Y., Sang N., Gao C. Scale Pyramid Network for Crowd Counting; Proceedings of the IEEE Winter Conference on Applications of Computer Vision; Waikoloa, HI, USA. 7–11 January 2019; pp. 1941–1950.

LinkOut - more resources