Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 22;15(1):2866.
doi: 10.1038/s41598-025-86247-w.

Multiscale regional calibration network for crowd counting

Affiliations

Multiscale regional calibration network for crowd counting

Jiamao Yu et al. Sci Rep. .

Abstract

Crowd counting aims to estimate the number, density, and distribution of crowds in an image. While CNN-based crowd counting methods have been effective, head-scale variation and complex background remain two major challenges for crowd counting. Therefore, we propose a multiscale region calibration network called MRCNet to effectively address these challenges. To address the former challenge, we design a multiscale aware module that utilizes multi-branch dilated convolutional parallelism to obtain multiscale receptive fields and cope with drastic changes in head size. For the latter challenge, we design a regional calibration module that calibrates the attention weights of each region after obtaining the attention map to effectively handle challenges in complex contexts. Additionally, we improve the loss function by combining L2 loss and binary cross-entropy loss to help MRCNet achieve excellent results. Extensive experiments were conducted on three mainstream datasets to demonstrate the robustness and competitiveness of our approach.

Keywords: Crowd counting; Feature aggregation; Multiscale; Regional calibration.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Major challenges facing current crowd counting tasks. (a) The problem of head-size variation. (b) The problem of complex background.
Fig. 2
Fig. 2
The overall structure of MRCNet. Firstly, the images are fed into the Feature Extraction Module (FEM), which utilizes the first 13 layers of VGG-16 for initial feature extraction. The FEM outputs three levels of features with different network depths, which are then passed through the Multiscale Aware Module (MAM) and Regional Calibration Module (RCM) for feature enhancement. This process generates three levels of features, denoted as formula image, formula image, and formula image. Finally, the three levels of features are combined in the Feature Aggregation Module (FAM) to produce the predicted density map. An improved loss function is used to train the network.
Fig. 3
Fig. 3
Visualization results on different datasets, where the three columns of pictures from left to right are the true image, the ground-truth density map, and the predicted density map.
Fig. 4
Fig. 4
The visualization results of different combinations of MAM and RCM are shown as follows: (a) After adding the MAM, the network can adapt to drastic changes in head-size. (b) When the scene is complex, the network has large errors, such as counting leaves as heads. By adding RCM, the network can enhance its counting accuracy and effectively suppress background interference.

Similar articles

Cited by

References

    1. Gao, H., Zhao, W., Zhang, D. & Deng, M. Application of improved transformer based on weakly supervised in crowd localization and crowd counting. Sci. Rep.13, 1144 (2023). - PMC - PubMed
    1. Xidias, E., Zacharia, P. & Nearchou, A. Intelligent fleet management of autonomous vehicles for city logistics. Appl. Intell.2022, 1–19 (2022).
    1. Xing, J. et al. STGs: construct spatial and temporal graphs for citywide crowd flow prediction. Appl. Intell.52, 12272–12281 (2022).
    1. Ilyas, N., Ahmad, Z., Lee, B. & Kim, K. An effective modular approach for crowd counting in an image using convolutional neural networks. Sci. Rep.12, 5795 (2022). - PMC - PubMed
    1. Zhong, X., Qin, J., Guo, M., Zuo, W. & Lu, W. Offset-decoupled deformable convolution for efficient crowd counting. Sci. Rep.12, 12229 (2022). - PMC - PubMed

LinkOut - more resources