Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jun 14;25(12):3726.
doi: 10.3390/s25123726.

HGCS-Det: A Deep Learning-Based Solution for Localizing and Recognizing Household Garbage in Complex Scenarios

Affiliations

HGCS-Det: A Deep Learning-Based Solution for Localizing and Recognizing Household Garbage in Complex Scenarios

Houkui Zhou et al. Sensors (Basel). .

Abstract

With the rise of deep learning technology, intelligent garbage detection provides a new idea for garbage classification management. However, due to the interference of complex environments, coupled with the influence of the irregular features of garbage, garbage detection in complex scenarios still faces significant challenges. Moreover, some of the existing research suffer from shortcomings in either their precision or real-time performance, particularly when applied to complex garbage detection scenarios. Therefore, this paper proposes a model based on YOLOv8, namely HGCS-Det, for detecting garbage in complex scenarios. The HGCS-Det model is designed as follows: Firstly, the normalization attention module is introduced to calibrate the model's attention to targets and to suppress the environmental noise interference information. Additionally, to weigh the attention-feature contributions, an Attention Feature Fusion module is employed to complement the attention weights of each channel. Subsequently, an Instance Boundary Reinforcement module is established to capture the fine-grained features of garbage by combining strong gradient information with semantic information. Finally, the Slide Loss function is applied to dynamically weight hard samples arising from the complex detection environments to improve the recognition accuracy of hard samples. With only a slight increase in parameters (3.02M), HGCS-Det achieves a 93.6% mean average precision (mAP) and 86 FPS on the public HGI30 dataset, which is a 3.33% higher mAP value than from YOLOv12, and outperforms the state-of-the-art (SOTA) methods in both efficiency and applicability. Notably, HGCS-Det maintains a lightweight architecture while enhancing the detection accuracy, enabling real-time performance even in resource-constrained environments. These characteristics significantly improve its practical applicability, making the model well suited for deployment in embedded devices and real-world garbage classification systems. This method can serve as a valuable technical reference for the engineering application of garbage classification.

Keywords: Slide Loss; attention-feature fusion; garbage detection; instance boundary reinforcement; normalization attention.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Dataset analysis for (a) the number of category instances, (b) the label spatial distribution, and (c) the label size distribution. (a) This bar chart displays the number of samples for each category across 30 classes. Different colors are used solely to distinguish each category visually, enhancing clarity and making it easier to identify individual class distributions. (b) This heatmap represents the spatial distribution of labels within the dataset. The intensity of the color indicates density, with darker blue shades signifying higher concentrations of data points. (c) This heatmap illustrates the distribution of label sizes (height vs. width). Similar to (b), darker blue shades represent higher densities, with a pronounced concentration in the upper right quadrant, indicating that larger label sizes are more prevalent in the dataset.
Figure 2
Figure 2
Examples of data diversity in HGI30.
Figure 3
Figure 3
The overall architecture of YOLOv8.
Figure 4
Figure 4
The structure of the HGCS-Det model. Solid arrows represent the main data flow, highlighting the direct transmission of feature maps across the Backbone, Neck, and Prediction stages. In contrast, dashed arrows signify residual or skip connections, whereby features from preceding layers are concatenated or summed with subsequent layers to retain spatial details.
Figure 5
Figure 5
The structure of NAM, which is made up of the combination of (a) a channel attention module and (b) a spatial attention module.
Figure 6
Figure 6
Visualization demonstration of the AFF module principle. These color-coded arrows illustrate the continuous processing stages of the same image across the multi-group feature pipeline.
Figure 7
Figure 7
The structure of the IBR module. Solid red arrows denote the primary data flow, guiding the input through the Local Descriptor and Semantic Projector, where convolutions extract initial features, directing the process to subsequent modules for further refinement. Dashed black arrows illustrate the gradient aggregation and expansion within the Gradient Aggregation module, demonstrating how gradients are collected and expanded to enhance feature quality before advancing to the next stage.
Figure 8
Figure 8
The weighting strategy for Slide Loss [32].
Figure 9
Figure 9
Examples of visual analysis by Grad-CAM. The figure compares detection results for an apple core and a cigarette butt across four columns: the first column shows the original images without overlay, while the subsequent columns display heatmaps where colors transition from blue (low confidence) to red (high confidence), with yellow and orange indicating intermediate levels.
Figure 10
Figure 10
Visual analysis for (a) VC’s outputs, (b) IBE‘s outputs, and (c) IBR’s outputs.
Figure 11
Figure 11
The confusion matrix results for (a) YOLOv8n and (b) HGCS-Det.
Figure 12
Figure 12
Training results of HGCS-Det with that of other mainstream models on HGI30.

Similar articles

References

    1. Kaza S., Yao L., Bhada-Tata P., Van Woerden F. What a Waste 2.0: A Global Snapshot of Solid Waste Management to 2050. World Bank Publications; Washington, DC, USA: 2018.
    1. Chu X., Chu Z., Huang W.-C., He Y., Chen M., Abula M. Assessing the Implementation Effect of Shanghai’s Compulsory Municipal Solid Waste Classification Policy. J. Mater. Cycles Waste Manag. 2023;25:1333–1343. doi: 10.1007/s10163-023-01597-9. - DOI - PMC - PubMed
    1. Zhang S., Hu D., Lin T., Li W., Zhao R., Yang H., Pei Y., Jiang L. Determinants Affecting Residents’ Waste Classification Intention and Behavior: A Study Based on TPB and A-B-C Methodology. J. Environ. Manag. 2021;290:112591. doi: 10.1016/j.jenvman.2021.112591. - DOI - PubMed
    1. Lowe D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004;60:91–110. doi: 10.1023/B:VISI.0000029664.99615.94. - DOI
    1. Dalal N., Triggs B. Histograms of Oriented Gradients for Human Detection; Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05); San Diego, CA, USA. 20–25 June 2005.

LinkOut - more resources