An object detection model AAPW-YOLO for UAV remote sensing images based on adaptive convolution and reconstructed feature fusion
- PMID: 40346071
- PMCID: PMC12064822
- DOI: 10.1038/s41598-025-00239-4
An object detection model AAPW-YOLO for UAV remote sensing images based on adaptive convolution and reconstructed feature fusion
Abstract
In small object detection scenarios such as UAV aerial imagery and remote sensing, the difficulties in feature extraction are primarily due to challenges such as small object size, multi-scale variations, and background interference. To overcome these challenges, this paper presents a model for detecting small objects, AAPW-YOLO, based on adaptive convolution and reconstructed feature fusion. In the AAPW-YOLO model, we improve the standard convolution and the CSP Bottleneck with 2 Convolutions (C2f) structure in the You Only Look Once v8 (YOLOv8) backbone network by using Alterable Kernel Convolution (AKConv), which improves the network's proficiency in capturing features across various scales while considerably lowering the model's parameter count. Additionally, we introduce the Attentional Scale Sequence Fusion P2 (ASFP2) structure, which enhances the feature fusion mechanism of the Attentional Scale Sequence Fusion You Only Look Once (ASF-YOLO) and incorporates a P2 detection layer. This optimizes the feature fusion mechanism in the YOLOv8 neck, enhancing the network's ability to capture both fine details and global contextual information, while additionally decreasing the model parameters. Finally, we adopt a gradient-enhancing strategy with the Wise Intersection over Union (Wise-IoU) loss function to balance the gradient contributions from anchor boxes of different qualities during training, thereby improving regression accuracy. Experimental results show that: The proposed detection model reduces the parameter count by 30% and improves mAP@0.5 by 3.6% on the VisDrone2019 dataset; On the DOTA v1.0 dataset, the parameter count is reduced by 30%, with a 2.5% improvement in mAP@0.5. The proposed model achieves high recognition accuracy while having fewer parameters, enhancing the robustness and generalization ability of the network.
Keywords: AKConv; Feature fusion mechanism; Small object detection; Wise-IoU; YOLOv8.
© 2025. The Author(s).
Conflict of interest statement
Declarations. Competing interests: The authors declare no competing interests.
Figures















Similar articles
-
RFAG-YOLO: A Receptive Field Attention-Guided YOLO Network for Small-Object Detection in UAV Images.Sensors (Basel). 2025 Mar 30;25(7):2193. doi: 10.3390/s25072193. Sensors (Basel). 2025. PMID: 40218706 Free PMC article.
-
Fusion of multi-scale attention for aerial images small-target detection model based on PARE-YOLO.Sci Rep. 2025 Feb 8;15(1):4753. doi: 10.1038/s41598-025-88857-w. Sci Rep. 2025. PMID: 39922922 Free PMC article.
-
A multi-scale small object detection algorithm SMA-YOLO for UAV remote sensing images.Sci Rep. 2025 Mar 18;15(1):9255. doi: 10.1038/s41598-025-92344-7. Sci Rep. 2025. PMID: 40102487 Free PMC article.
-
HSF-YOLO: A Multi-Scale and Gradient-Aware Network for Small Object Detection in Remote Sensing Images.Sensors (Basel). 2025 Jul 12;25(14):4369. doi: 10.3390/s25144369. Sensors (Basel). 2025. PMID: 40732496 Free PMC article.
-
OD-YOLO: Robust Small Object Detection Model in Remote Sensing Image with a Novel Multi-Scale Feature Fusion.Sensors (Basel). 2024 Jun 3;24(11):3596. doi: 10.3390/s24113596. Sensors (Basel). 2024. PMID: 38894387 Free PMC article.
References
-
- Sandler, M., Howard, A., Zhu, M., Zhmoginov, A. & Chen, L.-C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4510–4520 (2018).
-
- Ma, N., Zhang, X., Zheng, H.-T. & Sun, J. Shufflenet v2: Practical guidelines for efficient cnn architecture design. In Proceedings of the European Conference on Computer Vision (ECCV). 116–131 (2018).
-
- Han, K. et al. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1580–1589 (2020).
-
- Tan, M. & Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In International Conference on Machine Learning. 6105–6114 (PMLR, 2019).
-
- Liu, C. et al. Yolc: You only look clusters for tiny object detection in aerial images. In IEEE Transactions on Intelligent Transportation Systems (2024).
Grants and funding
LinkOut - more resources
Full Text Sources