Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 15;23(16):7190.
doi: 10.3390/s23167190.

UAV-YOLOv8: A Small-Object-Detection Model Based on Improved YOLOv8 for UAV Aerial Photography Scenarios

Affiliations

UAV-YOLOv8: A Small-Object-Detection Model Based on Improved YOLOv8 for UAV Aerial Photography Scenarios

Gang Wang et al. Sensors (Basel). .

Abstract

Unmanned aerial vehicle (UAV) object detection plays a crucial role in civil, commercial, and military domains. However, the high proportion of small objects in UAV images and the limited platform resources lead to the low accuracy of most of the existing detection models embedded in UAVs, and it is difficult to strike a good balance between detection performance and resource consumption. To alleviate the above problems, we optimize YOLOv8 and propose an object detection model based on UAV aerial photography scenarios, called UAV-YOLOv8. Firstly, Wise-IoU (WIoU) v3 is used as a bounding box regression loss, and a wise gradient allocation strategy makes the model focus more on common-quality samples, thus improving the localization ability of the model. Secondly, an attention mechanism called BiFormer is introduced to optimize the backbone network, which improves the model's attention to critical information. Finally, we design a feature processing module named Focal FasterNet block (FFNB) and propose two new detection scales based on this module, which makes the shallow features and deep features fully integrated. The proposed multiscale feature fusion network substantially increased the detection performance of the model and reduces the missed detection rate of small objects. The experimental results show that our model has fewer parameters compared to the baseline model and has a mean detection accuracy higher than the baseline model by 7.7%. Compared with other mainstream models, the overall performance of our model is much better. The proposed method effectively improves the ability to detect small objects. There is room to optimize the detection effectiveness of our model for small and feature-less objects (such as bicycle-type vehicles), as we will address in subsequent research.

Keywords: BiFormer; FasterNet; UAVs; WIoU; YOLOv8; small-object detection.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
The network structure of YOLOv8. The w (width) and r (ratio) in Figure 1 are parameters used to represent the size of the feature map. The size of the model can be controlled by setting the values of w and r to meet the needs of different application scenarios.
Figure 2
Figure 2
The overall structure of the proposed improved model.
Figure 3
Figure 3
Schematic diagram of the parameters of the loss function.
Figure 4
Figure 4
(a) Structure of the Bi-Level Routing Attention; (b) Structure of the Biformer block.
Figure 5
Figure 5
Comparison of DWConv and PConv. (a) Structure diagram of deep convolution; (b) structure diagram of partial convolution.
Figure 6
Figure 6
Comparison of FasterNet block and Focal FasterNet block. (a) Structure diagram of FasterNet block; (b) structure diagram of our proposed module.
Figure 7
Figure 7
(a) Detection head of YOLOv8; (b) detection head of our proposed method.
Figure 8
Figure 8
Some representative images from the VisDrone2019 dataset. (a) Sparse object distribution; (b) Dense object distribution; (c) Low number of objects; (d) High number of objects; (e) Many types of objects; (f) Objects are very small; (g) Morning; (h) Evening; (i) Night.
Figure 8
Figure 8
Some representative images from the VisDrone2019 dataset. (a) Sparse object distribution; (b) Dense object distribution; (c) Low number of objects; (d) High number of objects; (e) Many types of objects; (f) Objects are very small; (g) Morning; (h) Evening; (i) Night.
Figure 9
Figure 9
Information about the manually labeling of objects in VisDrone2019 dataset.
Figure 10
Figure 10
(a) Training curve of UAV-YOLOv8s and YOLOv8s in mAP; (b) training curve of UAV-YOLOv8s and YOLOv8s in precision; (c) training curve of UAV-YOLOv8s and YOLOv8s in recall.
Figure 11
Figure 11
(a) Confusion matrix plot of YOLOv8s; (b) confusion matrix plot of our model.
Figure 12
Figure 12
Inference results of three different models on VisDrone2019 dataset. (a) Inference results of YOLOv5s; (b) inference results of YOLOv8s; (c) inference results of our model.
Figure 13
Figure 13
(a) Original images; (b) heat maps of YOLOv8s; (c) heat maps of our model.
Figure 14
Figure 14
Examples of detection results on self-made data.

References

    1. Li Z., Zhang Y., Wu H., Suzuki S., Namiki A., Wang W. Design and Application of a UAV Autonomous Inspection System for High-Voltage Power Transmission Lines. Remote Sens. 2023;15:865. doi: 10.3390/rs15030865. - DOI
    1. Byun S., Shin I.-K., Moon J., Kang J., Choi S.-I. Road Traffic Monitoring from UAV Images Using Deep Learning Networks. Remote Sens. 2021;13:4027. doi: 10.3390/rs13204027. - DOI
    1. Bouguettaya A., Zarzour H., Kechida A., Taberkit A.M. A survey on deep learning-based identification of plant and crop diseases from UAV-based aerial images. Cluster. Comput. 2022;26:1297–1317. doi: 10.1007/s10586-022-03627-x. - DOI - PMC - PubMed
    1. Felzenszwalb P.F., Girshick R.B., McAllester D., Ramanan D. Object Detection with Discriminatively Trained Part-Based Models. IEEE Trans. Pattern Anal. Mach. Intell. 2010;32:1627–1645. doi: 10.1109/TPAMI.2009.167. - DOI - PubMed
    1. Girshick R., Donahue J., Darrell T., Malik J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation; Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition; Columbus, OH, USA. 23–28 June 2014; pp. 580–587.

LinkOut - more resources