Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug 17:14:1223410.
doi: 10.3389/fpls.2023.1223410. eCollection 2023.

TBC-YOLOv7: a refined YOLOv7-based algorithm for tea bud grading detection

Affiliations

TBC-YOLOv7: a refined YOLOv7-based algorithm for tea bud grading detection

Siyang Wang et al. Front Plant Sci. .

Abstract

Introduction: Accurate grading identification of tea buds is a prerequisite for automated tea-picking based on machine vision system. However, current target detection algorithms face challenges in detecting tea bud grades in complex backgrounds. In this paper, an improved YOLOv7 tea bud grading detection algorithm TBC-YOLOv7 is proposed.

Methods: The TBC-YOLOv7 algorithm incorporates the transformer architecture design in the natural language processing field, integrating the transformer module based on the contextual information in the feature map into the YOLOv7 algorithm, thereby facilitating self-attention learning and enhancing the connection of global feature information. To fuse feature information at different scales, the TBC-YOLOv7 algorithm employs a bidirectional feature pyramid network. In addition, coordinate attention is embedded into the critical positions of the network to suppress useless background details while paying more attention to the prominent features of tea buds. The SIOU loss function is applied as the bounding box loss function to improve the convergence speed of the network.

Result: The results of the experiments indicate that the TBC-YOLOv7 is effective in all grades of samples in the test set. Specifically, the model achieves a precision of 88.2% and 86.9%, with corresponding recall of 81% and 75.9%. The mean average precision of the model reaches 87.5%, 3.4% higher than the original YOLOv7, with average precision values of up to 90% for one bud with one leaf. Furthermore, the F1 score reaches 0.83. The model's performance outperforms the YOLOv7 model in terms of the number of parameters. Finally, the results of the model detection exhibit a high degree of correlation with the actual manual annotation results ( R2 =0.89), with the root mean square error of 1.54.

Discussion: The TBC-YOLOv7 model proposed in this paper exhibits superior performance in vision recognition, indicating that the improved YOLOv7 model fused with transformer-style module can achieve higher grading accuracy on densely growing tea buds, thereby enables the grade detection of tea buds in practical scenarios, providing solution and technical support for automated collection of tea buds and the judging of grades.

Keywords: BiFPN; CA; SIoU; YOLOv7; contextual transformer; tea bud grading detection.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
YOLOv7 network structure diagram. The basic CBS module has three colors, representing different convolutional kernels size and strides represented by k and s.
Figure 2
Figure 2
Neck components of the TBC-YOLOv7 network.
Figure 3
Figure 3
CoT module structure diagram.
Figure 4
Figure 4
PANet structure diagram (A) and BiFPN structure diagram (B).
Figure 5
Figure 5
SIOU: schematic diagram of loss function calculation cost.
Figure 6
Figure 6
Coordinate attention mechanism structure diagram.
Figure 7
Figure 7
Tea bud samples under different environmental conditions: intense light environment (A); weak light environment (B); with a single-target sample (C); with multiple intensive target samples (D).
Figure 8
Figure 8
Tea image’s data enhancement effect: (A–C) represent the data enhancement effect after different methods. Use the numbers 0 and 1 to label the different grades of tea buds.
Figure 9
Figure 9
Tea bud classification criteria. One bud with one leaf is denoted by “BOL” and one bud with two leaves is denoted by “BTL”.
Figure 10
Figure 10
Convergence curve for training dataset and validation dataset.
Figure 11
Figure 11
Precision–Recall curve (A): The horizontal axis represents recall, and the vertical axis represents precision. Confusion matrix (B): "BOL" represents the BOL grade, "BTL" represents the BTL grade, and "background" represents the background class. The rows represent the true labels, and the columns represent the predicted classes.
Figure 12
Figure 12
Heat map visualization: (A1–A3) with fewer bud targets; (B1–B3) with more bud targets. The different color areas of the image represent the level of contribution to the detection.
Figure 13
Figure 13
TBC-YOLOv7 detection results: (A) for single targets; (B) for multi-targets; (C) for lower light levels; (D) for brighter light levels; (E) for intensive situation; (F) for targets vary in size.
Figure 14
Figure 14
Comparison and evaluation of tea bud grade detection counts: (A) predicted number and actual number for BTL; (B) predicted number and actual number for BOL; (C) linear regression diagram of model between prediction value and actual value.
Figure 15
Figure 15
Variation curves of mAP for the models of TBC-YOLOv7, SSD, Faster RCNN, YOLOv5s, and YOLOv7 during training.
Figure 16
Figure 16
Visualization results predicted by the five models: (A) Faster RCNN; (B) SSD; (C) YOLOv5s; (D) YOLOv7; (E) TBC-YOLOv7. (A1–E1) shows the detection of multiple tea buds under low light conditions. (A2–E2) illustrate the detection of tea buds with multiple targets under weak light conditions. (A3–E3) show the dense target recognition under uniform illumination conditions. Differences in detection results for each algorithm are marked with bold yellow boxes.

References

    1. Mei Y., Zhang S. (2022). China tea production and sales situation report. Available at: https://www.ctma.com.cn/index/index/zybg/id/17/.
    1. Barbedo J. G. A. (2018). Factors influencing the use of deep learning for plant disease recognition. Biosyst. Eng. 172, 84–91. doi: 10.1016/J.BIOSYSTEMSENG.2018.05.013 - DOI
    1. Bochkovskiy A., Wang C.-Y., Liao H.-Y. M. (2020) YOLOv4: optimal speed and accuracy of object detection. Available at: http://arxiv.org/abs/2004.10934.
    1. Carion N., Massa F., Synnaeve G., Usunier N., Kirillov A., Zagoruyko S. (2020) End-to-end object detection with transformers. Available at: http://arxiv.org/abs/2005.12872.
    1. Chen Y. T., Chen S. F. (2020). Localizing plucking points of tea leaves using deep convolutional neural networks. Comput. Electron. Agric. 171, 105298. doi: 10.1016/J.COMPAG.2020.105298 - DOI

LinkOut - more resources