Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Aug 15;15(1):29955.
doi: 10.1038/s41598-025-14847-7.

Robust multiclass classification of crop leaf diseases using hybrid deep learning and Grad-CAM interpretability

Affiliations

Robust multiclass classification of crop leaf diseases using hybrid deep learning and Grad-CAM interpretability

Sankar Murugesan et al. Sci Rep. .

Abstract

The key objective of this study is to propose an effective and accurate deep learning (DL) framework to detect and classify diseases in banana, cherry, and tomato leaves. The performance of multiple pre-trained models is compared against a newly presented model.The experiments used a publicly released dataset of healthy and unhealthy leaves from banana, cherry, and tomato plants. This dataset was uniformly split into training, validation, and test sets to obtain consistent and unbiased model evaluations. The data pre-processing also involved pre-processing steps suitable for DL architectures to keep the input the same among all the models.We use several state-of-the-art pre-trained ConvNets models for the baselines, such as EfficientNetV2, ConvNeXt, Swin Transformer, and Vi-Transformer (ViT), to have an outlook on the performance. A new ConvNet-ViT hybrid model combines the ConvNet and ViT layers for local feature extraction and maintaining the global context. The classifier's performance was reinforced by a 5-fold cross-validation mechanism to avoid overfitting.The proposed Hybrid ConvNet-ViT model outperformed all the compared models evaluated, achieving a testing classification accuracy of 99.29%, which outperforms all the pre-trained models. This finding shows that combining ConvNets' local feature learning with the capability of global representation of the ViT is effective.The result shows that the Hybrid ConvNet-ViT model is an effective and accurate solution in detecting and classifying plant leaf diseases. Its outstanding performance of the state-of-the-art pre-trained top models positions itself as a solid model for practical agricultural use. Fusing the ConvNet and transformer frameworks jointly is beneficial for improving classification performance in image-based disease detection work.

Keywords: Classification; ConvNet; Deep learning; Hybrid ConvNet-ViT; Plant leaf disease; Vision transformer.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Overall architecture of proposed study.
Fig. 2
Fig. 2
Sample dataset images of all diseases.
Fig. 3
Fig. 3
Sample EfficientNetV2 architecture.
Fig. 4
Fig. 4
Sample ConvNeXt architecture.
Fig. 5
Fig. 5
Model architecture of Swin transformer.
Fig. 6
Fig. 6
Sample ViT model architecture.
Fig. 7
Fig. 7
Proposed hybrid ConvNet-ViT.
Algorithm 1
Algorithm 1
CNN-transformer hybrid classification.
Fig. 8
Fig. 8
Performance metric comparison of proposed and pre-trained models.
Fig. 9
Fig. 9
Validation accuracy and loss graph of proposed model.
Fig. 10
Fig. 10
Confusion matrix generated from genuine test predictions for all nine classes across banana, cherry, and tomato leaf diseases.
Fig. 11
Fig. 11
Grad-CAM visualization of 3-plant leaf diseases (a) BU and CPM, (b) TSL and TSM, (c) TTS and TTM, (d) TTY.
Fig. 12
Fig. 12
Accuracy comparison of proposed and other SOTA models.

References

    1. Benavides, M., Cantón-Garbín, M., Sánchez-Molina, J. A. & Rodríguez, F. Automatic tomato and peduncle location system based on computer vision for use in robotized harvesting. Appl. Sci.10 (17), 5887 (2020).
    1. Bai, Y., Mao, S., Zhou, J. & Zhang, B. Clustered tomato detection and picking point location using machine learning-aided image analysis for automatic robotic harvesting. Precision Agric.24 (2), 727–743 (2023).
    1. Gulzar, Y. Fruit image classification model based on mobilenetv2 with deep transfer learning technique. Sustainability15 (3), 1906 (2023).
    1. Manya Afonso, H. et al. Tomato fruit detection and counting in greenhouses using deep learning’, front. Plant. Sci.11, 571299 (2020). - PMC - PubMed
    1. Moreira, G., Magalhães, S. A., Pinho, T., dos Santos, F. N. & Cunha, M. Benchmark of deep learning and a proposed Hsv colour space models for the detection and classification of greenhouse tomato. Agronomy12 (2), 356 (2022).

MeSH terms

LinkOut - more resources