Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Mar 10:14:1117869.
doi: 10.3389/fpls.2023.1117869. eCollection 2023.

Phenotyping grapevine red blotch virus and grapevine leafroll-associated viruses before and after symptom expression through machine-learning analysis of hyperspectral images

Affiliations

Phenotyping grapevine red blotch virus and grapevine leafroll-associated viruses before and after symptom expression through machine-learning analysis of hyperspectral images

Erica Sawyer et al. Front Plant Sci. .

Abstract

Introduction: Grapevine leafroll-associated viruses (GLRaVs) and grapevine red blotch virus (GRBV) cause substantial economic losses and concern to North America's grape and wine industries. Fast and accurate identification of these two groups of viruses is key to informing disease management strategies and limiting their spread by insect vectors in the vineyard. Hyperspectral imaging offers new opportunities for virus disease scouting.

Methods: Here we used two machine learning methods, i.e., Random Forest (RF) and 3D-Convolutional Neural Network (CNN), to identify and distinguish leaves from red blotch-infected vines, leafroll-infected vines, and vines co-infected with both viruses using spatiospectral information in the visible domain (510-710nm). We captured hyperspectral images of about 500 leaves from 250 vines at two sampling times during the growing season (a pre-symptomatic stage at veraison and a symptomatic stage at mid-ripening). Concurrently, viral infections were determined in leaf petioles by polymerase chain reaction (PCR) based assays using virus-specific primers and by visual assessment of disease symptoms.

Results: When binarily classifying infected vs. non-infected leaves, the CNN model reaches an overall maximum accuracy of 87% versus 82.8% for the RF model. Using the symptomatic dataset lowers the rate of false negatives. Based on a multiclass categorization of leaves, the CNN and RF models had a maximum accuracy of 77.7% and 76.9% (averaged across both healthy and infected leaf categories). Both CNN and RF outperformed visual assessment of symptoms by experts when using RGB segmented images. Interpretation of the RF data showed that the most important wavelengths were in the green, orange, and red subregions.

Discussion: While differentiation between plants co-infected with GLRaVs and GRBV proved to be relatively challenging, both models showed promising accuracies across infection categories.

Keywords: Vitis vinifera L.; convolutional neural network; deep-learning; disease detection; phenomics; random forest; spectroscopy.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

Figure 1
Figure 1
Workflow of the different methodology steps used in this study. The acquisition was done in three vineyards in August and September 2019 and 2022. About 500 leaves were sampled for PCR analysis and imaged using a hyperspectral camera in a dark cabinet. The image pre-process consisted of segmenting the leaves to extract the pure leaf signal and converting the radiance to reflectance using a white standard. The predictions of the PCR results were done using random forest (RF) and convolutional neural network (CNN). A visual assessment by experts was also done. The results were evaluated using accuracy and confusion matrices and interpreted via variable importance rate.
Figure 2
Figure 2
Examples of RGB segmented leaf images for each category using the reflectance at 525.3nm, 555.7nm, and 601.3nm.
Figure 3
Figure 3
The convolutional neural network architecture used in this study for classification of grapevine disease. Input for the network consists of hyperspectral images of leaves, captured at 40 wavelengths. Data is then fed through two sequences of layers which are designed to extract significant features from each image (convolution), normalize values to minimize computational cost (batch normalization), adapt to nonlinearity of values (ReLU), and combine regional values to reduce the overall size of the dataset to be processed (pooling). A fully connected layer flattens data and updates network weights and parameters to improve prediction capability. A final activation function returns a class label which translates to the predicted virus classification (non-infected, leafroll, red blotch, and co-infected with both viruses).
Figure 4
Figure 4
Confusion matrices of binary classification using the RF (top) and CNN (bottom) models with the entire dataset (left), the pre-symptomatic dataset (center) and the symptomatic dataset (right). The top left corner shows the percentage of non-infected vines that were well predicted as non-infected. The top right corner shows the percentage of non-infected vines that were wrongly predicted as infected (false positive). The bottom left corner shows the percentage of infected vines that were wrongly predicted as non-infected (false negative). The bottom right corner shows the percentage of infected vines that were well predicted as infected. Using the pre-symptomatic dataset shows fewer false positives and using the symptomatic dataset shows fewer false negatives and false positives.
Figure 5
Figure 5
Confusion matrices averaged across the five CV folds for data with the entire dataset (left), the pre-symptomatic dataset (center), and the symptomatic dataset (right) using the RF (top) and CNN (bottom) models. The diagonal represents the percentage for each category that was well predicted. The three percentages below the top left corner represent the false negatives for each infected category. Two categories, non-infected and leafroll, were best predicted by both types of models. Leafroll was the best-predicted class with a maximum of 9% of false negatives using the RF model for the symptomatic dataset.
Figure 6
Figure 6
Variable importance (VI) of RF model. The higher the importance of the band, the higher the contribution of the reflectance of this band to the model. The first line represents the VI using the entire dataset, the second line is for the pre-symptomatic dataset and the third line is for the symptomatic dataset. The first column represents the VI for the binary classifications. The second column represents the VI for the multiclass classifications.

References

    1. Albawi S., Mohammed T. A., Al-Azawi S. (2017). “Understanding of a convolutional neural network,” in International Conference on Engineering and Technology (ICET), (Antalya, Turkey) 2017. 1–6. doi: 10.1109/ICEngTechnol.2017.8308186 - DOI
    1. AL-Saddik H., Simon J.-C., Cointault F. (2017). Development of spectral disease indices for ‘Flavescence dorée’ grapevine disease identification. Sensors 17 (12), 2772. doi: 10.3390/s17122772 - DOI - PMC - PubMed
    1. Atallah S. S., Gómez M. I., Fuchs M. F., Martinson T. E. (2012). Economic impact of grapevine leafroll disease on vitis vinifera cv. cabernet franc in finger lakes vineyards of new york. Am. J. Enology Viticulture 63, 73–79. doi: 10.5344/ajev.2011.11055 - DOI
    1. Behmann J., Steinrücken J., Plümer L. (2014). Detection of early plant stress responses in hyperspectral images. ISPRS J. Photogrammetry Remote Sens. 93, 98–111. doi: 10.1016/j.isprsjprs.2014.03.016 - DOI
    1. Belgiu M., Drăgut L. (2016). Random forest in remote sensing: A review of applications and future directions. ISPRS journal of photogrammetry and remote sensing 114, 24–31. doi: 10.1016/j.isprsjprs.2016.01.011 - DOI