Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Apr 25;14(9):1304.
doi: 10.3390/plants14091304.

Raman and FT-IR Spectroscopy Coupled with Machine Learning for the Discrimination of Different Vegetable Crop Seed Varieties

Affiliations

Raman and FT-IR Spectroscopy Coupled with Machine Learning for the Discrimination of Different Vegetable Crop Seed Varieties

Stefan M Kolašinac et al. Plants (Basel). .

Abstract

The aim of this research is to investigate the potential of Raman and FT-IR spectroscopy as well as mathematical linear and non-linear models as a tool for the discrimination of different seed varieties of paprika, tomato, and lettuce species. After visual inspection of spectra, pre-processing was applied in the following combinations: (1) smoothing + linear baseline correction + unit vector normalization; (2) smoothing + linear baseline correction + unit vector normalization + full multiplicative scatter correction; (3) smoothing + baseline correction + unit vector normalization + second-order derivative. Pre-processing was followed by Principal Component Analysis (PCA), and several classification methods were applied after that: the Support Vector Machines (SVM) algorithm, Partial Least Square Discriminant Analysis (PLS-DA), and Principal Component Analysis-Quadratic Discriminant Analysis (PCA-QDA). SVM showed the best classification power in both Raman (100.00, 99.37, and 92.71% for lettuce, paprika, and tomato varieties, respectively) and FT-IR spectroscopy (99.37, 92.50, and 97.50% for lettuce, paprika, and tomato varieties, respectively). Moreover, our novel approach of merging Raman and FT-IR spectra significantly contributed to the accuracy of some models, giving results of 100.00, 100.00, and 95.00% for lettuce, tomato, and paprika varieties, respectively. Our results indicate that Raman and FT-IR spectroscopy coupled with machine learning could be a promising tool for the rapid and rational evaluation and management of genetic resources in ex situ and in situ seed collections.

Keywords: breeding; merging spectra; support vector machine (SVM); vegetable seed; vibrational spectroscopy.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

Figure 1
Figure 1
Baseline corrected and smoothed Raman spectra of paprika (A), tomato (B), and lettuce (C); AT—Atrakcija, MK—Majska kraljica, GL—Great Lake, LL—Ljubljanska ledenka, PCH—Palanačko čudo, STR—Strižanka, PK—Palanačka kapija, DF—Delfina, CS—Crvena Stena, L—Lider F1, K—King F1, VS—Volovsko srce.
Figure 2
Figure 2
Baseline corrected and smoothed FT-IR spectra of paprika (A), tomato (B), and lettuce (C); AT—Atrakcija, MK—Majska kraljica, GL—Great Lake, LL—Ljubljanska ledenka, PCH—Palanačko čudo, STR—Strižanka, PK—Palanačka kapija, DF—Delfina, CS—Crvena Stena, L—Lider F1, K—King F1, VS—Volovsko srce.
Figure 3
Figure 3
PCA score plots (PC1 vs. PC2) of lettuce varieties: Raman spectra combined with (A) BC+N, (B) BC+N+MSC, and (C) BC+N+MSC+2nd OD; FT-IR spectra with (D) BC+N, (E) BC+N+MSC, (F) BC+N+MSC+2nd OD. AT—Atrakcija, MK—Majska kraljica, GL—Great Lake, LL—Ljubljanska ledenka.
Figure 4
Figure 4
PCA loading plots of lettuce varieties: Raman spectra combined with (A) BC+N, (B) BC+N+MSC, and (C) BC+N+MSC+2nd OD; FT-IR spectra with (D) BC+N, (E) BC+N+MSC, (F) BC+N+MSC+2nd OD.
Figure 5
Figure 5
PCA score plots (PC1 vs. PC2) of paprika seed varieties: Raman spectra combined with (A) BC+N, (B) BC+N+MSC, and (C) BC+N+MSC+2nd OD; FT-IR spectra with (D) BC+N, (E) BC+N+MSC, (F) BC+N+MSC+2nd OD. PCH—Palanačko čudo, STR—Strižanka, PK—Palanačka kapija, DF—Delfina.
Figure 6
Figure 6
PCA loading plots of paprika seed varieties: Raman spectra combined with (A) BC+N, (B) BC+N+MSC, and (C) BC+N+MSC+2nd OD; FT-IR spectra with (D) BC+N, (E) BC+N+MSC, (F) BC+N+MSC+2nd OD.
Figure 7
Figure 7
PCA score plots (PC1 vs. PC2) of tomato varieties: Raman spectra combined with (A) BC+N, (B) BC+N+MSC, and (C) BC+N+MSC+2nd OD; FT-IR spectra with (D) BC+N, (E) BC+N+MSC, (F) BC+N+MSC+2nd OD. CS—Crvena Stena, L—Lider F1, K—King F1, VS—Volovsko srce.
Figure 8
Figure 8
PCA loading plots of tomato seed varieties: Raman spectra combined with (A) BC+N, (B) BC+N+MSC, and (C) BC+N+MSC+2nd OD; FT-IR spectra with (D) BC+N, (E) BC+N+MSC, (F) BC+N+MSC+2nd OD.
Figure 9
Figure 9
Preprocessed and merged Raman and FT-IR spectra; (A) lettuce: ATR—Atrakcija, MK—Majska kraljica, GL—Great Lake, LL—Ljubljanska ledenka; (B) tomato: CS—Crvena Stena, L—Lider F1, K—King F1, VS—Volovsko srce; (C) paprika: PCH—Palanačko čudo, STR—Strižanka, PK—Palanačka kapija, DF—Delfina.
Figure 10
Figure 10
PCA score plots (PC1 vs. PC2) of lettuce varieties: Raman+FT-IR spectra combined with BC+N (A), BC+N+MSC (B), and BC+N+MSC+2nd OD (C); tomato varieties: Raman+FT-IR spectra with BC+N (D), BC+N+MSC (E), and BC+N+MSC+2nd OD (F); paprika varieties: Raman+FT-IR spectra with BC+N (G), BC+N+MSC (H), and BC+N+MSC+2nd OD (I). ATR—Atrakcija, MK—Majska kraljica, GL—Great Lake, LL—Ljubljanska ledenka; CS—Crvena Stena, L—Lider F1, K—King F1, VS—Volovsko srce; PCH—Palanačko čudo, STR—Strižanka, PK—Palanačka kapija, DF—Delfina.
Figure 11
Figure 11
Results of average classification accuracy performed on lettuce seed varieties: PLS-DA (A), PCA-QDA (B), SVM (C); paprika seed varieties: PLS-DA (D), PCA-QDA (E), SVM (F); tomato seed varieties: PLS-DA (G), PCA-QDA (H), SVM (I). Different small letters at each figure (AI) implies statistical significant differences (p < 0.05).
Figure 12
Figure 12
Seed material of different varieties (first row-paprika; second row-lettuce; third row-tomato).

References

    1. Bogdanović S., Mladenov V., Balešević-Tubić S. The importance of using certified seed. Sel. i Semen. 2015;21:63–67. doi: 10.5937/SelSem1502063B. - DOI
    1. Kelly A.F. Seed Production of Agricultural Crops. Wiley; Harlow, UK: 2013.
    1. UPOV-International Convention for the Protection of New Varieties of Plants. International Union for the Protection of New Varieties of Plants; Geneva, Switzerland: 1988.
    1. Bora A., Choudhury P.R., Pande V., Mandal A.B. Assessment of genetic purity in rice (Oryza sativa L.) hybrids using microsatellite markers. 3 Biotech. 2016;6:50. doi: 10.1007/s13205-015-0337-y. - DOI - PMC - PubMed
    1. Smith J.S.C., Register J.C., III Genetic purity and testing technologies for seed quality: A company perspective. Seed Sci. Res. 1998;8:285–294. doi: 10.1017/S0960258500004189. - DOI