. 2023 Jun 24;23(13):5864.

doi: 10.3390/s23135864.

Application of Machine Learning Algorithms to Classify Peruvian Pisco Varieties Using an Electronic Nose

Celso De-La-Cruz¹, Jorge Trevejo-Pinedo², Fabiola Bravo², Karina Visurraga², Joseph Peña-Echevarría², Angela Pinedo², Freddy Rojas¹, María R Sun-Kou²

Affiliations

¹ Department of Engineering, Pontifical Catholic University of Peru, Lima 15088, Peru.
² Department of Science, Pontifical Catholic University of Peru, Lima 15088, Peru.

PMID: 37447715
PMCID: PMC10347005
DOI: 10.3390/s23135864

Application of Machine Learning Algorithms to Classify Peruvian Pisco Varieties Using an Electronic Nose

Celso De-La-Cruz et al. Sensors (Basel). 2023.

. 2023 Jun 24;23(13):5864.

doi: 10.3390/s23135864.

Authors

Celso De-La-Cruz¹, Jorge Trevejo-Pinedo², Fabiola Bravo², Karina Visurraga², Joseph Peña-Echevarría², Angela Pinedo², Freddy Rojas¹, María R Sun-Kou²

Affiliations

¹ Department of Engineering, Pontifical Catholic University of Peru, Lima 15088, Peru.
² Department of Science, Pontifical Catholic University of Peru, Lima 15088, Peru.

PMID: 37447715
PMCID: PMC10347005
DOI: 10.3390/s23135864

Abstract

Pisco is an alcoholic beverage obtained from grape juice distillation. Considered the flagship drink of Peru, it is produced following strict and specific quality standards. In this work, sensing results for volatile compounds in pisco, obtained with an electronic nose, were analyzed through the application of machine learning algorithms for the differentiation of pisco varieties. This differentiation aids in verifying beverage quality, considering the parameters established in its Designation of Origin". For signal processing, neural networks, multiclass support vector machines and random forest machine learning algorithms were implemented in MATLAB. In addition, data augmentation was performed using a proposed procedure based on interpolation-extrapolation. All algorithms trained with augmented data showed an increase in performance and more reliable predictions compared to those trained with raw data. From the comparison of these results, it was found that the best performance was achieved with neural networks.

Keywords: artificial neural network; beverage quality; electronic nose; gas sensors array; random forest; support vector machine.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Electronic nose. (a) Pisco sample, (b) hydraulic system, (c) sensing chamber, (d) temperature controller, (e) LabVIEW software interface.

**Figure 2**
(a) Platinum electrodes over alumina substrate, (b) gas sensor prepared from a metal oxide (MO_x), (c) arrangement of sensors inside the sensing chamber, (d) schematic representation of the sensor.

**Figure 3**
Interpolation–extrapolation in the feature space for data augmentation. M is the set formed by the points of the blue line.

**Figure 4**
Interpolation–extrapolation in the time domain for data augmentation.

**Figure 5**
Example of the rising voltage response of the first group of sensors in one trial. Legend Pi is the i-th class of pisco variety.

**Figure 6**
Structure of the neural network to classify into 6 classes (varieties and brands).

**Figure 7**
Confusion matrix obtained after training the neural network without using augmented data.

**Figure 8**
Confusion matrix obtained after training the neural network using 2000 augmented data.

**Figure 9**
Results with the first dataset. Accuracy of the prediction of the ANN with the test data after training with different amounts of augmented data (0: red; 100: blue; 500: green; 2000: light blue). Ten trainings were generated for each case. The black line shows the mean accuracy for each case.

**Figure 10**
Results with the first dataset. Accuracy of the MSVM prediction with the test data after training with different amounts of augmented data (0: red; 100: blue; 500: green; 2000: light blue).

**Figure 11**
Results with the first dataset. Accuracy of the RF prediction with the test data after training with different amounts of augmented data (0: red; 100: blue; 500: green; 2000: light blue). Ten trainings were generated for each case. The black line shows the mean accuracy for each case.

**Figure 12**
Example of the rising voltage response of the first group of sensors in one trial. Legend Pi is the i-th class of pisco variety.

**Figure 13**
Results with the second dataset. Accuracy of the ANN prediction with the test data. Training with different amounts of augmented data (0: red; 100: blue; 500: green; 2000: light blue). Ten trainings were generated for each case. The black line shows the mean accuracy.

**Figure 14**
Results with the second dataset. Accuracy of the MSVM prediction with the test data after training with different amounts of augmented data (0: red; 100: blue; 500: green; 2000: light blue).

**Figure 15**
Results with the second dataset. Accuracy of the RF prediction with the test data. Training with different amounts of augmented data (0: red; 100: blue; 500: green; 2000: light blue). Ten trainings were generated for each case. The black line shows the mean accuracy.

**Figure 16**
Example of the rising voltage response of the third group of sensors in one trial. Legend Pi is the i-th class of pisco variety.

**Figure 17**
Results with the third dataset. Accuracy of the ANN prediction with the test data. Training with different amounts of augmented data (0: red; 100: blue; 500: green; 2000: light blue). Ten trainings were generated for each case. The black line shows the mean accuracy.

**Figure 18**
Results with the third dataset. Accuracy of the MSVM prediction with the test data after training with different amounts of augmented data (0: red; 100: blue; 500: green; 2000: light blue).

**Figure 19**
Results with the third dataset. Accuracy of the RF prediction with the test data. Training with different amounts of augmented data (0: red; 100: blue; 500: green; 2000: light blue). Ten trainings were generated for each case. The black line shows the mean accuracy.

**Figure 20**
Results with the third dataset. Accuracy of the ANN prediction with the test data as a function of the variation of the CV parameter. Training with 500 augmented data for each CV value. The red curve indicates the average of the bar and 10 subsequent values.

See this image and copyright information in PMC

References

1. Rossow I., Bye E., Moan I., Kilian C., Bramness J. Changes in Alcohol Consumption during the COVID-19 Pandemic—Small Change in Total Consumption, but Increase in Proportion of Heavy Drinkers. Int. J. Environ. Res. Public Health. 2021;18:4231. doi: 10.3390/ijerph18084231. - DOI - PMC - PubMed
1. CONCYTEC . IVAI. Destilados Premium, Iniciativas de Vinculación Para Acelerar la Innovación. Consejo Nacional de Ciencia Tecnología e Innovación Tecnológica; Lima, Peru: 2022.
1. Huertas L. Cronología de la Producción del Vino y del Pisco. Universidad Ricardo Palma; Lima, Peru: 2011.
1. INDECOPI . Reglamento de la Denominación de Origen Pisco. Instituto Nacional de Defensa de la Competencia y de la Protección de la Propiedad Intelectual; Lima, Peru: 2011.
1. INACAL . Bebidas Alcohólicas. Pisco. Requisitos. Instituto Nacional de Calidad; Lima, Peru: 2011. NTP 211.001.2006.

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Application of Machine Learning Algorithms to Classify Peruvian Pisco Varieties Using an Electronic Nose

Affiliations

Application of Machine Learning Algorithms to Classify Peruvian Pisco Varieties Using an Electronic Nose

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources