Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jan 30;105(2):1159-1169.
doi: 10.1002/jsfa.13906. Epub 2024 Sep 18.

Machine learning and multiple linear regression models can predict ascorbic acid and polyphenol contents, and antioxidant activity in strawberries

Affiliations

Machine learning and multiple linear regression models can predict ascorbic acid and polyphenol contents, and antioxidant activity in strawberries

Kazufumi Zushi et al. J Sci Food Agric. .

Abstract

Background: Strawberry is a rich source of antioxidants, including ascorbic acid (ASA) and polyphenols, which have numerous health benefits. Antioxidant content and activity are often determined manually using laboratory equipment, which is destructive and time-consuming. This study constructs a prediction model for antioxidant compounds utilizing machine learning (ML) and multiple linear regression based on environmental, plant growth and agronomic fruit quality-related parameters as well as antioxidant levels. These were studied in three farms at two-week intervals during two years of cultivation.

Results: During the ML model screening, artificial neural network (ANN)-boosted models displayed a moderate coefficient of determination (R2) at 0.68-0.78 and relative root mean square error (RRMSE) at 3.8-4.8% in polyphenols and total ASA levels, as well as a high R2 of 0.96 and low RRMSE at <3.0% in antioxidant activity. Additionally, we developed variable selection models regarding the antioxidant activity, and variables two and five (environmental parameters and leaf length, respectively) with high accuracy were selected. The linear regression analysis between the actual and predicted data of antioxidants in the ANN-boosted models revealed high fitness with all parameters in almost all training, validation and test sets. Furthermore, environmental parameters are essential in developing such reliable models.

Conclusion: We conclude that ANN-boosted, stepwise and double-Lasso regression models can predict antioxidant compounds with enhanced accuracy, and the relevant parameters can be easily acquired on-site without the need for any specific equipment. © 2024 Society of Chemical Industry.

Keywords: Lasso regression; antioxidant compounds; artificial neural network; environmental conditions; stepwise regression.

PubMed Disclaimer

References

    1. Giampieri F, Tulipani S, Alvarez‐Suarez JM, Quiles JL, Mezzetti B and Battino M, The strawberry: composition, nutritional quality, and impact on human health. Nutrition 28:9–19 (2012).
    1. Hannum SM, Potential impact of strawberries on human health: a review of the science. Crit Rev Food Sci Nutr 44:1–17 (2004).
    1. Hernández‐Martínez NR, Blanchard C, Wells D and Salazar‐Gutiérrez MR, Current state and future perspectives of commercial strawberry production: a review. Sci Hortic 312:111893 (2023).
    1. Fait A, Hanhineva K, Beleggia R, Dai N, Rogachev I, Nikiforova VJ et al., Reconfiguration of the achene and receptacle metabolic networks during strawberry fruit development. Plant Physiol 148:730–750 (2008).
    1. Yang J‐W and Kim H‐I, An overview of recent advances in greenhouse strawberry cultivation using deep learning techniques: a review for strawberry practitioners. Agronomy 14:34 (2024).

Publication types

LinkOut - more resources