Disentangling data dependency using cross-validation strategies to evaluate prediction quality of cattle grazing activities using machine learning algorithms and wearable sensor data

Leonardo Augusto Coelho Ribeiro¹, Tiago Bresolin², Guilherme Jordão de Magalhães Rosa², Daniel Rume Casagrande¹, Marina de Arruda Camargo Danes¹, João Ricardo Rebouças Dórea²

Affiliations

¹ Department of Animal Science, University of Lavras, Lavras, MG 37200-900, Brazil.
² Department of Animal and Dairy Sciences, University of Wisconsin, Madison, WI 53706, USA.

PMID: 34223900
PMCID: PMC8418637
DOI: 10.1093/jas/skab206

Disentangling data dependency using cross-validation strategies to evaluate prediction quality of cattle grazing activities using machine learning algorithms and wearable sensor data

Leonardo Augusto Coelho Ribeiro et al. J Anim Sci. 2021.

. 2021 Sep 1;99(9):skab206.

doi: 10.1093/jas/skab206.

Authors

Leonardo Augusto Coelho Ribeiro¹, Tiago Bresolin², Guilherme Jordão de Magalhães Rosa², Daniel Rume Casagrande¹, Marina de Arruda Camargo Danes¹, João Ricardo Rebouças Dórea²

Affiliations

¹ Department of Animal Science, University of Lavras, Lavras, MG 37200-900, Brazil.
² Department of Animal and Dairy Sciences, University of Wisconsin, Madison, WI 53706, USA.

PMID: 34223900
PMCID: PMC8418637
DOI: 10.1093/jas/skab206

Abstract

Wearable sensors have been explored as an alternative for real-time monitoring of cattle feeding behavior in grazing systems. To evaluate the performance of predictive models such as machine learning (ML) techniques, data cross-validation (CV) approaches are often employed. However, due to data dependencies and confounding effects, poorly performed validation strategies may significantly inflate the prediction quality. In this context, our objective was to evaluate the effect of different CV strategies on the prediction of grazing activities in cattle using wearable sensor (accelerometer) data and ML algorithms. Six Nellore bulls (average live weight of 345 ± 21 kg) had their behavior visually classified as grazing or not-grazing for a period of 15 d. Elastic Net Generalized Linear Model (GLM), Random Forest (RF), and Artificial Neural Network (ANN) were employed to predict grazing activity (grazing or not-grazing) using 3-axis accelerometer data. For each analytical method, three CV strategies were evaluated: holdout, leave-one-animal-out (LOAO), and leave-one-day-out (LODO). Algorithms were trained using similar dataset sizes (holdout: n = 57,862; LOAO: n = 56,786; LODO: n = 56,672). Overall, GLM delivered the worst prediction accuracy (53%) compared with the ML techniques (65% for both RF and ANN), and ANN performed slightly better than RF for LOAO (73%) and LODO (64%) across CV strategies. The holdout yielded the highest nominal accuracy values for all three ML approaches (GLM: 59%, RF: 76%, and ANN: 74%), followed by LODO (GLM: 49%, RF: 61%, and ANN: 63%) and LOAO (GLM: 52%, RF: 57%, and ANN: 57%). With a larger dataset (i.e., more animals and grazing management scenarios), it is expected that accuracy could be increased. Most importantly, the greater prediction accuracy observed for holdout CV may simply indicate a lack of data independence and the presence of carry-over effects from animals and grazing management. Our results suggest that generalizing predictive models to unknown (not used for training) animals or grazing management may incur poor prediction quality. The results highlight the need for using management knowledge to define the validation strategy that is closer to the real-life situation, i.e., the intended application of the predictive model.

Keywords: accelerometer; grazing; machine learning; validation.

PubMed Disclaimer

Figures

**Figure 1.**
Evaluation of sward height and forage offer (A) and percentage of leaves and stem during the experiment (B). The vertical red line indicates the division data using for the training set and validation set (last 5 d).

**Figure 2.**
Raw data distribution sampled from one experimental point day for grazing (top) or not-grazing (bottom) behavior categories visually observed. The X, Y, and Z accelerometer axis values (g-force) are represented in blue, green, and red colors, respectively.

**Figure 3.**
Prediction accuracy using the last 5 d (day-by-day) as a validation set for grazing and not-grazing individual observational behavior in Nellore cattle.

See this image and copyright information in PMC

References

1. Alvarenga, F. A. P., Borges I., Palkovič L., Rodina J., Oddy V. H., and Dobos R. C.. . 2016. Using a three-axis accelerometer to identify and classify sheep behaviour at pasture. Appl. Anim. Behav. Sci. 181:91–99. 10.1016/j.applanim.2016.05.026 - DOI
1. Barthram, G. T. 1985. Experimental techniques: the HFRO sward stick. The Hill Farming Research Organization Biennial Report. 29–30.
1. Borchers, M. R., Chang Y. M., Proudfoot K. L., Wadsworth B. A., Stone A. E., and Bewley J. M.. . 2017. Machine-learning-based calving prediction from activity, lying, and ruminating behaviors in dairy cattle. J. Dairy Sci. 100:5664–5674. 10.3168/jds.2016-11526 - DOI - PubMed
1. Breiman, L. 2001. Random forests. Machine Learn. 5:5–32. 10.1023/A:1010933404324 - DOI
1. Brewster, L. R., Dale J. J., Guttridge T. L., Gruber S. H., Hansell A. C., Elliott M., Cowx I. G., Whitney N. M., and Gleiss A. C.. . 2018. Development and application of a machine learning algorithm for classification of elasmobranch behaviour from accelerometry data. Mar. Biol. 165:1–19. 10.1007/s00227-018-3318-y - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Disentangling data dependency using cross-validation strategies to evaluate prediction quality of cattle grazing activities using machine learning algorithms and wearable sensor data

Affiliations

Disentangling data dependency using cross-validation strategies to evaluate prediction quality of cattle grazing activities using machine learning algorithms and wearable sensor data

Authors

Affiliations

Abstract

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources