Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Feb 21;21(4):1492.
doi: 10.3390/s21041492.

Practices and Applications of Convolutional Neural Network-Based Computer Vision Systems in Animal Farming: A Review

Affiliations

Practices and Applications of Convolutional Neural Network-Based Computer Vision Systems in Animal Farming: A Review

Guoming Li et al. Sensors (Basel). .

Abstract

Convolutional neural network (CNN)-based computer vision systems have been increasingly applied in animal farming to improve animal management, but current knowledge, practices, limitations, and solutions of the applications remain to be expanded and explored. The objective of this study is to systematically review applications of CNN-based computer vision systems on animal farming in terms of the five deep learning computer vision tasks: image classification, object detection, semantic/instance segmentation, pose estimation, and tracking. Cattle, sheep/goats, pigs, and poultry were the major farm animal species of concern. In this research, preparations for system development, including camera settings, inclusion of variations for data recordings, choices of graphics processing units, image preprocessing, and data labeling were summarized. CNN architectures were reviewed based on the computer vision tasks in animal farming. Strategies of algorithm development included distribution of development data, data augmentation, hyperparameter tuning, and selection of evaluation metrics. Judgment of model performance and performance based on architectures were discussed. Besides practices in optimizing CNN-based computer vision systems, system applications were also organized based on year, country, animal species, and purposes. Finally, recommendations on future research were provided to develop and improve CNN-based computer vision systems for improved welfare, environment, engineering, genetics, and management of farm animals.

Keywords: animal farming; computer vision system; convolutional neural network; deep learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Figures

Figure 1
Figure 1
Schematic drawing of a computer vision system for monitoring animals.
Figure 2
Figure 2
Some milestone events of development of artificial neural networks for convolutional neural networks in computer vision. DBN is deep brief network, GPU is graphics processing unit, DBM is Deep Boltzmann Machine, CNN is convolutional neural network, LeNet is a CNN architecture proposed by Yann LeCun, AlexNet is a CNN architecture designed by Alex Krizhevsky, VGGNet is Visual Geometry Group CNN, GoogLeNet is improved LeNet from Google, ResNet is residual network, faster R-CNN is faster region-based CNN, DenseNet is densely CNN, and mask R-CNN is mask region-based CNN. The CNN architectures after 2012 are not limited in this figure, and the selected ones are deemed influential and have at least 8000 citations in Google Scholar.
Figure 3
Figure 3
Example illustrations of six computer vision tasks. The semantic and instance segmentations were combined as semantic/instance segmentation due to many similarities, resulting in the major five computer vision tasks throughout the study.
Figure 4
Figure 4
Example illustrations of (a) a convolutional neural network and (b) a convolution with kernel size of 3 × 3, stride of 1, and no padding.
Figure 5
Figure 5
Frequency mentioned in the 105 publications corresponding to different camera settings: (a) sampling rate; (b) resolution; (c) camera view; (d) image type; and (e) distances between camera and surface of interest. (a,b,d): number near a round bracket is not included in specific ranges while number near square bracket is. (b): 480P is 720 × 480 pixels, 720P is 1280 × 720 pixels, 1080P is 1920 × 1080 pixels, and 2160P is 3840 × 2160 pixels. (d): RGB is red, green, and blue; and RGB-D is RGB and Depth.
Figure 5
Figure 5
Frequency mentioned in the 105 publications corresponding to different camera settings: (a) sampling rate; (b) resolution; (c) camera view; (d) image type; and (e) distances between camera and surface of interest. (a,b,d): number near a round bracket is not included in specific ranges while number near square bracket is. (b): 480P is 720 × 480 pixels, 720P is 1280 × 720 pixels, 1080P is 1920 × 1080 pixels, and 2160P is 3840 × 2160 pixels. (d): RGB is red, green, and blue; and RGB-D is RGB and Depth.
Figure 6
Figure 6
Frequency mentioned in the 105 publications corresponding to different variations included in data recording.
Figure 7
Figure 7
Frequency mentioned in the 105 publications corresponding to number of labeled images.
Figure 8
Figure 8
Frequency of development strategies in the 105 publications. (a) Frequency of the two development strategy; (b) frequency of different ratios of training:validation/testing; and (c) frequency of different ratios of training:validation:testing.
Figure 9
Figure 9
Frequency of different augmentation strategies in the 105 publications. Other geometric transformation includes distorting, translating, shifting, reflecting, shearing, etc.
Figure 10
Figure 10
Frequency of different datasets mentioned in the 105 publications. PASCAL VOC is PASCAL visual object class. COCO is common objects in context. “Transfer learning” means the publications only mentioned “transfer learning” rather than specified datasets for transfer learning. Other datasets are motion analysis and re-identification set (MARS) and action recognition dataset (UCF101).
Figure 11
Figure 11
Frequency of value of evaluation metrics in the 105 publications. Metrics include, but are not limited to, accuracy, specificity, recall, precision, average precision, mean average precision, F1 score, and intersection over union. Number near a round bracket is not included in specific ranges while number near square bracket is.
Figure 12
Figure 12
Number of publications based on (a) year, (b) country, (c) animal species, and (d) purpose. One publication can include multiple countries and animal species. “Others” are Algeria, Philippines, Nigeria, Italy, France, Turkey, New Zealand, Indonesia, Egypt, Spain, Norway, Saudi Arabia, and Switzerland.

References

    1. Tilman D., Balzer C., Hill J., Befort B.L. Global food demand and the sustainable intensification of agriculture. Proc. Natl. Acad. Sci. USA. 2011;108:20260–20264. doi: 10.1073/pnas.1116437108. - DOI - PMC - PubMed
    1. McLeod A. World Livestock 2011-Livestock in Food Security. Food and Agriculture Organization of the United Nations (FAO); Rome, Italy: 2011.
    1. Yitbarek M. Livestock and livestock product trends by 2050: A review. Int. J. Anim. Res. 2019;4:30.
    1. Beaver A., Proudfoot K.L., von Keyserlingk M.A. Symposium review: Considerations for the future of dairy cattle housing: An animal welfare perspective. J. Dairy Sci. 2020;103:5746–5758. doi: 10.3168/jds.2019-17804. - DOI - PubMed
    1. Hertz T., Zahniser S. Is there a farm labor shortage? Am. J. Agric. Econ. 2013;95:476–481. doi: 10.1093/ajae/aas090. - DOI

Publication types

LinkOut - more resources