Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Sep 16;10(19):10374-10383.
doi: 10.1002/ece3.6692. eCollection 2020 Oct.

Improving the accessibility and transferability of machine learning algorithms for identification of animals in camera trap images: MLWIC2

Affiliations

Improving the accessibility and transferability of machine learning algorithms for identification of animals in camera trap images: MLWIC2

Michael A Tabak et al. Ecol Evol. .

Abstract

Motion-activated wildlife cameras (or "camera traps") are frequently used to remotely and noninvasively observe animals. The vast number of images collected from camera trap projects has prompted some biologists to employ machine learning algorithms to automatically recognize species in these images, or at least filter-out images that do not contain animals. These approaches are often limited by model transferability, as a model trained to recognize species from one location might not work as well for the same species in different locations. Furthermore, these methods often require advanced computational skills, making them inaccessible to many biologists. We used 3 million camera trap images from 18 studies in 10 states across the United States of America to train two deep neural networks, one that recognizes 58 species, the "species model," and one that determines if an image is empty or if it contains an animal, the "empty-animal model." Our species model and empty-animal model had accuracies of 96.8% and 97.3%, respectively. Furthermore, the models performed well on some out-of-sample datasets, as the species model had 91% accuracy on species from Canada (accuracy range 36%-91% across all out-of-sample datasets) and the empty-animal model achieved an accuracy of 91%-94% on out-of-sample datasets from different continents. Our software addresses some of the limitations of using machine learning to classify images from camera traps. By including many species from several locations, our species model is potentially applicable to many camera trap studies in North America. We also found that our empty-animal model can facilitate removal of images without animals globally. We provide the trained models in an R package (MLWIC2: Machine Learning for Wildlife Image Classification in R), which contains Shiny Applications that allow scientists with minimal programming experience to use trained models and train new models in six neural network architectures with varying depths.

Keywords: R package; computer vision; deep convolutional neural networks; image classification; machine learning; motion‐activated camera; remote sensing; species identification.

PubMed Disclaimer

Conflict of interest statement

The authors have no conflicts of interest to declare.

Figures

FIGURE 1
FIGURE 1
Within sample validation of the species model revealed high recall and precision for most species. Median values across datasets are presented along with 95% confidence intervals. The number of datasets for each species is included in the circle next to the species name (circle sizes are proportional to the number of datasets containing each species)
FIGURE 2
FIGURE 2
Species model out‐of‐sample validation revealed variable recall and precision rates across species. Median values across datasets are presented along with 95% confidence intervals. The number of datasets for each species is included in the circle next to the species name
FIGURE 3
FIGURE 3
Models became more generalizable (i.e., out‐of‐sample accuracy increased) as the number of datasets used to train the model increased. Points represent median accuracy across out‐of‐sample datasets and lines connect the minimum and maximum of the 95% quantiles for accuracy values across these datasets
FIGURE 4
FIGURE 4
Proposed workflow for using MLWIC2 models when classifying camera trap images

References

    1. Adabi, M. , Barhab, P. , Chen, J. , Chen, Z. , Davis, A. , Dean, J. , … Zheng, X. (2016). TensorFlow: A system for large‐scale machine learning (Vol. 16, pp. 265–283). Presented at the 12th USENIX Symposium on Operating Systems Design and Implementation, USENIX Association.
    1. Advanced Research Computing Center (2018). Teton Computing Environment, Intel x86_64 cluster. Laramie, WY: University of Wyoming; Retrieved from 10.15786/M2FY47 - DOI
    1. Anton, V. , Hartley, S. , Geldenhuis, A. , & Wittmer, H. U. (2018). Monitoring the mammalian fauna of urban areas using remote cameras and citizen science. Journal of Urban Ecology, 4(1), 1–9. 10.1093/jue/juy002 - DOI
    1. Beery, S. , Morris, D. , & Yang, S. (2019). Efficient pipeline for camera trap image review. Retrieved from http://arxiv.org/abs/1907.06772
    1. Beery, S. , Van Horn, G. , & Perona, P. (2018). Recognition in terra incognita (pp. 456–473). Presented at the Proceedings of the European Conference on Computer Vision (ECCV). Retrieved from http://openaccess.thecvf.com/content_ECCV_2018/html/Beery_Recognition_in...

LinkOut - more resources