Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Apr 15;15(4):e0228059.
doi: 10.1371/journal.pone.0228059. eCollection 2020.

Towards a fully automated surveillance of well-being status in laboratory mice using deep learning: Starting with facial expression analysis

Affiliations

Towards a fully automated surveillance of well-being status in laboratory mice using deep learning: Starting with facial expression analysis

Niek Andresen et al. PLoS One. .

Abstract

Assessing the well-being of an animal is hindered by the limitations of efficient communication between humans and animals. Instead of direct communication, a variety of parameters are employed to evaluate the well-being of an animal. Especially in the field of biomedical research, scientifically sound tools to assess pain, suffering, and distress for experimental animals are highly demanded due to ethical and legal reasons. For mice, the most commonly used laboratory animals, a valuable tool is the Mouse Grimace Scale (MGS), a coding system for facial expressions of pain in mice. We aim to develop a fully automated system for the surveillance of post-surgical and post-anesthetic effects in mice. Our work introduces a semi-automated pipeline as a first step towards this goal. A new data set of images of black-furred laboratory mice that were moving freely is used and provided. Images were obtained after anesthesia (with isoflurane or ketamine/xylazine combination) and surgery (castration). We deploy two pre-trained state of the art deep convolutional neural network (CNN) architectures (ResNet50 and InceptionV3) and compare to a third CNN architecture without pre-training. Depending on the particular treatment, we achieve an accuracy of up to 99% for the recognition of the absence or presence of post-surgical and/or post-anesthetic effects on the facial expression.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Image of the data set and observation cage.
Example image of a black-furred laboratory mouse (C57BL/6JRj strain) of the dataset (left). Observation cage used for monitoring the mice after the procedures (right). Images of the mice were taken when mice were moving freely around in the observation cage.
Fig 2
Fig 2. Box plots of the human evaluated Mouse Grimace Scale.
Isoflurane anesthesia (IN, left), ketamine/xylazine anesthesia (KXN, middle) and castration (C, right). IN: Scores were obtained from 30 female and 31 male C57BL/6JRj mice. KXN: Scores were obtained from 29 female and 32 male C57BL/6JRj mice. C: Scores were obtained from 19 male C57BL/6JRj mice. Data represent the mean MGS scores averaged over three human scorers. The box represents the interquartile range (IQR), box edges are the 25th and 75th percentile. The whiskers represent values which are no greater than 1.5 × IQR. Outliers were excluded from the figure. This figure contains data from Hohlbaum et al. [30, 31].
Fig 3
Fig 3. Own network architecture.
The image is fed through three convolutional layers with filter size 3x3 and 32, 32 and 64 filters. Two fully connected layers with 128 neurons each follow. The two output neurons give a confidence in either judgment (post-anesthetic/surgical effect or not).
Fig 4
Fig 4. Ground truth distribution of the “post-anesthetic/surgical effect” and “no post-anesthetic/surgical effect” class.
Isoflurane anesthesia (IN, left), ketamine/xylazine anesthesia (KXN, middle), and castration (C, right). The box represents the interquartile range (IQR), box edges are the 25th and 75th percentile. The whiskers represent values which are no greater than 1.5 × IQR. Outliers were excluded from the figure.
Fig 5
Fig 5. Network confidence over time for isoflurane anesthesia.
Box plots of human labeled Mouse Grimace Scale (MGS) scores (grey) and confidence for “post-anesthetic/surgical effect” class of ResNet architecture (blue) for isoflurane anesthesia (IN). Scores were obtained from 33 female and 32 male C57BL/6JRj mice. MGS data represent the mean MGS scores averaged over three human scorers. The box represents the interquartile range (IQR), box edges are the 25th and 75th percentile. The whiskers represent values which are no greater than 1.5 × IQR. Outliers were excluded from the figure. This figure contains data from Hohlbaum et al. [30].
Fig 6
Fig 6. Network confidence over time for ketamine/xylazine anesthesia.
Box plots of human labeled Mouse Grimace Scale (MGS) score (grey) and confidence for “post-anesthetic/surgical effect” class of ResNet architecture (blue) for ketamine/xylazine anesthesia (KXN). Scores were obtained from 28 female and 30 male C57BL/6JRj mice. MGS data represent mean MGS scores averaged over four human scorers. The box represents the interquartile range (IQR), box edges are the 25th and 75th percentile. The whiskers represent values which are no greater than 1.5 × IQR. Outliers were excluded from the figure. This figure contains data from Hohlbaum et al. [31].
Fig 7
Fig 7. Network confidence over time for castration.
Box plots of human labeled Mouse Grimace Scale (MGS) score (grey) and confidence for “post-anesthetic/surgical effect” class of ResNet architecture (blue) for castration (C). Scores were obtained from 19 male C57BL/6JRj mice. MGS data represent the mean MGS scores averaged over two human scorers. The box represents the interquartile range (IQR), box edges are the 25th and 75th percentile. The whiskers represent values which are no greater than 1.5 × IQR. Outliers were excluded from the figure.
Fig 8
Fig 8. Performance for different combination of training and test datasets after 50 epoch of training.
These results are based on a subset of the data available for KXN. The subset has the same size as the K and IN sets and allows a fair comparison of the values. Data are given as mean accuracy (± standard deviation) in %. IN: isoflurane anesthesia; KXN: ketamine/xylazine anesthesia; C: castration; SO: subject overlap (the term subject is used as a synonym for mouse).
Fig 9
Fig 9. Visualization of the decision finding process using deep Taylor decomposition.
Castration (left), ketamine/xylazine anesthesia (middle), and isoflurane anesthesia (right). Mice correctly classified with “post-anesthetic/surgical effect” in top row, mice correctly classified with “no post-anesthetic/surgical effect” in bottom row. Red color indicates that a pixel contributes to the decision.
Fig 10
Fig 10. Contribution of nose, whisker pad, and ears to the decision making.
Visualization of the decision finding process using deep Taylor decomposition for images generated 30 min (top row) and 2 days (bottom row) after ketamine/xylazine anesthesia. Original image (left), original image combined with heat map (middle), heat map (right). Red color indicates that a pixel contributes to the decision. Top row: The mice was correctly classified as “no post-anesthetic/surgical effect” with a confidence of 96,2%. In particular the nose and the whisker pad seem to contribute to the decision. Bottom row: The mice was correctly classified as “post-anesthetic/surgical effect” with a confidence of 100,0%. The decision appears to be mainly based on the ears.
Fig 11
Fig 11. Contribution of piloerection and space between whiskers to the decision making.
Visualization of the decision finding process using deep Taylor decomposition for an image generated 150 min after ketamine/xylazine anesthesia. Original image (left), original image combined with heat map (middle), heat map (right). Red color indicates that a pixel contributes to the decision. Piloerection and the space between the whiskers seem to play a role in the decision making progress. The mice was correctly classified as “post-anesthetic/surgical effect” with a confidence of 99,6%.

References

    1. Russell WMS, Burch RL. The principles of humane experimental technique. London: Methuen; 1959.
    1. Poole T. Happy animals make good science. Lab Anim. 1997;31:116–124. 10.1258/002367797780600198 - DOI - PubMed
    1. Hawkins P, Morton D, Burman O, Dennison N, Honess P, Jennings M, et al. A guide to defining and implementing protocols for the welfare assessment of laboratory animals: eleventh report of the BVAAWF/FRAME/RSPCA/UFAW Joint Working Group on Refinement. Lab Anim. 2011;45(1):1–13. 10.1258/la.2010.010031 - DOI - PubMed
    1. Ekman P, Friesen WV. Facial action coding system: a technique for the measurement of facial movement. Consulting Psychologists Press; 1978.
    1. Langford DJ, Bailey AL, Chanda ML, Clarke SE, Drummond TE, Echols S, et al. Coding of facial expressions of pain in the laboratory mouse. Nat Methods. 2010;7:447–9. 10.1038/nmeth.1455 - DOI - PubMed

Publication types