Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 May 7:13:20.
doi: 10.3389/fnsys.2019.00020. eCollection 2019.

DeepBehavior: A Deep Learning Toolbox for Automated Analysis of Animal and Human Behavior Imaging Data

Affiliations

DeepBehavior: A Deep Learning Toolbox for Automated Analysis of Animal and Human Behavior Imaging Data

Ahmet Arac et al. Front Syst Neurosci. .

Abstract

Detailed behavioral analysis is key to understanding the brain-behavior relationship. Here, we present deep learning-based methods for analysis of behavior imaging data in mice and humans. Specifically, we use three different convolutional neural network architectures and five different behavior tasks in mice and humans and provide detailed instructions for rapid implementation of these methods for the neuroscience community. We provide examples of three dimensional (3D) kinematic analysis in the food pellet reaching task in mice, three-chamber test in mice, social interaction test in freely moving mice with simultaneous miniscope calcium imaging, and 3D kinematic analysis of two upper extremity movements in humans (reaching and alternating pronation/supination). We demonstrate that the transfer learning approach accelerates the training of the network when using images from these types of behavior video recordings. We also provide code for post-processing of the data after initial analysis with deep learning. Our methods expand the repertoire of available tools using deep learning for behavior analysis by providing detailed instructions on implementation, applications in several behavior tests, and post-processing methods and annotated code for detailed behavior analysis. Moreover, our methods in human motor behavior can be used in the clinic to assess motor function during recovery after an injury such as stroke.

Keywords: behavior analysis; deep learning; human kinematics; motor behavior; social behavior.

PubMed Disclaimer

Figures

Figure 1
Figure 1
3D paw kinematics and three-chamber social behavior analysis in mice. (A) Schematic of skilled food pellet reaching task in head-fixed mice. This setup allows simultaneous two-photon (2P) calcium imaging or electrophysiological recordings. (B) An example of the data processing with feeding of raw video frames to the deep learning algorithm to obtain the coordinates of a bounding box around the paw. (C) Representative images showing detected paws from two camera views when the paw is in different positions. (D) 3D trajectory of a single reaching attempt obtained from 2D coordinates of paw positions in two camera views. (E) Kinematic parameters such as velocity-time graphs can be obtained from the 3D trajectories. (F) Raw video-frame of three-chamber test. (G) Representative analysis showing detection of the head of the mouse (red circle), and the cup with a stranger mouse in (green circle), and without a mouse (blue circle). (H) Trajectory of the mouse seen in Video-3.
Figure 2
Figure 2
Transfer learning results in faster and more reliable training. (A) Training results of the test dataset. As the training dataset size increases from 10 images to 2,065 images, the regression and confidence losses decrease, and the accuracy (IoU: intersection over union) increases. For each training dataset size, transfer learning results in more accurate training compared to random initialization (low regression and confidence losses, and high accuracy). (B) Training results of the training dataset. Note that the transfer learning training curves in the 2,065 training dataset size group converge faster and stay stable throughout the training, meaning more reliable training.
Figure 3
Figure 3
Proposed workflow for processing of the raw behavior video and training the network. After pre-processing of the videos, and initial training of the network, an iterative training algorithm allows selection of images with high variability, resulting in more generalizable training for that dataset. After the images are processed, they can be used for post-processing.
Figure 4
Figure 4
Detection of two mice separately during social interaction and post-processing of data. (A) Representative raw video frame when two mice are interacting in a 45 × 45 cm chamber. Of note, one of the mice has a miniscope installed on the head to do calcium imaging of neuronal activity. (B) Detection of two mice separately (as mouse A and mouse B) by using YOLO v3 CNN. (C) Trajectories of body positions of these two mice in one session of interaction (~7 min). (D) The distance between two mice over time during the interaction session. The time periods when two mice are critically close to each other to allow any kind of interaction are marked and highlighted by orange color. A higher magnification of one of these close contacts is shown in the lower panel. (E) The velocity vs. time graphs can be obtained for each mouse throughout their interaction. (F) A representative distance time graph over one of the close contacts. The distances are between noses, or nose and tails of two mice. In panel (F), the close contact starts as mouse B sniffing mouse A's rear (shorter distance between tail-B to nose-A) but then turns into a nose-to-nose interaction. (G) A representative distance time graph over one of the close contacts showing a short nose-to-nose interaction between two mice.
Figure 5
Figure 5
Marker-less detection of human pose with 3D kinematics. (A) Representative images from the stereo camera system with two cameras that have different angles, with detection of joint poses on these 2D images. (B) 3D model reconstructed after calibration of the two camera views in panel (A), showing accurate detection of joint positions down to individual finger joints. (C) 3D trajectories of air reaches of the subject in panel (A), there are 10 reaches with superimposed reach trajectories. Velocity vs. time graphs for right elbow (D) and right wrist (E) during these 10 reaches. (F) Hierarchical clustering of these 10 reaches based on the dynamic time aligned kernels of the 3D trajectories. The numbers indicate the reach number. (G) Shoulder vs. body angles during these 10 reaches obtained from the 3D positions. H. Elbow (arm vs. forearm) angles during the 10 reaches. All the kinematic parameters (D,E,G,H) were obtained from the 3D model (as seen in panel B).
Figure 6
Figure 6
Marker-less detection of fine hand/finger movements in 3D. (A) Representative images from the stereo camera system, focusing more on the hand movements, with detection of the joint poses in 2D images. (B) Representative 3D models of both hands and forearms while the subject is doing alternating supination/pronation movements. (C) 3D views from different angles of the same hand positions as in panels (A,B). (D) Supination angles (rotation angles of the forearm) of both right and left hands during nine repetitive movements. (E) Hierarchical clustering based on the Euclidean distance between the supination angle curves of each rotation (supination and pronation) after these curves were aligned by dynamic time warping. Please note distinct clustering of right and left hand rotations as well as the heterogeneity among rotations within each hand.

References

    1. Alt Murphy M., Häger C. K. (2015). Kinematic analysis of the upper extremity after stroke - how far have we reached and what have we grasped? Phys. Ther. Rev. 20, 137–155. 10.1179/1743288X15Y.0000000002 - DOI
    1. Azim E., Jiang J., Alstermark B., Jessell T. M. (2014). Skilled reaching relies on a V2a propriospinal internal copy circuit. Nature 508, 357–363. 10.1038/nature13021 - DOI - PMC - PubMed
    1. Bernhardt J., Hayward K. S., Kwakkel G., Ward N. S., Wolf S. L., Borschmann K., et al. . (2017). Agreed definitions and a shared vision for new standards in stroke recovery research: the stroke recovery and rehabilitation roundtable taskforce. Neurorehabil. Neural. Repair. 31, 793–799. 10.1177/1545968317732668 - DOI - PubMed
    1. Bouguet J. Y. (2015). Camera Calibration Toolbox for Matlab: California Institute of Technology. Available online at: http://www.vision.caltech.edu/bouguetj/calib_doc/
    1. Cai D. J., Aharoni D., Shuman T., Shobe J., Biane J., Song W., et al. . (2016). A shared neural ensemble links distinct contextual memories encoded close in time. Nature 534, 115–118. 10.1038/nature17955 - DOI - PMC - PubMed