Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Jan 19:2024.01.18.576153.
doi: 10.1101/2024.01.18.576153.

DAMM for the detection and tracking of multiple animals within complex social and environmental settings

Affiliations

DAMM for the detection and tracking of multiple animals within complex social and environmental settings

Gaurav Kaul et al. bioRxiv. .

Update in

Abstract

Accurate detection and tracking of animals across diverse environments are crucial for behavioral studies in various disciplines, including neuroscience. Recently, machine learning and computer vision techniques have become integral to the neuroscientist's toolkit, enabling high-throughput behavioral studies. Despite advancements in localizing individual animals in simple environments, the task remains challenging in complex conditions due to intra-class visual variability and environmental diversity. These limitations hinder studies in ethologically-relevant conditions, such as when animals are concealed within nests or in obscured environments. Moreover, current tools are laborious and time-consuming to employ, requiring extensive, setup-specific annotation and model training/validation procedures. To address these challenges, we introduce the 'Detect Any Mouse Model' (DAMM), a pretrained object detector for localizing mice in complex environments, capable of robust performance with zero to minimal additional training on new experimental setups. Our approach involves collecting and annotating a diverse dataset that encompasses single and multi-housed mice in various lighting conditions, experimental setups, and occlusion levels. We utilize the Mask R-CNN architecture for instance segmentation and validate DAMM's performance with no additional training data (zero-shot inference) and with few examples for fine-tuning (few-shot inference). DAMM excels in zero-shot inference, detecting mice, and even rats, in entirely unseen scenarios and further improves with minimal additional training. By integrating DAMM with the SORT algorithm, we demonstrate robust tracking, competitively performing with keypoint-estimation-based methods. Finally, to advance and simplify behavioral studies, we made DAMM accessible to the scientific community with a user-friendly Python API, shared model weights, and a Google Colab implementation.

Keywords: Animal behavior; animal detection; animal tracking; computer vision; generalization; instance segmentation; machine learning; neuroscience.

PubMed Disclaimer

Conflict of interest statement

Competing interests. The authors declare no competing interest.

Figures

Fig. 1:
Fig. 1:. Pipeline for creating the Detect Any Mouse Model (DAMM).
(A) Image dataset collection strategy. Frames were randomly extracted from an extensive archive of 12,500 videos within our laboratory (AER Lab), depicting mice in various behavioral setups. (B) Schematic illustration of the procedure used to generate instance segmentation masks for our pretraining dataset in a cost-effective and time-efficient manner. The schematics depicts the workflow of a graphical user interface we developed, which utilizes the Segment Anything Model (SAM) for dataset annotation. (C) Overview of the object detection approach, illustrating the use of Mask R-CNN, which predicts instance segments for mice within videos. (D) Evaluation of model performance on a test set of 500 examples. Left, COCO-style strict mask precision (IoU > .75). Right, example predictions of instance segmentation on test images. Our final pretraining dataset included 2,200 diverse images, which were utilized for training the final DAMM.
Fig. 2:
Fig. 2:. Detection performance evaluation of DAMM.
(A) Schematic representation of detection evaluation procedures for two use cases: one with no further fine-tuning of model parameters (zero-shot) and another that incorporates a limited set of newly annotated examples for fine-tuning the model parameters (few-shot); θ represents model parameters. (B) Mask AP75 evaluation of DAMM across five unique datasets sourced from the AER Lab. The DAMM pretraining dataset may have contained frames from these five video datasets as both were sourced in-house. Each dataset contains 100 examples, with up to 50 allocated for training and 50 for testing. The mean and standard deviation of Mask AP75 are shown for each dataset across 0, 20, and 50 shot scenarios. Results are based on five randomly initialized train-test shuffles. Of note, standard deviation bars that are visually flat denote a deviation of 0. (C) Using the same approach as in (B), but for datasets collected outside the AER Lab. These datasets feature experimental setups that DAMM has not encountered during pretraining.
Fig. 3:
Fig. 3:. Controlled detection evaluation of DAMM.
(A-C) Organization of a controlled evaluation dataset, comprising samples conditioned on three distinct groups: (A) environments (3 types), (B) mice coat colors (3 colors), (C) and camera types (2 types). From these categories, we generated all possible combinations, resulting in 18 mini-datasets. Each of these 18 mini-datasets contains 70 annotated frames, randomly sampled from a 5-minute video recording corresponding to the specific combination of conditions. (D-F) Mask AP75 average performance over all datasets containing the condition of interest, conducted for 0-shot, 5-shot, 10-shot, and 20-shot scenarios. In each scenario, we use up to 20 examples for training and 50 examples for testing.
Fig. 4:
Fig. 4:. Tracking evaluation of DAMM.
(A-B) Compilation of single-animal and multi-animal tracking evaluation datasets. Each dataset features approximately 1-minute-long videos in which the location and unique identification of every mouse are annotated throughout all frames. (C-D) DAMM is employed as the detection module within the Simple Online Real-time Tracking (SORT) algorithm to track mice in videos. The evaluation showcases (C) single-object and (D) multi-object tracking accuracy (IOU > .50) of DAMM for both zero-shot and 20-shot scenarios across all tracking datasets. (E) Comparison strategy and performance of DAMM with an existing keypoint-based-estimation mouse tracking method: the DLC SuperAnimal-TopViewMouse model. This model outputs keypoint predictions for top-view singly-housed mice. (F) Presented is a zero-shot tracking comparison on a subset of our previously introduced datasets which feature top-view singly-housed mice.

References

    1. Mathis A., Schneider S., Lauer J., Mathis M. W., A Primer on Motion Capture with Deep Learning: Principles, Pitfalls, and Perspectives. Neuron 108, 44–65 (2020). - PubMed
    1. Pereira T. D., Shaevitz J. W., Murthy M., Quantifying behavior to understand the brain. Nat Neurosci 23, 1537–1549 (2020). - PMC - PubMed
    1. Dell A. I. et al., Automated image-based tracking and its application in ecology. Trends Ecol Evol 29, 417–428 (2014). - PubMed
    1. Lauer J. et al., Multi-animal pose estimation, identification and tracking with DeepLabCut. Nat Methods 19, 496–504 (2022). - PMC - PubMed
    1. Pereira T. D. et al., SLEAP: A deep learning system for multi-animal pose tracking. Nat Methods 19, 486–495 (2022). - PMC - PubMed

Publication types