Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 12;14(1):21366.
doi: 10.1038/s41598-024-72367-2.

DAMM for the detection and tracking of multiple animals within complex social and environmental settings

Affiliations

DAMM for the detection and tracking of multiple animals within complex social and environmental settings

Gaurav Kaul et al. Sci Rep. .

Abstract

Accurate detection and tracking of animals across diverse environments are crucial for studying brain and behavior. Recently, computer vision techniques have become essential for high-throughput behavioral studies; however, localizing animals in complex conditions remains challenging due to intra-class visual variability and environmental diversity. These challenges hinder studies in naturalistic settings, such as when animals are partially concealed within nests. Moreover, current tools are laborious and time-consuming, requiring extensive, setup-specific annotation and training procedures. To address these challenges, we introduce the 'Detect-Any-Mouse-Model' (DAMM), an object detector for localizing mice in complex environments with minimal training. Our approach involved collecting and annotating a diverse dataset of single- and multi-housed mice in complex setups. We trained a Mask R-CNN, a popular object detector in animal studies, to perform instance segmentation and validated DAMM's performance on a collection of downstream datasets using zero-shot and few-shot inference. DAMM excels in zero-shot inference, detecting mice and even rats, in entirely unseen scenarios and further improves with minimal training. Using the SORT algorithm, we demonstrate robust tracking, competitive with keypoint-estimation-based methods. Notably, to advance and simplify behavioral studies, we release our code, model weights, and data, along with a user-friendly Python API and a Google Colab implementation.

Keywords: Animal behavior; Animal tracking; Computer vision; Generalization; Instance segmentation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Pipeline for creating the Detect Any Mouse Model (DAMM). (A) Image dataset collection strategy. Frames were randomly extracted from an extensive archive of 12,500 videos within our laboratory (AER Lab), depicting mice in various behavioral setups. (B) Schematic illustration of the procedure used to generate instance segmentation masks for our pretraining dataset in a cost-effective and time-efficient manner. The schematics depict the workflow of a graphical user interface we developed, which utilizes the Segment Anything Model (SAM) for dataset annotation. (C) Overview of the object detection approach, illustrating the use of Mask R-CNN, which predicts instance segments for mice within videos. (D) Evaluation of model performance on a test set of 500 examples. Left, COCO-style strict mask precision (IoU > 0.75). Right, example predictions of instance segmentation on test images. Our final pretraining dataset included 2200 diverse images, which were utilized for training the final DAMM.
Fig. 2
Fig. 2
Detection performance evaluation of DAMM. (A) Schematic representation of detection evaluation procedures for two use cases: one with no further fine-tuning of model parameters (zero-shot) and another that incorporates a limited set of newly annotated examples for fine-tuning the model parameters (few-shot); θ represents model parameters. (B) Mask AP75 evaluation of DAMM across five unique datasets sourced from the AER Lab. The DAMM pretraining dataset may have contained frames from these five video datasets as both were sourced in-house. Each evaluation dataset contains 100 examples, with up to 50 allocated for training and 50 for testing. The mean and standard deviation of Mask AP75 are shown for each dataset across 0, 20, and 50 shot scenarios. Results are based on five randomly initialized train-test shuffles. Of note, standard deviation bars that are visually flat denote a deviation of 0. (C) Using the same approach as in (B), but for datasets collected outside the AER Lab. These datasets feature experimental setups that DAMM has not encountered during pretraining.
Fig. 3
Fig. 3
Controlled detection evaluation of DAMM. (A–C) Organization of a controlled evaluation dataset, comprising samples conditioned on three distinct groups: (A) environments (3 types), (B) mice coat colors (3 colors), (C) and camera types (2 types). From these categories, we generated all possible combinations, resulting in 18 mini-datasets. Each of these 18 mini-datasets contains 70 annotated frames, randomly sampled from a 5-min video recording corresponding to the specific combination of conditions. (D–F) Mask AP75 average performance over all datasets containing the condition of interest, conducted for 0-shot, 5-shot, 10-shot, and 20-shot scenarios. In each scenario, we use up to 20 examples for training and 50 examples for testing.
Fig. 4
Fig. 4
Tracking evaluation of DAMM. (A,B) Compilation of single-animal and multi-animal tracking evaluation datasets. Each dataset features videos with a mean duration of 45 s, in which the location and unique identification of every mouse are annotated throughout all frames. (C,D) DAMM is employed as the detection module within the Simple Online Real-time Tracking (SORT) algorithm to track mice in videos. The evaluation showcases (C) single-object and (D) multi-object tracking accuracy (IOU > .50) of DAMM for both zero-shot and 20-shot scenarios across all tracking datasets. (E) Comparison strategy and performance of DAMM with an existing keypoint-based-estimation mouse tracking method: the DLC SuperAnimal-TopViewMouse model. This model outputs keypoint predictions for top-view singly-housed mice. (F) Presented is a zero-shot tracking comparison on a subset of our previously introduced datasets which feature top-view singly-housed mice.

Update of

References

    1. Mathis, A., Schneider, S., Lauer, J. & Mathis, M. W. A primer on motion capture with deep learning: Principles, pitfalls, and perspectives. Neuron108, 44–65 (2020). 10.1016/j.neuron.2020.09.017 - DOI - PubMed
    1. Pereira, T. D., Shaevitz, J. W. & Murthy, M. Quantifying behavior to understand the brain. Nat. Neurosci.23, 1537–1549 (2020). 10.1038/s41593-020-00734-z - DOI - PMC - PubMed
    1. Dell, A. I. et al. Automated image-based tracking and its application in ecology. Trends Ecol. Evol.29, 417–428 (2014). 10.1016/j.tree.2014.05.004 - DOI - PubMed
    1. Lauer, J. et al. Multi-animal pose estimation, identification and tracking with DeepLabCut. Nat. Methods19, 496–504 (2022). 10.1038/s41592-022-01443-0 - DOI - PMC - PubMed
    1. Pereira, T. D. et al. SLEAP: A deep learning system for multi-animal pose tracking. Nat. Methods19, 486–495 (2022). 10.1038/s41592-022-01426-1 - DOI - PMC - PubMed

LinkOut - more resources