. 2024 Sep 12;14(1):21366.

doi: 10.1038/s41598-024-72367-2.

DAMM for the detection and tracking of multiple animals within complex social and environmental settings

Gaurav Kaul^{1

2}, Jonathan McDevitt³, Justin Johnson⁴, Ada Eban-Rothschild⁵

Affiliations

¹ Department of Psychology, University of Michigan, Ann Arbor, MI, 48109-1043, USA. kaulg@umich.edu.
² Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI, 48109-2121, USA. kaulg@umich.edu.
³ Department of Psychology, University of Michigan, Ann Arbor, MI, 48109-1043, USA.
⁴ Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI, 48109-2121, USA.
⁵ Department of Psychology, University of Michigan, Ann Arbor, MI, 48109-1043, USA. adae@umich.edu.

PMID: 39266610
PMCID: PMC11393305
DOI: 10.1038/s41598-024-72367-2

DAMM for the detection and tracking of multiple animals within complex social and environmental settings

Gaurav Kaul et al. Sci Rep. 2024.

. 2024 Sep 12;14(1):21366.

doi: 10.1038/s41598-024-72367-2.

Authors

Gaurav Kaul^{1

2}, Jonathan McDevitt³, Justin Johnson⁴, Ada Eban-Rothschild⁵

Affiliations

¹ Department of Psychology, University of Michigan, Ann Arbor, MI, 48109-1043, USA. kaulg@umich.edu.
² Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI, 48109-2121, USA. kaulg@umich.edu.
³ Department of Psychology, University of Michigan, Ann Arbor, MI, 48109-1043, USA.
⁴ Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI, 48109-2121, USA.
⁵ Department of Psychology, University of Michigan, Ann Arbor, MI, 48109-1043, USA. adae@umich.edu.

PMID: 39266610
PMCID: PMC11393305
DOI: 10.1038/s41598-024-72367-2

Abstract

Accurate detection and tracking of animals across diverse environments are crucial for studying brain and behavior. Recently, computer vision techniques have become essential for high-throughput behavioral studies; however, localizing animals in complex conditions remains challenging due to intra-class visual variability and environmental diversity. These challenges hinder studies in naturalistic settings, such as when animals are partially concealed within nests. Moreover, current tools are laborious and time-consuming, requiring extensive, setup-specific annotation and training procedures. To address these challenges, we introduce the 'Detect-Any-Mouse-Model' (DAMM), an object detector for localizing mice in complex environments with minimal training. Our approach involved collecting and annotating a diverse dataset of single- and multi-housed mice in complex setups. We trained a Mask R-CNN, a popular object detector in animal studies, to perform instance segmentation and validated DAMM's performance on a collection of downstream datasets using zero-shot and few-shot inference. DAMM excels in zero-shot inference, detecting mice and even rats, in entirely unseen scenarios and further improves with minimal training. Using the SORT algorithm, we demonstrate robust tracking, competitive with keypoint-estimation-based methods. Notably, to advance and simplify behavioral studies, we release our code, model weights, and data, along with a user-friendly Python API and a Google Colab implementation.

Keywords: Animal behavior; Animal tracking; Computer vision; Generalization; Instance segmentation.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

**Fig. 1**
Pipeline for creating the Detect Any Mouse Model (DAMM). **(A)** Image dataset collection strategy. Frames were randomly extracted from an extensive archive of 12,500 videos within our laboratory (AER Lab), depicting mice in various behavioral setups. **(B)** Schematic illustration of the procedure used to generate instance segmentation masks for our pretraining dataset in a cost-effective and time-efficient manner. The schematics depict the workflow of a graphical user interface we developed, which utilizes the Segment Anything Model (SAM) for dataset annotation. **(C)** Overview of the object detection approach, illustrating the use of Mask R-CNN, which predicts instance segments for mice within videos. **(D)** Evaluation of model performance on a test set of 500 examples. Left, COCO-style strict mask precision (IoU > 0.75). Right, example predictions of instance segmentation on test images. Our final pretraining dataset included 2200 diverse images, which were utilized for training the final DAMM.

**Fig. 2**
Detection performance evaluation of DAMM. **(A)** Schematic representation of detection evaluation procedures for two use cases: one with no further fine-tuning of model parameters (zero-shot) and another that incorporates a limited set of newly annotated examples for fine-tuning the model parameters (few-shot); $θ$ represents model parameters. **(B)** Mask AP₇₅ evaluation of DAMM across five unique datasets sourced from the AER Lab. The DAMM pretraining dataset may have contained frames from these five video datasets as both were sourced in-house. Each evaluation dataset contains 100 examples, with up to 50 allocated for training and 50 for testing. The mean and standard deviation of Mask AP₇₅ are shown for each dataset across 0, 20, and 50 shot scenarios. Results are based on five randomly initialized train-test shuffles. Of note, standard deviation bars that are visually flat denote a deviation of 0. **(C)** Using the same approach as in (B), but for datasets collected outside the AER Lab. These datasets feature experimental setups that DAMM has not encountered during pretraining.

**Fig. 3**
Controlled detection evaluation of DAMM. **(A–C)** Organization of a controlled evaluation dataset, comprising samples conditioned on three distinct groups: **(A)** environments (3 types), **(B)** mice coat colors (3 colors), **(C)** and camera types (2 types). From these categories, we generated all possible combinations, resulting in 18 mini-datasets. Each of these 18 mini-datasets contains 70 annotated frames, randomly sampled from a 5-min video recording corresponding to the specific combination of conditions. **(D–F)** Mask AP₇₅ average performance over all datasets containing the condition of interest, conducted for 0-shot, 5-shot, 10-shot, and 20-shot scenarios. In each scenario, we use up to 20 examples for training and 50 examples for testing.

**Fig. 4**
Tracking evaluation of DAMM. **(A,B)** Compilation of single-animal and multi-animal tracking evaluation datasets. Each dataset features videos with a mean duration of 45 s, in which the location and unique identification of every mouse are annotated throughout all frames. **(C,D)** DAMM is employed as the detection module within the Simple Online Real-time Tracking (SORT) algorithm to track mice in videos. The evaluation showcases **(C)** single-object and **(D)** multi-object tracking accuracy (IOU > .50) of DAMM for both zero-shot and 20-shot scenarios across all tracking datasets. **(E)** Comparison strategy and performance of DAMM with an existing keypoint-based-estimation mouse tracking method: the DLC SuperAnimal-TopViewMouse model. This model outputs keypoint predictions for top-view singly-housed mice. **(F)** Presented is a zero-shot tracking comparison on a subset of our previously introduced datasets which feature top-view singly-housed mice.

See this image and copyright information in PMC

Update of

DAMM for the detection and tracking of multiple animals within complex social and environmental settings.
Kaul G, McDevitt J, Johnson J, Eban-Rothschild A. Kaul G, et al. bioRxiv [Preprint]. 2024 Jan 19:2024.01.18.576153. doi: 10.1101/2024.01.18.576153. bioRxiv. 2024. Update in: Sci Rep. 2024 Sep 12;14(1):21366. doi: 10.1038/s41598-024-72367-2. PMID: 38293166 Free PMC article. Updated. Preprint.

References

1. Mathis, A., Schneider, S., Lauer, J. & Mathis, M. W. A primer on motion capture with deep learning: Principles, pitfalls, and perspectives. Neuron108, 44–65 (2020). 10.1016/j.neuron.2020.09.017 - DOI - PubMed
1. Pereira, T. D., Shaevitz, J. W. & Murthy, M. Quantifying behavior to understand the brain. Nat. Neurosci.23, 1537–1549 (2020). 10.1038/s41593-020-00734-z - DOI - PMC - PubMed
1. Dell, A. I. et al. Automated image-based tracking and its application in ecology. Trends Ecol. Evol.29, 417–428 (2014). 10.1016/j.tree.2014.05.004 - DOI - PubMed
1. Lauer, J. et al. Multi-animal pose estimation, identification and tracking with DeepLabCut. Nat. Methods19, 496–504 (2022). 10.1038/s41592-022-01443-0 - DOI - PMC - PubMed
1. Pereira, T. D. et al. SLEAP: A deep learning system for multi-animal pose tracking. Nat. Methods19, 486–495 (2022). 10.1038/s41592-022-01426-1 - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
- Nature Publishing Group
- PubMed Central
Molecular Biology Databases
- Mouse Genome Informatics (MGI)

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

DAMM for the detection and tracking of multiple animals within complex social and environmental settings

Affiliations

DAMM for the detection and tracking of multiple animals within complex social and environmental settings

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Update of

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Molecular Biology Databases