DeepRodent — A Robust and Generalizable Vision Framework for Automated Rodent Monitoring

Abstract

One backbone, four synchronized predictions

Continuous, non-invasive behavioral monitoring of rodents is fundamental to neuroscience and pharmacological phenotyping, yet existing vision pipelines struggle to generalize across illumination changes, perspective distortion, cage geometry, and dense-occlusion housing conditions.

DeepRodent addresses this with a single multi-scale feature backbone feeding four task-specific heads — axis-aligned detection, rotation-aware oriented bounding boxes for curled or rotated animals, pixel-level instance segmentation under occlusion, and a temporal behavioral embedding — trained jointly and converted, via a post-processing aggregation engine, into trajectory tracking, behavioral-state classification, and spatial occupancy heatmaps.

Key Results

Consistent gains, detector-agnostic

Plugging DeepRodent's prediction heads into any YOLOv8–YOLO12 backbone yields a consistent +2.6 to +3.1 mAP improvement while holding real-time inference speed suitable for continuous monitoring.

Method	Backbone	Precision	Recall	mAP₅₀	mAP₅₀₋₉₅	FPS
YOLOv8-Seg	Nano	91.7	89.6	92.8	73.9	188
YOLO11-Seg	Small	93.5	92.1	94.2	77.4	161
YOLO12-Seg	Small	93.8	92.5	94.4	78.2	156
DeepRodent (Ours)	YOLO Family	95.4	94.1	96.2	84.6	154

Full cross-environment generalization, ablation, and SOTA comparison tables are reported in the paper.

Method

Shared backbone, four task-specific heads

A multi-scale feature integration backbone (CSP-style blocks with scale-aware softmax fusion) produces one shared representation per frame, which is decoded by four heads under a single joint objective.

Prediction function F_θ(I_t) = { B_t, M_t, O_t, E_t }

B_t

Detection

Axis-aligned bounding boxes for fast, cage-wide localization of every animal in frame.

O_t

Oriented Boxes

Rotation-aware localization for curled, rearing, or arbitrarily rotated rodents, regressed with a Gaussian-Wasserstein rotated-IoU loss.

M_t

Segmentation

Pixel-level instance masks that hold up under high-density occlusion between animals.

E_t

Temporal Embedding

A behavioral embedding feeding trajectory tracking, state classification, and occupancy heatmaps.

Joint training objective L = λ₁L_cls + λ₂L_box + λ₃L_seg + λ₄L_obb + λ₅L_temp + β·L_KL + λ₆L_domain + λ₇L_temp

Combining focal segmentation loss, IoU box loss, rotated-IoU regression, KL-divergence regularization, uncertainty-guided reweighting, and a cross-domain feature-moment matching term for generalization across laboratory settings. See docs/ARCHITECTURE.md for the equation-by-equation mapping to code.

Quickstart

From clone to trajectory in five steps

Install

# clone and install
git clone https://github.com/kaopanboonyuen/DeepRodent.git
cd DeepRodent
pip install -e .

Get a dataset

DeepRodent expects the standard YOLO-style polygon segmentation layout. No data on hand? Generate a synthetic set to smoke-test the pipeline end-to-end.

python scripts/make_toy_dataset.py --root ./data/DeepRodentDataset --n-per-split 30

Train

python scripts/train.py --config configs/deeprodent.yaml --epochs 100 --seed 42

Evaluate

python scripts/evaluate.py \
  --config configs/deeprodent.yaml \
  --checkpoint checkpoints/deeprodent_epoch100.pt \
  --split test

Run inference

Produces trajectory arrays, an occupancy heatmap, and per-frame behavioral-state tags.

python scripts/predict.py \
  --checkpoint checkpoints/deeprodent_epoch100.pt \
  --source path/to/video.mp4 \
  --out outputs/

Reproducibility

Everything needed to verify the numbers

All ablations are reported across 3 random seeds with the multi-seed averaging protocol in Evaluator.multi_seed_summary.

Ethical Considerations

A decision-support tool, not a replacement

DeepRodent is intended solely as an assistive research framework and is not designed to replace expert veterinary oversight or certified behavioral assessment by trained experimental biologists.

The underlying study used a private, non-invasive laboratory video dataset (secondary analysis of recorded observation clips only); no housing conditions were altered and no invasive procedures were performed for the purpose of data collection. All animal care and handling from the primary data source were conducted under approved IACUC protocols, in accordance with the ARRIVE guidelines and the 3Rs principles (Replacement, Reduction, Refinement).

DeepRodent should be treated as a decision-support tool requiring expert oversight, continual monitoring, and multi-center validation prior to broader deployment in experimental biology workflows.

Citation

If DeepRodent is useful for your research

@article{panboonyuen2026deeprodent,
  title   = {DeepRodent: A Robust and Generalizable Vision Framework for Automated Rodent Monitoring in Experimental Biology},
  author  = {Panboonyuen, Teerapong},
  year    = {2026},
  url     = {https://github.com/kaopanboonyuen/DeepRodent}
}