Skip to content

Production-quality object detection & tracking pipeline with self-supervised evaluation metrics. Built with Mask R-CNN + SORT algorithm.

License

Notifications You must be signed in to change notification settings

SAMithila/objectSpace

Repository files navigation

ObjectSpace

A production-quality object detection & tracking pipeline for workspace monitoring — demonstrating real-world ML engineering with self-supervised evaluation metrics.

Demo

Python 3.9+ License: MIT CI


Why This Project?

Smart workspace monitoring enables:

  • Productivity analytics — Track object interactions over time
  • Ergonomics research — Monitor desk setup and posture indicators
  • Automated inventory — Detect and track items on workspaces

This project demonstrates end-to-end ML pipeline engineering: from raw video to tracked objects with quality metrics — all without requiring ground truth annotations.


Key Features

Feature Description
Object Detection Pre-trained Mask R-CNN with configurable confidence thresholds
Multi-Object Tracking SORT algorithm with 8D Kalman filtering
Self-Supervised Evaluation Quality metrics without ground truth
Modular Architecture Clean separation of detection, tracking, I/O, and evaluation
Multiple Outputs COCO JSON annotations, visualization frames, evaluation reports
CLI + Python API Flexible usage for scripts or integration

Evaluation Results

The built-in evaluation framework measures tracking quality without ground truth:

Video Overall Continuity Stability Tracks ID Switches
video1 (complex) 36.8 66.5 25.4 23 6
video2 (medium) 44.4 67.8 43.3 11 3
video4 (simple) 78.4 95.9 100.0 8 0
Average 53.2 76.7 56.3 - -

Key Findings

  • 100% stability on simple scenes (≤8 concurrent tracks)
  • Stability degrades with scene complexity (IoU-based matching limitation)
  • Identified bottleneck: ID association in crowded scenes → recommends Deep SORT

Architecture

objectSpace/
├── src/objectSpace/
│   ├── detection/          # Mask R-CNN object detection
│   │   ├── base.py         # Abstract detector interface
│   │   └── mask_rcnn.py    # Mask R-CNN implementation
│   ├── tracking/           # SORT with Kalman filtering
│   │   ├── kalman.py       # Kalman filter implementation
│   │   ├── association.py  # IoU & Hungarian matching
│   │   └── sort_tracker.py # SORT algorithm
│   ├── evaluation/         # Self-supervised quality metrics
│   │   ├── metrics.py      # Metric dataclasses
│   │   ├── analyzer.py     # TrackingAnalyzer
│   │   ├── reporter.py     # Report generation
│   │   └── integration.py  # Pipeline integration
│   ├── io/                 # Video I/O and COCO export
│   │   ├── video.py        # Video reading
│   │   └── export.py       # COCO JSON export
│   ├── pipeline.py         # Main orchestration
│   └── config.py           # Typed configuration
├── tests/                  # Unit & integration tests
│   ├── evaluation/         # Evaluation module tests
│   ├── test_detection.py
│   └── test_tracking.py
├── examples/               # Demo notebooks
│   └── demo.ipynb          # Interactive demo
├── configs/                # YAML configurations
│   ├── default.yaml
│   └── tuned.yaml
└── assets/                 # Demo media
    └── demo.gif

Quick Start

Installation

git clone https://github.com/SAMithila/objectSpace.git
cd objectSpace
python -m venv venv
source venv/bin/activate
pip install -e ".[dev]"

Process a Video

from objectSpace import DetectionTrackingPipeline

pipeline = DetectionTrackingPipeline()
results = pipeline.process_video("video.mp4", output_dir="output/")

Process with Evaluation

# Get tracking results + quality metrics
results, evaluation = pipeline.process_video_with_evaluation("video.mp4")

print(f"Overall Score: {evaluation.overall_score:.1f}/100")
print(f"ID Switches: {evaluation.id_switches.total_switches}")

CLI Usage

# Process single video
python process_one_video.py task3.1_video1

# Evaluate existing results
python run_evaluation.py

# Compare all videos
python compare_videos.py

Evaluation Framework

The evaluation module computes tracking quality without ground truth annotations:

Metrics

Metric What It Measures
Continuity Score Track completeness (gaps, fragmentation)
Stability Score ID consistency (fewer switches = better)
Speed Score Processing FPS vs target
Overall Score Weighted combination

Usage

from objectSpace.pipeline import evaluate_annotations

# Evaluate existing tracking results
result = evaluate_annotations("output/video_annotations.json")

print(f"Fragmented tracks: {result.fragmentation.fragmented_tracks}")
print(f"ID switches: {result.id_switches.total_switches}")
print(f"Avg coverage: {result.fragmentation.avg_coverage_ratio:.1%}")

Compare Videos

python compare_videos.py

Output:

EVALUATION COMPARISON
================================================================================
Video                      Overall    Cont.    Stab.    Speed  Tracks
--------------------------------------------------------------------------------
task3.1_video1                36.8     66.5     25.4      0.0      23
task3.1_video2                44.4     67.8     43.3      0.0      11
task3.1_video4                78.4     95.9    100.0      0.0       8
--------------------------------------------------------------------------------
AVERAGE                       53.2     76.7     56.3      0.0      42

Configuration

Default settings in configs/default.yaml:

Parameter Default Description
detector.device auto CPU/CUDA selection
detector.default_confidence 0.3 Detection threshold
tracker.max_age 8 Frames to keep lost tracks
tracker.iou_threshold 0.3 Minimum IoU for matching

Tuned Configuration

Based on evaluation results, configs/tuned.yaml improves performance:

tracker:
  max_age: 15          # Handles longer occlusions
  iou_threshold: 0.2   # Fewer false ID switches

Development

# Run tests
pytest tests/ -v

# Run specific test module
pytest tests/evaluation/ -v

# Run with coverage
pytest tests/ --cov=objectSpace --cov-report=term-missing

Output Format

COCO JSON with Tracking

{
  "annotations": [
    {
      "id": 1,
      "image_id": 0,
      "category_id": 1,
      "bbox": [100, 100, 50, 80],
      "track_id": 0
    }
  ]
}

Evaluation Reports

  • *_evaluation.json — Machine-readable metrics
  • *_evaluation.md — Human-readable report
  • EVALUATION_SUMMARY.md — Cross-video comparison

Technical Highlights

This project demonstrates:

  1. Modular Design — Separate concerns for detection, tracking, evaluation
  2. Type Safety — Full type hints with dataclasses
  3. Configuration Management — YAML configs with typed validation
  4. Self-Supervised ML — Quality metrics without labeled data
  5. Production Patterns — Logging, error handling, CLI interface
  6. CI/CD — GitHub Actions for automated testing

Extending

Add New Detector

from objectSpace.detection import BaseDetector

class YOLODetector(BaseDetector):
    def detect(self, frame):
        # Your implementation
        pass

Add Custom Metrics

from objectSpace.evaluation import TrackingAnalyzer

class CustomAnalyzer(TrackingAnalyzer):
    def compute_custom_metric(self, annotations):
        # Your metric logic
        pass

License

MIT License — see LICENSE for details.

Acknowledgments