Skip to content

Commit d208f1d

Browse files
authored
Merge pull request #16 from dronefreak/dji-tello-object-detection-segmentation-v2
Dji tello object detection segmentation v2
2 parents 5595a24 + e95b2e0 commit d208f1d

File tree

4 files changed

+317
-12
lines changed

4 files changed

+317
-12
lines changed

QUICKSTART.md

Lines changed: 43 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -28,24 +28,31 @@ python -m tello_vision.app
2828
## First Steps
2929

3030
### 1. Test Detection Without Drone
31+
3132
Good for verifying everything works:
33+
3234
```bash
3335
python examples/test_detector.py --source 0 # Webcam
3436
```
3537

3638
### 2. Benchmark Your Setup
39+
3740
See what FPS you can get:
41+
3842
```bash
3943
python examples/benchmark.py
4044
```
4145

4246
### 3. Full Drone Mode
47+
4348
With Tello connected:
49+
4450
```bash
4551
python -m tello_vision.app
4652
```
4753

4854
Controls:
55+
4956
- **Tab**: Takeoff
5057
- **W/A/S/D**: Move
5158
- **Space/Shift**: Up/Down
@@ -60,29 +67,33 @@ Controls:
6067
Edit `config.yaml`:
6168

6269
**Want faster FPS?** Use smaller model:
70+
6371
```yaml
6472
detector:
6573
yolov8:
66-
model: "yolov8n-seg.pt" # n=nano (fastest)
74+
model: "yolov8n-seg.pt" # n=nano (fastest)
6775
```
6876
6977
**Only track people?**
78+
7079
```yaml
7180
detector:
7281
target_classes: ["person"]
7382
```
7483
7584
**Adjust visualization:**
85+
7686
```yaml
7787
visualization:
78-
mask_alpha: 0.4 # Mask transparency
88+
mask_alpha: 0.4 # Mask transparency
7989
show_confidence: true
8090
```
8191
8292
**Performance tuning:**
93+
8394
```yaml
8495
processing:
85-
frame_skip: 1 # Process every 2nd frame (doubles FPS)
96+
frame_skip: 1 # Process every 2nd frame (doubles FPS)
8697
```
8798
8899
## Project Structure
@@ -131,8 +142,9 @@ This demonstrates reactive control suitable for autonomous vehicles.
131142
## For Self-Driving Car Work
132143

133144
This gives you:
145+
134146
- Real-time object detection pipeline
135-
- Target tracking framework
147+
- Target tracking framework
136148
- Reactive control examples
137149
- Extensible architecture for adding SLAM, planning, etc.
138150

@@ -145,19 +157,38 @@ Check `examples/object_follower.py` for autonomous navigation basics.
145157
3. **Modify config.yaml** - Tune for your use case
146158
4. **Extend** - Add your own detectors/controllers
147159

148-
## Performance Reference
160+
## Performance Reference - NVIDIA RTX 500 Ada Generation Laptop GPU
161+
162+
| Model | Size | FPS | Avg (ms) | Std (ms) | Min (ms) | Max (ms) | Notes |
163+
| ------------------ | ------ | ----- | -------- | -------- | -------- | -------- | ------------------- |
164+
| YOLOv8n-seg | Nano | 207.8 | 4.8 | 0.4 | 4.4 | 8.2 | Fastest model |
165+
| YOLOv8s-seg | Small | 120.2 | 8.3 | 0.1 | 8.2 | 9.1 | Most stable latency |
166+
| YOLOv8m-seg | Medium | 53.2 | 18.8 | 0.5 | 16.4 | 19.6 | Balanced trade-off |
167+
| Detectron2 R50-FPN | Large | 9.7 | 102.7 | 0.8 | 101.2 | 107.5 | Slow but accurate |
168+
169+
---
170+
171+
## Performance Reference Across GPUs
172+
173+
| GPU | Model | FPS Range |
174+
| ----------- | ------- | --------- |
175+
| RTX 3060 | YOLOv8n | 25–30 |
176+
| RTX 3060 | YOLOv8s | 18–22 |
177+
| GTX 1050 Ti | YOLOv8n | 18–22 |
178+
| CPU | YOLOv8n | 2–3 |
179+
180+
---
181+
182+
**Summary:**
149183

150-
| GPU | Model | FPS |
151-
|-----|-------|-----|
152-
| RTX 3060 | YOLOv8n | 25-30 |
153-
| RTX 3060 | YOLOv8s | 18-22 |
154-
| 1050 Ti | YOLOv8n | 18-22 |
155-
| CPU | YOLOv8n | 2-3 |
184+
- **Fastest model:** YOLOv8n-seg (Nano) — 207.8 FPS
185+
- **Most stable latency:** YOLOv8s-seg (Small) — ±0.1 ms
186+
- **Performance leap:** RTX 500 Ada delivers **~7–8× speedup** over RTX 3060 for YOLOv8n.
156187

157188
## Files to Know
158189

159190
- **config.yaml** - All settings
160-
- **tello_vision/app.py** - Main application
191+
- **tello_vision/app.py** - Main application
161192
- **tello_vision/detectors/base_detector.py** - Add custom models here
162193
- **examples/object_follower.py** - Autonomous control reference
163194

tello_vision/detectors/__init__.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
"""Detector module for various object detection/segmentation backends."""
2+
3+
from .base_detector import BaseDetector, Detection, DetectionResult
4+
5+
__all__ = ["BaseDetector", "Detection", "DetectionResult"]
Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
"""Detectron2 detector implementation.
2+
3+
Higher quality but slower than YOLO. Good for precision applications.
4+
"""
5+
6+
import time
7+
8+
import numpy as np
9+
10+
from .base_detector import BaseDetector, Detection, DetectionResult
11+
12+
13+
class Detectron2Detector(BaseDetector):
14+
"""Detectron2 Mask R-CNN detector."""
15+
16+
def __init__(self, config: dict):
17+
super().__init__(config)
18+
self.predictor = None
19+
self.metadata = None
20+
21+
def load_model(self) -> None:
22+
"""Load Detectron2 model."""
23+
try:
24+
from detectron2 import model_zoo
25+
from detectron2.config import get_cfg
26+
from detectron2.data import MetadataCatalog
27+
from detectron2.engine import DefaultPredictor
28+
except ImportError:
29+
raise ImportError(
30+
"detectron2 not installed. Install from: "
31+
"https://github.com/facebookresearch/detectron2"
32+
)
33+
34+
cfg = get_cfg()
35+
36+
# Load config
37+
config_file = self.config.get(
38+
"config_file", "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"
39+
)
40+
cfg.merge_from_file(model_zoo.get_config_file(config_file))
41+
42+
# Set model weights
43+
weights = self.config.get("model_weights")
44+
if weights and weights.startswith("detectron2://"):
45+
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(config_file)
46+
else:
47+
cfg.MODEL.WEIGHTS = weights or model_zoo.get_checkpoint_url(config_file)
48+
49+
# Set confidence threshold
50+
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = self.config.get("confidence", 0.5)
51+
52+
# Set device
53+
device = self.config.get("device", "cuda")
54+
cfg.MODEL.DEVICE = device
55+
56+
print(f"Loading Detectron2 model: {config_file} on {device}")
57+
58+
# Create predictor
59+
self.predictor = DefaultPredictor(cfg)
60+
61+
# Get metadata for class names
62+
dataset_name = config_file.split("/")[0]
63+
if dataset_name.startswith("COCO"):
64+
self.metadata = MetadataCatalog.get("coco_2017_val")
65+
else:
66+
self.metadata = MetadataCatalog.get(cfg.DATASETS.TRAIN[0])
67+
68+
self.class_names = self.metadata.thing_classes
69+
70+
self._initialized = True
71+
print(f"Detectron2 model loaded. Classes: {len(self.class_names)}")
72+
73+
def detect(self, frame: np.ndarray) -> DetectionResult:
74+
"""Run Detectron2 detection on frame.
75+
76+
Args:
77+
frame: Input image (H, W, C) in BGR format
78+
79+
Returns:
80+
DetectionResult with all detections
81+
"""
82+
if not self._initialized:
83+
raise RuntimeError("Model not loaded. Call load_model() first.")
84+
85+
start_time = time.time()
86+
87+
# Run inference
88+
outputs = self.predictor(frame)
89+
90+
inference_time = time.time() - start_time
91+
92+
# Parse results
93+
detections = []
94+
instances = outputs["instances"].to("cpu")
95+
96+
if len(instances) > 0:
97+
boxes = instances.pred_boxes.tensor.numpy()
98+
scores = instances.scores.numpy()
99+
classes = instances.pred_classes.numpy()
100+
101+
# Get masks if available
102+
masks = None
103+
if instances.has("pred_masks"):
104+
masks = instances.pred_masks.numpy()
105+
106+
for idx in range(len(instances)):
107+
bbox = boxes[idx].astype(int)
108+
109+
# Get mask
110+
mask = None
111+
if masks is not None:
112+
mask = masks[idx].astype(np.uint8)
113+
114+
detection = Detection(
115+
class_id=int(classes[idx]),
116+
class_name=self.get_class_name(int(classes[idx])),
117+
confidence=float(scores[idx]),
118+
bbox=tuple(bbox),
119+
mask=mask,
120+
)
121+
detections.append(detection)
122+
123+
return DetectionResult(
124+
detections=detections,
125+
inference_time=inference_time,
126+
frame_shape=frame.shape,
127+
)
128+
129+
def get_class_name(self, class_id: int) -> str:
130+
"""Get class name from ID."""
131+
if 0 <= class_id < len(self.class_names):
132+
return self.class_names[class_id]
133+
return f"class_{class_id}"
134+
135+
def get_model_info(self) -> dict:
136+
"""Get model information."""
137+
return {
138+
"backend": "detectron2",
139+
"config": self.config.get("config_file", "unknown"),
140+
"device": self.device,
141+
"num_classes": len(self.class_names),
142+
"classes": self.class_names,
143+
}

0 commit comments

Comments
 (0)