fix(annotators): clip crops, fix heatmap wrap, release capture#2389
Open
Borda wants to merge 6 commits into
Open
fix(annotators): clip crops, fix heatmap wrap, release capture#2389Borda wants to merge 6 commits into
Borda wants to merge 6 commits into
Conversation
- `IconAnnotator`/`CropAnnotator` called the deprecated public `overlay_image`, emitting a warning users could not avoid and breaking both at its 0.31.0 removal; extract a private `_overlay_image` and delegate the public wrapper to it - `CropAnnotator` crashed (`cv2.error`) on detections partially outside the frame; clip boxes to scene bounds and skip degenerate boxes, matching `BlurAnnotator`/`PixelateAnnotator` - `HeatMapAnnotator` cast per-pixel frame counts to uint8, so heat silently vanished after 256 accumulated frames; derive the mask from the float accumulator directly - `get_video_frames_generator` leaked the `VideoCapture` when a consumer broke out early; release it in a `finally` - add regression tests for all four --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
- `train_test_split` seeded the global `random` module and shuffled the caller's list in place, so `DetectionDataset.split()` reordered its own `image_paths` and polluted process-wide randomness; use a local `random.Random` and shuffle a copy - Pascal VOC class ids were assigned in `set`-iteration and filesystem-glob order, so the same dataset produced different `class_id` values across runs; sort class names and the loaded file list - dataset exports keyed output files on basename, silently overwriting when two entries shared a name across directories (common after `merge()`); detect basename collisions and raise - add regression tests for split determinism, VOC id stability, and export collisions --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
- legacy `MeanAveragePrecision.from_tensors` appended per-image stats only when ground truth was present, so predictions on background images were never counted as false positives and mAP was inflated - append an FP-only stats tuple for prediction-present, target-empty images, mirroring the pattern in `f1_score.py` - add regression test asserting background false positives lower mAP --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
- benchmark guide installed `inference` from a personal dev branch that can be deleted or go stale; use the released package and add the `metrics` extra needed by the guide's evaluation code --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
- `from_tensorflow` scaled boxes in place on the array returned by `.numpy()`, which can share memory with the source tensor — corrupting caller data and double-scaling on a repeat call; copy before scaling - `from_lmm` raised a bare `KeyError` for `MOONDREAM` and `QWEN_3_VL`, which the enum and docstring advertise; map both to their `VLM` members - `from_deepseek_vl_2` returned a `(0,)`-shaped `xyxy` on empty output, so a zero-detection response crashed the `Detections` constructor; return `(0, 4)` like the other parsers - `extract_ultralytics_masks` binarized bilinear-resized masks with `> 0`, dilating every mask at object boundaries; threshold at 0.5 to match Ultralytics - add connector coverage: fake-result shims and round-trip tests (N>1, N=1, empty) for the nine previously untested `from_*` connectors and the `detection/tools/transformers.py` processors; one empty-`segments_info` panoptic case is xfail-marked pending a separate fix --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
- add a guard test asserting every `supervision.__all__` name is importable - cover previously untested public functions: mask/polygon converters, `pad_boxes`, `box_non_max_merge`, draw primitives, and `VideoSink`/`ImageSink` round-trips - add `DetectionDataset.split()` tests for determinism, disjoint coverage, ratio boundaries, and non-mutation of the source --- Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves robustness and determinism across Supervision’s core workflows: dataset export/splitting, image/video utilities used by annotators, and detection/metrics correctness. It also adds broad regression coverage for these edge cases.
Changes:
- Prevent silent overwrites during dataset export by detecting basename collisions; improve determinism in splitting and Pascal VOC loading.
- Refactor internal image overlay usage to avoid deprecated API warnings; harden annotators (clip/skip out-of-bounds crops, prevent heatmap uint8 wraparound).
- Fix correctness issues in adapters/utilities (TensorFlow box scaling mutability, Ultralytics mask resize thresholding) and ensure mAP penalizes background false positives; add/expand tests.
Holistic assessment (per CONTRIBUTING guidance):
- Code quality: 4/5
- Testing: 5/5
- Documentation: 4/5
Reviewed changes
Copilot reviewed 30 out of 31 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
src/supervision/annotators/core.py |
Switch internal overlay to non-deprecated implementation; clip/skip invalid crop regions; fix heatmap masking to avoid wrap-related artifacts. |
src/supervision/utils/image.py |
Add _overlay_image as the non-deprecated internal overlay implementation backing the public wrapper. |
src/supervision/utils/video.py |
Ensure VideoCapture.release() is always called via try/finally in the frame generator. |
src/supervision/metrics/detection.py |
Record predictions on GT-empty images as false positives so AP/mAP is penalized accordingly. |
src/supervision/detection/core.py |
Avoid mutating TensorFlow result tensors by copying prior to in-place scaling; expand legacy LMM→VLM mapping. |
src/supervision/detection/utils/internal.py |
Threshold resized Ultralytics proto masks after interpolation to avoid dilation. |
src/supervision/detection/vlm.py |
Ensure DeepSeek-VL2 empty parses yield a correctly shaped empty (0,4) xyxy array. |
src/supervision/dataset/utils.py |
Add basename collision guard for flat exports; isolate RNG usage in train_test_split and avoid mutating caller input. |
src/supervision/dataset/core.py |
Enforce basename collision checks for Pascal VOC annotation export paths. |
src/supervision/dataset/formats/yolo.py |
Enforce basename collision checks for YOLO label export naming. |
src/supervision/dataset/formats/pascal_voc.py |
Make VOC loading deterministic by sorting image paths and class name iteration. |
docs/how_to/benchmark_a_model.md |
Simplify installation instructions (including metrics extra). |
tests/annotators/test_core.py |
Add regressions for heatmap wraparound, deprecation-warning suppression, and out-of-bounds crop handling. |
tests/utils/test_image.py |
Assert public overlay_image matches _overlay_image output while allowing deprecation warnings. |
tests/utils/test_video.py |
Verify early generator close still releases the capture. |
tests/utils/test_sinks.py |
Add round-trip filesystem tests for ImageSink and VideoSink. |
tests/metrics/test_detection.py |
Add regression ensuring background-image false positives reduce mAP vs FP-free baseline. |
tests/dataset/test_utils.py |
Add regressions for RNG isolation and basename-collision detection utility. |
tests/dataset/test_core.py |
Add export collision regression; add deterministic and non-mutating split tests; add classification dataset folder round-trip tests. |
tests/dataset/formats/test_pascal_voc.py |
Add deterministic class ordering and repeatable class-id assignment regressions for VOC loads. |
tests/draw/test_utils.py |
Expand coverage for additional drawing and grayscale utilities. |
tests/detection/test_vlm.py |
Align empty-parse behavior across VLM parsers to return empty Detections. |
tests/detection/test_from_adapters.py |
Add extensive adapter routing and empty-input coverage; regression for TensorFlow mutation; LMM mapping tests. |
tests/detection/tools/test_transformers.py |
Add processing-unit tests for transformers detection/segmentation conversions and dispatch. |
tests/detection/tools/__init__.py |
Package init for transformer tool tests. |
tests/detection/utils/test_internal.py |
Add regression for Ultralytics mask resize thresholding behavior. |
tests/detection/utils/test_iou_and_nms.py |
Add tests for box_non_max_merge behavior. |
tests/detection/utils/test_converters.py |
Add polygon/mask conversion tests and round-trips. |
tests/detection/utils/test_boxes.py |
Add tests for new pad_boxes utility. |
tests/helpers.py |
Add additional fake result/tensor helpers to support expanded adapter/tool test coverage. |
tests/test_public_api.py |
Add guard that every symbol in supervision.__all__ is importable. |
Comment on lines
+1435
to
+1439
| predicted_objs[:, 5], | ||
| predicted_objs[:, 4], | ||
| np.zeros((0,)), | ||
| ) | ||
| ) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces several improvements and bug fixes across the annotation, dataset export, and detection utilities in the codebase. The main themes are: preventing file overwrite collisions during dataset export, improving annotation robustness and internal API usage, and fixing subtle bugs in detection and metric calculations. The changes also include expanded test coverage for new and edge-case behaviors.
Dataset export: collision prevention and consistency
check_no_basename_collisionsutility to prevent silent overwrites when exporting datasets with images or annotations sharing the same basename. This check is now enforced in Pascal VOC, YOLO, and image export workflows (src/supervision/dataset/utils.py,src/supervision/dataset/core.py,src/supervision/dataset/formats/yolo.py). [1] [2] [3] [4]src/supervision/dataset/formats/pascal_voc.py). [1] [2]Annotation and image overlay improvements
_overlay_imagefunction, preventing deprecation warnings in library-internal calls and updating all internal usages (src/supervision/annotators/core.py,src/supervision/utils/image.py). [1] [2] [3] [4]src/supervision/annotators/core.py,tests/annotators/test_core.py). [1] [2] [3] [4]Detection and metrics bug fixes
src/supervision/detection/core.py).src/supervision/detection/utils/internal.py).src/supervision/metrics/detection.py).Other improvements
src/supervision/dataset/utils.py).src/supervision/detection/core.py,src/supervision/detection/vlm.py). [1] [2]src/supervision/utils/video.py). [1] [2]docs/how_to/benchmark_a_model.md).