refactor: resolve critical code smells (statements, branches, attributes) and add metrics pipeline#2377
refactor: resolve critical code smells (statements, branches, attributes) and add metrics pipeline#2377EdivarCr wants to merge 36 commits into
Conversation
…ean_average_precision.py
…_average_recall.py
…ny-statements Refactor/remove code smell too many statements
…ny-branchs Refactor/remove code smell too many branchs
…ce-attributes Refactor/remove code smell instance attributes
|
|
Borda
left a comment
There was a problem hiding this comment.
Interesting PR, but then, are you serious about adding metrics files and refactoring without explaining why?
There was a problem hiding this comment.
Pull request overview
This PR performs a broad refactor across Supervision’s detection/metrics utilities to reduce complexity and internal state while adding a “before/after” metrics extraction pipeline (Radon/Pylint/Pytest/Coverage/CodeCarbon) and updating tests to cover refactored behaviors.
Changes:
- Refactors several core components (metrics result objects, video processing, inference slicing, zone annotators, SAM3/VLM parsing) to reduce branching/attributes and improve structure.
- Adds/updates tests to validate refactored public accessors/configuration and to guard against regressions in parsing/alignment logic.
- Adds metric-extraction scripts and commits baseline/after metric reports for comparison.
Code quality / tests / docs (n/5): 2/5, 4/5, 2/5
Reviewed changes
Copilot reviewed 70 out of 90 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/metrics/test_recall.py | Adds regression coverage for RecallResult size-subresult accessors. |
| tests/metrics/test_precision.py | Adds regression coverage for PrecisionResult size-subresult accessors. |
| tests/metrics/test_mean_average_recall.py | Adds regression coverage for MeanAverageRecallResult size-subresult accessors. |
| tests/metrics/test_mean_average_precision.py | Adds regression coverage for MeanAveragePrecisionResult size-subresult accessors. |
| tests/metrics/test_f1_score.py | Adds regression coverage for F1ScoreResult size-subresult accessors. |
| tests/detection/tools/test_inference_slicer.py | Adds coverage for InferenceSlicer configuration property access/mutation. |
| tests/detection/test_vlm.py | Adds coverage ensuring Gemini 2.5 class filtering keeps arrays aligned. |
| tests/detection/test_line_counter.py | Adjusts tests for refactored LineZone annotators/configuration. |
| tests/detection/test_from_sam.py | Expands SAM3 adapter test coverage for object-style results. |
| src/supervision/utils/video.py | Extracts process_video threading/queue logic into a dedicated helper class. |
| src/supervision/tracker/byte_tracker/single_object_track.py | Refactors STrack state into a dataclass-backed container with properties. |
| src/supervision/metrics/recall.py | Refactors RecallResult to group size subresults behind a size_results container. |
| src/supervision/metrics/mean_average_recall.py | Refactors MeanAverageRecallResult size subresults via mixin + grouped container. |
| src/supervision/metrics/f1_score.py | Refactors F1ScoreResult to group size subresults behind size_results. |
| src/supervision/metrics/detection.py | Splits detection confusion-matrix logic into smaller helpers. |
| src/supervision/key_points/core.py | Refactors KeyPoints.getitem into smaller normalization/selection helpers. |
| src/supervision/detection/vlm.py | Refactors Gemini JSON extraction/parsing and filtering into helpers. |
| src/supervision/detection/utils/masks.py | Extracts distance-threshold resolution and label filtering into helpers. |
| src/supervision/detection/utils/iou_and_nms.py | Extracts oriented-box validation and overlap-metric normalization helpers. |
| src/supervision/detection/utils/internal.py | Extracts RLE decode + data-merge validation/merge helpers; simplifies roboflow parsing. |
| src/supervision/detection/tools/polygon_zone.py | Refactors PolygonZoneAnnotator to a config dataclass + properties. |
| src/supervision/detection/tools/inference_slicer.py | Refactors InferenceSlicer config/state into dataclasses and splits execution paths. |
| src/supervision/detection/line_zone.py | Refactors LineZone annotators to config dataclasses + property proxies. |
| src/supervision/detection/core.py | Refactors SAM3 normalization/mask-building helpers for reduced branching. |
| src/supervision/dataset/formats/coco.py | Refactors COCO segmentation construction into helpers; improves edge-case handling. |
| metrics-before-radon/raw_por_arquivo_e_total_antes.csv | Baseline Radon “raw” CSV report. |
| metrics-before-radon/mi_por_arquivo_antes.csv | Baseline Radon Maintainability Index CSV report. |
| metrics-before-radon/mi_antes.json | Baseline Radon Maintainability Index JSON report. |
| metrics-before-radon/hal_por_arquivo_antes.csv | Baseline Radon Halstead CSV report. |
| metrics-before-radon/cc_por_arquivo_antes.csv | Baseline Radon Cyclomatic Complexity CSV report. |
| metrics-before-pylint/pylint_score_antes.txt | Baseline Pylint score capture. |
| metrics-before-pylint/pylint_ranking_smells_antes.json | Baseline Pylint refactor-smell ranking. |
| metrics-before-pylint/pylint_fatal_antes.json | Baseline Pylint fatal messages (empty). |
| metrics-before-pylint/pylint_distribuicao_categorias_antes.json | Baseline Pylint message-category distribution. |
| metrics-before-codecarbon/emissions_antes.csv | Baseline CodeCarbon emissions output. |
| metrics-after-radon/raw_por_arquivo_e_total_depois.csv | Post-change Radon “raw” CSV report. |
| metrics-after-radon/mi_por_arquivo_depois.csv | Post-change Radon Maintainability Index CSV report. |
| metrics-after-radon/mi_depois.json | Post-change Radon Maintainability Index JSON report. |
| metrics-after-radon/cc_por_arquivo_depois.csv | Post-change Radon Cyclomatic Complexity CSV report. |
| metrics-after-pylint/pylint_score_depois.txt | Post-change Pylint score capture. |
| metrics-after-pylint/pylint_ranking_smells_depois.json | Post-change Pylint refactor-smell ranking. |
| metrics-after-pylint/pylint_fatal_depois.json | Post-change Pylint fatal messages (empty). |
| metrics-after-pylint/pylint_distribuicao_categorias_depois.json | Post-change Pylint message-category distribution. |
| metrics-after-codecarbon/emissions_depois.csv | Post-change CodeCarbon emissions output. |
| extract_score_before_pylint.py | Script to extract baseline Pylint score. |
| extract_score_after_pylint.py | Script to extract post-change Pylint score. |
| extract_metrics_before_radon.py | Script to generate baseline Radon reports. |
| extract_metrics_before_pytest.py | Script to generate baseline Pytest/Coverage reports. |
| extract_metrics_before_pylint.py | Script to generate baseline Pylint JSON breakdown reports. |
| extract_metrics_before_codecarbon.py | Script to generate baseline CodeCarbon emissions report. |
| extract_metrics_after_radon.py | Script to generate post-change Radon reports. |
| extract_metrics_after_pytest.py | Script to generate post-change Pytest/Coverage reports. |
| extract_metrics_after_pylint.py | Script to generate post-change Pylint JSON breakdown reports. |
| extract_metrics_after_codecarbon.py | Script to generate post-change CodeCarbon emissions report. |
| with self._obb_thread_workers_lock: | ||
| if self._obb_thread_workers_warned: | ||
| return | ||
|
|
||
| self._obb_thread_workers_warned = True |
| def __init__( | ||
| self, | ||
| tlwh: npt.NDArray[np.float32], | ||
| score: float, | ||
| minimum_consecutive_frames: int, |
| # Configuração, ajuste apenas se necessário. | ||
|
|
||
| # Diretório raiz do projeto clonado, os testes vai começar a execução a petir dele. | ||
| PROJETO = "." |
Before submitting
Description
General refactoring of the Supervision library aimed at architectural optimization, eliminating critical code smells identified by static analysis tools, and establishing a comparative metrics pipeline (before and after the process).
Type of Change
Motivation and Context
The main goal of this refactoring was to increase the maintainability, readability, and cohesion of the project's components. Three major groups of code smells were addressed:
Detectionsclass.PolygonZone, andLineZone).Additionally, automated scripts were integrated to document ecological impact (CodeCarbon), test coverage (Pytest + Coverage), cyclomatic complexity (Radon), and style compliance (Pylint).
Changes Made
1. Statement Complexity Refactoring (PR #1)
byte_tracker/core.pytracking algorithm.supervision/utils/video.py.supervision/metrics/.2. Branch Complexity Refactoring (PR #2)
supervision/detection/core.pyand mask utilities (masks.py).InferenceSlicer).3. Instance Attribute Cleanup (PR #3)
mean_average_precision.py,mean_average_recall.py,f1_score.py,precision.py,recall.py) to avoid storing redundant internal states.PolygonZoneandLineZone).4. Baseline and Metrics Extraction
extract_metrics_after_...).metrics-before-*andmetrics-after-*directories to audit the impact of the refactoring process.Testing
Validation was performed by running the complete test suite via
pytest: