Skip to content

refactor: resolve critical code smells (statements, branches, attributes) and add metrics pipeline#2377

Draft
EdivarCr wants to merge 36 commits into
roboflow:developfrom
EdivarCr:develop
Draft

refactor: resolve critical code smells (statements, branches, attributes) and add metrics pipeline#2377
EdivarCr wants to merge 36 commits into
roboflow:developfrom
EdivarCr:develop

Conversation

@EdivarCr

Copy link
Copy Markdown
Before submitting
  • Self-reviewed the code
  • Updated documentation, follow Google-style
  • Added docs entry for autogeneration (if new functions/classes)
  • Added/updated tests
  • All tests pass locally

Description

General refactoring of the Supervision library aimed at architectural optimization, eliminating critical code smells identified by static analysis tools, and establishing a comparative metrics pipeline (before and after the process).

Type of Change

  • 🔨 Refactoring (non-breaking change which improves code structure)
  • 🧪 Test update
  • 🔧 Chore (dependencies, configs, etc.)

Motivation and Context

The main goal of this refactoring was to increase the maintainability, readability, and cohesion of the project's components. Three major groups of code smells were addressed:

  1. Too Many Statements: Reduced linear complexity in key functions (mAP calculation, trackers, and video utilities).
  2. Too Many Branches: Optimized decision flows in mask filters, NMS, inference slicing, and the core Detections class.
  3. Too Many Instance Attributes: Simplified classes that accumulated excessive internal state (especially validation metrics classes, PolygonZone, and LineZone).

Additionally, automated scripts were integrated to document ecological impact (CodeCarbon), test coverage (Pytest + Coverage), cyclomatic complexity (Radon), and style compliance (Pylint).

Changes Made

1. Statement Complexity Refactoring (PR #1)

  • Reduced complexity in the byte_tracker/core.py tracking algorithm.
  • Split long utility functions in supervision/utils/video.py.
  • Simplified mathematical expressions and evaluation logic in supervision/metrics/.

2. Branch Complexity Refactoring (PR #2)

  • Simplified nested conditional branches in the COCO dataset formatter.
  • Streamlined control flows in supervision/detection/core.py and mask utilities (masks.py).
  • Optimized decision-making paths in the inference slicer (InferenceSlicer).

3. Instance Attribute Cleanup (PR #3)

  • Refactored evaluation classes (mean_average_precision.py, mean_average_recall.py, f1_score.py, precision.py, recall.py) to avoid storing redundant internal states.
  • Cleaned up attributes in zone counting classes (PolygonZone and LineZone).
  • Updated and aligned the corresponding test suites to reflect these architectural improvements.

4. Baseline and Metrics Extraction

  • Added automated metric extraction scripts for Radon, Pytest, Pylint, and CodeCarbon (extract_metrics_after_...).
  • Saved comparative reports under metrics-before-* and metrics-after-* directories to audit the impact of the refactoring process.

Testing

  • I have tested this code locally
  • I have added unit tests that prove my fix is effective or that my feature works
  • All new and existing tests pass

Validation was performed by running the complete test suite via pytest:

uv run pytest

EdivarCr and others added 30 commits June 25, 2026 19:07
@EdivarCr EdivarCr requested a review from SkalskiP as a code owner June 30, 2026 15:09
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ EdivarCr
❌ Julian-Cardoso
You have signed the CLA already but the status is still pending? Let us recheck it.

@Borda Borda left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting PR, but then, are you serious about adding metrics files and refactoring without explaining why?

@Borda Borda marked this pull request as draft June 30, 2026 22:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR performs a broad refactor across Supervision’s detection/metrics utilities to reduce complexity and internal state while adding a “before/after” metrics extraction pipeline (Radon/Pylint/Pytest/Coverage/CodeCarbon) and updating tests to cover refactored behaviors.

Changes:

  • Refactors several core components (metrics result objects, video processing, inference slicing, zone annotators, SAM3/VLM parsing) to reduce branching/attributes and improve structure.
  • Adds/updates tests to validate refactored public accessors/configuration and to guard against regressions in parsing/alignment logic.
  • Adds metric-extraction scripts and commits baseline/after metric reports for comparison.

Code quality / tests / docs (n/5): 2/5, 4/5, 2/5

Reviewed changes

Copilot reviewed 70 out of 90 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/metrics/test_recall.py Adds regression coverage for RecallResult size-subresult accessors.
tests/metrics/test_precision.py Adds regression coverage for PrecisionResult size-subresult accessors.
tests/metrics/test_mean_average_recall.py Adds regression coverage for MeanAverageRecallResult size-subresult accessors.
tests/metrics/test_mean_average_precision.py Adds regression coverage for MeanAveragePrecisionResult size-subresult accessors.
tests/metrics/test_f1_score.py Adds regression coverage for F1ScoreResult size-subresult accessors.
tests/detection/tools/test_inference_slicer.py Adds coverage for InferenceSlicer configuration property access/mutation.
tests/detection/test_vlm.py Adds coverage ensuring Gemini 2.5 class filtering keeps arrays aligned.
tests/detection/test_line_counter.py Adjusts tests for refactored LineZone annotators/configuration.
tests/detection/test_from_sam.py Expands SAM3 adapter test coverage for object-style results.
src/supervision/utils/video.py Extracts process_video threading/queue logic into a dedicated helper class.
src/supervision/tracker/byte_tracker/single_object_track.py Refactors STrack state into a dataclass-backed container with properties.
src/supervision/metrics/recall.py Refactors RecallResult to group size subresults behind a size_results container.
src/supervision/metrics/mean_average_recall.py Refactors MeanAverageRecallResult size subresults via mixin + grouped container.
src/supervision/metrics/f1_score.py Refactors F1ScoreResult to group size subresults behind size_results.
src/supervision/metrics/detection.py Splits detection confusion-matrix logic into smaller helpers.
src/supervision/key_points/core.py Refactors KeyPoints.getitem into smaller normalization/selection helpers.
src/supervision/detection/vlm.py Refactors Gemini JSON extraction/parsing and filtering into helpers.
src/supervision/detection/utils/masks.py Extracts distance-threshold resolution and label filtering into helpers.
src/supervision/detection/utils/iou_and_nms.py Extracts oriented-box validation and overlap-metric normalization helpers.
src/supervision/detection/utils/internal.py Extracts RLE decode + data-merge validation/merge helpers; simplifies roboflow parsing.
src/supervision/detection/tools/polygon_zone.py Refactors PolygonZoneAnnotator to a config dataclass + properties.
src/supervision/detection/tools/inference_slicer.py Refactors InferenceSlicer config/state into dataclasses and splits execution paths.
src/supervision/detection/line_zone.py Refactors LineZone annotators to config dataclasses + property proxies.
src/supervision/detection/core.py Refactors SAM3 normalization/mask-building helpers for reduced branching.
src/supervision/dataset/formats/coco.py Refactors COCO segmentation construction into helpers; improves edge-case handling.
metrics-before-radon/raw_por_arquivo_e_total_antes.csv Baseline Radon “raw” CSV report.
metrics-before-radon/mi_por_arquivo_antes.csv Baseline Radon Maintainability Index CSV report.
metrics-before-radon/mi_antes.json Baseline Radon Maintainability Index JSON report.
metrics-before-radon/hal_por_arquivo_antes.csv Baseline Radon Halstead CSV report.
metrics-before-radon/cc_por_arquivo_antes.csv Baseline Radon Cyclomatic Complexity CSV report.
metrics-before-pylint/pylint_score_antes.txt Baseline Pylint score capture.
metrics-before-pylint/pylint_ranking_smells_antes.json Baseline Pylint refactor-smell ranking.
metrics-before-pylint/pylint_fatal_antes.json Baseline Pylint fatal messages (empty).
metrics-before-pylint/pylint_distribuicao_categorias_antes.json Baseline Pylint message-category distribution.
metrics-before-codecarbon/emissions_antes.csv Baseline CodeCarbon emissions output.
metrics-after-radon/raw_por_arquivo_e_total_depois.csv Post-change Radon “raw” CSV report.
metrics-after-radon/mi_por_arquivo_depois.csv Post-change Radon Maintainability Index CSV report.
metrics-after-radon/mi_depois.json Post-change Radon Maintainability Index JSON report.
metrics-after-radon/cc_por_arquivo_depois.csv Post-change Radon Cyclomatic Complexity CSV report.
metrics-after-pylint/pylint_score_depois.txt Post-change Pylint score capture.
metrics-after-pylint/pylint_ranking_smells_depois.json Post-change Pylint refactor-smell ranking.
metrics-after-pylint/pylint_fatal_depois.json Post-change Pylint fatal messages (empty).
metrics-after-pylint/pylint_distribuicao_categorias_depois.json Post-change Pylint message-category distribution.
metrics-after-codecarbon/emissions_depois.csv Post-change CodeCarbon emissions output.
extract_score_before_pylint.py Script to extract baseline Pylint score.
extract_score_after_pylint.py Script to extract post-change Pylint score.
extract_metrics_before_radon.py Script to generate baseline Radon reports.
extract_metrics_before_pytest.py Script to generate baseline Pytest/Coverage reports.
extract_metrics_before_pylint.py Script to generate baseline Pylint JSON breakdown reports.
extract_metrics_before_codecarbon.py Script to generate baseline CodeCarbon emissions report.
extract_metrics_after_radon.py Script to generate post-change Radon reports.
extract_metrics_after_pytest.py Script to generate post-change Pytest/Coverage reports.
extract_metrics_after_pylint.py Script to generate post-change Pylint JSON breakdown reports.
extract_metrics_after_codecarbon.py Script to generate post-change CodeCarbon emissions report.

Comment on lines +409 to +413
with self._obb_thread_workers_lock:
if self._obb_thread_workers_warned:
return

self._obb_thread_workers_warned = True
Comment on lines 45 to 49
def __init__(
self,
tlwh: npt.NDArray[np.float32],
score: float,
minimum_consecutive_frames: int,
Comment on lines +6 to +9
# Configuração, ajuste apenas se necessário.

# Diretório raiz do projeto clonado, os testes vai começar a execução a petir dele.
PROJETO = "."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants