⚡️ Speed up method ObjectDetectionEvalProcessor._change_bbox_bounds_for_image_size by 312%
#11
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 312% (3.12x) speedup for
ObjectDetectionEvalProcessor._change_bbox_bounds_for_image_sizeinunstructured/metrics/object_detection.py⏱️ Runtime :
335 microseconds→81.3 microseconds(best of250runs)📝 Explanation and details
The optimization replaces NumPy's vectorized
.clip()operations with a custom Numba-compiled function that performs explicit bounds checking through manual loops. This achieves a 311% speedup (335μs → 81.3μs) despite appearing to trade vectorized operations for explicit loops.Key optimizations applied:
@njit(cache=True, fastmath=True)decorator compiles the clipping function to optimized machine code, eliminating Python interpreter overheadcache=Truestores the compiled version for subsequent runs, avoiding recompilation costsfastmath=Trueenables aggressive floating-point optimizationsWhy this is faster than NumPy:
NumPy's
.clip()creates intermediate arrays for fancy indexing operations likeboxes[..., [0, 2]], requiring memory allocation and element copying. The original code performs this twice (once for x-coordinates, once for y-coordinates). The Numba version eliminates these allocations by processing elements directly in a tight loop that the compiler can heavily optimize.Performance characteristics from tests:
This optimization is particularly valuable for object detection pipelines that process many small to medium batches of bounding boxes, which is typical in computer vision workloads where this function would likely be called frequently during evaluation.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-ObjectDetectionEvalProcessor._change_bbox_bounds_for_image_size-mjceh1u0and push.