⚡️ Speed up method WatermarkDecoder.reconstruct by 27%
#158
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 27% (0.27x) speedup for
WatermarkDecoder.reconstructininvokeai/backend/image_util/imwatermark/vendor.py⏱️ Runtime :
1.22 milliseconds→957 microseconds(best of119runs)📝 Explanation and details
The optimized code achieves a 27% speedup by eliminating inefficient byte concatenation and string operations in three key methods:
What optimizations were applied:
reconstruct_ipv4: Replaced list comprehension with string conversion ([str(ip) for ip in list(np.packbits(bits))]) with direct.format()string formatting using indexed array access. This avoids creating an intermediate list and multiple string conversions.reconstruct_uuid: Eliminated the expensive loop that repeatedly concatenates bytes (bstr += struct.pack(">B", nums[i])) and replaced it with a singlebytes(nums[:16])call. This removes Python-level iteration and repeated immutable bytes object creation.reconstruct_bytes: Replaced the loop-based byte concatenation pattern with direct slicing andbytes()constructor (bytes(nums[:end_idx])), eliminating the expensive repeated concatenation of immutable bytes objects.Why these optimizations are faster:
bstr += ...in loops, which creates new bytes objects each iteration due to immutability, resulting in O(n²) memory allocationsPerformance impact by test type:
The optimization is particularly effective for watermark decoding workloads that process multiple or large watermarks, as the byte manipulation operations are core to the decoding process.
✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-WatermarkDecoder.reconstruct-mhwwpz39and push.