⚡️ Speed up function find_last_node by 24,591%
#189
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄 24,591% (245.91x) speedup for
find_last_nodeinsrc/algorithms/graph.py⏱️ Runtime :
101 milliseconds→410 microseconds(best of185runs)📝 Explanation and details
The optimization transforms an O(n*m) algorithm into an O(n+m) algorithm by eliminating redundant edge scanning.
Key Changes:
sources = {e["source"] for e in edges}containing all source node IDs from edgesall(e["source"] != n["id"] for e in edges)withn["id"] not in sourcesWhy This Is Faster:
The original code performs a linear scan through all edges for every node being checked. With n nodes and m edges, this creates O(n*m) time complexity. For each node, it checks every edge to ensure that node isn't a source anywhere.
The optimized version builds the source set once in O(m) time, then performs O(1) hash table lookups for each node, resulting in O(n+m) total complexity.
Performance Impact:
The 245x speedup (from 101ms to 410μs) demonstrates the dramatic improvement, especially evident in large-scale test cases:
test_large_linear_chain(1000 nodes): Benefits significantly as it avoids 1000×999 = 999,000 edge comparisonstest_large_fan_in(1000 nodes): Similarly optimized from quadratic to linear scanningTest Case Performance:
The optimization is most beneficial for graphs with many edges relative to nodes, where the original's repeated edge scanning becomes a bottleneck. Even simple cases like
test_three_nodes_linearbenefit from avoiding redundant edge iterations.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-find_last_node-mjby12rsand push.