Skip to content

Commit 14feefe

Browse files
authored
Arena implementation (#665)
## Description <!-- Provide a brief description of the changes in this PR --> ## Related Issues <!-- Link to any related issues using #issue_number --> Closes #670 #666 #667 #672 #668 #661 #655 #654 #652 #436 ## Checklist when merging to main <!-- Mark items with "x" when completed --> - [ ] No compiler warnings (if applicable) - [ ] Code is formatted with `rustfmt` - [ ] No useless or dead code (if applicable) - [ ] Code is easy to understand - [ ] Doc comments are used for all functions, enums, structs, and fields (where appropriate) - [ ] All tests pass - [ ] Performance has not regressed (assuming change was not to fix a bug) - [ ] Version number has been updated in `helix-cli/Cargo.toml` and `helixdb/Cargo.toml` ## Additional Notes <!-- Add any additional information that would be helpful for reviewers --> <!-- greptile_comment --> <h2>Greptile Overview</h2> Updated On: 2025-11-07 00:19:04 UTC <h3>Greptile Summary</h3> This PR implements arena-based memory allocation for graph traversals and refactors the worker pool's channel selection mechanism. **Key Changes:** - **Arena Implementation**: Introduced `'arena` lifetime parameter throughout traversal operations (`in_e.rs`), replacing owned data with arena-allocated references for improved memory efficiency - **Worker Pool Refactor**: Replaced `flume::Selector` with a parity-based `try_recv()`/`recv()` pattern to handle two channels (`cont_rx` and `rx`) across multiple worker threads - **Badge Addition**: Added Manta Graph badge to README **Issues Found:** - **Worker Pool Channel Handling**: The new parity-based approach requires an even number of workers (≥2) and uses non-blocking `try_recv()` followed by blocking `recv()` on alternating channels. While this avoids a true busy-wait (since one `recv()` always blocks), the asymmetry means channels are polled at different frequencies, potentially causing channel starvation or unfair scheduling compared to the previous `Selector::wait()` approach. The arena implementation appears solid and follows Rust lifetime best practices. The worker pool change seems to be addressing a specific issue with core affinity (per commit `7437cf0f`), but the trade-off in channel fairness should be monitored. <details><summary><h3>Important Files Changed</h3></summary> File Analysis | Filename | Score | Overview | |----------|-------|----------| | README.md | 5/5 | Added Manta Graph badge to README - cosmetic documentation change with no functional impact | | helix-db/src/helix_engine/traversal_core/ops/in_/in_e.rs | 5/5 | Refactored to use arena-based lifetimes ('arena) instead of owned data, replacing separate InEdgesIterator struct with inline closures for better memory management | | helix-db/src/helix_gateway/worker_pool/mod.rs | 3/5 | Replaced flume Selector with parity-based try_recv/recv pattern requiring even worker count, but implementation has potential busy-wait issues that could cause high CPU usage | </details> </details> <details><summary><h3>Sequence Diagram</h3></summary> ```mermaid sequenceDiagram participant Client participant WorkerPool participant Worker1 as Worker (parity=true) participant Worker2 as Worker (parity=false) participant Router participant Storage Client->>WorkerPool: process(request) WorkerPool->>WorkerPool: Send request to req_rx channel par Worker1 Loop (parity=true) loop Every iteration Worker1->>Worker1: try_recv(cont_rx) - non-blocking alt Continuation available Worker1->>Worker1: Execute continuation function else Empty Worker1->>Worker1: Skip (no busy wait here) end Worker1->>Worker1: recv(rx) - BLOCKS until request alt Request received Worker1->>Router: Route request to handler Router->>Storage: Execute graph operation Storage-->>Router: Return result Router-->>Worker1: Response Worker1->>WorkerPool: Send response via ret_chan end end end par Worker2 Loop (parity=false) loop Every iteration Worker2->>Worker2: try_recv(rx) - non-blocking alt Request available Worker2->>Router: Route request to handler Router->>Storage: Execute graph operation Storage-->>Router: Return result Router-->>Worker2: Response Worker2->>WorkerPool: Send response via ret_chan else Empty Worker2->>Worker2: Skip (no busy wait here) end Worker2->>Worker2: recv(cont_rx) - BLOCKS until continuation alt Continuation received Worker2->>Worker2: Execute continuation function end end end WorkerPool-->>Client: Response ``` </details> <!-- greptile_other_comments_section --> <!-- /greptile_comment -->
2 parents cda6dc2 + 8687d75 commit 14feefe

File tree

236 files changed

+40641
-11200
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

236 files changed

+40641
-11200
lines changed

.dockerignore

Lines changed: 0 additions & 12 deletions
This file was deleted.

.github/workflows/db_tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,4 +32,4 @@ jobs:
3232
- name: Run tests
3333
run: |
3434
cd helix-db
35-
cargo test --profile dev helix_engine
35+
cargo test --release --lib -- --skip concurrency_tests

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,6 @@ test/
66
.claude/
77
hack/
88
.DS_Store
9+
/tmp
10+
/data
11+
claude.md

0 commit comments

Comments
 (0)