Commit 14feefe
authored
Arena implementation (#665)
## Description
<!-- Provide a brief description of the changes in this PR -->
## Related Issues
<!-- Link to any related issues using #issue_number -->
Closes #670 #666 #667 #672 #668 #661 #655 #654 #652 #436
## Checklist when merging to main
<!-- Mark items with "x" when completed -->
- [ ] No compiler warnings (if applicable)
- [ ] Code is formatted with `rustfmt`
- [ ] No useless or dead code (if applicable)
- [ ] Code is easy to understand
- [ ] Doc comments are used for all functions, enums, structs, and
fields (where appropriate)
- [ ] All tests pass
- [ ] Performance has not regressed (assuming change was not to fix a
bug)
- [ ] Version number has been updated in `helix-cli/Cargo.toml` and
`helixdb/Cargo.toml`
## Additional Notes
<!-- Add any additional information that would be helpful for reviewers
-->
<!-- greptile_comment -->
<h2>Greptile Overview</h2>
Updated On: 2025-11-07 00:19:04 UTC
<h3>Greptile Summary</h3>
This PR implements arena-based memory allocation for graph traversals
and refactors the worker pool's channel selection mechanism.
**Key Changes:**
- **Arena Implementation**: Introduced `'arena` lifetime parameter
throughout traversal operations (`in_e.rs`), replacing owned data with
arena-allocated references for improved memory efficiency
- **Worker Pool Refactor**: Replaced `flume::Selector` with a
parity-based `try_recv()`/`recv()` pattern to handle two channels
(`cont_rx` and `rx`) across multiple worker threads
- **Badge Addition**: Added Manta Graph badge to README
**Issues Found:**
- **Worker Pool Channel Handling**: The new parity-based approach
requires an even number of workers (≥2) and uses non-blocking
`try_recv()` followed by blocking `recv()` on alternating channels.
While this avoids a true busy-wait (since one `recv()` always blocks),
the asymmetry means channels are polled at different frequencies,
potentially causing channel starvation or unfair scheduling compared to
the previous `Selector::wait()` approach.
The arena implementation appears solid and follows Rust lifetime best
practices. The worker pool change seems to be addressing a specific
issue with core affinity (per commit `7437cf0f`), but the trade-off in
channel fairness should be monitored.
<details><summary><h3>Important Files Changed</h3></summary>
File Analysis
| Filename | Score | Overview |
|----------|-------|----------|
| README.md | 5/5 | Added Manta Graph badge to README - cosmetic
documentation change with no functional impact |
| helix-db/src/helix_engine/traversal_core/ops/in_/in_e.rs | 5/5 |
Refactored to use arena-based lifetimes ('arena) instead of owned data,
replacing separate InEdgesIterator struct with inline closures for
better memory management |
| helix-db/src/helix_gateway/worker_pool/mod.rs | 3/5 | Replaced flume
Selector with parity-based try_recv/recv pattern requiring even worker
count, but implementation has potential busy-wait issues that could
cause high CPU usage |
</details>
</details>
<details><summary><h3>Sequence Diagram</h3></summary>
```mermaid
sequenceDiagram
participant Client
participant WorkerPool
participant Worker1 as Worker (parity=true)
participant Worker2 as Worker (parity=false)
participant Router
participant Storage
Client->>WorkerPool: process(request)
WorkerPool->>WorkerPool: Send request to req_rx channel
par Worker1 Loop (parity=true)
loop Every iteration
Worker1->>Worker1: try_recv(cont_rx) - non-blocking
alt Continuation available
Worker1->>Worker1: Execute continuation function
else Empty
Worker1->>Worker1: Skip (no busy wait here)
end
Worker1->>Worker1: recv(rx) - BLOCKS until request
alt Request received
Worker1->>Router: Route request to handler
Router->>Storage: Execute graph operation
Storage-->>Router: Return result
Router-->>Worker1: Response
Worker1->>WorkerPool: Send response via ret_chan
end
end
end
par Worker2 Loop (parity=false)
loop Every iteration
Worker2->>Worker2: try_recv(rx) - non-blocking
alt Request available
Worker2->>Router: Route request to handler
Router->>Storage: Execute graph operation
Storage-->>Router: Return result
Router-->>Worker2: Response
Worker2->>WorkerPool: Send response via ret_chan
else Empty
Worker2->>Worker2: Skip (no busy wait here)
end
Worker2->>Worker2: recv(cont_rx) - BLOCKS until continuation
alt Continuation received
Worker2->>Worker2: Execute continuation function
end
end
end
WorkerPool-->>Client: Response
```
</details>
<!-- greptile_other_comments_section -->
<!-- /greptile_comment -->File tree
236 files changed
+40641
-11200
lines changed- .github/workflows
- helix-cli
- src
- tests
- helix-container
- src
- helix-db
- benches
- src
- helix_engine
- bm25
- reranker
- adapters
- fusion
- models
- storage_core
- tests
- concurrency_tests
- traversal_tests
- traversal_core
- ops
- bm25
- in_
- out
- source
- util
- vectors
- vector_core
- helix_gateway
- builtin
- embedding_providers
- mcp
- router
- tests
- worker_pool
- helixc
- analyzer
- methods
- generator
- parser
- protocol
- custom_serde
- utils
- helix-macros/src
- hql-tests
- tests
- add_e_borrowed_ids
- add_n
- aggregate
- benchmarks
- cloud_queries
- contains
- count
- figoai
- first
- is_in
- knowledge_graphs
- nested_remappings
- rerankers
- series
- update
- metrics/src
Some content is hidden
Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
236 files changed
+40641
-11200
lines changedThis file was deleted.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
0 commit comments