You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[feat] Add pcstore for enhanced PrefixCache performance (#393)
# Purpose
Add `pcstore` for enhanced PrefixCache performance
# Modifications
- Data is paged out to Host and asynchronously written to SSD, freeing HBM space earlier.
- Reads and writes are aggregated at Block-level granularity to increase SSD I/O size and improve performance.
- For MLA models, data is loaded once from SSD and shared across Devices in DRAM, reducing SSD bandwidth pressure.
# Test
- `ucm/store/test/e2e/pcstore_embed.py`
- `ucm/store/test/e2e/pcstore_fetch.py`
0 commit comments