Open
Conversation
Greptile SummaryThis PR implements dynamic resizing for embedding tables with significant architectural changes: Key Changes:
Critical Issue Found:
Architecture Improvements:
Concerns:
Confidence Score: 1/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant App as Application
participant Table as DynamicEmbeddingTable
participant KVMap as KeyIndexMap (Hash Table)
participant VMM as VMMTensor/HostVMMTensor
participant Buffer as ExtendableBuffer
participant CUDA as CUDA/CU Driver API
Note over App,CUDA: Initialization Flow
App->>Table: Create table with capacity
Table->>Buffer: Create VmmDeviceBuffer/RegisteredHostBuffer
Buffer->>VMM: Initialize VMMTensor(capacity, dtype, device)
VMM->>CUDA: cuMemAddressReserve (reserve VA space)
VMM->>CUDA: cuMemCreate + cuMemMap (allocate initial pages)
VMM->>CUDA: cuMemSetAccess (set permissions)
VMM-->>Buffer: Return tensor wrapper
Table->>KVMap: Create hash table with capacity
Note over App,CUDA: Insert Flow (NO_EVICTION Mode)
App->>Table: insert(keys, values)
Table->>Table: Generate scores = arange(valid_rows, valid_rows+n)
Table->>KVMap: insert_and_evict(keys, scores)
alt Hash table full (eviction occurs)
KVMap-->>Table: Return evicted_keys, evicted_scores
Table->>Table: load_from_table(evicted_scores, evicted_values)
Table->>Table: rehash(capacity * 2)
Table->>KVMap: Create new hash table (2x capacity)
Table->>Table: Re-insert all entries to new hash table
Table->>Buffer: extend(capacity * 2)
Buffer->>VMM: extend(new_capacity)
VMM->>CUDA: cuMemCreate + cuMemMap (map new pages)
VMM->>CUDA: cuMemSetAccess (set new page permissions)
Table->>Table: store_to_table(scores, values)
Table->>Table: Recursive insert(evicted_keys, evicted_values)
else No eviction
Table->>Table: store_to_table(scores, values)
end
Note over App,CUDA: Lookup and Update Flow
App->>Table: update(keys, grads)
Table->>KVMap: lookup(keys) → indices
Table->>CUDA: optimizer_update_kernel(grads, table, indices)
CUDA->>VMM: Read/write embedding table via mapped memory
Note over App,CUDA: Memory Extension
Note right of VMM: All extensions preserve base address
VMM->>CUDA: Map additional pages at offset
CUDA-->>VMM: Extended virtual address range
|
| policy = ScorePolicy.ASSIGN | ||
|
|
||
| if self._no_eviction and scores is None: | ||
| scores = torch.arrange( |
There was a problem hiding this comment.
syntax: torch.arrange is not a valid PyTorch function, should be torch.arange
Suggested change
| scores = torch.arrange( | |
| scores = torch.arange( |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Checklist