Skip to content

Commit ace918a

Browse files
committed
tweaks
Signed-off-by: Juncheng Gu <jcgu@google.com>
1 parent aca95f1 commit ace918a

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

tpu_inference/distributed/offload/tpu_offload_connector.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -246,7 +246,7 @@ def update(self, new_block_ids: list[int], new_token_ids: list[int]):
246246
self.block_ids.extend(new_block_ids)
247247
self.token_ids.extend(new_token_ids)
248248

249-
# NOTE(jcgu): is it always true? will MTP affect this judegment?
249+
# NOTE(jcgu): is it always true? will MTP affect this judgement?
250250
# When a request is scheduled again, and the number of new tokens
251251
# is 1 (excluding chunked prefill), the request is in decode phase.
252252
if len(new_token_ids) == 1:
@@ -711,7 +711,7 @@ def _prepare_req_meta(
711711
has_new_tokens = adjusted_num_total_tokens > tracker.save_watermark
712712
should_save = False
713713
# Determine if a save is needed for this step
714-
# when there are new token KVs (adjusted by saving behavior):
714+
# when there are new token KVs:
715715
# 1. Prefill: always save
716716
# 2. Decode (with save_decode=True)
717717
# 2.1 regular decode (not finished): accumulate until getting a full block

0 commit comments

Comments
 (0)