Skip to content

Commit cf8dd5b

Browse files
authored
Update readme.md
1 parent b023843 commit cf8dd5b

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

LLM/Quantization/readme.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -303,7 +303,7 @@ def generate(prompt, model, tokenizer, **kwargs):
303303
num_tokens = len(completion.split())
304304
latency = seconds_used*1000 / num_tokens
305305
token_per_sec = len(generator.tokenizer(completion)["input_ids"]) / seconds_used
306-
print(f"******\nTime used: {seconds_used:.3f} \nNumber of tokens: {num_tokens} \nseconds \nThroughput: {token_per_sec:.2f} Tokens/sec \nLatency: {latency:.2f} ms/token")
306+
print(f"******\nTotal time: {seconds_used:.3f} \nNumber of tokens: {num_tokens} \nseconds \nThroughput: {token_per_sec:.2f} Tokens/sec \nLatency: {latency:.2f} ms/token")
307307
```
308308

309309
Test the quantized model:

0 commit comments

Comments
 (0)