Skip to content

Commit 91beef3

Browse files
authored
Update readme.md
1 parent 486d073 commit 91beef3

File tree

1 file changed

+15
-9
lines changed

1 file changed

+15
-9
lines changed

LLM/Quantization/readme.md

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -300,16 +300,22 @@ def generate(prompt, model, tokenizer, **kwargs):
300300
completion = generator(prompt)[0]['generated_text']
301301
seconds_used = time.time() - time_started
302302
print(completion)
303-
per_token = seconds_used / len(generator.tokenizer(completion)["input_ids"])
304-
print(f"******\nTime used: {seconds_used:.3f} seconds, {per_token:.3f} s/token")
303+
num_tokens = len(completion.split())
304+
latency = seconds_used*1000 / num_tokens
305+
token_per_sec = len(generator.tokenizer(completion)["input_ids"]) / seconds_used
306+
print(f"******\nTime used: {seconds_used:.3f} \nNumber of tokens: {num_tokens} \nseconds \nThroughput: {token_per_sec:.2f} Tokens/sec \nLatency: {latency:.2f} ms/token")
305307
```
306308

307-
Test the full model:
308-
generate("What's LLM quantization?", model_full, tokenizer)
309-
Output:
310-
TBA
311-
312309
Test the quantized model:
313-
generate("What's LLM quantization?", model_quantized, tokenizer_q)
310+
generate("What's AI?", model_quantized, tokenizer_q)
311+
314312
Output:
315-
TBA
313+
The AI 101 series is a collection of articles that will introduce you to the basics of artificial intelligence (AI). In this first article, we're going to talk about the history of AI, and how it has evolved over the years.
314+
The first AI system was created in the 1950s, and it was called the Logic Theorist. This system was able to solve mathematical problems using a set of rules. The Logic Theorist was followed by other AI systems, such as the General Problem Solver and the Game of Checkers.
315+
In the 1960s, AI researchers began to focus on developing systems that could understand natural language. This led to the development of the first chatbot, named ELIZA. ELIZA was able to hold a conversation with a human user by responding to their questions with pre-programmed responses.
316+
In the 1970s, AI researchers began to focus on developing systems that could learn from data. This led to the development of the first expert system, named MYCIN. MYCIN was able to diagnose diseases by analyzing data from medical records.
317+
******
318+
Time used: 28.659 seconds
319+
Number of tokens: 176
320+
Throughput: 9.00 Tokens/sec
321+
Latency: 162.83 ms/token

0 commit comments

Comments
 (0)