Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Commit 6c6ec99

Browse files
committed
update readme
1 parent bcf07ed commit 6c6ec99

File tree

1 file changed

+41
-43
lines changed

1 file changed

+41
-43
lines changed

README.md

Lines changed: 41 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -17,11 +17,9 @@
1717
- Quick Setup: Approximately 10-second initialization for swift deployment.
1818
- Enhanced Web Framework: Incorporates drogon cpp to boost web service efficiency.
1919

20-
## Documentation
21-
2220
## About Nitro
2321

24-
Nitro is a light-weight integration layer (and soon to be inference engine) for cutting edge inference engine, make deployment of AI models easier than ever before!
22+
Nitro is a high-efficiency C++ inference engine for edge computing, powering [Jan](https://jan.ai/). It is lightweight and embeddable, ideal for product integration.
2523

2624
The binary of nitro after zipped is only ~3mb in size with none to minimal dependencies (if you use a GPU need CUDA for example) make it desirable for any edge/server deployment 👍.
2725

@@ -40,37 +38,57 @@ The binary of nitro after zipped is only ~3mb in size with none to minimal depen
4038

4139
## Quickstart
4240

43-
**Step 1: Download Nitro**
41+
**Step 1: Install Nitro**
4442

45-
To use Nitro, download the released binaries from the release page below:
43+
- For Linux and MacOS
4644

47-
[![Download Nitro](https://img.shields.io/badge/Download-Nitro-blue.svg)](https://github.com/janhq/nitro/releases)
45+
```bash
46+
curl -sfL https://raw.githubusercontent.com/janhq/nitro/main/install.sh | sudo /bin/bash -
47+
```
4848

49-
After downloading the release, double-click on the Nitro binary.
49+
- For Windows
5050

51-
**Step 2: Download a Model**
51+
```bash
52+
powershell -Command "& { Invoke-WebRequest -Uri 'https://raw.githubusercontent.com/janhq/nitro/main/install.bat' -OutFile 'install.bat'; .\install.bat; Remove-Item -Path 'install.bat' }"
53+
```
5254

53-
Download a llama model to try running the llama C++ integration. You can find a "GGUF" model on The Bloke's page below:
55+
**Step 2: Downloading a Model**
5456

55-
[![Download Model](https://img.shields.io/badge/Download-Model-green.svg)](https://huggingface.co/TheBloke)
57+
```bash
58+
mkdir model && cd model
59+
wget -O llama-2-7b-model.gguf https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_M.gguf?download=true
60+
```
5661

57-
**Step 3: Run Nitro**
62+
**Step 3: Run Nitro server**
5863

59-
Double-click on Nitro to run it. After downloading your model, make sure it's saved to a specific path. Then, make an API call to load your model into Nitro.
64+
```bash title="Run Nitro server"
65+
nitro
66+
```
6067

68+
**Step 4: Load model**
6169

62-
```zsh
63-
curl -X POST 'http://localhost:3928/inferences/llamacpp/loadmodel' \
70+
```bash title="Load model"
71+
curl http://localhost:3928/inferences/llamacpp/loadmodel \
6472
-H 'Content-Type: application/json' \
6573
-d '{
66-
"llama_model_path": "/path/to/your_model.gguf",
67-
"ctx_len": 2048,
74+
"llama_model_path": "/model/llama-2-7b-model.gguf",
75+
"ctx_len": 512,
6876
"ngl": 100,
69-
"embedding": true,
70-
"n_parallel": 4,
71-
"pre_prompt": "A chat between a curious user and an artificial intelligence",
72-
"user_prompt": "USER: ",
73-
"ai_prompt": "ASSISTANT: "
77+
}'
78+
```
79+
80+
**Step 5: Making an Inference**
81+
82+
```bash title="Nitro Inference"
83+
curl http://localhost:3928/v1/chat/completions \
84+
-H "Content-Type: application/json" \
85+
-d '{
86+
"messages": [
87+
{
88+
"role": "user",
89+
"content": "Who won the world series in 2020?"
90+
},
91+
]
7492
}'
7593
```
7694

@@ -89,7 +107,6 @@ Table of parameters
89107
| `system_prompt` | String | The prompt to use for system rules. |
90108
| `pre_prompt` | String | The prompt to use for internal configuration. |
91109

92-
93110
***OPTIONAL***: You can run Nitro on a different port like 5000 instead of 3928 by running it manually in terminal
94111
```zsh
95112
./nitro 1 127.0.0.1 5000 ([thread_num] [host] [port])
@@ -98,32 +115,13 @@ Table of parameters
98115
- host : host value normally 127.0.0.1 or 0.0.0.0
99116
- port : the port that nitro got deployed onto
100117

101-
**Step 4: Perform Inference on Nitro for the First Time**
102-
103-
```zsh
104-
curl --location 'http://localhost:3928/inferences/llamacpp/chat_completion' \
105-
--header 'Content-Type: application/json' \
106-
--header 'Accept: text/event-stream' \
107-
--header 'Access-Control-Allow-Origin: *' \
108-
--data '{
109-
"messages": [
110-
{"content": "Hello there 👋", "role": "assistant"},
111-
{"content": "Can you write a long story", "role": "user"}
112-
],
113-
"stream": true,
114-
"model": "gpt-3.5-turbo",
115-
"max_tokens": 2000
116-
}'
117-
```
118-
119118
Nitro server is compatible with the OpenAI format, so you can expect the same output as the OpenAI ChatGPT API.
120119

121120
## Compile from source
122-
To compile nitro please visit [Compile from source](docs/manual_install.md)
121+
To compile nitro please visit [Compile from source](docs/new/build-source.md)
123122

124123
### Contact
125124

126125
- For support, please file a GitHub ticket.
127126
- For questions, join our Discord [here](https://discord.gg/FTk2MvZwJH).
128-
- For long-form inquiries, please email hello@jan.ai.
129-
127+
- For long-form inquiries, please email hello@jan.ai.

0 commit comments

Comments
 (0)