11# Cortex Monorepo
22
3- This monorepo contains two projects: CortexJS and CortexCPP.
4-
5- ## CortexJS: Stateful Business Backend
6-
7- * All of the stateful endpoints:
8- + /threads
9- + /messages
10- + /models
11- + /runs
12- + /vector_store
13- + /settings
14- + /?auth
15- + …
16- * Database & Filesystem
17- * API Gateway
18- * Authentication & Authorization
19- * Observability
20-
21- ## CortexCPP: Stateless Embedding Backend
22-
23- * All of the high performance, stateless endpoints:
24- + /chat/completion
25- + /audio
26- + /fine_tuning
27- + /embeddings
28- + /load_model
29- + /unload_model
30- * Kernel - Hardware Recognition
31-
32- ## Project Structure
33-
34- ```
35- .
36- ├── cortex-js/
37- │ ├── package.json
38- │ ├── README.md
39- │ ├── Dockerfile
40- │ ├── docker-compose.yml
41- │ ├── src/
42- │ │ ├── controllers/
43- │ │ ├── modules/
44- │ │ ├── services/
45- │ │ └── ...
46- │ └── ...
47- ├── cortex-cpp/
48- │ ├── app/
49- │ │ ├── controllers/
50- │ │ ├── models/
51- │ │ ├── services/
52- │ │ ├── ?engines/
53- │ │ │ ├── llama.cpp
54- │ │ │ ├── tensorrt-llm
55- │ │ │ └── ...
56- │ │ └── ...
57- │ ├── CMakeLists.txt
58- │ ├── config.json
59- │ ├── Dockerfile
60- │ ├── docker-compose.yml
61- │ ├── README.md
62- │ └── ...
63- ├── scripts/
64- │ └── ...
65- ├── README.md
66- ├── package.json
67- ├── Dockerfile
68- ├── docker-compose.yml
69- └── docs/
70- └── ...
71- ```
72-
73- # Install
3+ # Installation
4+
745## Prerequisites
756
767### ** Dependencies**
@@ -91,17 +22,18 @@ Before installation, ensure that you have installed the following:
9122Ensure that your system meets the following requirements to run Cortex:
9223
9324- ** OS** :
94- - MacOSX 13.6 or higher.
95- - Windows 10 or higher.
96- - Ubuntu 12.04 and later.
25+ - MacOSX 13.6 or higher.
26+ - Windows 10 or higher.
27+ - Ubuntu 12.04 and later.
9728- ** RAM (CPU Mode):**
98- - 8GB for running up to 3B models.
99- - 16GB for running up to 7B models.
100- - 32GB for running up to 13B models.
29+ - 8GB for running up to 3B models.
30+ - 16GB for running up to 7B models.
31+ - 32GB for running up to 13B models.
10132- ** VRAM (GPU Mode):**
102- - 6GB can load the 3B model (int4) with ` ngl ` at 120 ~ full speed on CPU/ GPU.
103- - 8GB can load the 7B model (int4) with ` ngl ` at 120 ~ full speed on CPU/ GPU.
104- - 12GB can load the 13B model (int4) with ` ngl ` at 120 ~ full speed on CPU/ GPU.
33+
34+ - 6GB can load the 3B model (int4) with ` ngl ` at 120 ~ full speed on CPU/ GPU.
35+ - 8GB can load the 7B model (int4) with ` ngl ` at 120 ~ full speed on CPU/ GPU.
36+ - 12GB can load the 13B model (int4) with ` ngl ` at 120 ~ full speed on CPU/ GPU.
10537
10638- ** Disk** : At least 10GB for app and model download.
10739
@@ -152,6 +84,7 @@ cortex init
15284> Nvidia
15385 Others (Vulkan)
15486```
87+
155883 . Select CPU instructions (will be deprecated soon).
15689
15790``` bash
@@ -165,22 +98,27 @@ cortex init
165982 . Once downloaded, Cortex is ready to use!
16699
167100### Step 4: Pull a model
101+
168102From HuggingFace
103+
169104``` bash
170105cortex pull janhq/phi-3-medium-128k-instruct-GGUF
171106```
172107
173108From Jan Hub (TBD)
109+
174110``` bash
175111cortex pull llama3
176112```
177113
178114### Step 5: Chat
115+
179116``` bash
180117cortex run janhq/phi-3-medium-128k-instruct-GGUF
181118```
182119
183120## Run as an API server
121+
184122``` bash
185123cortex serve
186124```
0 commit comments