Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Commit 0a56d50

Browse files
committed
Overview - models
1 parent e6a5548 commit 0a56d50

File tree

1 file changed

+67
-37
lines changed

1 file changed

+67
-37
lines changed

docs/docs/overview.mdx

Lines changed: 67 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -10,39 +10,82 @@ import TabItem from "@theme/TabItem";
1010

1111
# Cortex
1212

13-
:::warning
14-
🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
13+
:::info
14+
**Real-world Use**: Cortex.cpp powers [Jan](https://jan.ai), our on-device ChatGPT-alternative.
15+
16+
Cortex.cpp is in active development. If you have any questions, please reach out to us on [GitHub](https://github.com/janhq/cortex.cpp/issues/new/choose)
17+
or [Discord](https://discord.com/invite/FTk2MvZwJH)
1518
:::
1619

1720
![Cortex Cover Image](/img/social-card.jpg)
1821

19-
Cortex.cpp lets you run AI easily on your computer.
20-
21-
Cortex.cpp is a C++ command-line interface (CLI) designed as an alternative to Ollama. By default, it runs on the `llama.cpp` engine but also supports other engines, including `ONNX` and `TensorRT-LLM`, making it a multi-engine platform.
22+
Cortex is a Local AI API Platform that is used to run and customize LLMs.
2223

23-
## Supported Accelerators
24-
- Nvidia CUDA
25-
- Apple Metal
26-
- Qualcomm AI Engine
24+
Key Features:
25+
- Straightforward CLI (inspired by Ollama)
26+
- Full C++ implementation, packageable into Desktop and Mobile apps
27+
- Pull from Huggingface, or Cortex Built-in Model Library
28+
- Models stored in universal file formats (vs blobs)
29+
- Swappable Inference Backends (default: [`llamacpp`](https://github.com/janhq/cortex.llamacpp), future: [`ONNXRuntime`](https://github.com/janhq/cortex.onnx), [`TensorRT-LLM`](https://github.com/janhq/cortex.tensorrt-llm))
30+
- Cortex can be deployed as a standalone API server, or integrated into apps like [Jan.ai](https://jan.ai/)
2731

28-
## Supported Inference Backends
29-
- [llama.cpp](https://github.com/ggerganov/llama.cpp): cross-platform, supports most laptops, desktops and OSes
30-
- [ONNX Runtime](https://github.com/microsoft/onnxruntime): supports Windows Copilot+ PCs & NPUs
31-
- [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM): supports Nvidia GPUs
32-
33-
If GPU hardware is available, Cortex is GPU accelerated by default.
34-
35-
:::info
36-
**Real-world Use**: Cortex.cpp powers [Jan](https://jan.ai), our on-device ChatGPT-alternative.
32+
Cortex's roadmap is to implement the full OpenAI API including Tools, Runs, Multi-modal and Realtime APIs.
3733

38-
Cortex.cpp has been battle-tested across 1 million+ downloads and handles a variety of hardware configurations.
39-
:::
4034

41-
## Supported Models
35+
## Inference Backends
36+
- Default: [llama.cpp](https://github.com/ggerganov/llama.cpp): cross-platform, supports most laptops, desktops and OSes
37+
- Future: [ONNX Runtime](https://github.com/microsoft/onnxruntime): supports Windows Copilot+ PCs & NPUs
38+
- Future: [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM): supports Nvidia GPUs
4239

43-
Cortex.cpp supports the following list of [Built-in Models](/models):
40+
If GPU hardware is available, Cortex is GPU accelerated by default.
4441

45-
<Tabs>
42+
## Models
43+
Cortex.cpp allows users to pull models from multiple Model Hubs, offering flexibility and extensive model access.
44+
- [Hugging Face](https://huggingface.co)
45+
- [Cortex Built-in Models](https://cortex.so/models)
46+
47+
> **Note**:
48+
> As a very general guide: You should have >8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
49+
50+
### Cortex Built-in Models & Quantizations
51+
| Model /Engine | llama.cpp | Command |
52+
| -------------- | --------------------- | ----------------------------- |
53+
| phi-3.5 || cortex run phi3.5 |
54+
| llama3.2 || cortex run llama3.1 |
55+
| llama3.1 || cortex run llama3.1 |
56+
| codestral || cortex run codestral |
57+
| gemma2 || cortex run gemma2 |
58+
| mistral || cortex run mistral |
59+
| ministral || cortex run ministral |
60+
| qwen2 || cortex run qwen2.5 |
61+
| openhermes-2.5 || cortex run openhermes-2.5 |
62+
| tinyllama || cortex run tinyllama |
63+
64+
View all [Cortex Built-in Models](https://cortex.so/models).
65+
66+
Cortex supports multiple quantizations for each model.
67+
```
68+
❯ cortex-nightly pull llama3.2
69+
Downloaded models:
70+
llama3.2:3b-gguf-q2-k
71+
72+
Available to download:
73+
1. llama3.2:3b-gguf-q3-kl
74+
2. llama3.2:3b-gguf-q3-km
75+
3. llama3.2:3b-gguf-q3-ks
76+
4. llama3.2:3b-gguf-q4-km (default)
77+
5. llama3.2:3b-gguf-q4-ks
78+
6. llama3.2:3b-gguf-q5-km
79+
7. llama3.2:3b-gguf-q5-ks
80+
8. llama3.2:3b-gguf-q6-k
81+
9. llama3.2:3b-gguf-q8-0
82+
83+
Select a model (1-9):
84+
```
85+
86+
87+
{/*
88+
<Tabs>
4689
<TabItem value="Llama.cpp" label="Llama.cpp" default>
4790
| Model ID | Variant (Branch) | Model size | CLI command |
4891
|------------------|------------------|-------------------|------------------------------------|
@@ -86,17 +129,4 @@ Cortex.cpp supports the following list of [Built-in Models](/models):
86129
| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
87130
88131
</TabItem>
89-
</Tabs>
90-
:::info
91-
Cortex.cpp supports pulling `GGUF` and `ONNX` models from the [Hugging Face Hub](https://huggingface.co). Read how to [Pull models from Hugging Face](/docs/hub/hugging-face/)
92-
:::
93-
94-
## Cortex.cpp Versions
95-
Cortex.cpp offers three different versions of the app, each serving a unique purpose:
96-
- **Stable**: The official release version of Cortex.cpp, designed for general use with proven stability.
97-
- **Beta**: This version includes upcoming features still in testing, allowing users to try new functionality before the next official release.
98-
- **Nightly**: Automatically built every night, this version includes the latest updates and changes from the engineering team but may be unstable.
99-
100-
:::info
101-
Each of these versions has a different CLI prefix command.
102-
:::
132+
</Tabs> */}

0 commit comments

Comments
 (0)