Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Commit 041f6c6

Browse files
author
Gabrielle Ong
authored
Merge pull request #1586 from janhq/docs
docs: Cortex.so page
2 parents fcf2fb3 + 2c44b1e commit 041f6c6

File tree

9 files changed

+174
-221
lines changed

9 files changed

+174
-221
lines changed

.github/ISSUE_TEMPLATE/bug_report.yml

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,14 @@ body:
99
required: true
1010
attributes:
1111
label: "Cortex version"
12-
description: "**Tip:** The version is in the app's bottom right corner"
13-
12+
description: "**Tip:** `cortex -v` outputs the version number"
13+
1414
- type: textarea
1515
validations:
1616
required: true
1717
attributes:
18-
label: "Describe the Bug"
19-
description: "A clear & concise description of the bug"
18+
label: "Describe the issue and expected behaviour"
19+
description: "A clear & concise description of the issue encountered"
2020

2121
- type: textarea
2222
attributes:
@@ -31,20 +31,30 @@ body:
3131
attributes:
3232
label: "Screenshots / Logs"
3333
description: |
34-
You can find logs in: ~/cortex/logs
34+
Please include cortex-cli.log and cortex.log files in: ~/cortex/logs/
3535
3636
- type: checkboxes
3737
attributes:
3838
label: "What is your OS?"
3939
options:
40-
- label: MacOS
4140
- label: Windows
42-
- label: Linux
41+
- label: Mac Silicon
42+
- label: Mac Intel
43+
- label: Linux / Ubuntu
4344

4445
- type: checkboxes
4546
attributes:
4647
label: "What engine are you running?"
4748
options:
4849
- label: cortex.llamacpp (default)
4950
- label: cortex.tensorrt-llm (Nvidia GPUs)
50-
- label: cortex.onnx (NPUs, DirectML)
51+
- label: cortex.onnx (NPUs, DirectML)
52+
53+
- type: input
54+
validations:
55+
required: true
56+
attributes:
57+
label: "Hardware Specs eg OS version, GPU"
58+
description:
59+
60+

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ Cortex is a Local AI API Platform that is used to run and customize LLMs.
2828
Key Features:
2929
- Straightforward CLI (inspired by Ollama)
3030
- Full C++ implementation, packageable into Desktop and Mobile apps
31-
- Pull from Huggingface of Cortex Built-in Model Library
31+
- Pull from Huggingface, or Cortex Built-in Models
3232
- Models stored in universal file formats (vs blobs)
3333
- Swappable Engines (default: [`llamacpp`](https://github.com/janhq/cortex.llamacpp), future: [`ONNXRuntime`](https://github.com/janhq/cortex.onnx), [`TensorRT-LLM`](https://github.com/janhq/cortex.tensorrt-llm))
3434
- Cortex can be deployed as a standalone API server, or integrated into apps like [Jan.ai](https://jan.ai/)
@@ -88,22 +88,22 @@ Refer to our [Quickstart](https://cortex.so/docs/quickstart/) and
8888
### API:
8989
Cortex.cpp includes a REST API accessible at `localhost:39281`.
9090

91-
Refer to our [API documentation](https://cortex.so/api-reference) for more details
91+
Refer to our [API documentation](https://cortex.so/api-reference) for more details.
9292

93-
## Models & Quantizations
93+
## Models
9494

9595
Cortex.cpp allows users to pull models from multiple Model Hubs, offering flexibility and extensive model access.
9696

9797
Currently Cortex supports pulling from:
98-
- Hugging Face: GGUF models eg `author/Model-GGUF`
98+
- [Hugging Face](https://huggingface.co): GGUF models eg `author/Model-GGUF`
9999
- Cortex Built-in Models
100100

101101
Once downloaded, the model `.gguf` and `model.yml` files are stored in `~\cortexcpp\models`.
102102

103103
> **Note**:
104104
> You should have at least 8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
105105
106-
### Cortex Model Hub & Quantizations
106+
### Cortex Built-in Models & Quantizations
107107

108108
| Model /Engine | llama.cpp | Command |
109109
| -------------- | --------------------- | ----------------------------- |

docs/docs/architecture.mdx

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -148,7 +148,3 @@ Our development roadmap outlines key features and epics we will focus on in the
148148

149149
- **RAG**: Improve response quality and contextual relevance in our AI models.
150150
- **Cortex Python Runtime**: Provide a scalable Python execution environment for Cortex.
151-
152-
:::info
153-
For a full list of Cortex development roadmap, please see [here](https://discord.com/channels/1107178041848909847/1230770299730001941).
154-
:::

docs/docs/overview.mdx

Lines changed: 67 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -10,39 +10,82 @@ import TabItem from "@theme/TabItem";
1010

1111
# Cortex
1212

13-
:::warning
14-
🚧 Cortex.cpp is currently under development. Our documentation outlines the intended behavior of Cortex, which may not yet be fully implemented in the codebase.
13+
:::info
14+
**Real-world Use**: Cortex.cpp powers [Jan](https://jan.ai), our on-device ChatGPT-alternative.
15+
16+
Cortex.cpp is in active development. If you have any questions, please reach out to us on [GitHub](https://github.com/janhq/cortex.cpp/issues/new/choose)
17+
or [Discord](https://discord.com/invite/FTk2MvZwJH)
1518
:::
1619

1720
![Cortex Cover Image](/img/social-card.jpg)
1821

19-
Cortex.cpp lets you run AI easily on your computer.
20-
21-
Cortex.cpp is a C++ command-line interface (CLI) designed as an alternative to Ollama. By default, it runs on the `llama.cpp` engine but also supports other engines, including `ONNX` and `TensorRT-LLM`, making it a multi-engine platform.
22+
Cortex is a Local AI API Platform that is used to run and customize LLMs.
2223

23-
## Supported Accelerators
24-
- Nvidia CUDA
25-
- Apple Metal
26-
- Qualcomm AI Engine
24+
Key Features:
25+
- Straightforward CLI (inspired by Ollama)
26+
- Full C++ implementation, packageable into Desktop and Mobile apps
27+
- Pull from Huggingface, or Cortex Built-in Model Library
28+
- Models stored in universal file formats (vs blobs)
29+
- Swappable Inference Backends (default: [`llamacpp`](https://github.com/janhq/cortex.llamacpp), future: [`ONNXRuntime`](https://github.com/janhq/cortex.onnx), [`TensorRT-LLM`](https://github.com/janhq/cortex.tensorrt-llm))
30+
- Cortex can be deployed as a standalone API server, or integrated into apps like [Jan.ai](https://jan.ai/)
2731

28-
## Supported Inference Backends
29-
- [llama.cpp](https://github.com/ggerganov/llama.cpp): cross-platform, supports most laptops, desktops and OSes
30-
- [ONNX Runtime](https://github.com/microsoft/onnxruntime): supports Windows Copilot+ PCs & NPUs
31-
- [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM): supports Nvidia GPUs
32-
33-
If GPU hardware is available, Cortex is GPU accelerated by default.
34-
35-
:::info
36-
**Real-world Use**: Cortex.cpp powers [Jan](https://jan.ai), our on-device ChatGPT-alternative.
32+
Cortex's roadmap is to implement the full OpenAI API including Tools, Runs, Multi-modal and Realtime APIs.
3733

38-
Cortex.cpp has been battle-tested across 1 million+ downloads and handles a variety of hardware configurations.
39-
:::
4034

41-
## Supported Models
35+
## Inference Backends
36+
- Default: [llama.cpp](https://github.com/ggerganov/llama.cpp): cross-platform, supports most laptops, desktops and OSes
37+
- Future: [ONNX Runtime](https://github.com/microsoft/onnxruntime): supports Windows Copilot+ PCs & NPUs
38+
- Future: [TensorRT-LLM](https://github.com/NVIDIA/TensorRT-LLM): supports Nvidia GPUs
4239

43-
Cortex.cpp supports the following list of [Built-in Models](/models):
40+
If GPU hardware is available, Cortex is GPU accelerated by default.
4441

45-
<Tabs>
42+
## Models
43+
Cortex.cpp allows users to pull models from multiple Model Hubs, offering flexibility and extensive model access.
44+
- [Hugging Face](https://huggingface.co)
45+
- [Cortex Built-in Models](https://cortex.so/models)
46+
47+
> **Note**:
48+
> As a very general guide: You should have >8 GB of RAM available to run the 7B models, 16 GB to run the 14B models, and 32 GB to run the 32B models.
49+
50+
### Cortex Built-in Models & Quantizations
51+
| Model /Engine | llama.cpp | Command |
52+
| -------------- | --------------------- | ----------------------------- |
53+
| phi-3.5 || cortex run phi3.5 |
54+
| llama3.2 || cortex run llama3.1 |
55+
| llama3.1 || cortex run llama3.1 |
56+
| codestral || cortex run codestral |
57+
| gemma2 || cortex run gemma2 |
58+
| mistral || cortex run mistral |
59+
| ministral || cortex run ministral |
60+
| qwen2 || cortex run qwen2.5 |
61+
| openhermes-2.5 || cortex run openhermes-2.5 |
62+
| tinyllama || cortex run tinyllama |
63+
64+
View all [Cortex Built-in Models](https://cortex.so/models).
65+
66+
Cortex supports multiple quantizations for each model.
67+
```
68+
❯ cortex-nightly pull llama3.2
69+
Downloaded models:
70+
llama3.2:3b-gguf-q2-k
71+
72+
Available to download:
73+
1. llama3.2:3b-gguf-q3-kl
74+
2. llama3.2:3b-gguf-q3-km
75+
3. llama3.2:3b-gguf-q3-ks
76+
4. llama3.2:3b-gguf-q4-km (default)
77+
5. llama3.2:3b-gguf-q4-ks
78+
6. llama3.2:3b-gguf-q5-km
79+
7. llama3.2:3b-gguf-q5-ks
80+
8. llama3.2:3b-gguf-q6-k
81+
9. llama3.2:3b-gguf-q8-0
82+
83+
Select a model (1-9):
84+
```
85+
86+
87+
{/*
88+
<Tabs>
4689
<TabItem value="Llama.cpp" label="Llama.cpp" default>
4790
| Model ID | Variant (Branch) | Model size | CLI command |
4891
|------------------|------------------|-------------------|------------------------------------|
@@ -86,17 +129,4 @@ Cortex.cpp supports the following list of [Built-in Models](/models):
86129
| openhermes-2.5 | 7b-tensorrt-llm-linux-ada | 7B | `cortex run openhermes-2.5:7b-tensorrt-llm-linux-ada`|
87130
88131
</TabItem>
89-
</Tabs>
90-
:::info
91-
Cortex.cpp supports pulling `GGUF` and `ONNX` models from the [Hugging Face Hub](https://huggingface.co). Read how to [Pull models from Hugging Face](/docs/hub/hugging-face/)
92-
:::
93-
94-
## Cortex.cpp Versions
95-
Cortex.cpp offers three different versions of the app, each serving a unique purpose:
96-
- **Stable**: The official release version of Cortex.cpp, designed for general use with proven stability.
97-
- **Beta**: This version includes upcoming features still in testing, allowing users to try new functionality before the next official release.
98-
- **Nightly**: Automatically built every night, this version includes the latest updates and changes from the engineering team but may be unstable.
99-
100-
:::info
101-
Each of these versions has a different CLI prefix command.
102-
:::
132+
</Tabs> */}

0 commit comments

Comments
 (0)