DeepCogito Cogito-671B-V2.1-FP8
Cogito v2.1 671B is a large language model developed by DeepCogito, optimized for high-performance natural language processing tasks.
This model uses significantly fewer tokens amongst any similar capability models, because it has better reasoning capabilities. It also has improvements across instruction following, coding, longer queries, multi-turn and creativity.
On most industry benchmarks and our internal evals, the model performs competitively to frontier closed and open models.
- Cogito v2.1 minimum specs are either an 8X
H200or 8XB200. - A network volume of size
800GBis recommended to not redownload model. - Note model takes about 7 minutes to load into GPU, we recommend having an active worker for minimal delays.
Optimal Environment Variables
DOWNLOAD_DIR=/runpod-volume
MAX_MODEL_LEN=16384
GPU_MEMORY_UTILIZATION=0.9
MAX_CONCURRENCY=40
HF_TOKEN=<HF TOKEN>
TENSOR_PARALLEL_SIZE=8
RUNPOD_INIT_TIMEOUT=1400
ENABLE_PREFIX_CACHING=True
MAX_NUM_BATCHED_TOKENS=2048
ENABLE_EXPERT_PARALLEL=True{
"prompt": {
"type": "string",
"required": false,
"default": "What is Runpod?"
},
"system_prompt": {
"type": "string",
"required": false,
"default": ""
},
"messages": {
"type": "array",
"required": false,
"items": {
"type": "object",
"properties": {
"role": {
"type": "string",
"enum": [
"user",
"assistant",
"system"
]
},
"content": {
"type": "string"
}
},
"required": [
"role",
"content"
]
}
},
"temperature": {
"type": "number",
"required": false,
"default": 0.7,
"minimum": 0,
"maximum": 2
},
"top_k": {
"type": "number",
"required": false,
"default": -1,
"minimum": -1,
"maximum": 100
},
"seed": {
"type": "number",
"required": false,
"default": -1,
"minimum": 0,
"maximum": 4294967295
},
"top_p": {
"type": "number",
"required": false,
"default": 1,
"minimum": 0,
"maximum": 1
},
"max_tokens": {
"type": "number",
"required": false,
"default": 250,
"minimum": 1,
"maximum": 4096
}
}{
"input": {
"messages": [
{"role": "system", "content": "You are a helpful AI assistant. Provide clear and concise responses."},
{"role": "user", "content": "What is Cogito and what makes it special?"}
]
}
}{
"delayTime": 25,
"executionTime": 3153,
"id": "sync-0f3288b5-58e8-46fd-ba73-53945f5e8982-u2",
"output": [
{
"choices": [
{
"tokens": [
"Cogito-671B is a large language model developed by DeepCogito. This model provides high-quality natural language processing capabilities while maintaining computational efficiency.\n\nKey features that make Cogito special include:\n\n1. **Efficiency**: The model is optimized for performance, providing good results with lower computational requirements compared to larger models.\n\n2. **Quality**: Despite its size, it maintains high-quality outputs for various NLP tasks including text generation, summarization, and question answering.\n\n3. **Enterprise Focus**: Cogito models are designed with enterprise use cases in mind, offering reliability and consistency.\n\n4. **Research Backing**: Built on DeepCogito's extensive research in AI and machine learning, incorporating the latest advances in transformer architecture and training methodologies.\n\n5. **Versatility**: Capable of handling a wide range of natural language tasks while being more accessible than larger, more resource-intensive models."
]
}
],
"cost": 0.0001,
"usage": {
"input": 10,
"output": 100
}
}
],
"status": "COMPLETED",
"workerId": "pkej0t9bbyjrgy"
}