Input

DeepCogito Cogito-671B-V2.1-FP8

Description

Cogito v2.1 671B is a large language model developed by DeepCogito, optimized for high-performance natural language processing tasks.

This model uses significantly fewer tokens amongst any similar capability models, because it has better reasoning capabilities. It also has improvements across instruction following, coding, longer queries, multi-turn and creativity.

On most industry benchmarks and our internal evals, the model performs competitively to frontier closed and open models.

GPU Specs

Cogito v2.1 minimum specs are either an 8X H200 or 8X B200.
A network volume of size 800GB is recommended to not redownload model.
Note model takes about 7 minutes to load into GPU, we recommend having an active worker for minimal delays.

Environment Variables

Optimal Environment Variables

DOWNLOAD_DIR=/runpod-volume
MAX_MODEL_LEN=16384
GPU_MEMORY_UTILIZATION=0.9
MAX_CONCURRENCY=40
HF_TOKEN=<HF TOKEN>
TENSOR_PARALLEL_SIZE=8
RUNPOD_INIT_TIMEOUT=1400
ENABLE_PREFIX_CACHING=True
MAX_NUM_BATCHED_TOKENS=2048
ENABLE_EXPERT_PARALLEL=True

Schema

{
  "prompt": {
    "type": "string",
    "required": false,
    "default": "What is Runpod?"
  },
  "system_prompt": {
    "type": "string",
    "required": false,
    "default": ""
  },
  "messages": {
    "type": "array",
    "required": false,
    "items": {
      "type": "object",
      "properties": {
        "role": {
          "type": "string",
          "enum": [
            "user",
            "assistant",
            "system"
          ]
        },
        "content": {
          "type": "string"
        }
      },
      "required": [
        "role",
        "content"
      ]
    }
  },
  "temperature": {
    "type": "number",
    "required": false,
    "default": 0.7,
    "minimum": 0,
    "maximum": 2
  },
  "top_k": {
    "type": "number",
    "required": false,
    "default": -1,
    "minimum": -1,
    "maximum": 100
  },
  "seed": {
    "type": "number",
    "required": false,
    "default": -1,
    "minimum": 0,
    "maximum": 4294967295
  },
  "top_p": {
    "type": "number",
    "required": false,
    "default": 1,
    "minimum": 0,
    "maximum": 1
  },
  "max_tokens": {
    "type": "number",
    "required": false,
    "default": 250,
    "minimum": 1,
    "maximum": 4096
  }
}

Input

Chat Template

{
    "input": {
      "messages": [
       {"role": "system", "content": "You are a helpful AI assistant. Provide clear and concise responses."},
        {"role": "user", "content": "What is Cogito and what makes it special?"}
      ]
    }
  }

Expected Output

{
  "delayTime": 25,
  "executionTime": 3153,
  "id": "sync-0f3288b5-58e8-46fd-ba73-53945f5e8982-u2",
  "output": [
    {
      "choices": [
        {
          "tokens": [
            "Cogito-671B is a large language model developed by DeepCogito. This model provides high-quality natural language processing capabilities while maintaining computational efficiency.\n\nKey features that make Cogito special include:\n\n1. **Efficiency**: The model is optimized for performance, providing good results with lower computational requirements compared to larger models.\n\n2. **Quality**: Despite its size, it maintains high-quality outputs for various NLP tasks including text generation, summarization, and question answering.\n\n3. **Enterprise Focus**: Cogito models are designed with enterprise use cases in mind, offering reliability and consistency.\n\n4. **Research Backing**: Built on DeepCogito's extensive research in AI and machine learning, incorporating the latest advances in transformer architecture and training methodologies.\n\n5. **Versatility**: Capable of handling a wide range of natural language tasks while being more accessible than larger, more resource-intensive models."
          ]
        }
      ],
      "cost": 0.0001,
      "usage": {
        "input": 10,
        "output": 100
      }
    }
  ],
  "status": "COMPLETED",
  "workerId": "pkej0t9bbyjrgy"
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.runpod		.runpod
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
handler.py		handler.py
test_input.json		test_input.json
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Description

GPU Specs

Environment Variables

Schema

Input

Chat Template

Expected Output

About

Uh oh!

Releases 1

Packages

Languages

runpod-workers/deep-cogito-cogito-671b-v2.1-FP8

Folders and files

Latest commit

History

Repository files navigation

Description

GPU Specs

Environment Variables

Schema

Input

Chat Template

Expected Output

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages