Skip to content

runpod-workers/deep-cogito-cogito-671b-v2.1-FP8

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Cogito Worker Banner


DeepCogito Cogito-671B-V2.1-FP8


Description

Cogito v2.1 671B is a large language model developed by DeepCogito, optimized for high-performance natural language processing tasks.

This model uses significantly fewer tokens amongst any similar capability models, because it has better reasoning capabilities. It also has improvements across instruction following, coding, longer queries, multi-turn and creativity.

On most industry benchmarks and our internal evals, the model performs competitively to frontier closed and open models.

GPU Specs

  • Cogito v2.1 minimum specs are either an 8X H200 or 8X B200.
  • A network volume of size 800GB is recommended to not redownload model.
  • Note model takes about 7 minutes to load into GPU, we recommend having an active worker for minimal delays.

Environment Variables

Optimal Environment Variables

DOWNLOAD_DIR=/runpod-volume
MAX_MODEL_LEN=16384
GPU_MEMORY_UTILIZATION=0.9
MAX_CONCURRENCY=40
HF_TOKEN=<HF TOKEN>
TENSOR_PARALLEL_SIZE=8
RUNPOD_INIT_TIMEOUT=1400
ENABLE_PREFIX_CACHING=True
MAX_NUM_BATCHED_TOKENS=2048
ENABLE_EXPERT_PARALLEL=True

Schema

{
  "prompt": {
    "type": "string",
    "required": false,
    "default": "What is Runpod?"
  },
  "system_prompt": {
    "type": "string",
    "required": false,
    "default": ""
  },
  "messages": {
    "type": "array",
    "required": false,
    "items": {
      "type": "object",
      "properties": {
        "role": {
          "type": "string",
          "enum": [
            "user",
            "assistant",
            "system"
          ]
        },
        "content": {
          "type": "string"
        }
      },
      "required": [
        "role",
        "content"
      ]
    }
  },
  "temperature": {
    "type": "number",
    "required": false,
    "default": 0.7,
    "minimum": 0,
    "maximum": 2
  },
  "top_k": {
    "type": "number",
    "required": false,
    "default": -1,
    "minimum": -1,
    "maximum": 100
  },
  "seed": {
    "type": "number",
    "required": false,
    "default": -1,
    "minimum": 0,
    "maximum": 4294967295
  },
  "top_p": {
    "type": "number",
    "required": false,
    "default": 1,
    "minimum": 0,
    "maximum": 1
  },
  "max_tokens": {
    "type": "number",
    "required": false,
    "default": 250,
    "minimum": 1,
    "maximum": 4096
  }
}

Input

Chat Template

{
    "input": {
      "messages": [
       {"role": "system", "content": "You are a helpful AI assistant. Provide clear and concise responses."},
        {"role": "user", "content": "What is Cogito and what makes it special?"}
      ]
    }
  }

Expected Output

{
  "delayTime": 25,
  "executionTime": 3153,
  "id": "sync-0f3288b5-58e8-46fd-ba73-53945f5e8982-u2",
  "output": [
    {
      "choices": [
        {
          "tokens": [
            "Cogito-671B is a large language model developed by DeepCogito. This model provides high-quality natural language processing capabilities while maintaining computational efficiency.\n\nKey features that make Cogito special include:\n\n1. **Efficiency**: The model is optimized for performance, providing good results with lower computational requirements compared to larger models.\n\n2. **Quality**: Despite its size, it maintains high-quality outputs for various NLP tasks including text generation, summarization, and question answering.\n\n3. **Enterprise Focus**: Cogito models are designed with enterprise use cases in mind, offering reliability and consistency.\n\n4. **Research Backing**: Built on DeepCogito's extensive research in AI and machine learning, incorporating the latest advances in transformer architecture and training methodologies.\n\n5. **Versatility**: Capable of handling a wide range of natural language tasks while being more accessible than larger, more resource-intensive models."
          ]
        }
      ],
      "cost": 0.0001,
      "usage": {
        "input": 10,
        "output": 100
      }
    }
  ],
  "status": "COMPLETED",
  "workerId": "pkej0t9bbyjrgy"
}

About

hub template for cogito-671b-v2.1-FP8

Resources

Stars

Watchers

Forks

Packages

No packages published