openai · jrusso1020 · Apr 28, 2026 · Apr 28, 2026 · Apr 28, 2026 · Apr 28, 2026
diff --git a/.agents/plugins/marketplace.json b/.agents/plugins/marketplace.json
@@ -1406,6 +1406,18 @@
       },
       "category": "Design"
     },
+    {
+      "name": "heygen",
+      "source": {
+        "source": "local",
+        "path": "./plugins/heygen"
+      },
+      "policy": {
+        "installation": "AVAILABLE",
+        "authentication": "ON_INSTALL"
+      },
+      "category": "Design"
+    },
     {
       "name": "supabase",
       "source": {

diff --git a/plugins/heygen/.app.json b/plugins/heygen/.app.json
@@ -0,0 +1,7 @@
+{
+  "apps": {
+    "heygen": {
+      "id": "asdk_app_69418aad55e08191aa5e437b649ca2e4"
+    }
+  }
+}
diff --git a/plugins/heygen/.codex-plugin/plugin.json b/plugins/heygen/.codex-plugin/plugin.json
@@ -0,0 +1,44 @@
+{
+  "name": "heygen",
+  "version": "2.2.0",
+  "description": "Create HeyGen avatar videos and personalized video messages. Build a persistent digital identity from a photo, then generate presenter-led videos with your digital twin.",
+  "author": {
+    "name": "HeyGen",
+    "email": "developers@heygen.com",
+    "url": "https://heygen.com"
+  },
+  "homepage": "https://heygen.com",
+  "repository": "https://github.com/heygen-com/skills",
+  "license": "MIT",
+  "keywords": [
+    "heygen",
+    "avatar",
+    "identity",
+    "video",
+    "digital-twin",
+    "video-message",
+    "presenter",
+    "talking-head",
+    "ai-avatar",
+    "avatar-video"
+  ],
+  "skills": "./skills/",
+  "apps": "./.app.json",
+  "interface": {
+    "displayName": "HeyGen",
+    "shortDescription": "Avatar videos and personalized video messages",
+    "longDescription": "HeyGen Skills give your agent a face, a voice, and the ability to send video like a message. Use heygen-avatar to build a persistent digital identity from a photo and pick a voice, then heygen-video to generate identity-first presenter videos via the HeyGen v3 Video Agent pipeline (avatar resolution, aspect ratio correction, prompt engineering, and voice selection are handled automatically).",
+    "developerName": "HeyGen",
+    "category": "Design",
+    "capabilities": ["Read", "Write"],
+    "websiteURL": "https://heygen.com",
+    "defaultPrompt": [
+      "Create my HeyGen avatar from this photo",
+      "Make a 30-second intro video of myself",
+      "Send a video update to my team about this week's progress"
+    ],
+    "brandColor": "#0a0a0a",
+    "composerIcon": "./assets/icon.png",
+    "logo": "./assets/logo.png"
+  }
+}
diff --git a/plugins/heygen/README.md b/plugins/heygen/README.md
@@ -0,0 +1,25 @@
+# heygen
+
+OpenAI Codex plugin for [HeyGen](https://heygen.com) — create AI avatar videos and personalized video messages.
+
+## What's included
+
+Two skills that chain together:
+
+- **heygen-avatar** — turn a photo into a persistent digital twin. Handles avatar lookup, instant-avatar creation, voice selection (or voice cloning), and writes an `AVATAR` file the video skill reads back.
+- **heygen-video** — generate identity-first presenter videos via the HeyGen v3 Video Agent pipeline. Encodes the prompting, asset routing, aspect-ratio correction, and avatar/voice resolution that good HeyGen videos need.
+- **HeyGen app reference** — `.app.json` points at the curated [HeyGen ChatGPT app](https://chatgpt.com/apps/heygen/asdk_app_69418aad55e08191aa5e437b649ca2e4).
+
+## Requirements
+
+Installing the plugin connects the HeyGen ChatGPT app automatically (OAuth on first use). That is enough for the skills to work end-to-end on the user's existing HeyGen plan credits.
+
+If you'd rather not use the app, the skills also support the HeyGen CLI: install it from <https://static.heygen.ai/cli/install.sh> and export `HEYGEN_API_KEY` (get one at <https://app.heygen.com/api>).
+
+## Source of truth
+
+The skills are authored in [`heygen-com/skills`](https://github.com/heygen-com/skills) (under `heygen-avatar/` and `heygen-video/` at the repo root) and mirrored here. The main structural delta in this mirror is the wrapping `skills/` parent directory required by the Codex plugin convention. File issues about skill content on that repo.
+
+## License
+
+MIT
diff --git a/plugins/heygen/agents/openai.yaml b/plugins/heygen/agents/openai.yaml
@@ -0,0 +1,6 @@
+interface:
+  display_name: "HeyGen"
+  short_description: "Create avatar videos and personalized video messages"
+  icon_small: "./assets/icon.png"
+  icon_large: "./assets/logo.png"
+  default_prompt: "Help me create a personalized HeyGen video message. Ask who should appear on camera, who the audience is, the key points, and the tone before generating it."
diff --git a/plugins/heygen/assets/PRISM_ORB.svg b/plugins/heygen/assets/PRISM_ORB.svg
diff --git a/plugins/heygen/assets/icon.png b/plugins/heygen/assets/icon.png
diff --git a/plugins/heygen/assets/logo.png b/plugins/heygen/assets/logo.png
diff --git a/plugins/heygen/skills/heygen-avatar/SKILL.md b/plugins/heygen/skills/heygen-avatar/SKILL.md
diff --git a/plugins/heygen/skills/heygen-avatar/agents/openai.yaml b/plugins/heygen/skills/heygen-avatar/agents/openai.yaml
@@ -0,0 +1,4 @@
+interface:
+  display_name: "HeyGen Avatar"
+  short_description: "Create reusable HeyGen avatar identities"
+  default_prompt: "Create a reusable HeyGen avatar for me from a photo or written description, then help me choose a matching voice."
diff --git a/plugins/heygen/skills/heygen-avatar/references/asset-routing.md b/plugins/heygen/skills/heygen-avatar/references/asset-routing.md
@@ -0,0 +1,86 @@
+# Asset Handling — The Classification Engine
+
+When the user provides files, URLs, or references, route each asset to the right path. The user should NEVER have to think about this.
+
+## Two Paths
+
+| Path | What happens | When to use |
+|------|-------------|-------------|
+| **A: Contextualize → Prompt** | Read/analyze the asset, extract key info, bake into script. Video Agent never sees the original. | Reference material, auth-walled content, documents where the *information* matters more than the *visual*. |
+| **B: Attach to API** | Upload the raw file via `files[]`. Video Agent analyzes, extracts graphics, uses as frames/B-roll. | Screenshots, branded assets, PDFs with important visual layouts, images the viewer should literally see. |
+| **A+B: Both** | Contextualize for script quality AND attach for visual use. | Long docs where you need to summarize but Video Agent should also have the full source. |
+
+## Classification Flow
+
+```
+1. Can Video Agent access this directly?
+   - Public URL (no auth, no paywall) → YES
+   - Private/internal URL → NO
+   - Local file → NO (must upload first)
+
+2. Should the viewer SEE this asset?
+   - Screenshot, logo, product image, chart → YES → Path B
+   - Research doc, article, context material → NO → Path A
+   - Ambiguous → Path A+B
+
+3. Is the content too long for the prompt?
+   - Short (< 500 words) → fits in prompt
+   - Long (> 500 words) → summarize key points, attach full doc
+```
+
+## Decision Matrix
+
+| Asset Type | Publicly Accessible? | Show On Screen? | Route |
+|-----------|---------------------|----------------|-------|
+| Screenshot / image | N/A | Yes | **B: Attach** + describe in prompt as B-roll |
+| Logo / brand asset | N/A | Yes | **B: Attach** + anchor to intro/outro |
+| Public URL to file (PDF, image, video) | Yes | Maybe | **B: Download → upload via `/v3/assets` → pass `asset_id`** + summarize |
+| Public URL to web page (HTML) | Yes | No | **A: Fetch and contextualize only.** Do NOT pass HTML URLs in `files[]`. |
+| Auth-walled URL (requires login) | No | No | **A: Ask the user to paste the content.** Never fabricate. |
+| PDF (short, text-heavy) | N/A | No | **A+B: Extract key points** + attach |
+| PDF (long, visual-rich) | N/A | Maybe | **B: Attach** + summarize top points |
+| Raw data / spreadsheet | N/A | Partially | **A: Analyze and describe** key stats. Attach if charts should appear. |
+
+## Executing Routes
+
+### Path A (Contextualize)
+- URLs: retrieve publicly accessible content with the environment's standard web/content fetch capability
+- For auth-walled content you cannot access: ask the user to paste the text directly
+- Extract 3-5 most important points relevant to the video
+- Weave naturally into the script. Don't dump. Integrate.
+
+### Path B (Attach)
+Upload to HeyGen:
+
+**App:** upload through the HeyGen app's asset flow when available.
+**CLI:** `heygen asset create --file /path/to/file.png`
+
+Max 32MB per file. Returns JSON with the new `asset_id`.
+
+Or pass inline in `files[]`:
+```json
+{"type": "url", "url": "https://example.com/image.png"}
+{"type": "asset_id", "asset_id": "<from upload>"}
+{"type": "base64", "data": "<base64>", "content_type": "image/png"}
+```
+
+### Describe Asset Usage in Prompt
+Be SPECIFIC:
+- "Use the uploaded dashboard screenshot as B-roll when discussing analytics"
+- "Display the company logo in the intro and end card"
+
+### Log Classification
+In the learning log entry, record:
+```json
+"assets_classified": [{"type": "image", "route": "attach", "accessible": true, "reason": "product screenshot"}]
+```
+
+## Rules
+
+- **Never ask the user which path unless genuinely 50/50.** You're the producer. Make the call.
+- **When in doubt, do both (A+B).** Over-providing costs nothing.
+- **Always describe attached assets in the prompt.** Uploading without description = ignored.
+- **Auth-walled content is YOUR job.** Bridge the gap between your access and Video Agent's.
+- **URLs that fail:** Try the environment's standard web/content fetch capability. If login/paywall/404 → tell the user, ask for content directly. Never silently fabricate.
+- **HTML URLs cannot go in `files[]`.** Video Agent rejects `text/html`. Web pages are ALWAYS Path A only.
+- **Prefer download→upload→asset_id** over `files[]{url}`. HeyGen's servers often blocked by CDN/WAF.
diff --git a/plugins/heygen/skills/heygen-avatar/references/avatar-creation.md b/plugins/heygen/skills/heygen-avatar/references/avatar-creation.md
@@ -0,0 +1,178 @@
+# Avatar Creation API Surface
+
+This guide expands `heygen-avatar` Phase 2 (avatar creation) and Phase 3
+(voice selection) with the full API surface, field mappings, and file
+input formats. The SKILL.md gives the high-level workflow; this file is
+the reference when you need exact arguments, edge cases, or alternative
+creation paths.
+
+For *avatar discovery* (finding an existing avatar at video time), see
+[`../../heygen-video/references/avatar-discovery.md`](../../heygen-video/references/avatar-discovery.md).
+
+---
+
+## Avatar Creation: Three Types
+
+`heygen-avatar` Phase 2 supports three creation types. Pick based on what
+the user provides:
+
+| User input | Type | Flow |
+|---|---|---|
+| A photo of a real person | `photo` | Photo avatar creation |
+| A description of an appearance | `prompt` | Prompt-based avatar creation |
+| A short video recording of a real person | `video` | Digital-twin creation |
+
+All three accept an optional `avatar_group_id`:
+- **Omit it** to create a new character (new group).
+- **Include it** to add a new look (variation) to an existing character.
+
+Always use Mode 2 (with `avatar_group_id`) when the avatar already exists
+and you're creating a variant (different outfit, orientation fix, bg
+change). Only use Mode 1 (new character) for genuinely new identities.
+
+### Photo avatar (from user's photo)
+
+**App:** use the HeyGen app flow for photo avatar creation.
+
+**CLI:**
+```bash
+heygen avatar create -d '{
+  "type": "photo",
+  "name": "My Avatar",
+  "file": {"type": "url", "url": "https://example.com/headshot.jpg"},
+  "avatar_group_id": "<optional>"
+}'
+```
+
+Photo requirements:
+- JPEG or PNG
+- Min 512x512
+- Clear front-facing face
+- Good lighting
+
+### AI-generated avatar (from text prompt)
+
+**App:** use the HeyGen app flow for prompt-based avatar creation.
+
+**CLI:**
+```bash
+heygen avatar create -d '{
+  "type": "prompt",
+  "name": "Tech Presenter",
+  "prompt": "Young professional woman, modern workspace, confident smile",
+  "avatar_group_id": "<optional>"
+}'
+```
+
+Prompt limit: 1000 characters (the API spec says 200 but the actual
+enforced limit is 1000). Be descriptive — include style, features,
+expression, lighting.
+
+Optional: up to 3 `reference_images` to anchor the generated appearance.
+
+### Video avatar / digital twin (from a short recording)
+
+**App:** use the HeyGen app flow for digital-twin creation from video.
+
+**CLI:**
+```bash
+heygen avatar create -d '{
+  "type": "video",
+  "name": "My Video Avatar",
+  "file": {"type": "asset_id", "asset_id": "<uploaded_asset_id>"},
+  "avatar_group_id": "<optional>"
+}'
+```
+
+---
+
+## File Input Formats
+
+`file` accepts three forms:
+
+```jsonc
+// Public URL (no auth, no paywall)
+{ "type": "url", "url": "https://example.com/headshot.jpg" }
+
+// Pre-uploaded asset (from `heygen asset create --file <path>`)
+{ "type": "asset_id", "asset_id": "<id>" }
+
+// Inline base64
+{ "type": "base64", "data": "<base64>", "content_type": "image/png" }
+```
+
+For when each is appropriate, see
+[`references/asset-routing.md`](asset-routing.md).
+
+---
+
+## Response Shape
+
+All three types return:
+```jsonc
+{
+  "avatar_item": {
+    "id": "<look_id>",         // ephemeral — the specific look
+    "group_id": "<group_id>"   // stable — the character identity
+  }
+}
+```
+
+- `id` is the **look_id** — what you pass downstream as `avatar_id` for
+  HeyGen video generation.
+- `group_id` is the **character identity** — stable across looks. Save
+  this in the AVATAR-<NAME>.md file. Always resolve fresh look_ids at
+  video time via the avatar-looks flow rather than caching
+  a specific look_id.
+
+---
+
+## Identity Field → HeyGen Enum Mapping
+
+When building a prompt-based avatar, map identity attributes to these
+HeyGen enums:
+
+- **age**: Young Adult | Early Middle Age | Late Middle Age | Senior | Unspecified
+- **gender**: Man | Woman | Unspecified
+- **ethnicity**: White | Black | Asian American | East Asian | South East Asian | South Asian | Middle Eastern | Pacific | Hispanic | Unspecified
+- **style**: Realistic | Pixar | Cinematic | Vintage | Noir | Cyberpunk | Unspecified
+- **orientation**: square | horizontal | vertical
+- **pose**: half_body | close_up | full_body
+
+---
+
+## Voice Selection (during avatar setup)
+
+After the avatar look is created, pair it with a voice. Two paths:
+
+### Path A — Voice Design (preferred)
+
+Find matching voices via semantic search using the Voice section from
+the AVATAR file. This searches HeyGen's full voice library. No new
+voices are generated and no quota is consumed.
+
+**Language matching:** The voice design prompt should specify the target
+language from `user_language`. Example for Japanese: `"A calm, warm
+female voice. Professional but approachable. Japanese speaker."` This
+ensures semantic search returns voices in the correct language.
+
+### Path B — Voice Browse (fallback)
+
+For manual catalog browsing:
+
+**App:** browse available voices in the HeyGen app, filtered to the target language and voice characteristics when possible.
+
+**CLI:**
+```bash
+heygen voice list --type private --limit 20
+heygen voice list --type public --engine starfish --language en --gender female --limit 20
+```
+
+**ALWAYS show a playable voice preview.** Each voice response includes
+`preview_audio_url` — share it before committing.
+
+**Handling missing/broken previews:** Some voices may not expose a usable
+preview URL and can return `null`. When this happens: note "(no preview available)" and
+offer to generate a short TTS sample via the app or
+`heygen voice speech create --text "<sample>" --voice-id <id>
+--input-type plain_text --language en --locale en-US` (CLI).