diff --git a/.claude/skills b/.claude/skills
new file mode 120000
index 00000000..3e73f3a3
--- /dev/null
+++ b/.claude/skills
@@ -0,0 +1 @@
+../.github/skills
\ No newline at end of file
diff --git a/.github/skills/migrate-from-openinference/SKILL.md b/.github/skills/migrate-from-openinference/SKILL.md
new file mode 100644
index 00000000..f409ef7d
--- /dev/null
+++ b/.github/skills/migrate-from-openinference/SKILL.md
@@ -0,0 +1,599 @@
+---
+name: migrate-from-openinference
+description: Migrate an openinference-instrumentation-* package from https://github.com/open-telemetry/donation-openinference into this repo. Creates a new package, or — when a package for the library already exists in the repo — augments it with the coverage OpenInference adds on top. Use when a user asks to migrate or port a package from OpenInference.
+---
+
+# Migrate an OpenInference `instrumentation-*` package
+
+Migrate an `openinference-instrumentation-<source>` package from
+https://github.com/open-telemetry/donation-openinference into this
+repo. The result emits OTel GenAI semantic conventions through
+`opentelemetry-util-genai`.
+
+Two modes, decided by the "Before you start" gate below:
+
+- **Greenfield migration** — no package for the library exists yet. Create a
+  **new implementation** under `instrumentation/`. The default, and what the
+  bulk of this skill describes.
+- **Augment an existing package** — the repo already ships
+  `opentelemetry-instrumentation-genai-<lib>`. Don't re-create it; inventory
+  what it covers, diff against OpenInference, and add **only the missing
+  parts**. See [Augment mode](#augment-mode-the-package-already-exists).
+
+For a greenfield migration the major work items are: rewriting the patcher to
+method-level (step 5), mapping every request/response shape into OTel
+`InputMessage`/`OutputMessage` parts (step 6), and migrating the unit-test
+corpus while filtering openinference-framework plumbing tests out (step 7).
+
+## Inputs
+
+User specifies the source, e.g. `openinference-instrumentation-crewai`.
+
+- **Source**: It should point to one of the folders in 
+  `https://github.com/open-telemetry/donation-openinference/tree/main/python/instrumentation/`.
+  Fetch a fresh shallow clone if you don't already have one locally:
+  ```sh
+  git clone --depth=1 https://github.com/open-telemetry/donation-openinference.git /tmp/openinference
+  ```
+  and use `/tmp/openinference/python/instrumentation/<source>/` as the
+  source path in step 1.
+
+User may also provide the **target package name**. If not provided: derive it from the source name:
+- drop the leading `openinference-instrumentation-`. Remaining part should match the instrumented library name as it appears on PyPI. If it's not the case, flag it.
+- The target package name should be `opentelemetry-instrumentation-genai-<lib>` where `<lib>` is the instrumented library name (e.g. `openai`, `anthropic`, `bedrock`). For example:
+  - `openinference-instrumentation-openai` → `opentelemetry-instrumentation-genai-openai`
+  - `openinference-instrumentation-anthropic` → `opentelemetry-instrumentation-genai-anthropic`
+  Confirm the chosen name with the user.
+
+## Before you start: is there already a package for this library?
+
+Once the target name is settled, check whether the repo already ships it:
+
+```sh
+ls instrumentation/opentelemetry-instrumentation-genai-<lib> 2>/dev/null
+```
+
+- **Exists →** **augment mode**: don't scaffold a new package. OpenInference
+  is now a *second* reference to mine for coverage the existing package
+  lacks. Jump to [Augment mode](#augment-mode-the-package-already-exists).
+- **Doesn't exist →** greenfield migration; continue below.
+
+If a near-name sibling (a `-agents` / `-client` suffix) might instrument a
+*different* surface of the same vendor, confirm the target name with the user
+before deciding.
+
+## Before you start: check for native OTel instrumentation
+
+AI SDKs increasingly ship their **own** OpenTelemetry GenAI instrumentation.
+When they do, migrating the OpenInference package is redundant. So
+before writing any code, determine whether the instrumented library is
+self-instrumenting.
+
+```sh
+# 1. Does the SDK depend on the OTel API / semconv?
+pip show <lib> | grep -i requires        # or read its pyproject / METADATA
+#    a dependency on opentelemetry-api or opentelemetry-semantic-conventions
+#    is the tell.
+# 2. Does its source actually emit GenAI spans?
+python -c "import <lib>, os; print(os.path.dirname(<lib>.__file__))"
+rg -l "opentelemetry|semconv\._incubating\.attributes\.gen_ai" <site-packages>/<lib>
+```
+
+A dependency on `opentelemetry-api` (or `-semantic-conventions`) **plus**
+`gen_ai_attributes` usage in the SDK source means the library is
+self-instrumented. Confirm empirically: set a **global** `TracerProvider`
+(native hooks often activate only when a real, non-proxy provider is
+configured), make one call, and inspect the emitted spans' instrumentation
+scope.
+
+**If the library is self-instrumented, do NOT migrate the OpenInference
+package.** Pivot the work:
+
+1. **Ignore the OpenInference instrumentation entirely** — the vendor owns
+   the spans; there is nothing to re-implement, and no `src/` instrumentor /
+   patcher to write.
+2. **Write conformance tests against the native instrumentation.** Follow
+   step 8 / the `write-conformance-tests` skill, but each scenario's `run()`
+   configures providers and enables the **native** tracer (e.g. sets a global
+   `TracerProvider`) instead of calling a `*Instrumentor` — then runs the
+   emitted telemetry through weaver live-check.
+3. **Identify gaps / inconsistencies** between the native output and the
+   GenAI semconv: missing operations, wrong operation name, legacy/duplicate
+   attributes, no metrics, no content-capture controls, no util-genai
+   content modes / completion-hook / upload support, etc. Record each as an
+   `expected_violation` or a documented skip, same as a normal migration.
+4. **Write `MIGRATION_REPORT.md`** stating the library is self-instrumented,
+   the conformance results, and the gap list — that report is the
+   deliverable. **Stop and surface the finding to the user.** Do not build a
+   competing package unless they explicitly decide to (e.g. to suppress
+   native instrumentation and layer util-genai features on top).
+
+Only when the library has **no** native OTel instrumentation do you continue
+with the migration flow below.
+
+## Reference material
+
+- **OTel GenAI spans**: <https://github.com/open-telemetry/semantic-conventions-genai/tree/main/docs/gen-ai> — authoritative attribute names, spans, logs, and metrics definitions.
+- **OpenInference → OTel attribute mapping** (Arize-maintained): <https://github.com/Arize-ai/openinference/blob/e9a8746daeb184c9aabc68ca29c05909ddcccf1e/spec/genai/README.md>. Use as a quick lookup for what an OpenInference attribute *roughly* corresponds to in OTel; when the mapping disagrees with the official semconv, **the official semconv wins**.
+- **Message JSON schemas**:
+  - input messages: <https://github.com/open-telemetry/semantic-conventions-genai/tree/main/docs/gen-ai/gen-ai-input-messages.json>
+  - output messages: <https://github.com/open-telemetry/semantic-conventions-genai/tree/main/docs/gen-ai/gen-ai-output-messages.json>
+  - system instructions: <https://github.com/open-telemetry/semantic-conventions-genai/tree/main/docs/gen-ai/gen-ai-system-instructions.json>
+  - tool definitions: <https://github.com/open-telemetry/semantic-conventions-genai/tree/main/docs/gen-ai/gen-ai-tool-definitions.json>
+  - retrieval documents: <https://github.com/open-telemetry/semantic-conventions-genai/tree/main/docs/gen-ai/gen-ai-retrieval-documents.json>
+
+- **Code for above models**: <https://github.com/open-telemetry/semantic-conventions-genai/tree/main/docs/gen-ai/non-normative/models.py>.
+
+## Non-negotiable rules
+
+The repo-wide rules in [AGENTS.md](../../../AGENTS.md) already apply
+(telemetry through `opentelemetry-util-genai` public surface only, no
+`type: ignore`, semconv enums over string literals, re-raise caught
+exceptions). The rules below are the ones the migration is most likely to
+violate:
+
+1. **Zero OpenInference dependencies.** No `openinference-instrumentation`,
+   no `openinference-semantic-conventions`, no `openinference-*` anywhere
+   in the migrated package's `src/` or `tests/`. Verify with
+   `rg openinference instrumentation/<target>` — output must be empty.
+2. **Public util-genai surface only.** Beyond the AGENTS.md rule, the migrated package
+   must not import any `opentelemetry.util.genai._*` module — the allowed
+   modules are enumerated in step 4.
+3. **Ignore all other OpenInference instrumentations during the migration.** The only
+   instrumentation code to read is the OpenInference package being migrated
+   plus `opentelemetry-util-genai`. Build
+   from first principles: original OpenInference code + util-genai public API +
+   official semconv spec.
+4. **Never work around gaps.** If util-genai or the GenAI semconv is
+   missing something, flag it and fail the test intentionally
+5. **Do not make OTel API calls.** **Exception:**
+   semconv attributes that exist in the registry but have no named property
+   on `InferenceInvocation` (e.g. `gen_ai.usage.cache_creation.input_tokens`,
+   `gen_ai.usage.cache_read.input_tokens`) may be set via
+   `invocation.attributes[KEY] = …` — that's still going through the
+   util-genai extension point. Import the key from
+   `opentelemetry.semconv._incubating.attributes.gen_ai_attributes`. Do
+   not invent attribute names that aren't in the semconv.
+6. **Reuse VCR cassettes.** Reuse cassettes from the OpenInference
+   tests when possible.
+7. **Conformance tests must never be silently skipped.** 
+   When instrumentation can't be made conformant due to missing
+   information, gap in semantic conventions, or in util-genai, still 
+   write the scenario and let it fail. 
+
+   Ask user to decide if they want to mark the scenario as skipped with a reason, or
+   add an `expected_violation` in the scenario that covers the missing piece.
+   All these must be documented in `MIGRATION_REPORT.md` as well, with links to the skipped scenario and the expected violation.
+
+8. **Do not modify weaver policies.**
+
+## Augment mode: the package already exists
+
+The package already has a working, conformant implementation; OpenInference
+is just another reference to mine for missing coverage. The job is a tight,
+delta-only PR that closes specific gaps — **not** a rewrite.
+
+All [Non-negotiable rules](#non-negotiable-rules) apply to every line you
+add, plus two specific to this mode:
+
+- **Don't rewrite or "improve" existing code.** Leave the existing patcher,
+  wrappers, and tests alone unless OpenInference reveals a concrete bug — and
+  then it's a *separate* PR. No opportunistic refactors.
+- **Match the existing package's conventions.** New wrappers, helpers, and
+  scenarios follow the patterns already there (helper names, wrapper
+  structure, conftest/fixture wiring, scenario shape). Don't add a second way
+  to do something the package already does.
+
+### A. Inventory what's already there
+
+Map the existing package before reading OpenInference (read, don't guess):
+patched methods (`wrap_function_wrapper` / `unwrap` in `_instrument`),
+request/response shapes its wrappers map, the unit-test matrix per method
+(`rg -c '^\s*(async )?def test_' instrumentation/<target>/tests/`),
+conformance scenarios under `tests/conformance/`, and cassettes.
+
+### B. Diff against OpenInference
+
+Run the OpenInference analysis as a greenfield migration would (the reading behind
+steps 5–6): every method it patches, every shape it parses. Subtract
+inventory A. The remainder is the work-list:
+
+- Methods OpenInference patches that the package doesn't → new wrappers (step 5).
+- Shape branches OpenInference handles that the wrappers drop → extend the
+  mapping (step 6).
+- Scenarios OpenInference covers that the package lacks → new tests /
+  conformance (steps 7–8).
+
+Coverage both already have → skip. Coverage the package has but OpenInference
+lacks → leave it; not a regression.
+
+### C. Add the delta
+
+Apply steps **5–10 to the new parts only**:
+
+- **Steps 1–3 (scaffold/rename/pyproject) skipped** — touch `pyproject.toml`
+  only for a genuinely new entry point or dependency range, `README.rst` only
+  for a new user-visible capability.
+- **Step 4** applies to anything you copy in; nothing to excise from existing
+  code.
+- **Steps 5–9** extend the existing wrappers / test utils / conformance
+  runner in place, not parallel ones.
+- **Step 10** is done; revisit `tox.ini` / pyright only for a new test factor
+  or requirements file.
+
+### D. Report and review
+
+Write `MIGRATION_REPORT.md` via the `review-migration` skill as usual — it
+detects augment mode.
+
+## Migration flow
+
+> The numbered steps below are written for a **greenfield migration**. In
+> [augment mode](#augment-mode-the-package-already-exists) skip steps 1–3,
+> and scope steps 5–10 to the delta from the inventory/diff (sections A–C
+> above).
+
+### 1. Create the target package
+
+Because the patcher, wrappers, and tests are all rewritten (steps 5–7), a
+`cp -R` of the OpenInference tree mostly creates files you immediately
+delete or overwrite. **Prefer scaffolding fresh** from the nearest existing
+package (e.g. `opentelemetry-instrumentation-genai-anthropic`) and copy over from
+OpenInference **only** what you actually reuse:
+
+- `LICENSE` (Apache-2.0).
+- Reusable cassettes (step 9), if any.
+
+```sh
+mkdir -p instrumentation/<target>/src/opentelemetry/instrumentation/genai/<lib>
+mkdir -p instrumentation/<target>/tests/conformance instrumentation/<target>/tests/cassettes
+cp <source-path>/LICENSE instrumentation/<target>/LICENSE
+```
+
+Do **not** carry over `examples/`, OpenInference's `README.md`, or its
+`CHANGELOG.md` (per-package changelogs are towncrier-generated at release
+time).
+
+(If you do `cp -R` instead, clean it up afterwards:
+`rm -rf .pytest_cache .tox .venv venv .vscode .DS_Store .claude .ruff_cache CHANGELOG.md`
+and `find . -name __pycache__ -type d -exec rm -rf {} +`.)
+
+### 2. Rename the Python module
+
+OpenInference packages live at `src/openinference/instrumentation/<lib>/`.
+Move that tree under the OTel GenAI namespace, update path according to the target package name:
+
+```sh
+mkdir -p src/opentelemetry/instrumentation/genai
+mv src/openinference/instrumentation/<lib> src/opentelemetry/instrumentation/genai/<lib>
+rm -rf src/openinference
+```
+
+Update every import. Verify zero `openinference` references remain in
+`src/`, `tests/`, README. The instrumentor class typically renames from
+`<Lib>Instrumentor` (kept as-is — same name is fine).
+
+### 3. Update `pyproject.toml`, `version.py`, and `README.rst`
+
+- `[project] name` → new package name.
+- `[project.entry-points.opentelemetry_instrumentor]` → un-prefixed lib name
+  pointing at the new module path
+  (`<lib> = "opentelemetry.instrumentation.genai.<lib>:<Lib>Instrumentor"`).
+- Hatch version path, project URLs, classifiers → new repo paths.
+- **Strip every `openinference-*` dependency.** OpenInference packages typically depend on
+  `openinference-instrumentation`, `openinference-semantic-conventions`, and
+  sometimes `opentelemetry-instrumentation` for the `BaseInstrumentor` mixin
+  — keep only the last one. Replace with `opentelemetry-instrumentation` (for
+  `BaseInstrumentor`) and the underlying SDK (`openai`, `anthropic`, …) at
+  the same range OpenInference was using. 
+- `__version__` in `version.py` should equal the value in
+  `util/opentelemetry-util-genai/src/opentelemetry/util/genai/version.py` — all
+  workspace packages share one version. Verify:
+
+  ```sh
+  diff <(grep ^__version__ instrumentation/<target>/src/opentelemetry/instrumentation/genai/<lib>/version.py) \
+       <(grep ^__version__ util/opentelemetry-util-genai/src/opentelemetry/util/genai/version.py)
+  ```
+
+- Hatchling builds **require a `README.md` or `README.rst`**. Rewrite it to
+  point at the new repo URLs and module path; drop OpenInference links, OpenInference badges,
+  `using_attributes(...)` examples, `OpenInferenceTracer` / `TraceConfig`
+  configuration, and any "OpenInference semconv" links. Include a usage snippet
+  importing from `opentelemetry.instrumentation.genai.<lib>`, a pointer to
+  `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT`, and a pointer to
+  `tests/conformance/` (no `examples/`).
+
+### 4. Drop OpenInference plumbing
+
+OpenInference ships a framework that's incompatible with this repo's util-genai model
+— excise it before touching the patcher.
+
+```sh
+rg 'openinference|OpenInferenceTracer|TraceConfig|using_attributes|using_session|using_user|using_metadata|using_tags|SpanAttributes\.|OpenInferenceMimeTypeValues|OpenInferenceSpanKindValues|safe_json_dumps' src/ tests/
+```
+
+Drop every match. The mappings:
+
+- **`OpenInferenceTracer` / `TraceConfig`** — replaced by `TelemetryHandler` from
+  `opentelemetry.util.genai.handler`. Instrumentation code does not
+  construct or pass tracers.
+- **`using_attributes(session_id=…, user_id=…, …)` / `using_session` /
+  `using_metadata` / `using_tags`** — there is **no OTel GenAI equivalent**
+  context-propagation API. Drop the calls and the tests that exercise them
+  (those go in the test "skip with reason" bucket — see step 7).
+- **`OpenInferenceSpanKindValues` / `OpenInferenceMimeTypeValues`** — drop;
+  span kind is set by util-genai based on the invocation type.
+- **`SpanAttributes.LLM_*` / `SpanAttributes.INPUT_*`** — flat-string
+  OpenInference-semconv attributes. Replaced by typed `InputMessage` / `OutputMessage`
+  payloads serialized into `gen_ai.input.messages` / `gen_ai.output.messages`
+  by util-genai. The conversion is in step 6.
+- **`safe_json_dumps`** — drop; util-genai serializes message payloads.
+
+**Any import of `opentelemetry.util.genai._<anything>` from instrumentation
+`src/` is a violation.** Public surface only:
+
+- `opentelemetry.util.genai.handler` — `TelemetryHandler`
+- `opentelemetry.util.genai.invocation` — `InferenceInvocation`,
+  `EmbeddingInvocation`, `ToolInvocation`, `WorkflowInvocation`,
+  `AgentInvocation`, `Error`, `GenAIInvocation`
+- `opentelemetry.util.genai.types` — `InputMessage`, `OutputMessage`,
+  `Text`, `ToolCallRequest`, `ToolCallResponse`, `Reasoning`,
+  `ServerToolCall`, `ServerToolCallResponse`, `GenericPart`, `Blob`,
+  `File`, `Uri`, `Modality`
+- `opentelemetry.util.genai.completion_hook`
+- `opentelemetry.util.genai.environment_variables`
+
+```sh
+rg 'from opentelemetry\.util\.genai\._' instrumentation/<target>/src/
+```
+
+Output must be empty.
+
+### 5. Rewrite patching: transport → API method level
+
+This is the largest behavioral change. OpenInference typically patches at
+the HTTP-transport layer (`OpenAI.request`, `AsyncOpenAI.request`,
+`HTTPClient._send_request`, etc.) and dispatches by `cast_to` response type.
+**That pattern does not survive the migration.** util-genai's
+`InferenceInvocation` model needs the request kwargs (`model`, `messages`,
+`tools`, `stream`, …) at call time, which only the API methods see.
+
+Replace every transport-level wrapper with method-level
+`wrap_function_wrapper` calls — one per public API endpoint — using
+**positional args only** (newer wrapt versions reject keyword args):
+
+```python
+from wrapt import wrap_function_wrapper
+from opentelemetry.instrumentation.utils import unwrap
+
+wrap_function_wrapper(
+    "openai.resources.chat.completions",   # module     (positional)
+    "Completions.create",                  # name       (positional)
+    chat_completions_wrapper,              # wrapper    (positional)
+)
+```
+
+For uninstrument, use `opentelemetry.instrumentation.utils.unwrap` (matching
+positional module + name).
+
+**Patch ALL API methods OpenInference instruments.** Walk OpenInference's
+`_instrumentor.py` / `instrumentor.py` plus the dispatch table in the
+transport accumulator and enumerate every endpoint OpenInference emits a span for —
+including ones with only generic attribute extraction (assistants, threads,
+files, fine-tuning, vector stores, batches, uploads, moderations, …). Each
+becomes one `wrap_function_wrapper` call. Dropping coverage for an endpoint
+because it's "legacy" or "rarely used" is a regression — OTel GenAI semconv
+applies to every inference and embedding API regardless of vintage. If a
+specific API has no util-genai invocation type yet, that's a gap for the
+review report (see step 11), not a reason to drop the patch.
+
+### 6. Map every request and response shape into OTel GenAI types
+
+For each wrapped method, walk OpenInference's input-parsing branch by branch and
+ensure each branch has a corresponding mapping in the new wrapper. Same
+for output parsing. A wrapper that handles `str` input but not `list`
+input (when the original SDK accepts both) is incomplete and must not
+ship.
+
+Mapping cheat sheet — OpenInference source shape on the left, OTel construct on the
+right (all types from `opentelemetry.util.genai.types` unless noted):
+
+| Source request item | OTel construct |
+|---|---|
+| User / assistant / system text message | `Input/OutputMessage(role=…, parts=[Text(content=…)])` |
+| Assistant message containing a tool/function call | `Message(role="assistant", parts=[ToolCallRequest(name=…, id=…, arguments=…)])` |
+| Tool/function result message | `Message(role="tool", parts=[ToolCallResponse(id=…, response=…)])` |
+| Reasoning / thinking item | `Message(role="assistant", parts=[Reasoning(content=…)])` |
+| Server-side tool call (web_search, file_search, code_interpreter, …) | `Message(parts=[ServerToolCall(name=…, server_tool_call=…, id=…)])` |
+| Server-side tool call result | `Message(parts=[ServerToolCallResponse(server_tool_call_response=…, id=…)])` |
+| Inline image / audio / video bytes | `Blob(mime_type=…, modality="image"\|"audio"\|"video", content=b"…")` |
+| External media URL | `Uri(mime_type=…, modality=…, uri="…")` |
+| File reference (e.g. OpenAI `file_id`) | `File(mime_type=…, modality=…, file_id="file-…")` |
+| Provider-specific item with no semconv mapping | `GenericPart(value=…)` — never silently drop. Flag those in the review report. |
+
+Output messages mirror the input mapping — `OutputMessage` serializes with
+a `parts` array (not `content`); each part has a `type` field. When
+asserting on `gen_ai.output.messages`, parse the JSON and check
+`msg["parts"]`.
+
+`Modality` is `Literal["image", "video", "audio"]`. `error.type` and span
+status come from `invocation.fail(exc)` — do not emit a separate span
+exception event.
+
+### 7. Restructure tests
+
+```text
+tests/
+  cassettes/<scenario>.yaml
+  conformance/
+    inference.py / embedding.py / ...   # see step 8
+  conftest.py
+  test_<existing>.py                    # unit tests; refactor onto helpers
+  requirements.{oldest,latest}.txt
+```
+
+**Categorize OpenInference tests before migrating.** List every test function in the
+OpenInference package and bucket each one:
+
+- ✅ **Migrate** — exercises a patched API method. Rewrite assertions
+  from flat OpenInference semconv attributes (`SpanAttributes.LLM_INPUT_MESSAGES_…`)
+  to OTel constructs: assert on `span.attributes[GenAIAttributes.GEN_AI_…]`
+  (semconv constants), and parse `gen_ai.input.messages` / `gen_ai.output.messages`
+  JSON to check the `parts` arrays.
+- ✅ **Migrate (rewrite)** — unit test for an OpenInference-internal helper
+  (attribute extractor, message parser, etc.) where the helper is gone but
+  the **parsing scenario** still applies. Rewrite as an integration test
+  that feeds the same response shape through VCR and asserts the
+  resulting OTel telemetry. This is the bucket most likely to be
+  mis-categorized — anything covering tool-call objects, refusals,
+  reasoning items, multi-content messages, token-usage breakdowns, image
+  inputs, raw-response variants (e.g. `with_raw_response`) is parsing
+  domain logic and **must** be migrated. The wrapper has to do the
+  parsing; the test has to verify it.
+- ❌ **Skip with reason** — exercises OpenInference framework plumbing with no OTel
+  equivalent: `using_attributes()` context propagation, `TraceConfig`
+  masking, `OpenInferenceTracer` behavior, OpenInference flat-attribute naming format,
+  `OpenInferenceSpanKindValues` checks. Document the reason in a comment
+  in the test file (or in `MIGRATION_REPORT.md` later — see step 11).
+  Do **not** skip a test because the API it exercises is "legacy" or
+  "new" — semconv applies to all inference APIs.
+
+Decision rule for the ✅ rewrite vs ❌ skip split: ask whether the test
+covers a *response shape* from the instrumented library, or *OpenInference framework
+behavior*. A test that constructs a tool-call object with `arguments` /
+`call_id` / `name` and verifies it's extracted into attributes covers a
+response shape — migrate it. A test that checks
+`using_attributes(session_id=…)` propagates into span attributes covers OpenInference
+framework behavior — skip it.
+
+**Sanity check before committing step 7.** Count source vs migrated tests:
+
+```sh
+rg -c '^\s*(async )?def test_' <source-path>/tests/
+rg -c '^\s*(async )?def test_' instrumentation/<target>/tests/
+```
+
+A migration that drops from 80 tests to 5 is a regression — go back to the
+"migrate" and "migrate (rewrite)" buckets and finish them.
+
+**Replace conftest boilerplate.** OpenInference conftests duplicate the
+exporter/provider/VCR plumbing that lives here in
+`opentelemetry-test-util-genai`. Don't copy OpenInference's — mirror an
+existing package's conftest (e.g.
+`instrumentation/opentelemetry-instrumentation-genai-openai/tests/conftest.py`),
+which registers the shared fixtures as plugins:
+
+```python
+pytest_plugins = [
+    "opentelemetry.test_util_genai.fixtures",
+    "opentelemetry.test_util_genai.vcr",
+]
+```
+
+The lib-specific conftest then adds only: `vcr_config` (per-package
+`filter_headers` and `before_record_response`), an `environment` autouse
+for the lib's API-key env var, library-client fixtures (e.g.
+`openai_client` / `async_openai_client`), and the `instrument_*` fixtures
+(`instrument_no_content`, `instrument_with_content`, `instrument_event_only`)
+built on the shared `instrument` context manager from
+`opentelemetry.test_util_genai.instrumentor` (see [AGENTS.md](../../../AGENTS.md) Tests).
+
+**Assertions — no shared helper module.** There is no
+`opentelemetry.test_util_genai.assertions`. Assert directly on
+`span.attributes[GenAIAttributes.GEN_AI_…]` using the semconv constants
+from `opentelemetry.semconv._incubating.attributes.gen_ai_attributes`, and
+on metric/log records from the in-memory exporters. Factor repeated checks
+into a per-package `tests/test_utils.py` (existing packages have helpers
+like `assert_all_attributes`, `assert_completion_attributes`,
+`assert_messages_attribute`, plus weather-tool fixtures). OpenInference
+helpers (`_check_llm_attributes`, etc.) map onto these — rewrite them on
+top of OTel semconv constants and parsed `gen_ai.*.messages` JSON, and keep
+tiny constants (`DEFAULT_MODEL`, sample prompts) inline or in
+`tests/test_utils.py`.
+
+**Required unit-test coverage per wrapped method.** Apply the repo test
+matrix (sync/async × happy/error, plus streaming × happy/error where the
+method streams — see [AGENTS.md](../../../AGENTS.md) Tests section)
+to **every** method patched in step 5. For the migration these are blockers for
+the migration PR, not follow-up. The error variants must verify the
+original exception is re-raised, `error.type` is recorded, and span status
+is ERROR.
+
+**`tests/requirements.{latest,oldest}.txt`** — OpenInference typically has its own
+pin file; keep only the third-party version pins (`openai==`,
+`anthropic==`, …). 
+Add current `util/opentelemetry-util-genai` and `instrumentation/opentelemetry-instrumentation-genai-<lib>` to the latest; for the oldest, use the oldest version of `opentelemetry-util-genai` that works (there is none released yet, but check).
+
+### 8. Conformance scenarios
+
+Author conformance scenarios using the **`write-conformance-tests`** skill —
+it's the generic procedure (scenario modules, the `test_conformance.py`
+runner, declared gaps, lib-specific assertions, weaver policies) and applies
+to any instrumentation. Migration-specific notes on top of that skill:
+
+- Drop OpenInference's `examples/` tree — its end-to-end demos are replaced
+  by conformance scenarios, not migrated.
+- For an operation blocked by a util-genai/semconv gap, point the
+  `expected_violations` / `xfail` `reason=` at the gap row in
+  `MIGRATION_REPORT.md`.
+
+### 9. Cassettes (or a transport proxy)
+
+- Copy cassettes from OpenInference's `tests/cassettes/` (or wherever the OpenInference package
+  parks them) into the migrated package's `tests/cassettes/`. Reuse names so existing
+  unit tests keep loading them.
+- Reuse existing cassettes for conformance scenarios when they are applicable.
+- **AI-generated cassettes.** For a cassette OpenInference lacks and you
+  can't record (no provider access), you may synthesize one from the
+  provider's API reference via AI. Start it with a
+  `# TODO: this is generated by AI, re-record` comment, mention it in the PR,
+  and open a follow-up issue to re-record it against the real provider in CI.
+
+**Transport proxy instead of cassettes.** If the OpenInference unit tests mock
+HTTP (e.g. `respx`, `httpx.MockTransport`) rather than replay recorded
+cassettes, you may do the same in the migrated package's **unit** tests — build the SDK
+client with an `httpx.MockTransport` (or equivalent) returning canned
+responses instead of `@pytest.mark.vcr`. When you go this route:
+
+- Reuse the OpenInference cassettes' recorded **response bodies** (incl. the
+  streaming SSE payloads) as the canned responses, so fidelity is preserved.
+- Still register the shared VCR plugins in `conftest.py` (the shared
+  `fixture_vcr` is autouse) and keep `vcr_config`.
+- **Use the same mechanism in conformance scenarios.** Conformance does not
+  require VCR — a scenario can build its client with the same transport mock
+  and ignore the injected `vcr` (see the `write-conformance-tests` skill).
+  Pick one mechanism (cassettes *or* transport mock) and use it consistently
+  across the whole package.
+- **Mention the choice in `MIGRATION_REPORT.md`** (the `review-migration` skill
+  flags missing cassettes; note that the package mocks the transport by
+  design so the absence is not a gap).
+
+### 10. Workspace integration
+
+Wire the new package into the workspace, `tox.ini`, and pyright per the
+**Adding a package to the workspace** section of [AGENTS.md](../../../AGENTS.md)
+— it applies to any new package, not just migrations. Migration-specific note on top:
+
+- **Leave the package out of `[tool.pyright] include`.** A migration over untyped
+  `wrapt` boundaries (`wrapped, instance, args, kwargs`) and vendor SDK members
+  produces hundreds of strict-mode errors, so don't add it to `include` until
+  typing lands — track that as a follow-up.
+
+### 11. Local checks, review, and PR
+
+Run the pre-PR checks from the **Commands** section of
+[AGENTS.md](../../../AGENTS.md) — `tox -e precommit`, `tox -e typecheck`, and
+the package's `-{oldest,latest}` (and `-conformance`) test envs.
+
+Open the PR with the `migration:openinference` label. Run the
+`review-migration` skill locally to generate `MIGRATION_REPORT.md`; iterate
+until §4 (test coverage) is clean. The review skill compares the migrated package
+against OpenInference (or any upstreams you name), so coverage gaps
+surface in one report.
+
+## See also
+
+- [AGENTS.md](../../../AGENTS.md) — general repo rules that already apply to the migration.
+- `util/opentelemetry-util-genai/AGENTS.md` — util-genai usage rules.
+- `.github/skills/write-conformance-tests/SKILL.md` — generic conformance-scenario authoring (step 8).
+- `.github/skills/review-migration/SKILL.md` — sister review skill (writes `MIGRATION_REPORT.md`).
diff --git a/.github/skills/review-migration/SKILL.md b/.github/skills/review-migration/SKILL.md
new file mode 100644
index 00000000..fcac4d7a
--- /dev/null
+++ b/.github/skills/review-migration/SKILL.md
@@ -0,0 +1,316 @@
+---
+name: review-migration
+description: Review a migrated or augmented instrumentation-genai package by comparing it against any known external upstream implementations of the same instrumentation (OpenInference, vendor-specific). Handles both greenfield migrations and augment-mode PRs that add coverage to a pre-existing package (checking the added parts, their consistency with existing code, and old-vs-new coexistence). Writes MIGRATION_REPORT.md in the migrated package root.
+---
+
+# Review a migrated instrumentation-genai package
+
+Compare the migrated package in `instrumentation/<target>/` against
+every known upstream implementation of the same instrumentation, and
+write a `MIGRATION_REPORT.md` in the migrated package root listing only
+what's missing or wrong.
+
+## Inputs and upstreams
+
+User supplies the **migrated package name**, e.g.
+`opentelemetry-instrumentation-genai-openai`. Strip
+`opentelemetry-instrumentation-genai-` to get `<lib>`; that drives the
+upstream lookup.
+
+**Upstreams come from the user, with a default fallback:**
+
+- **If the user names one or more upstreams** (a repo, a directory path, a
+  vendor's own SDK-side instrumentation, a Logfire / Pydantic AI package,
+  …), compare against exactly those — one column per upstream. Use them
+  verbatim; don't second-guess or substitute.
+- **If the user names none**, default to OpenInference: search
+  `https://github.com/open-telemetry/donation-openinference` under
+  `python/instrumentation/` for a package matching `<lib>` (typically
+  `openinference-instrumentation-<lib>`, but confirm by listing the
+  directory — the name may differ). Fetch a shallow clone if you don't
+  already have one locally:
+
+  ```sh
+  git clone --depth=1 https://github.com/open-telemetry/donation-openinference.git /tmp/openinference
+  ls /tmp/openinference/python/instrumentation/ | rg <lib>
+  ```
+
+If there are no upstreams to compare against (no user-named upstream resolves and no
+OpenInference match), this isn't a migration — bail with a one-line note.
+
+## Greenfield vs augment mode
+
+Detect the mode before reviewing — it changes what counts as a problem:
+
+- **Greenfield migration** — the migration *created* the package; everything in
+  `instrumentation/<target>/` is the migrated package.
+- **Augment mode** — the package **already existed** and the migration only
+  *adds* coverage mined from the upstream. The diff modifies a pre-existing
+  package.
+
+Tell them apart from the PR diff / git history / unstaged changes: if the
+package's `src/` and tests predate the migration commits, it's augment mode.
+
+In augment mode the deliverable shifts to: the **added** coverage's
+completeness vs the upstream, its **consistency** with the pre-existing code,
+and **old-vs-new coexistence** concerns. Do **not** flag pre-existing
+coverage the upstream also has as missing. Per-section adjustments are inline
+below.
+
+## Rules
+
+- **Deliverable is `instrumentation/<target>/MIGRATION_REPORT.md`.** Write
+  there, not stdout. The file is gitignored — regenerate freely.
+- **Report problems, not work.** No "✅ matches", no "all tests migrated",
+  no recap of structure. The PR diff already shows what's there; the
+  report is for what's missing or wrong.
+- **Don't justify empty findings.** When a section/table has no problems,
+  emit `_none_` (or skip the section, where the rule says so) and stop.
+  Do not list which greps you ran, which env vars you searched for, or
+  why you concluded clean. Clean means clean.
+- **Tables only when there are ≥1 problem rows.** Empty section: `_none_`.
+- **Read, don't guess.** Every claim from actual code or command output.
+  Don't infer from package names or READMEs. Before listing something as
+  missing, grep for it in the migrated package's `src/` and `tests/`.
+- **Snapshot of the PR head**, not aspirational state.
+
+## Report structure
+
+First line:
+
+```markdown
+# Migration review: <target-package>
+
+Mode: greenfield migration | augment existing package
+
+Compared against:
+- OpenInference: `openinference-instrumentation-<lib>` (add link)
+- <other upstream>: `<path>`, `<link>` (omit the line if no other upstream
+  is in scope)
+```
+
+State the mode on the second line. In augment mode, one sentence after it
+naming what the PR adds (the gaps it closes) orients the reviewer.
+
+Sections render in order. Each is bounded to its problems.
+
+### 1. Instrumented API surface
+
+Table of every API method any source patches, marking methods the
+migrated package does **not** patch. **Be exhaustive.** Enumerate every
+endpoint upstream emits a span for, even if attribute extraction
+is generic. Do not collapse rows.
+
+For each upstream, walk the actual instrumentation code, not docs:
+
+- **Method-level patching** (the shape of this migration, and of any
+  method-level upstream): read every `wrap_function_wrapper(...)` call in
+  `_instrument()`. Each call = one row.
+- **Transport-level patching** (typical of OpenInference, e.g.
+  `openai.OpenAI.request` / `AsyncOpenAI.request`): enumerate every
+  `cast_to` response type the accumulator/dispatch table handles **and**
+  every other endpoint that flows through the wrapped method (assistants,
+  threads, files, fine-tuning, images, audio, vector_stores, batches,
+  uploads, moderations, …). Spans with only generic attribute extraction
+  still count — list them, and note "generic span only" in the Notes.
+- **Anything else** (vendor SDK hooks, monkey-patched class methods,
+  decorator-based instrumentation): walk the actual entry points the
+  upstream registers and list one row per emitted span site.
+
+One column per upstream that exists, plus `This package`:
+
+| API method | OpenInference | This package | Notes |
+|---|---|---|---|
+| `openai.resources.chat.completions.Completions.create` | ✅ | ✅ | |
+| `openai.resources.responses.Responses.create` | ✅ | ❌ | |
+| `openai.resources.beta.assistants.Assistants.create` | ✅ (generic span only) | ❌ | |
+
+- ✅ = patched. ❌ = at least one upstream patches it, this migration doesn't.
+  — = not patched (and that's expected — upstream doesn't patch it either).
+- Sort `❌` rows to the top.
+- For `❌` rows, the **Notes** cell must name the GenAI semconv operation
+  the method maps to (`chat`, `embeddings`, `text_completion`,
+  `invoke_agent`, `invoke_workflow`, `execute_tool`, `realtime`, …) — or
+  "no semconv operation defined yet" if the spec hasn't caught up. 
+
+Do not render a separate `**Gaps:**` bullet list below the table — the
+table itself is the gap list.
+
+**Augment mode.** Split `This package` into `Existed` and `Added this PR`:
+
+| API method | OpenInference | Existed | Added this PR | Notes |
+|---|---|---|---|---|
+| `…chat.completions.Completions.create` | ✅ | ✅ | — | already covered |
+| `…responses.Responses.create` | ✅ | ❌ | ✅ | added by this PR |
+| `…batches.Batches.create` | ✅ | ❌ | ❌ | still missing — `chat` |
+
+`❌` in both columns (upstream patches it, the package still doesn't) is the
+gap; sort it to the top.
+
+### 2. Gaps and open issues
+
+Genuine **tooling/util gaps** — things the migration couldn't do because
+`opentelemetry-util-genai` and/or the GenAI semconv doesn't yet support them
+(missing util-genai factory, attribute not in the registry yet). 
+Reference failing/skipped tests if any.
+
+Do NOT list "OTel semconv doesn't cover X" as a gap when X is just an
+inference API variant — semconv applies to all inference APIs.
+
+| Gap | File / test | Upstream issue | Notes |
+|---|---|---|---|
+
+### 3. Significant behavioral changes
+
+Material differences vs upstream — equivalences (different code shape,
+same emitted telemetry) are not listed. Look for: patching strategy
+(transport-level vs method-level); error recording (exception event vs
+`error.type` + status); token counting (cache details folded vs
+separate); message format (flat OpenInference attributes vs JSON `parts` array);
+streaming wrapper shape; attributes set unconditionally vs gated on
+`is_recording()` / `should_capture_content()`; anything emitted by
+upstream that this migration deliberately drops without a one-line rationale.
+
+| Aspect | Upstream | This package | Notes |
+|---|---|---|---|
+
+### 3b. Consistency and old-vs-new coexistence (augment mode only)
+
+**Greenfield migration: skip.** Here the risk isn't "is it conformant" but "do
+the additions fit the package they landed in." Compare the **added** code
+against the **pre-existing** code (not the upstream); flag only real
+problems:
+
+- **Divergent patterns** — added wrappers/helpers/fixtures re-implementing
+  something the package already has a convention for.
+- **Duplicated scaffolding** — added tests not reusing the package's
+  `tests/test_utils.py` helpers or conformance runner.
+- **Telemetry contradictions** — an added method emitting an operation name,
+  attribute shape, span-kind, or content-capture gating that disagrees with
+  what pre-existing methods emit for the analogous case.
+- **Coexistence hazards** — added code that alters pre-existing paths (shared
+  module-level state, `_instrument()` ordering, a dependency range the old
+  code wasn't tested against), or a `pyproject`/`tox` edit affecting existing
+  tests.
+
+| Concern | Pre-existing | Added | Notes |
+|---|---|---|---|
+
+`_none_` if the additions are consistent and isolated.
+
+### 4. Test coverage
+
+Three checklists. Render only **missing** cells; if every checklist is
+clean, render `_Test coverage complete._` and skip the subsections.
+
+Gaps in this section are **blockers for the migration PR**, not
+follow-up work — address them before merge. Do not list §4 items as
+follow-up issues in §5.
+
+**Augment mode.** The matrix applies to methods **added this PR** (§1
+`Added this PR` = ✅) — those block merge. A pre-existing method missing a
+variant is a §5 follow-up, not a §4 blocker.
+
+Also flag here: **🟡 missing-cassette** — any scenario or unit test that
+references a cassette not committed under `tests/cassettes/`. And:
+**unreferenced cassettes** — one-line count of `tests/cassettes/*.yaml`
+files no test/scenario opens. Skip if 0.
+
+For each unreferenced cassette, walk git history before naming a cause:
+
+1. Check whether the same cassette is also unreferenced in the upstream
+   it came from (for each unreferenced file, search the upstream's
+   `tests/` for any reference). If yes, it's an **inherited upstream
+   orphan** — say so plainly; the migration is not responsible.
+2. If the cassette IS referenced upstream but not in the migrated package, the migration
+   dropped a test. That is **not** a §4b cosmetic note — it's a §4a
+   missing variant (or a missing scenario, depending on what the test
+   covered). Add it to §4a's table with the upstream test name in Notes,
+   and do not rationalize ("tests were renamed/removed during the migration"
+   without naming the commit is speculation, not analysis).
+
+Never write "appear to be" / "likely" / "safe to delete" without
+evidence. Either you walked the history and know, or you didn't — in
+which case say "history not checked" so the reviewer does it.
+
+#### 4a. Unit-test matrix per wrapped method
+
+For each method in §1 row where `This package` = ✅:
+
+| Variant | Required when |
+|---|---|
+| **sync × happy** | always |
+| **sync × error** | always — verify the original exception is re-raised unmodified, `error.type` is recorded, span status is ERROR |
+| **async × happy** | the lib exposes an async counterpart |
+| **async × error** | the lib exposes an async counterpart |
+| **streaming × happy** | the method accepts `stream=True` or returns a stream wrapper |
+| **streaming × error** | flag if streaming is supported but no error path is exercised at all or error is not recorded on telemetry |
+| **async streaming × happy** | the method accepts `stream=True` or returns an async stream wrapper |
+| **async streaming × error** | flag if streaming is supported but no error path is exercised at all or error is not recorded on telemetry |
+
+Identify variants by reading the migrated package's `src/` (`is_streaming(kwargs)`,
+async `def`, `Stream` / `AsyncStream` wrappers).
+
+| Wrapped method | Missing variants | Notes |
+|---|---|---|
+
+#### 4b. Conformance scenarios
+
+For each distinct GenAI semconv operation the migrated package emits (`chat`,
+`embeddings`, `execute_tool`, `invoke_agent`, `invoke_workflow`,
+`create_agent`, …) there should be at least one happy-path scenario file under
+`tests/conformance/<op>.py` driven by `run_conformance(...)`. More
+scenarios per operation are fine but never required.
+
+| Operation | Scenario file | Status |
+|---|---|---|
+
+Mention if conformance scenario is skipped, there are expected_violations,
+or `uv run tox -e py312-test-instrumentation-genai-<lib>` fails.
+
+#### 4c. Docstring / README coverage
+
+| Asset | Required content | Status |
+|---|---|---|
+| `README.rst` | install snippet, usage snippet importing from `opentelemetry.instrumentation.genai.<lib>`, pointer to `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` and the util-genai README, pointer to `tests/conformance/` (no `examples/`) | |
+| `src/opentelemetry/instrumentation/genai/<lib>/__init__.py` module docstring | usage example using the new module path; configuration section listing the env vars users actually need | |
+
+Render only rows that are missing/wrong/stale (e.g. README still imports
+from `openinference.instrumentation.<lib>`, install command points
+at the `openinference-instrumentation-<lib>` package name, links to an
+OpenInference URL).
+
+**Refactor misses.** One bullet *only if* unit tests re-implement generic
+semconv shape checks inline instead of factoring them into the migrated package's
+`tests/test_utils.py` helpers (e.g. `assert_all_attributes`,
+`assert_completion_attributes`, `assert_messages_attribute`). There is no
+shared `opentelemetry.test_util_genai.assertions` module — assert directly
+on `span.attributes[GenAIAttributes.GEN_AI_…]` using the semconv constants.
+
+### 5. Follow-up work
+
+What goes in this PR vs. a follow-up:
+
+- **In this PR**: §4 test coverage gaps. These block merge. In augment mode,
+  scoped to the methods added this PR, plus any §3b consistency problem in the
+  added code (it ships here).
+- **Follow-up**: §1 ❌ rows (new instrumented methods), §2 util-genai /
+  semconv gaps, §3 behavioral parity items, and — augment mode — pre-existing
+  coverage gaps from §4 — each a **separate** PR, one logical change apiece.
+
+Suggest an issue title per follow-up item, grouped by type (API surface /
+util-genai gaps / behavioral parity); name the upstream that covers it and
+the semconv operation where relevant. **Do not file the issues** — listing
+them is the deliverable; filing AI-generated issues is against the
+contributor policy, so the human author decides which to open. If there
+are none, render `_No follow-up issues recommended._`
+
+## See also
+
+- `.github/skills/migrate-from-openinference/SKILL.md` — the migration skill; it
+  runs this review at its final step to produce `MIGRATION_REPORT.md`.
+- `.github/skills/write-conformance-tests/SKILL.md` — authoring the
+  conformance scenarios this report checks in §4b.
+- `.github/instructions/instrumentation.instructions.md` — the copilot
+  PR-review rules for `instrumentation/**`; generic instrumentation
+  violations are flagged there and not repeated in this report.
\ No newline at end of file
diff --git a/.github/skills/write-conformance-tests/SKILL.md b/.github/skills/write-conformance-tests/SKILL.md
new file mode 100644
index 00000000..c1c8a6cd
--- /dev/null
+++ b/.github/skills/write-conformance-tests/SKILL.md
@@ -0,0 +1,362 @@
+---
+name: write-conformance-tests
+description: Author GenAI conformance-test scenarios for an OpenTelemetry instrumentation package — Scenario subclasses under tests/conformance/, the test_conformance.py runner, declared gaps, lib-specific assertions, and weaver live-check policies. Use when adding or updating conformance tests for any instrumentation, whether greenfield or ported.
+---
+
+# Write GenAI conformance tests
+
+Conformance tests validate that an instrumentation package emits telemetry
+matching the [GenAI semantic conventions](https://github.com/open-telemetry/semantic-conventions-genai)
+via Weaver live-check. They apply to **any** instrumentation package —
+greenfield or ported — and don't depend on how the package was built.
+
+This skill covers authoring the `tests/conformance/<scenario>.py` modules and
+the `tests/test_conformance.py` runner. For the always-on rules that hold even
+without this skill loaded, see the **Conformance tests** section of
+[AGENTS.md](../../../AGENTS.md).
+
+## One scenario per operation
+
+Put one scenario per emitted semconv operation under `tests/conformance/`.
+Write a scenario for **every** semconv operation the library emits, even one
+currently blocked by a util-genai or semconv gap. Skipping the scenario hides
+the gap; writing it records the gap (see [Declared gaps](#declared-gaps)). 
+**Never** drop a scenario file because it would fail today.
+
+## Recommended scenarios
+
+Cover the scenarios below that apply to the library. Skip a row only when the
+library genuinely can't perform that operation (e.g. an inference-only
+client has no `embeddings` scenario).
+
+**LLM client instrumentations:**
+
+| Scenario | File | Covers |
+|---|---|---|
+| Inference | `inference.py` | A `chat` operation. |
+| Tool calling | `tool_calling.py` | A `chat` turn where the model returns tool calls and a follow-up turn feeds tool results back. Asserts tool calls and tool results are present on input/output **messages**. weaver will validate the format. Do **not** expect `execute_tool` spans unless the client library itself instruments tool execution — most don't; tool execution is the caller's code. |
+| Multimodal content | `multimodal.py` | A `chat` turn carrying the **non-text parts** the client accepts (inline image/audio bytes, media URLs, file refs, …), asserting each round-trips onto the messages. Cover only the part types the library emits — see [Message-part coverage](#message-part-coverage). |
+| Reasoning | `reasoning.py` | A `chat` turn against a reasoning model where the response carries reasoning/thinking content, asserting a `reasoning` part lands on an output message (and `gen_ai.usage.output_tokens` / reasoning-token attributes if the library records them). Only when the client surfaces reasoning content — see [Message-part coverage](#message-part-coverage). |
+| Embeddings | `embedding.py` | An `embeddings` operation. |
+
+**Agent / orchestration instrumentations:**
+
+| Scenario | File | Covers |
+|---|---|---|
+| Agent invocation with tooling | `invoke_agent.py` | An `invoke_agent` run that calls at least one tool — expects `invoke_agent` plus the nested `execute_tool` / `chat` spans the framework emits. |
+| Multi-agent orchestration | `multi_agent.py` | One agent handing off to / invoking another — expects nested `invoke_agent` spans under the orchestrator. |
+| Workflows | `invoke_workflow.py` | An `invoke_workflow` run wrapping the agent/tool spans it drives. |
+
+## Message-part coverage
+
+Weaver validates a part's *shape*, not *which* part types a scenario
+exercised — a text-only scenario leaves the package's image/audio/file/tool
+mapping unverified. So exercise **every non-text part type the library can
+emit** and assert it landed on a message. Cover only what the package
+instruments: walk its wrappers (the step-6 mapping for a port) for which
+`opentelemetry.util.genai.types` parts they produce.
+
+| Part `type` | util-genai type | Emitted when the library accepts… |
+|---|---|---|
+| `text` | `Text` | plain text (always) |
+| `tool_call` / `tool_call_response` | `ToolCallRequest` / `ToolCallResponse` | function/tool calling — covered by `tool_calling.py` |
+| `server_tool_call` / `server_tool_call_response` | `ServerToolCall` / `ServerToolCallResponse` | vendor server-side tools (web_search, code_interpreter, …) |
+| `reasoning` | `Reasoning` | reasoning / thinking items |
+| `blob` | `Blob` | inline image/audio/video **bytes** (`modality` distinguishes them) |
+| `uri` | `Uri` | an external media **URL** (`modality`) |
+| `file` | `File` | a **file reference** / id (`modality`) |
+| `generic` | `GenericPart` | a provider item with no semconv mapping — flag, don't drop |
+
+Group by shared turn/cassette — typically one `multimodal.py` for the
+image/audio/file/url inputs the client accepts, `tool_calling.py` for tool
+parts, and `reasoning.py` for `reasoning` parts (a reasoning model emits
+those on output messages, not input). `type` alone gives `blob` / `uri` /
+`file`; to tell image from
+audio from video, read the part's `modality` with a `_part_fields` helper
+returning `(type, modality)` tuples (defined alongside `_part_types` in
+[Lib-specific assertions](#lib-specific-assertions)):
+
+```python
+    def validate(self, report: LiveCheckReport) -> None:
+        super().validate(report)
+        chat_spans = [
+            entry["span"] for entry in report["samples"]
+            if "span" in entry
+            and _attr(entry["span"], "gen_ai.operation.name") == "chat"
+        ]
+        input_parts = {
+            (t, m) for span in chat_spans
+            for t, m in _part_fields(_attr(span, "gen_ai.input.messages"))
+        }
+        # e.g. an inline image + an audio URL were sent
+        assert ("blob", "image") in input_parts, f"no image blob, saw {input_parts}"
+        assert ("uri", "audio") in input_parts, f"no audio uri, saw {input_parts}"
+```
+
+If a part type the library accepts can't round-trip yet (a util-genai/semconv
+gap), still write the scenario and record it as a
+[declared gap](#declared-gaps) — never silently omit the part.
+
+## Scenario modules
+
+Each scenario module defines a subclass of `Scenario` from
+`opentelemetry.test_util_genai.conformance`. It sets the `expected_spans` /
+`expected_metrics` ClassVars and implements
+`run(self, *, tracer_provider, meter_provider, logger_provider, vcr)`.
+Drive instrumentation through the shared `instrument` context manager (not
+`instr.instrument()` / `trace.set_tracer_provider`). The runner injects an
+already-configured `vcr`, so a cassette-based scenario just calls
+`vcr.use_cassette(...)`:
+
+```python
+# tests/conformance/inference.py
+from typing import Any
+
+from opentelemetry.instrumentation.genai.<lib> import <Lib>Instrumentor
+from opentelemetry.sdk._logs import LoggerProvider
+from opentelemetry.sdk.metrics import MeterProvider
+from opentelemetry.sdk.trace import TracerProvider
+from opentelemetry.test_util_genai.conformance import Scenario
+from opentelemetry.test_util_genai.instrumentor import instrument
+
+
+class InferenceScenario(Scenario):
+    expected_spans = ("chat",)
+    expected_metrics = (
+        "gen_ai.client.operation.duration",
+        "gen_ai.client.token.usage",
+    )
+
+    def run(
+        self,
+        *,
+        tracer_provider: TracerProvider,
+        meter_provider: MeterProvider,
+        logger_provider: LoggerProvider,
+        vcr: Any,
+    ) -> None:
+        with instrument(
+            <Lib>Instrumentor(),
+            tracer_provider=tracer_provider,
+            logger_provider=logger_provider,
+            meter_provider=meter_provider,
+            content_capture="SPAN_ONLY",
+        ):
+            with vcr.use_cassette("inference.yaml"):
+                ...  # call the patched API
+```
+
+One operation per scenario. No env vars, no logging config.
+
+**VCR cassettes are not required — a transport mock works too.** Mock HTTP
+however the package's **unit** tests already do, and use the **same pattern
+across every scenario in that package** (don't mix cassettes and transport
+mocks within one lib). If the package mocks the transport (e.g.
+`httpx.MockTransport`, `respx`) instead of replaying cassettes, build the
+client with that transport inside `run()` and ignore the injected `vcr`:
+
+```python
+    def run(self, *, tracer_provider, meter_provider, logger_provider, vcr) -> None:
+        with instrument(
+            <Lib>Instrumentor(),
+            tracer_provider=tracer_provider,
+            logger_provider=logger_provider,
+            meter_provider=meter_provider,
+            content_capture="SPAN_ONLY",
+        ):
+            client = <Lib>(transport=httpx.MockTransport(_handler))  # canned response
+            client.<method>(...)  # call the patched API
+```
+
+`vcr` stays in the signature either way (the runner always passes it).
+
+## Declared gaps
+
+**Declared gaps** go in the `expected_violations` ClassVar (a tuple of
+`ExpectedViolation`), not `xfail`. `run_conformance` fails on *undeclared*
+weaver violations and on declared violations weaver no longer reports — so
+a known util-genai/semconv gap is recorded as an `expected_violation` that
+fails loudly the moment it's fixed.
+
+When the gap is too big to capture as individual `expected_violations` — the
+whole operation can't run yet — skip the entire scenario instead, via
+`pytest.mark.xfail` / `skip` on the parametrize entry in
+`test_conformance.py`. Don't invent a one-off `reason=` string: **ask the
+user to file a tracking bug first**, then use that issue as the skip
+`reason=` (e.g. `reason="blocked by open-telemetry/...#1234"`) so the skip
+is traceable and gets revisited when the bug is fixed. **Never** drop the
+scenario file itself — a skipped scenario still records that the operation
+exists; a deleted one hides it.
+
+## Lib-specific assertions
+
+**Lib-specific assertions** go in a `validate(self, report)` override on the
+scenario (call `super().validate(report)` first) — there is no
+`_local_assertions.py` / `LocalAssertions` hook. Common lib-specific shapes:
+
+- **Vendor server-tool payloads** (`code_interpreter`, `web_search`, …).
+- **Vendor-specific finish reasons** outside semconv's enum (`stop`,
+  `length`, `content_filter`, `tool_call`, `error`).
+- **Provider-specific `error.type`** — exception class names from the
+  underlying SDK.
+
+`validate` receives the weaver `LiveCheckReport`. Read span samples from
+`report["samples"]` (each `entry["span"]["attributes"]` is a list of
+`{"name", "value"}`) and seen metrics from
+`report["statistics"]["seen_registry_metrics"]`. Always call
+`super().validate(report)` first so the `expected_spans` / `expected_metrics`
+checks still run:
+
+```python
+# tests/conformance/tool_calling.py
+from __future__ import annotations
+
+import json
+from typing import Any
+
+from opentelemetry.test.weaver_live_check import LiveCheckReport
+from opentelemetry.test_util_genai.conformance import Scenario
+
+
+class ToolCallingScenario(Scenario):
+    expected_spans = ("chat",)
+    expected_metrics = ("gen_ai.client.operation.duration",)
+
+    def run(self, *, tracer_provider, meter_provider, logger_provider, vcr) -> None:
+        ...  # drive a tool-calling turn — see "Scenario modules" above
+
+    def validate(self, report: LiveCheckReport) -> None:
+        super().validate(report)  # keep the expected_spans / _metrics checks
+
+        # Lib-specific: weaver validates the *shape* of each message part, but
+        # not that a tool call actually round-tripped. Assert the model's
+        # tool_call landed on an output message and the tool result was fed
+        # back on an input message (across the two chat turns).
+        chat_spans = [
+            entry["span"]
+            for entry in report["samples"]
+            if "span" in entry
+            and _attr(entry["span"], "gen_ai.operation.name") == "chat"
+        ]
+        assert chat_spans, "no chat span emitted"
+
+        output_part_types = {
+            t for span in chat_spans
+            for t in _part_types(_attr(span, "gen_ai.output.messages"))
+        }
+        input_part_types = {
+            t for span in chat_spans
+            for t in _part_types(_attr(span, "gen_ai.input.messages"))
+        }
+        assert "tool_call" in output_part_types, (
+            f"expected a tool_call part on an output message, saw {output_part_types}"
+        )
+        assert "tool_call_response" in input_part_types, (
+            f"expected a tool_call_response part on an input message, saw {input_part_types}"
+        )
+
+
+def _attr(span: dict[str, Any], name: str) -> Any:
+    for attr in span["attributes"]:
+        if attr["name"] == name:
+            return attr["value"]
+    return None
+
+
+def _part_types(messages_json: str | None) -> list[str]:
+    # gen_ai.{input,output}.messages is a JSON string of
+    # [{"role": ..., "parts": [{"type": ..., ...}]}].
+    messages = json.loads(messages_json) if messages_json else []
+    return [part["type"] for message in messages for part in message["parts"]]
+
+
+def _part_fields(messages_json: str | None) -> list[tuple[str, str | None]]:
+    # Like _part_types, but keeps modality so image/audio/video are
+    # distinguishable on blob/uri/file parts (None for parts without one).
+    messages = json.loads(messages_json) if messages_json else []
+    return [
+        (part["type"], part.get("modality"))
+        for message in messages
+        for part in message["parts"]
+    ]
+```
+
+## The test_conformance.py runner
+
+`tests/test_conformance.py` guards collection with
+`pytest.importorskip("opentelemetry.test.weaver_live_check")`, parametrizes
+the scenario *instances*, and calls `run_conformance(scenario, vcr=vcr,
+weaver=weaver_live_check)` — it builds its own providers from the weaver
+OTLP endpoint, so don't pass providers/exporters:
+
+```python
+import pytest
+
+pytest.importorskip("opentelemetry.test.weaver_live_check")
+
+from opentelemetry.test.weaver_live_check import WeaverLiveCheck  # noqa: E402
+from opentelemetry.test_util_genai.conformance import (  # noqa: E402
+    Scenario,
+    run_conformance,
+)
+
+from .conformance.embedding import EmbeddingScenario
+from .conformance.inference import InferenceScenario
+
+
+@pytest.mark.parametrize(
+    "scenario",
+    [InferenceScenario(), EmbeddingScenario()],
+    ids=lambda s: type(s).__name__,
+)
+def test_conformance(
+    scenario: Scenario, vcr, weaver_live_check: WeaverLiveCheck
+) -> None:
+    run_conformance(scenario, vcr=vcr, weaver=weaver_live_check)
+```
+
+Do not write a separate `test_weaver_live.py`; weaver is already wired
+through `run_conformance`. The `weaver_live_check` fixture skips the test
+(rather than passing without the gate) only when it can't start (unsupported
+platform / network) — never wrap it in a try/except, which would silently
+disable the gate.
+
+## Weaver policies
+
+`weaver_live_check` enforces `policies/genai_content_validation.rego`
+(content-attribute JSON shape) and `policies/genai_span_validation.rego`
+(span name format, per-op expected attributes) — read these policy files
+when authoring scenarios; they're authoritative.
+
+`weaver_live_check` downloads the pinned weaver binary on first use (cached
+under `~/.cache/otel-conformance/weaver/`).
+
+## Recorded HTTP (cassettes or transport mock)
+
+A scenario replays one HTTP interaction per operation. Use whichever
+mechanism the package's unit tests use, consistently across all of its
+scenarios:
+
+- **VCR cassette** — `vcr.use_cassette("<scenario>.yaml")`, one committed
+  cassette per operation under `tests/cassettes/`.
+- **Transport mock** — build the SDK client with an `httpx.MockTransport`
+  (or `respx`) returning a canned response; no cassette file needed.
+
+Pick the one the lib already follows and don't mix the
+two within a package.
+
+**AI-generated cassettes.** Lacking provider access, you may synthesize a
+cassette from the provider's API reference via AI. Start the cassette with a
+`# TODO: this is generated by AI, re-record` comment, mention it in the PR,
+and open a follow-up issue to re-record it against the real provider in CI.
+
+## Running
+
+```sh
+uv run tox -e py312-test-instrumentation-genai-<lib>-conformance
+```
+
+The `*-conformance` tox envs target `tests/test_conformance.py` directly; the
+regular `*-{oldest,latest}` envs `--ignore` it so they don't need the
+OTLP/gRPC exporter or `weaver_live_check`.
+
diff --git a/.gitignore b/.gitignore
index fa7bfddc..67ff44c6 100644
--- a/.gitignore
+++ b/.gitignore
@@ -73,3 +73,6 @@ policies/_schemas.rego
 
 # Conformance test weaver live-check reports
 weaver_reports/
+
+# Migration review reports generated by the review-migration skill
+MIGRATION_REPORT.md
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 337764c2..9c00f9dc 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -128,6 +128,22 @@ uv run tox -e changelog-preview
 If your change doesn't need an entry (pure docs/tooling), add the
 `Skip Changelog` label to the PR.
 
+## Skills
+
+This repo ships skills (under `.github/skills/`) that automate the heavy,
+repeatable contribution flows. Trigger them deliberately when
+you start one of these tasks:
+
+- **`migrate-from-openinference`** — migrate an `openinference-instrumentation-*`
+  package into this repo as an OTel GenAI package, or augment an existing
+  package with the coverage OpenInference adds on top.
+- **`review-migration`** — review a ported or augmented package against its
+  upstream implementation and write `MIGRATION_REPORT.md`.
+- **`write-conformance-tests`** — author conformance scenarios and the
+  `test_conformance.py` runner for an instrumentation package.
+
+Please contribute back anything you learn while using the skills that could help improve them!
+
 ## Keep PRs small
 
 One logical change per PR. Don't bundle unrelated fixes, refactors, or