From 880c594e00a147681edc80cdb93f340949f6b4be Mon Sep 17 00:00:00 2001 From: yairallouche Date: Tue, 2 Jun 2026 15:08:24 +0300 Subject: [PATCH] docs: add With Mellea section to BYOA guide and align terminology Step 4 of the Build Your Own Adapter guide claimed custom adapters were not supported by Mellea, contradicting mellea_build_your_own_adapter.md which shows exactly how to invoke them via the lower-level Intrinsic interface. Replace that note with an accurate framing and add a "With Mellea" subsection alongside the existing "With HuggingFace" and "With vLLM" subsections, reusing the guide's running example for consistency and linking to the standalone Mellea BYOA guide for full detail. Also standardize the term to "Build Your Own Adapter" across the tutorial docs (titles, headings, and cross-links). --- tutorials/README.md | 6 +-- tutorials/guides/build_your_own_adapter.md | 50 ++++++++++++++++++- .../guides/compare_inference_throughput.md | 2 +- .../guides/mellea_build_your_own_adapter.md | 6 +-- .../guides/mellea_with_granite_switch.md | 2 +- 5 files changed, 56 insertions(+), 10 deletions(-) diff --git a/tutorials/README.md b/tutorials/README.md index 817501a..23f1a4b 100644 --- a/tutorials/README.md +++ b/tutorials/README.md @@ -22,7 +22,7 @@ Step-by-step walkthroughs covering adapter function invocation, pipeline constru | Guide | Description | |-------|-------------| | [Using Mellea with Granite Switch](guides/mellea_with_granite_switch.md) | Connect Mellea to a Granite Switch model | -| [Bring Your Own Adapter](guides/build_your_own_adapter.md) | Train, compose, and use custom adapters | +| [Build Your Own Adapter](guides/build_your_own_adapter.md) | Train, compose, and use custom adapters | | [Compare Inference Throughput](guides/compare_inference_throughput.md) | Compare LoRA vs aLoRA based models in an inference race setup | @@ -57,11 +57,11 @@ Best for: Seeing how adapter functions compose into multi-step applications -### Path 3: Bring Your Own Adapter +### Path 3: Build Your Own Adapter Best for: Custom adapter function development -1. [Bring Your Own Adapter Guide](guides/build_your_own_adapter.md) +1. [Build Your Own Adapter Guide](guides/build_your_own_adapter.md) 2. [Configure Your Own Adapter Guide](guides/mellea_build_your_own_adapter.md) 3. [Compose Your Checkpoint](notebooks/compose_granite_switch.ipynb) diff --git a/tutorials/guides/build_your_own_adapter.md b/tutorials/guides/build_your_own_adapter.md index 7af4d92..c7227a0 100644 --- a/tutorials/guides/build_your_own_adapter.md +++ b/tutorials/guides/build_your_own_adapter.md @@ -1,4 +1,4 @@ -# Bring Your Own Adapter (BYOA) +# Build Your Own Adapter (BYOA) This guide explains how to train your own adapter (aLoRA or LoRA) and compose it into a Granite Switch model. @@ -183,7 +183,7 @@ The base model's tokenizer and generation assets (`generation_config.json`, `mer ## Step 4: Use the Composed Model -> **Note:** Custom (BYOA) adapters are not supported by [Mellea](https://github.com/generative-computing/mellea). Mellea only supports the official IBM Granite Library adapters. To invoke your custom adapters, use the chat template directly as shown below. +You can invoke your custom adapter directly via the model's chat template (with HuggingFace or vLLM), or through [Mellea](https://github.com/generative-computing/mellea). Note that Mellea's *high-level* wrappers (Guardian, RAG, Core) are specific to the official IBM Granite Library adapters; a custom (BYOA) adapter is invoked by name through Mellea's lower-level `Intrinsic` interface, as shown in the **With Mellea** section below. ### With HuggingFace @@ -247,6 +247,52 @@ response = client.chat.completions.create( print(response.choices[0].message.content) ``` +### With Mellea + +Mellea can drive the same vLLM server. Its high-level wrappers (Guardian, RAG, Core) only cover the official IBM Granite Library adapters, but a custom adapter is invoked by name through the lower-level `Intrinsic` interface: + +```bash +pip install mellea +``` + +```python +import json + +from mellea.backends.model_options import ModelOption +from mellea.backends.openai import OpenAIBackend +from mellea.stdlib.context import ChatContext +from mellea.stdlib.components import Message, Intrinsic +import mellea.stdlib.functional as mfuncs + +# Connect to the vLLM server from the "With vLLM" section above. +# load_embedded_adapters=True autoloads the adapters composed into the model. +backend = OpenAIBackend( + model_id="./composed-model", + base_url="http://localhost:8000/v1", + api_key="unused", # vLLM doesn't require auth by default + load_embedded_adapters=True, +) + +# Invoke your custom adapter by name (the io.yaml `name` from Step 2/3). +context = ChatContext().add(Message("assistant", "Paris is the capital of France.")) +action = Intrinsic("uncertainty") + +out, _ = mfuncs.act( + action, + context, + backend, + model_options={ModelOption.TEMPERATURE: 0.0}, + strategy=None, +) + +# The adapter's io.yaml `response_format` (see Step 3) forces JSON output, and its +# `transformations` rename `score` to `certainty`. +result = json.loads(str(out)) +print(result["certainty"]) +``` + +For the full walkthrough — including helper functions that wrap your adapter so it behaves like Mellea's built-in intrinsics — see [Build Your Own Adapter with Mellea](mellea_build_your_own_adapter.md). + ## Next Steps - **[Hello Adapter](../notebooks/hello_adapter.ipynb)** - minimal embedded-adapter invocation via the HuggingFace backend diff --git a/tutorials/guides/compare_inference_throughput.md b/tutorials/guides/compare_inference_throughput.md index 961172f..2ed2719 100644 --- a/tutorials/guides/compare_inference_throughput.md +++ b/tutorials/guides/compare_inference_throughput.md @@ -87,4 +87,4 @@ raced simultaneously. - **[Hello Adapter](../notebooks/hello_adapter.ipynb)** - minimal embedded-adapter invocation via the HuggingFace backend - **[Using Mellea with Granite Switch](mellea_with_granite_switch.md)** - deeper Mellea integration details -- **[Bring Your Own Adapter](build_your_own_adapter.md)** - train a custom adapter and compose it in +- **[Build Your Own Adapter](build_your_own_adapter.md)** - train a custom adapter and compose it in diff --git a/tutorials/guides/mellea_build_your_own_adapter.md b/tutorials/guides/mellea_build_your_own_adapter.md index 46779ef..5853f09 100644 --- a/tutorials/guides/mellea_build_your_own_adapter.md +++ b/tutorials/guides/mellea_build_your_own_adapter.md @@ -1,4 +1,4 @@ -# Bring Your Own Adapter with Mellea +# Build Your Own Adapter with Mellea This guide explains how to configure your own adapter with Mellea to be used by Granite Switch model. @@ -6,7 +6,7 @@ This guide explains how to configure your own adapter with Mellea to be used by Together, Mellea + Granite Switch + vLLM provide a production-ready inference stack for adapter-based AI applications that can utilize custom adapters. - See [Mellea With Granite Switch](mellea_with_granite_switch.md) for a detailed explanation of how granite-switch and Mellea work together. -- See [Bring Your Own Adapter](build_your_own_adapter.md) for info on how to train your own adapter. +- See [Build Your Own Adapter](build_your_own_adapter.md) for info on how to train your own adapter. - See Mellea's [Lora and aLoRA adapters](https://docs.mellea.ai/advanced/lora-and-alora-adapters) for info on how to train your own custom adapters using Mellea. ## Prerequisites @@ -61,7 +61,7 @@ out, _ = mfuncs.act( ) # Adapter / Intrinsic processing in Mellea utilizes the io.yaml format forcing the output -# to be a json. See the "Bring Your Own Adapter" linked example above. +# to be a json. See the "Build Your Own Adapter" linked example above. result = json.loads(str(out)) print(result) ``` diff --git a/tutorials/guides/mellea_with_granite_switch.md b/tutorials/guides/mellea_with_granite_switch.md index e9946fd..2a8850c 100644 --- a/tutorials/guides/mellea_with_granite_switch.md +++ b/tutorials/guides/mellea_with_granite_switch.md @@ -247,7 +247,7 @@ print(f"Citations: {citations}") ## Next Steps - **[Hello Adapter](../notebooks/hello_adapter.ipynb)** - Minimal embedded-adapter invocation via the HuggingFace backend -- **[Bring Your Own Adapter](build_your_own_adapter.md)** - Train a custom adapter and compose it in +- **[Build Your Own Adapter](build_your_own_adapter.md)** - Train a custom adapter and compose it in - **[Compare Inference Throughput](compare_inference_throughput.md)** - Benchmark ALORA vs LoRA on a 6-step RAG pipeline - **[Mellea Repository](https://github.com/generative-computing/mellea)** - Full documentation - **[Granite Models](https://huggingface.co/ibm-granite)**