Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions tutorials/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Step-by-step walkthroughs covering adapter function invocation, pipeline constru
| Guide | Description |
|-------|-------------|
| [Using Mellea with Granite Switch](guides/mellea_with_granite_switch.md) | Connect Mellea to a Granite Switch model |
| [Bring Your Own Adapter](guides/build_your_own_adapter.md) | Train, compose, and use custom adapters |
| [Build Your Own Adapter](guides/build_your_own_adapter.md) | Train, compose, and use custom adapters |
| [Compare Inference Throughput](guides/compare_inference_throughput.md) | Compare LoRA vs aLoRA based models in an inference race setup |


Expand Down Expand Up @@ -57,11 +57,11 @@ Best for: Seeing how adapter functions compose into multi-step applications



### Path 3: Bring Your Own Adapter
### Path 3: Build Your Own Adapter

Best for: Custom adapter function development

1. [Bring Your Own Adapter Guide](guides/build_your_own_adapter.md)
1. [Build Your Own Adapter Guide](guides/build_your_own_adapter.md)
2. [Configure Your Own Adapter Guide](guides/mellea_build_your_own_adapter.md)
3. [Compose Your Checkpoint](notebooks/compose_granite_switch.ipynb)

Expand Down
50 changes: 48 additions & 2 deletions tutorials/guides/build_your_own_adapter.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Bring Your Own Adapter (BYOA)
# Build Your Own Adapter (BYOA)

This guide explains how to train your own adapter (aLoRA or LoRA) and compose it into a Granite Switch model.

Expand Down Expand Up @@ -183,7 +183,7 @@ The base model's tokenizer and generation assets (`generation_config.json`, `mer

## Step 4: Use the Composed Model

> **Note:** Custom (BYOA) adapters are not supported by [Mellea](https://github.com/generative-computing/mellea). Mellea only supports the official IBM Granite Library adapters. To invoke your custom adapters, use the chat template directly as shown below.
You can invoke your custom adapter directly via the model's chat template (with HuggingFace or vLLM), or through [Mellea](https://github.com/generative-computing/mellea). Note that Mellea's *high-level* wrappers (Guardian, RAG, Core) are specific to the official IBM Granite Library adapters; a custom (BYOA) adapter is invoked by name through Mellea's lower-level `Intrinsic` interface, as shown in the **With Mellea** section below.

### With HuggingFace

Expand Down Expand Up @@ -247,6 +247,52 @@ response = client.chat.completions.create(
print(response.choices[0].message.content)
```

### With Mellea

Mellea can drive the same vLLM server. Its high-level wrappers (Guardian, RAG, Core) only cover the official IBM Granite Library adapters, but a custom adapter is invoked by name through the lower-level `Intrinsic` interface:

```bash
pip install mellea
```

```python
import json

from mellea.backends.model_options import ModelOption
from mellea.backends.openai import OpenAIBackend
from mellea.stdlib.context import ChatContext
from mellea.stdlib.components import Message, Intrinsic
import mellea.stdlib.functional as mfuncs

# Connect to the vLLM server from the "With vLLM" section above.
# load_embedded_adapters=True autoloads the adapters composed into the model.
backend = OpenAIBackend(
model_id="./composed-model",
base_url="http://localhost:8000/v1",
api_key="unused", # vLLM doesn't require auth by default
load_embedded_adapters=True,
)

# Invoke your custom adapter by name (the io.yaml `name` from Step 2/3).
context = ChatContext().add(Message("assistant", "Paris is the capital of France."))
action = Intrinsic("uncertainty")

out, _ = mfuncs.act(
action,
context,
backend,
model_options={ModelOption.TEMPERATURE: 0.0},
strategy=None,
)

# The adapter's io.yaml `response_format` (see Step 3) forces JSON output, and its
# `transformations` rename `score` to `certainty`.
result = json.loads(str(out))
print(result["certainty"])
```

For the full walkthrough — including helper functions that wrap your adapter so it behaves like Mellea's built-in intrinsics — see [Build Your Own Adapter with Mellea](mellea_build_your_own_adapter.md).

## Next Steps

- **[Hello Adapter](../notebooks/hello_adapter.ipynb)** - minimal embedded-adapter invocation via the HuggingFace backend
Expand Down
2 changes: 1 addition & 1 deletion tutorials/guides/compare_inference_throughput.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,4 +87,4 @@ raced simultaneously.

- **[Hello Adapter](../notebooks/hello_adapter.ipynb)** - minimal embedded-adapter invocation via the HuggingFace backend
- **[Using Mellea with Granite Switch](mellea_with_granite_switch.md)** - deeper Mellea integration details
- **[Bring Your Own Adapter](build_your_own_adapter.md)** - train a custom adapter and compose it in
- **[Build Your Own Adapter](build_your_own_adapter.md)** - train a custom adapter and compose it in
6 changes: 3 additions & 3 deletions tutorials/guides/mellea_build_your_own_adapter.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Bring Your Own Adapter with Mellea
# Build Your Own Adapter with Mellea

This guide explains how to configure your own adapter with Mellea to be used by Granite Switch model.

## Overview

Together, Mellea + Granite Switch + vLLM provide a production-ready inference stack for adapter-based AI applications that can utilize custom adapters.
- See [Mellea With Granite Switch](mellea_with_granite_switch.md) for a detailed explanation of how granite-switch and Mellea work together.
- See [Bring Your Own Adapter](build_your_own_adapter.md) for info on how to train your own adapter.
- See [Build Your Own Adapter](build_your_own_adapter.md) for info on how to train your own adapter.
- See Mellea's [Lora and aLoRA adapters](https://docs.mellea.ai/advanced/lora-and-alora-adapters) for info on how to train your own custom adapters using Mellea.

## Prerequisites
Expand Down Expand Up @@ -61,7 +61,7 @@ out, _ = mfuncs.act(
)

# Adapter / Intrinsic processing in Mellea utilizes the io.yaml format forcing the output
# to be a json. See the "Bring Your Own Adapter" linked example above.
# to be a json. See the "Build Your Own Adapter" linked example above.
result = json.loads(str(out))
print(result)
```
Expand Down
2 changes: 1 addition & 1 deletion tutorials/guides/mellea_with_granite_switch.md
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,7 @@ print(f"Citations: {citations}")
## Next Steps

- **[Hello Adapter](../notebooks/hello_adapter.ipynb)** - Minimal embedded-adapter invocation via the HuggingFace backend
- **[Bring Your Own Adapter](build_your_own_adapter.md)** - Train a custom adapter and compose it in
- **[Build Your Own Adapter](build_your_own_adapter.md)** - Train a custom adapter and compose it in
- **[Compare Inference Throughput](compare_inference_throughput.md)** - Benchmark ALORA vs LoRA on a 6-step RAG pipeline
- **[Mellea Repository](https://github.com/generative-computing/mellea)** - Full documentation
- **[Granite Models](https://huggingface.co/ibm-granite)**
Expand Down