From 7fd33d3b3f08a6ba25d950c8eea01e6bf4cb4335 Mon Sep 17 00:00:00 2001 From: Fiona Corden Date: Thu, 15 Jan 2026 09:29:43 +0000 Subject: [PATCH 1/9] WIP - add pricing information to overview --- src/pages/docs/ai-transport/index.mdx | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/src/pages/docs/ai-transport/index.mdx b/src/pages/docs/ai-transport/index.mdx index ceec10b83a..57969480a0 100644 --- a/src/pages/docs/ai-transport/index.mdx +++ b/src/pages/docs/ai-transport/index.mdx @@ -145,3 +145,20 @@ Take a look at some example code running in-browser of the sorts of features you }, ]} + +## Pricing + +AI Transport uses Ably's [usage based billing model](/pricing) at your package rates. Your consumption costs will depend on the number of messages inbound (published to Ably) and outbound (delivered to subscribers), and how long channels or connections are active. [Contact Ably](/contact) to discuss options for Enterprise pricing and volume discounts. + +The cost of streaming token responses over Ably depends on: + +- the number of tokens in the LLM responses that you are streaming. For example, a simple support chatbot response is around 300 tokens, a code session chat can be 2000-3000 tokens and a deep reasoning response could be 50000+ tokens. +- the rate at which your agent publishes tokens to Ably and the number of messages it uses to do so. Some LLMs output every token as a single event, while others batch multiple tokens together. Similarly, your agent may publish tokens as they are received from the LLM or perform its own processing and batching first. +- the number of subscribers receiving the response +- the [token streaming pattern](/docs/ai-transport/features/token-streaming#token-streaming-patterns) you choose + + +- message-per-response Ably will automatically + +- message-per-token you are in control, you can turn on server side batching to group messages together in a batching interval. Higher batching interval increases latency but reduces total number of messages, lower batching interval delivers messages quickly. +[server-side batching](/docs/messages/batch#server-side) From 20d5dbeba6377153dc9d4e4fc0999c55a07a11a3 Mon Sep 17 00:00:00 2001 From: Fiona Corden Date: Thu, 15 Jan 2026 16:03:00 +0000 Subject: [PATCH 2/9] WIP - pricing notes --- src/pages/docs/ai-transport/index.mdx | 14 ++++------- .../platform/pricing/examples/ai-chatbot.mdx | 23 +++++++++++++++++++ 2 files changed, 28 insertions(+), 9 deletions(-) create mode 100644 src/pages/docs/platform/pricing/examples/ai-chatbot.mdx diff --git a/src/pages/docs/ai-transport/index.mdx b/src/pages/docs/ai-transport/index.mdx index 57969480a0..ef96f64479 100644 --- a/src/pages/docs/ai-transport/index.mdx +++ b/src/pages/docs/ai-transport/index.mdx @@ -148,17 +148,13 @@ Take a look at some example code running in-browser of the sorts of features you ## Pricing -AI Transport uses Ably's [usage based billing model](/pricing) at your package rates. Your consumption costs will depend on the number of messages inbound (published to Ably) and outbound (delivered to subscribers), and how long channels or connections are active. [Contact Ably](/contact) to discuss options for Enterprise pricing and volume discounts. +AI Transport uses Ably's [usage based billing model](/docs/platform/pricing) at your package rates. Your consumption costs will depend on the number of messages inbound (published to Ably) and outbound (delivered to subscribers), and how long channels or connections are active. [Contact Ably](https://ably.com/contact) to discuss options for Enterprise pricing and volume discounts. The cost of streaming token responses over Ably depends on: -- the number of tokens in the LLM responses that you are streaming. For example, a simple support chatbot response is around 300 tokens, a code session chat can be 2000-3000 tokens and a deep reasoning response could be 50000+ tokens. +- the number of tokens in the LLM responses that you are streaming. For example, a simple support chatbot response is around 300 tokens, a code session chat can be 2,000-3,000 tokens and a deep reasoning response could be over 50,000 tokens. - the rate at which your agent publishes tokens to Ably and the number of messages it uses to do so. Some LLMs output every token as a single event, while others batch multiple tokens together. Similarly, your agent may publish tokens as they are received from the LLM or perform its own processing and batching first. -- the number of subscribers receiving the response -- the [token streaming pattern](/docs/ai-transport/features/token-streaming#token-streaming-patterns) you choose +- the number of subscribers receiving the response. +- the [token streaming pattern](/docs/ai-transport/features/token-streaming#token-streaming-patterns) you choose. - -- message-per-response Ably will automatically - -- message-per-token you are in control, you can turn on server side batching to group messages together in a batching interval. Higher batching interval increases latency but reduces total number of messages, lower batching interval delivers messages quickly. -[server-side batching](/docs/messages/batch#server-side) +*** Link to worked example(s) *** diff --git a/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx b/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx new file mode 100644 index 0000000000..5a433988fa --- /dev/null +++ b/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx @@ -0,0 +1,23 @@ +--- +title: AI support chatbot +meta_description: "Calculate AI Transport pricing for conversations with an AI chatbot. Example shows how using the message-per-response pattern and modifying the append rollup window can generate cost savings." +meta_keywords: "chatbot, support chat, token streaming, token cost, AI Transport pricing, Ably AI Transport pricing, stream cost, Pub/Sub pricing, realtime data delivery, Ably Pub/Sub pricing" +intro: "This example uses consumption-based pricing for an AI support chatbot use case, where a single agent is publishing tokens to user over AI Transport." +--- + +### Assumptions + +The scale and features used in this calculation. + +### Cost summary + +The high level cost breakdown for this scenario. Messages are billed for both inbound (published to Ably) and outbound (delivered to subscribers). + +### Effect + + +- message-per-response Ably will automatically + +- message-per-token you are in control, you can turn on server side batching to group messages together in a batching interval. Higher batching interval increases latency but reduces total number of messages, lower batching interval delivers messages quickly. +[server-side batching](/docs/messages/batch#server-side) + From 2fb5f0c3c3e0caec3145dda3f5827b40d0f3500e Mon Sep 17 00:00:00 2001 From: Fiona Corden Date: Thu, 15 Jan 2026 22:19:13 +0000 Subject: [PATCH 3/9] Add worked example for AI chatbot use case --- src/pages/docs/ai-transport/index.mdx | 2 +- .../platform/pricing/examples/ai-chatbot.mdx | 44 ++++++++++++++++--- 2 files changed, 40 insertions(+), 6 deletions(-) diff --git a/src/pages/docs/ai-transport/index.mdx b/src/pages/docs/ai-transport/index.mdx index ef96f64479..8224a44abc 100644 --- a/src/pages/docs/ai-transport/index.mdx +++ b/src/pages/docs/ai-transport/index.mdx @@ -157,4 +157,4 @@ The cost of streaming token responses over Ably depends on: - the number of subscribers receiving the response. - the [token streaming pattern](/docs/ai-transport/features/token-streaming#token-streaming-patterns) you choose. -*** Link to worked example(s) *** +For example, an AI support chatbot sending a response of 250 tokens at 70 tokens/s to a single client using the [message-per-response](/docs/ai-transport/features/token-streaming/message-per-response) pattern would consume 90 inbound messages, 90 outbound messages and 90 persisted messages. See the [AI support chatbot pricing example](/docs/platform/pricing/examples/ai-chatbot) for a full breakdown of the costs in this scenario. diff --git a/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx b/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx index 5a433988fa..41d08bc5d4 100644 --- a/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx +++ b/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx @@ -9,15 +9,49 @@ intro: "This example uses consumption-based pricing for an AI support chatbot us The scale and features used in this calculation. +| Scale | Features | +|-------|----------| +| 4 user prompts to get to resolution | ✓ Message-per-response | +| 250 tokens per LLM response | | +| 70 appends per second from agent | | +| 3 minute average chat duration | | +| 1 million chats | | + ### Cost summary -The high level cost breakdown for this scenario. Messages are billed for both inbound (published to Ably) and outbound (delivered to subscribers). +The high level cost breakdown for this scenario. Messages are billed for both inbound (published to Ably) and outbound (delivered to subscribers). Creating the "Message updates and deletes" [channel rule](/docs/ai-transport/features/token-streaming/message-per-response#enable) will automatically enable message persistence. + +| Item | Calculation | Cost | +|------|-------------|------| +| Messages | 1092M × $2.50/M | $2730.00 | +| Connection minutes | 6M × $1.00/M | $6.00 | +| Channel minutes | 3M × $1.00/M | $3.00 | +| Package fee | | [See plans](/pricing) | +| **Total** | | **~$2739.00/M chats** | + +### Message breakdown + +How the message cost breaks down. The message-per-response pattern includes [automatic rollup of append events](/docs/ai-transport/features/token-streaming/token-rate-limits#per-response) to reduce consumption costs and avoid rate limits. + +| Type | Calculation | Inbound | Outbound | Total messages | Cost | +|------|-------------|---------|----------|----------------|------| +| User prompts | 1M chats × 4 prompts | 4M | 4M | 8M | $20.00 | +| Agent responses | 1M chats x 4 responses x 250 token events per response | 360M | 360M | 720M | $1800.00 | +| Persisted messages | Every inbound message is persisted | 364M | 0 | 364M | $910.00 | -### Effect +### Effect of append rollup +The calculation above uses the default append rollup window of 40ms, chosen to control costs with minimum impact on responsiveness. For a text chatbot use case, you could increase the window to 200ms without noticably impacting the user experience. -- message-per-response Ably will automatically +| Rollup window | Inbound response messages | Total messages | Cost | +|---------------|---------------------------|----------------|------| +| 40ms | 360 per chat | 1092M | $2730.00/M chats | +| 100ms | 144 per chat | 444M | $1110.00/M chats | +| 200ms | 72 per chat | 228M | $570.00/M chats | -- message-per-token you are in control, you can turn on server side batching to group messages together in a batching interval. Higher batching interval increases latency but reduces total number of messages, lower batching interval delivers messages quickly. -[server-side batching](/docs/messages/batch#server-side) + From bbc18766ca836d443fea85a1372624c6be3729df Mon Sep 17 00:00:00 2001 From: Fiona Corden Date: Fri, 16 Jan 2026 10:02:22 +0000 Subject: [PATCH 4/9] Update nav --- src/data/nav/platform.ts | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/data/nav/platform.ts b/src/data/nav/platform.ts index ce1a3b6682..383a459861 100644 --- a/src/data/nav/platform.ts +++ b/src/data/nav/platform.ts @@ -124,6 +124,15 @@ export default { link: '/docs/platform/pricing/limits', name: 'Limits', }, + { + name: 'Pricing examples', + pages: [ + { + link: '/docs/platform/pricing/examples/ai-chatbot', + name: 'AI support chatbot', + }, + ], + }, { link: '/docs/platform/pricing/faqs', name: 'Pricing FAQs', From b4260513f59598af3dca9d6d618c2b59511dfba6 Mon Sep 17 00:00:00 2001 From: rainbowFi Date: Fri, 16 Jan 2026 12:28:12 +0000 Subject: [PATCH 5/9] Apply suggestions from code review Co-authored-by: Paddy Byers --- src/pages/docs/ai-transport/index.mdx | 2 +- src/pages/docs/platform/pricing/examples/ai-chatbot.mdx | 6 +++--- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/pages/docs/ai-transport/index.mdx b/src/pages/docs/ai-transport/index.mdx index 8224a44abc..14051f4e2b 100644 --- a/src/pages/docs/ai-transport/index.mdx +++ b/src/pages/docs/ai-transport/index.mdx @@ -152,7 +152,7 @@ AI Transport uses Ably's [usage based billing model](/docs/platform/pricing) at The cost of streaming token responses over Ably depends on: -- the number of tokens in the LLM responses that you are streaming. For example, a simple support chatbot response is around 300 tokens, a code session chat can be 2,000-3,000 tokens and a deep reasoning response could be over 50,000 tokens. +- the number of tokens in the LLM responses that you are streaming. For example, a simple support chatbot response might be around 300 tokens, a coding session can be 2,000-3,000 tokens and a deep reasoning response could be over 50,000 tokens. - the rate at which your agent publishes tokens to Ably and the number of messages it uses to do so. Some LLMs output every token as a single event, while others batch multiple tokens together. Similarly, your agent may publish tokens as they are received from the LLM or perform its own processing and batching first. - the number of subscribers receiving the response. - the [token streaming pattern](/docs/ai-transport/features/token-streaming#token-streaming-patterns) you choose. diff --git a/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx b/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx index 41d08bc5d4..f1cdd51b2c 100644 --- a/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx +++ b/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx @@ -19,7 +19,7 @@ The scale and features used in this calculation. ### Cost summary -The high level cost breakdown for this scenario. Messages are billed for both inbound (published to Ably) and outbound (delivered to subscribers). Creating the "Message updates and deletes" [channel rule](/docs/ai-transport/features/token-streaming/message-per-response#enable) will automatically enable message persistence. +The high level cost breakdown for this scenario is given in the table below. Messages are billed for both inbound (published to Ably) and outbound (delivered to subscribers). Enabling the "Message updates, deletes and appends" [channel rule](/docs/ai-transport/features/token-streaming/message-per-response#enable) will automatically enable message persistence. | Item | Calculation | Cost | |------|-------------|------| @@ -29,9 +29,9 @@ The high level cost breakdown for this scenario. Messages are billed for both in | Package fee | | [See plans](/pricing) | | **Total** | | **~$2739.00/M chats** | -### Message breakdown +### Message usage breakdown -How the message cost breaks down. The message-per-response pattern includes [automatic rollup of append events](/docs/ai-transport/features/token-streaming/token-rate-limits#per-response) to reduce consumption costs and avoid rate limits. +Several factors influence the total message usage. The message-per-response pattern includes [automatic rollup of append events](/docs/ai-transport/features/token-streaming/token-rate-limits#per-response) to reduce consumption costs and avoid rate limits. | Type | Calculation | Inbound | Outbound | Total messages | Cost | |------|-------------|---------|----------|----------------|------| From 5852272a5601021cee49f641abfe35aae571a166 Mon Sep 17 00:00:00 2001 From: Fiona Corden Date: Fri, 16 Jan 2026 13:42:40 +0000 Subject: [PATCH 6/9] Update based on comments from JamieB's example PR --- .../platform/pricing/examples/ai-chatbot.mdx | 26 ++++++++++--------- 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx b/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx index f1cdd51b2c..aa705e54b6 100644 --- a/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx +++ b/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx @@ -13,7 +13,7 @@ The scale and features used in this calculation. |-------|----------| | 4 user prompts to get to resolution | ✓ Message-per-response | | 250 tokens per LLM response | | -| 70 appends per second from agent | | +| 75 appends per second from agent | | | 3 minute average chat duration | | | 1 million chats | | @@ -23,11 +23,11 @@ The high level cost breakdown for this scenario is given in the table below. Mes | Item | Calculation | Cost | |------|-------------|------| -| Messages | 1092M × $2.50/M | $2730.00 | -| Connection minutes | 6M × $1.00/M | $6.00 | -| Channel minutes | 3M × $1.00/M | $3.00 | +| Messages | 1092M × $2.50/M | $2730 | +| Connection minutes | 6M × $1.00/M | $6 | +| Channel minutes | 3M × $1.00/M | $3 | | Package fee | | [See plans](/pricing) | -| **Total** | | **~$2739.00/M chats** | +| **Total** | | **~$2739/M chats** | ### Message usage breakdown @@ -35,9 +35,10 @@ Several factors influence the total message usage. The message-per-response patt | Type | Calculation | Inbound | Outbound | Total messages | Cost | |------|-------------|---------|----------|----------------|------| -| User prompts | 1M chats × 4 prompts | 4M | 4M | 8M | $20.00 | -| Agent responses | 1M chats x 4 responses x 250 token events per response | 360M | 360M | 720M | $1800.00 | -| Persisted messages | Every inbound message is persisted | 364M | 0 | 364M | $910.00 | +| User prompts | 1M chats × 4 prompts | 4M | 4M | 8M | $20 | +| Agent responses | 1M chats x 4 responses x 250 token events per response | 360M | 360M | 720M | $1800 | +| Persisted messages | Every inbound message is persisted | 364M | 0 | 364M | $910 | +| **Total** | | **728M** | **364M** | **1092M** | **$2730** | ### Effect of append rollup @@ -45,12 +46,13 @@ The calculation above uses the default append rollup window of 40ms, chosen to c | Rollup window | Inbound response messages | Total messages | Cost | |---------------|---------------------------|----------------|------| -| 40ms | 360 per chat | 1092M | $2730.00/M chats | -| 100ms | 144 per chat | 444M | $1110.00/M chats | -| 200ms | 72 per chat | 228M | $570.00/M chats | +| 40ms | 360 per chat | 1092M | $2730/M chats | +| 100ms | 144 per chat | 444M | $1110/M chats | +| 200ms | 72 per chat | 228M | $570/M chats | From 2c91ce29cf7f50d91ea3ab531d809b6b13e09cc4 Mon Sep 17 00:00:00 2001 From: Fiona Corden Date: Fri, 16 Jan 2026 15:22:48 +0000 Subject: [PATCH 7/9] Fixup following Paddy review --- src/pages/docs/ai-transport/index.mdx | 2 +- .../platform/pricing/examples/ai-chatbot.mdx | 27 +++++++------------ 2 files changed, 11 insertions(+), 18 deletions(-) diff --git a/src/pages/docs/ai-transport/index.mdx b/src/pages/docs/ai-transport/index.mdx index 14051f4e2b..ce6a635c99 100644 --- a/src/pages/docs/ai-transport/index.mdx +++ b/src/pages/docs/ai-transport/index.mdx @@ -157,4 +157,4 @@ The cost of streaming token responses over Ably depends on: - the number of subscribers receiving the response. - the [token streaming pattern](/docs/ai-transport/features/token-streaming#token-streaming-patterns) you choose. -For example, an AI support chatbot sending a response of 250 tokens at 70 tokens/s to a single client using the [message-per-response](/docs/ai-transport/features/token-streaming/message-per-response) pattern would consume 90 inbound messages, 90 outbound messages and 90 persisted messages. See the [AI support chatbot pricing example](/docs/platform/pricing/examples/ai-chatbot) for a full breakdown of the costs in this scenario. +For example, suppose an AI support chatbot sends a response of 300 tokens, each as a discrete update, using the [message-per-response](/docs/ai-transport/features/token-streaming/message-per-response) pattern, and with a single client subscribed to the channel. With AI Transport's [append rollup](/docs/ai-transport/messaging/token-rate-limits#per-response), this will result in usage of 100 inbound messages, 100 outbound messages and 100 persisted messages. See the [AI support chatbot pricing example](/docs/platform/pricing/examples/ai-chatbot) for a full breakdown of the costs in this scenario. diff --git a/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx b/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx index aa705e54b6..1d56f6e146 100644 --- a/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx +++ b/src/pages/docs/platform/pricing/examples/ai-chatbot.mdx @@ -1,5 +1,5 @@ --- -title: AI support chatbot +title: AI support chatbot pricing example meta_description: "Calculate AI Transport pricing for conversations with an AI chatbot. Example shows how using the message-per-response pattern and modifying the append rollup window can generate cost savings." meta_keywords: "chatbot, support chat, token streaming, token cost, AI Transport pricing, Ably AI Transport pricing, stream cost, Pub/Sub pricing, realtime data delivery, Ably Pub/Sub pricing" intro: "This example uses consumption-based pricing for an AI support chatbot use case, where a single agent is publishing tokens to user over AI Transport." @@ -12,7 +12,7 @@ The scale and features used in this calculation. | Scale | Features | |-------|----------| | 4 user prompts to get to resolution | ✓ Message-per-response | -| 250 tokens per LLM response | | +| 300 token events per LLM response | | | 75 appends per second from agent | | | 3 minute average chat duration | | | 1 million chats | | @@ -23,32 +23,25 @@ The high level cost breakdown for this scenario is given in the table below. Mes | Item | Calculation | Cost | |------|-------------|------| -| Messages | 1092M × $2.50/M | $2730 | +| Messages | 1212M × $2.50/M | $3030 | | Connection minutes | 6M × $1.00/M | $6 | | Channel minutes | 3M × $1.00/M | $3 | | Package fee | | [See plans](/pricing) | -| **Total** | | **~$2739/M chats** | +| **Total** | | **~$3039/M chats** | ### Message usage breakdown Several factors influence the total message usage. The message-per-response pattern includes [automatic rollup of append events](/docs/ai-transport/features/token-streaming/token-rate-limits#per-response) to reduce consumption costs and avoid rate limits. +- Agent stream time: 300 token events ÷ 75 appends per second = 4 seconds of streaming per response +- Messages published after rollup: 4 seconds x 25 messages/s = **100 messages per response** + | Type | Calculation | Inbound | Outbound | Total messages | Cost | |------|-------------|---------|----------|----------------|------| | User prompts | 1M chats × 4 prompts | 4M | 4M | 8M | $20 | -| Agent responses | 1M chats x 4 responses x 250 token events per response | 360M | 360M | 720M | $1800 | -| Persisted messages | Every inbound message is persisted | 364M | 0 | 364M | $910 | -| **Total** | | **728M** | **364M** | **1092M** | **$2730** | - -### Effect of append rollup - -The calculation above uses the default append rollup window of 40ms, chosen to control costs with minimum impact on responsiveness. For a text chatbot use case, you could increase the window to 200ms without noticably impacting the user experience. - -| Rollup window | Inbound response messages | Total messages | Cost | -|---------------|---------------------------|----------------|------| -| 40ms | 360 per chat | 1092M | $2730/M chats | -| 100ms | 144 per chat | 444M | $1110/M chats | -| 200ms | 72 per chat | 228M | $570/M chats | +| Agent responses | 1M chats x 4 responses x 100 messages per response | 400M | 400M | 800M | $2000 | +| Persisted messages | Every inbound message is persisted | 404M | 0 | 404M | $1010 | +| **Total** | | **808M** | **404M** | **1212M** | **$3030** |