Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 16 additions & 4 deletions docs/cloud/capacity-modes.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -67,12 +67,24 @@ RPS and OPS are lower-level measures to control and balance request rates at th

### What happens when my Actions Rate exceeds my Limit?

When your Action rate exceeds your quota, Temporal Cloud throttles Actions until the rate matches your quota.
Throttling means limiting the rate at which Actions are performed to prevent the Namespace from exceeding its APS limit.
Your work is never lost and will continue at the limited pace until APS returns below the limit.
When your Action rate exceeds your quota, Temporal Cloud throttles Actions.
Throttling limits the rate at which Actions are performed to prevent the Namespace from exceeding its APS limit.

**How throttling works:**
- Low-priority operations are throttled first; higher-priority operations (like starting or signaling Workflows) continue when possible.
- Rate limiting is not instantaneous, so usage may briefly exceed your limit before throttling takes effect.
- When throttled, the server returns `ResourceExhausted` errors that SDK clients automatically retry.
- If throttling persists beyond the SDK's retry limit, client calls can fail.

**To avoid data loss during throttling:**
- Log any failed client calls (with payloads) so you can retry or backfill later.
- Set up [limit metrics](/cloud/metrics/openmetrics/metrics-reference#limit-metrics) to alert when approaching your limits.

See [Throttling behavior](/cloud/limits#throttling-behavior) for more details.

Your rate limits can be adjusted automatically over time or provisioned manually with Capacity Modes.

We recommend tracking your Actions Rate and Limits using Temporal metrics to assess your use cases specific needs.
We recommend tracking your Actions Rate and Limits using Temporal metrics to assess your use cases specific needs.
See [Monitoring Trends Against Limits](/cloud/service-health#rps-aps-rate-limits) to track usage trends.

:::note Actions that don't count against APS
Expand Down
15 changes: 15 additions & 0 deletions docs/evaluate/temporal-cloud/limits.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ The following limits apply at the Namespace level.
- Automatically increases (and decreases) based on the last 7 days of APS usage. Will never go below the default limit.
- See [Capacity Modes](/cloud/capacity-modes).
- [Contact support](/cloud/support#support-ticket).
- What happens when you exceed the limit: See [Throttling behavior](#throttling-behavior) below.

See the [Actions page](/cloud/actions) for the list of actions.

Expand All @@ -87,6 +88,20 @@ See the [glossary](/glossary#requests-per-second-rps) for more about RPS.

See the [operations list](/references/operation-list) for the list of operations.

### Throttling behavior

When you exceed your APS, RPS, or OPS limits, Temporal Cloud throttles requests. Here's what happens:

1. **Priority-based throttling**: Low-priority operations are throttled first. Higher-priority operations like `StartWorkflowExecution`, `SignalWorkflowExecution`, and `UpdateWorkflowExecution` continue to go through when possible. Temporal Cloud uses similar [throttling priorities as the open source server](https://github.com/temporalio/temporal/blob/main/service/frontend/configs/quotas.go#L66).
2. **Throttling latency**: Rate limiting is not instantaneous, so usage may briefly exceed your limit before throttling takes effect.
3. **ResourceExhausted errors**: When throttled, the server returns a `ResourceExhausted` gRPC error. SDK clients automatically retry these based on the default gRPC retry policy.
4. **Potential failure**: If throttling persists beyond the SDK's retry limit, client calls fail. This means work _can_ be lost if you don't handle these failures.

**Best practices for handling throttling:**
- Log any failed `StartWorkflowExecution`, `SignalWorkflowExecution`, or `UpdateWorkflowExecution` calls on the client side, including the payload, so you can retry or backfill later.
- Set up [Cloud metrics](/cloud/metrics/openmetrics/metrics-reference#limit-metrics) to alert when throttling occurs and when you approach your limits.
- Consider [Provisioned Capacity](/cloud/capacity-modes#provisioned-capacity) if you have predictable spikes or need guaranteed throughput.

### Schedules rate limit

- Scope: Namespace
Expand Down