temporalio · bechols · Jan 29, 2026 · Jan 29, 2026 · Jan 29, 2026
@@ -67,12 +67,24 @@ RPS and OPS  are lower-level measures to control and balance request rates at th
 
 ### What happens when my Actions Rate exceeds my Limit?
 
-When your Action rate exceeds your quota, Temporal Cloud throttles Actions until the rate matches your quota. 
-Throttling means limiting the rate at which Actions are performed to prevent the Namespace from exceeding its APS limit. 
-Your work is never lost and will continue at the limited pace until APS returns below the limit. 
+When your Action rate exceeds your quota, Temporal Cloud throttles Actions.
+Throttling limits the rate at which Actions are performed to prevent the Namespace from exceeding its APS limit.
+
+**How throttling works:**
+- Low-priority operations are throttled first; higher-priority operations (like starting or signaling Workflows) continue when possible.
+- Rate limiting is not instantaneous, so usage may briefly exceed your limit before throttling takes effect.
+- When throttled, the server returns `ResourceExhausted` errors that SDK clients automatically retry.
+- If throttling persists beyond the SDK's retry limit, client calls can fail.
+
+**To avoid data loss during throttling:**
+- Log any failed client calls (with payloads) so you can retry or backfill later.
+- Set up [limit metrics](/cloud/metrics/openmetrics/metrics-reference#limit-metrics) to alert when approaching your limits.
+
+See [Throttling behavior](/cloud/limits#throttling-behavior) for more details.
+
 Your rate limits can be adjusted automatically over time or provisioned manually with Capacity Modes.
 
-We recommend tracking your Actions Rate and Limits using Temporal metrics to assess your use cases specific needs. 
+We recommend tracking your Actions Rate and Limits using Temporal metrics to assess your use cases specific needs.
 See [Monitoring Trends Against Limits](/cloud/service-health#rps-aps-rate-limits) to track usage trends.
 
 :::note Actions that don't count against APS

@@ -62,6 +62,7 @@ The following limits apply at the Namespace level.
   - Automatically increases (and decreases) based on the last 7 days of APS usage. Will never go below the default limit.
   - See [Capacity Modes](/cloud/capacity-modes).
   - [Contact support](/cloud/support#support-ticket).
+- What happens when you exceed the limit: See [Throttling behavior](#throttling-behavior) below.
 
 See the [Actions page](/cloud/actions) for the list of actions. 
 
@@ -87,6 +88,20 @@ See the [glossary](/glossary#requests-per-second-rps) for more about RPS.
 
 See the [operations list](/references/operation-list) for the list of operations.
 
+### Throttling behavior
+
+When you exceed your APS, RPS, or OPS limits, Temporal Cloud throttles requests. Here's what happens:
+
+1. **Priority-based throttling**: Low-priority operations are throttled first. Higher-priority operations like `StartWorkflowExecution`, `SignalWorkflowExecution`, and `UpdateWorkflowExecution` continue to go through when possible. Temporal Cloud uses similar [throttling priorities as the open source server](https://github.com/temporalio/temporal/blob/main/service/frontend/configs/quotas.go#L66).
+2. **Throttling latency**: Rate limiting is not instantaneous, so usage may briefly exceed your limit before throttling takes effect.
+3. **ResourceExhausted errors**: When throttled, the server returns a `ResourceExhausted` gRPC error. SDK clients automatically retry these based on the default gRPC retry policy.
+4. **Potential failure**: If throttling persists beyond the SDK's retry limit, client calls fail. This means work _can_ be lost if you don't handle these failures.
+
+**Best practices for handling throttling:**
+- Log any failed `StartWorkflowExecution`, `SignalWorkflowExecution`, or `UpdateWorkflowExecution` calls on the client side, including the payload, so you can retry or backfill later.
+- Set up [Cloud metrics](/cloud/metrics/openmetrics/metrics-reference#limit-metrics) to alert when throttling occurs and when you approach your limits.
+- Consider [Provisioned Capacity](/cloud/capacity-modes#provisioned-capacity) if you have predictable spikes or need guaranteed throughput.
+
 ### Schedules rate limit
 
 - Scope: Namespace