Skip to content

feat(exception-capture): add client-side token bucket rate limiting#662

Merged
hpouillot merged 5 commits into
mainfrom
feat/exception-bucketed-rate-limiter
Jun 15, 2026
Merged

feat(exception-capture): add client-side token bucket rate limiting#662
hpouillot merged 5 commits into
mainfrom
feat/exception-bucketed-rate-limiter

Conversation

@hpouillot

@hpouillot hpouillot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

💡 Motivation and Context

Addresses the long-standing TODO in posthog/exception_capture.py: exception autocapture has no client-side rate limiting, so a crash loop can flood the ingestion queue.

This ports the BucketedRateLimiter from posthog-js (packages/core/src/utils/bucketed-rate-limiter.ts) and applies it to exception autocapture as an opt-in feature:

  • disabled by default — without the flag, behavior is identical to released versions (the limiter is not even constructed)
  • one token bucket per exception type (the Python equivalent of $exception_list[0].type, falling back to "Exception")
  • rate-limited exceptions are skipped before reaching the ingestion queue, logging Skipping exception capture because of client rate limiting. like the other SDKs
  • defaults are more generous than the browser/Node SDKs' 10 / 1 / 10s because one server process aggregates exceptions across many users' requests

New Client options (also available as module-level settings):

Option Default Description
enable_exception_autocapture_rate_limiting False Opt into client-side rate limiting
exception_autocapture_bucket_size 50 Max burst of captures per exception type (clamped to 0–100)
exception_autocapture_refill_rate 10 Tokens restored per refill interval
exception_autocapture_refill_interval_seconds 10 Seconds between refills

Deviations from the JS source, since this SDK runs in threaded server processes:

  • guarded by a threading.Lock (the on_bucket_rate_limited callback fires outside the lock)
  • injectable monotonic clock for tests

One JS quirk is preserved for cross-SDK parity: the call that drains the bucket is itself reported as limited, so a burst lets bucket_size - 1 events through.

💚 How did you test it?

  • posthog/test/test_bucketed_rate_limiter.py: 25 tests porting the posthog-js spec (bucketed-rate-limiter.spec.ts) — consumption, refill math, partial intervals, bucket isolation, callback semantics, stop(), timestamp carry-over — plus Python-specific tests: parameter clamping, a 10-thread concurrency test asserting exactly bucket_size - 1 of 200 contended consumes pass, ExceptionCapture integration (15 same-type exceptions on a size-10 bucket → 9 captured; a different type still passes), disabled-by-default (100 captures pass untouched, no limiter constructed), enabled defaults, and config pass-through from Client kwargs to the limiter.
  • Existing posthog/test/test_exception_capture.py (15 tests) still passes.
  • ruff format, ruff check, and mypy are clean.

📝 Checklist

  • I reviewed the submitted code.
  • I added tests to verify the changes.
  • I updated the docs if needed.
  • No breaking change or entry added to the changelog.

If releasing new changes

  • Ran sampo add to generate a changeset file

🤖 Generated with Claude Code

Port the posthog-js BucketedRateLimiter (packages/core/src/utils/
bucketed-rate-limiter.ts) and apply it to exception autocapture with
the same settings as the browser and Node SDKs: one bucket per
exception type, bucket size 10, refilling 1 token per 10 seconds.

Resolves the long-standing TODO in exception_capture.py.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

posthog-python Compliance Report

Date: 2026-06-15 09:50:29 UTC
Duration: 176149ms

✅ All Tests Passed!

45/45 tests passed


Capture Tests

29/29 tests passed

View Details
Test Status Duration
Format Validation.Event Has Required Fields 517ms
Format Validation.Event Has Uuid 1507ms
Format Validation.Event Has Lib Properties 1507ms
Format Validation.Distinct Id Is String 1507ms
Format Validation.Token Is Present 1507ms
Format Validation.Custom Properties Preserved 1507ms
Format Validation.Event Has Timestamp 1507ms
Retry Behavior.Retries On 503 9518ms
Retry Behavior.Does Not Retry On 400 3506ms
Retry Behavior.Does Not Retry On 401 3507ms
Retry Behavior.Respects Retry After Header 9514ms
Retry Behavior.Implements Backoff 23517ms
Retry Behavior.Retries On 500 7516ms
Retry Behavior.Retries On 502 7512ms
Retry Behavior.Retries On 504 7513ms
Retry Behavior.Max Retries Respected 23515ms
Deduplication.Generates Unique Uuids 1510ms
Deduplication.Preserves Uuid On Retry 7511ms
Deduplication.Preserves Uuid And Timestamp On Retry 14515ms
Deduplication.Preserves Uuid And Timestamp On Batch Retry 7517ms
Deduplication.No Duplicate Events In Batch 1504ms
Deduplication.Different Events Have Different Uuids 1507ms
Compression.Sends Gzip When Enabled 1507ms
Batch Format.Uses Proper Batch Structure 1507ms
Batch Format.Flush With No Events Sends Nothing 1005ms
Batch Format.Multiple Events Batched Together 1506ms
Error Handling.Does Not Retry On 403 3509ms
Error Handling.Does Not Retry On 413 3508ms
Error Handling.Retries On 408 7509ms

Feature_Flags Tests

16/16 tests passed

View Details
Test Status Duration
Request Payload.Request With Person Properties Device Id 1008ms
Request Payload.Flags Request Uses V2 Query Param 1006ms
Request Payload.Flags Request Hits Flags Path Not Decide 1007ms
Request Payload.Flags Request Omits Authorization Header 1007ms
Request Payload.Token In Flags Body Matches Init 1007ms
Request Payload.Groups Round Trip 1006ms
Request Payload.Groups Default To Empty Object 1007ms
Request Payload.Person Properties Distinct Id Auto Populated When Caller Omits It 1007ms
Request Payload.Disable Geoip False Propagates As Geoip Disable False 1007ms
Request Payload.Disable Geoip Omitted Defaults To False 1007ms
Request Payload.Flag Keys To Evaluate Contains Only Requested Key 1007ms
Request Lifecycle.No Flags Request On Init Alone 503ms
Request Lifecycle.No Flags Request On Normal Capture 1507ms
Request Lifecycle.Two Flag Calls Produce Two Remote Requests 1012ms
Request Lifecycle.Mock Response Value Is Returned To Caller 1001ms
Side Effect Events.Get Feature Flag Captures Feature Flag Called Event 1510ms

@greptile-apps

greptile-apps Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
posthog/test/test_bucketed_rate_limiter.py:223-248
**Integration test placed in wrong test module**

`test_exception_capture_rate_limits_per_exception_type` exercises `ExceptionCapture` end-to-end (including its `sys.excepthook` side-effect and `close()` teardown) — it is an integration test for `ExceptionCapture`, not for `BucketedRateLimiter`. Having it here means anyone reading `test_exception_capture.py` gets an incomplete picture of that class's tested behaviour. It belongs in `test_exception_capture.py` alongside the other `ExceptionCapture` tests.

Reviews (1): Last reviewed commit: "feat(exception-capture): add client-side..." | Re-trigger Greptile

Comment thread posthog/test/test_bucketed_rate_limiter.py Outdated
hpouillot and others added 3 commits June 12, 2026 13:04
Expose exception_autocapture_bucket_size, exception_autocapture_refill_rate
and exception_autocapture_refill_interval_seconds on Client and the
module-level API, passed through to ExceptionCapture's rate limiter.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Bucket size 50 refilling 10 tokens per 10 seconds (was 10/1/10, the
browser SDK defaults) since one server process aggregates exceptions
across many users' requests. Defaults now live on ExceptionCapture and
are referenced by Client and the module-level API.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add enable_exception_autocapture_rate_limiting (default False) on
Client, the module-level API and ExceptionCapture. The limiter is only
constructed when enabled, so default behavior is unchanged from
released versions.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@hpouillot hpouillot marked this pull request as ready for review June 12, 2026 12:13
@hpouillot hpouillot requested a review from a team as a code owner June 12, 2026 12:13
@greptile-apps

greptile-apps Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Reviews (2): Last reviewed commit: "feat(exception-capture): make rate limit..." | Re-trigger Greptile

@hpouillot hpouillot requested review from a team, ablaszkiewicz and cat-ph June 12, 2026 12:47
Comment thread posthog/exception_capture.py
…eptions

PostHog groups exceptions by $exception_list[0].type, which is the root
cause (exceptions_from_error_tuple reverses the walked chain), not the
wrapping exception. Walk the chain so e.g. `RuntimeError from
ZeroDivisionError` is keyed on ZeroDivisionError, matching server-side
grouping and posthog-js.

Also move the ExceptionCapture integration tests out of
test_bucketed_rate_limiter.py into test_exception_capture.py so each
file covers a single unit.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@hpouillot hpouillot merged commit b9f3208 into main Jun 15, 2026
31 checks passed
@hpouillot hpouillot deleted the feat/exception-bucketed-rate-limiter branch June 15, 2026 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants