Skip to content

Replace custom Sentry HTTP client with standard sentry SDK#902

Open
svarlamov wants to merge 4 commits intomainfrom
devin/1774989709-sentry-sdk-migration
Open

Replace custom Sentry HTTP client with standard sentry SDK#902
svarlamov wants to merge 4 commits intomainfrom
devin/1774989709-sentry-sdk-migration

Conversation

@svarlamov
Copy link
Copy Markdown
Member

@svarlamov svarlamov commented Mar 31, 2026

Summary

Replaces the hand-rolled SentryClient (manual DSN parsing, X-Sentry-Auth header construction, raw JSON payloads via minreq) with the official sentry Rust SDK (v0.37, ureq transport).

What changed:

  • Added sentry crate with backtrace, contexts, ureq, rustls features
  • Replaced SentryClient struct and its from_dsn/send_event methods with sentry::Client + sentry::Hub
  • Events are now constructed as proper sentry::protocol::Event structs instead of raw serde_json::Value blobs
  • Extracted helpers: resolve_enterprise_dsn, resolve_oss_dsn, build_sentry_client, base_sentry_tags, context_to_extra, parse_sentry_level, parse_timestamp, send_event_to_client
  • SentryClients struct holds long-lived Arc<sentry::Client> instances, created once in the flush loop and reused across all flush cycles
  • Original event timestamps are preserved (RFC 3339 → SystemTime) instead of relying on SDK auto-timestamping
  • Explicit sentry_clients.flush(5s) call after each batch ensures the async transport delivers events before the batch is considered complete
  • PostHog integration, metrics, CAS sync, and TelemetryEnvelope IPC are unchanged

Updates since last revision

  • Pre-epoch timestamp safety: parse_timestamp now guards against negative i64 values from dt.timestamp() — pre-epoch or malformed timestamps fall back to SystemTime::now() instead of panicking from unsigned wraparound.

Review & Testing Checklist for Human

  • Flush timeout vs. flush interval: sentry_clients.flush(Duration::from_secs(5)) blocks the spawn_blocking task for up to 5s if the network is slow. FLUSH_INTERVAL is 3s, so a slow flush could cause batches to accumulate. Additionally, the flush() return value (bool indicating whether all events were sent) is currently ignored — if the timeout is reached, events are silently dropped with no logging. Verify this is acceptable or add a debug_log on timeout.
  • Dependency footprint: The Cargo.lock gained ~1000 lines of new transitive deps including hyper, hyper-util, tokio-rustls, http, http-body, etc. Confirm this is acceptable for binary size and compile time, and that ureq transport is actually being used (not the default reqwest/curl transport pulling in hyper transitively via some other path).
  • Timestamp precision: parse_timestamp uses dt.timestamp() which truncates sub-second precision from the original RFC 3339 timestamps. The old code passed the full string through to Sentry's JSON API. Verify whether sub-second precision matters for your Sentry dashboards.
  • DSN changes at runtime: SentryClients is constructed once when the flush loop starts. If DSN configuration changes at runtime (e.g., config file update, env var change), the clients won't pick up the new DSNs until the daemon restarts. Confirm this matches the expected behavior.

Suggested test plan: Deploy to a test environment with valid OSS/Enterprise DSNs configured. Trigger error, performance, and message telemetry events, then verify they appear in both Sentry dashboards with correct tags, extra data, level, release version, and timestamps that match event creation time (not flush time).

Notes

  • minreq is still a dependency (used by PostHog and CAS flush paths).
  • The old SentryClient::send_event() was fully synchronous (blocking HTTP call per event, with status code check). The new code enqueues events via hub.capture_event() (non-blocking) then calls client.flush() once at the end of the batch — this should be more efficient when sending multiple events per cycle, but individual per-event error responses are no longer surfaced.

Link to Devin session: https://app.devin.ai/sessions/32562d5f120d422fb3a6fa92035193f1
Requested by: @svarlamov


Open with Devin

Since the project has moved to daemon mode with a long-lived process,
replace the hand-rolled SentryClient (manual DSN parsing, auth headers,
minreq HTTP POSTs) with the official sentry Rust SDK.

Changes:
- Add sentry crate (0.37) with ureq transport, backtrace, contexts
- Replace SentryClient struct with sentry::Client + sentry::Hub
- Build proper sentry::protocol::Event structs instead of raw JSON
- Use sentry::Hub::capture_event for dual DSN (OSS + Enterprise) support
- Extract DSN resolution, client building, and tag helpers into functions
- Keep PostHog, metrics, CAS, and TelemetryEnvelope IPC unchanged

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

devin-ai-integration[bot]

This comment was marked as resolved.

devin-ai-integration bot and others added 2 commits April 1, 2026 17:58
…icit flush

- Hoist sentry::Client instances to telemetry_flush_loop scope so they
  are created once and reused across all flush cycles (avoids spinning up
  transport worker threads every 3s).
- Preserve original event timestamps by parsing RFC 3339 strings from
  the TelemetryEnvelope into SystemTime, instead of relying on sentry
  SDK auto-timestamping at flush time.
- Add explicit sentry_clients.flush() after capturing events to ensure
  the async transport delivers all enqueued events before the batch is
  considered complete.
- Remove #[allow(dead_code)] from ErrorEvent/PerformanceEvent timestamp
  fields since they are now actively used.

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Guard against negative i64 from dt.timestamp() being cast to u64,
which would wrap to u64::MAX and panic in SystemTime addition.
Now falls back to SystemTime::now() for pre-epoch or unparseable
timestamps.

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 11 additional findings in Devin Review.

Open in Devin Review

Comment on lines +419 to +421
let secs = dt.timestamp();
if secs >= 0 {
Some(SystemTime::UNIX_EPOCH + std::time::Duration::from_secs(secs as u64))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 parse_timestamp truncates sub-second precision from event timestamps

The parse_timestamp function uses dt.timestamp() which returns only whole seconds, discarding all sub-second precision from the RFC 3339 timestamp string. The old code passed the timestamp string directly into the JSON payload (e.g., "timestamp": error.timestamp), preserving full millisecond/nanosecond precision. Now every Sentry event timestamp is rounded down to the nearest second.

Fix: include subsecond nanoseconds

Use dt.timestamp_subsec_nanos() to reconstruct a full-precision SystemTime:

Some(SystemTime::UNIX_EPOCH + std::time::Duration::new(secs as u64, dt.timestamp_subsec_nanos()))
Suggested change
let secs = dt.timestamp();
if secs >= 0 {
Some(SystemTime::UNIX_EPOCH + std::time::Duration::from_secs(secs as u64))
let secs = dt.timestamp();
if secs >= 0 {
Some(SystemTime::UNIX_EPOCH + std::time::Duration::new(secs as u64, dt.timestamp_subsec_nanos()))
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants