fix(control-plane): surface upstream errors via respondWithSdk helper#1678
Open
offendingcommit wants to merge 3 commits into
Open
fix(control-plane): surface upstream errors via respondWithSdk helper#1678offendingcommit wants to merge 3 commits into
offendingcommit wants to merge 3 commits into
Conversation
Pre-commit hook auto-sync caught drift between hindsight-docs/ sources and the skills/hindsight-docs/ mirror. No content authored here.
Closes vectorize-io#1677. The SDK (@hey-api/client-fetch shape) returns `{data, error, response}` and does not throw on non-2xx upstream responses. Route handlers were doing `NextResponse.json(response.data, {status: 200})` without checking `response.error` first. When the upstream API 5xx'd, `response.data` was `undefined`, and Node's spec'd `Response.json(undefined)` threw `TypeError: Value is not JSON serializable`. The catch block logged that TypeError as if it were the failure, masking the real upstream error and hard-coding the response status to 500. Introduce `src/lib/sdk-response.ts::respondWithSdk(result, label, status?)` that: - Detects `result.error !== undefined || result.data === undefined` - Logs the upstream HTTP status + upstream error detail - Returns a NextResponse with the upstream status code (502 fallback when the SDK had no Response object — i.e. network-level failure) - Surfaces the upstream detail in the body as `{error, upstream: {status, detail}}` so the dashboard can show a useful message - On success, serializes `result.data` with the requested status (default 200; pass 201 for create endpoints) Refactor 17 SDK-backed route files to use the helper. Routes that parse a request body keep a minimal try/catch around `await request.json()` and return 400 on malformed JSON (a small UX improvement over the prior 500). Routes that use raw `fetch()` (documents PATCH, operations retry POST) and the observations route (which does post-fetch transformation of `response.data.items`) are left untouched — they don't exhibit the bug. Add vitest + 12 durable tests covering the helper (success path with custom status, failure pass-through for 500/503/429, body shape includes `upstream.detail`, regression assertion that NO TypeError escapes when data is undefined, default-502 for network-level failures with no Response object). Wire `npm test --workspace=hindsight-control-plane` into the existing `build-control-plane` and `build-hindsight-all` CI jobs so the helper stays load-bearing. Browser UX is unchanged on the happy path. On failures, operators now see the real upstream status code and error body in both logs and the response.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1677.
Problem
Every SDK-backed route handler in
hindsight-control-planehad the same shape:The SDK (
@hey-api/client-fetch) returns{data, error, response}and does not throw on non-2xx. When the upstream API 5xx'd,response.datawasundefined, and Node'sResponse.json(undefined)throwsTypeError: Value is not JSON serializable. The catch block logged that TypeError as if it were the actual failure, hiding the real upstream error, and hard-coding the response status to500regardless of whether upstream said503,429,502, etc.Fix
New helper
src/lib/sdk-response.ts::respondWithSdk(result, label, status?):result.error !== undefined || result.data === undefinedNextResponsewith the upstream status code (502 fallback for network-level failures with no Response object){error: <label>, upstream: {status, detail}}— dashboards/clients now have the real upstream messageresult.datawith the requested status (default 200; pass 201 for create endpoints)17 SDK-backed routes refactored to use it. Routes that parse a request body keep a minimal try/catch around
await request.json()and now return 400 on malformed JSON (a small UX improvement over the prior 500).Out of scope (intentional)
documents/[documentId]/route.tsPATCH andbanks/[bankId]/operations/[operationId]/route.tsPOST use rawfetch()rather than the SDK — they don't exhibit the bug and are unchanged.banks/[bankId]/observations/route.tsdoes post-fetch transformation ofresponse.data.itemsand doesn't cleanly compose with the helper without restructuring — left for a separate PR.Tests
hindsight-control-plane/src/lib/sdk-response.test.ts— 12 vitest cases:upstream.detailin body, logs toconsole.error, falls back to 502 when noResponseobject (network-level), treatsundefined data + undefined erroras failure rather than 200ing withnullexpect(() => respondWithSdk(failingResult, label)).not.toThrow()— directly guards theResponse.json(undefined)TypeError that started this whole threadCI
Wired
npm test --workspace=hindsight-control-planeinto the existingbuild-control-planeandbuild-hindsight-alljobs in.github/workflows/test.yml. Tests run before the build step, so a regression breaks CI fast.Verification
Backwards compatibility
response.data.upstreamfield but keeps the same top-levelerrorstring, and the HTTP status code can now be the real upstream code (503, 429, 502) instead of always 500. Clients keying off literal status500would notice; the dashboard'sfetchApialready toasts on any 4xx/5xx so no UI changes are needed."<label>:"with{upstreamStatus, upstreamError}payload instead of an arbitraryErrorobject. Materially clearer.