feat(attachments): pre-submit cost preview for images and PDFs [GET-24]#56
Conversation
…GET-24] The agent in lib/attachment-cost.ts and the drift tests in tests/unit/attachment-cost.test.ts treat every number in this spec as ground truth. Update this file in lockstep with the constants when Anthropic publishes a change. Allow docs/*-spec.md through the gitignore so feature specs (long-lived, tracked) are separate from audit reports (local, ignored).
New pure-function agent computes input-token cost for image and PDF attachments before send. Image cost mirrors Anthropic's published algorithm exactly (resize long edge to per-model cap, apply w*h/750, clamp to maxTokens). PDF cost is returned as Anthropic's published 1500-3000 per-page range; never collapsed to a midpoint. PDF page count comes from a focused regex over the file's head + tail windows. No heavy dependency: pdfjs-dist would have added ~600 KB gzipped and DOM-dependency friction for what is a one-shot read. Drift tests assert every value in Anthropic's published vision table verbatim for both Sonnet 4.6 and Opus 4.7. If Anthropic changes the formula or caps, those tests fail next CI run and force a docs review. Spec pinned in docs/attachment-cost-spec.md.
PreSubmitInput now carries optional attachmentTokensLow, attachmentTokensHigh, attachmentBreakdown, attachmentWarnings, hasUnknownImage, hasPdf. PreSubmitEstimate exposes both bounds for tokens and session %, separating text and attachment contributions. Gate: when text is below MIN_DRAFT_CHARS but attachments are present, still produce an estimate (the attachment alone is the user's draft). The DRAFT_ESTIMATE pre-send fallback path in inject.ts passes no attachment fields; behavior unchanged for that path. Warning fires on the LOW projection so a PDF range does not raise a false alarm whenever the high end happens to spike above 90 percent. Model comparison gates on the HIGH projection so PDF-heavy drafts still surface "switch models" advice when the worst case is large. Tests cover sums, range pass-through, attachments-only path, threshold gate, hasPdf and hasUnknownImage flags, and the warning asymmetry between LOW and HIGH bounds.
The orchestrator now maintains an attachmentMap keyed by a stable file fingerprint. On file-input change events delegated at the form level, image dimensions are read from a transient blob URL (revoked on load) and PDF page counts from a head + tail byte window. Bytes never leave the browser. DOM mutations on the form drive the existing onComposeInput debounce. Reconciliation prunes map entries whose filename no longer appears in the form's textContent (the user removed the attachment via the UI). TOKEN_BATCH and SPA navigation clear the map. Bug fix: onComposeInput previously read textContent from the form parent, which inflated the char count by the length of every attached filename. Now reads only from composeBoxRef. Wires computeAttachmentCost into the pre-submit estimate; nothing else in the orchestrator changes.
…-24]
Draft row now displays low-to-high token and session % ranges when a
PDF is attached, since Anthropic publishes PDF cost as a 1500-3000
per-page range. Per-attachment breakdown lines surface contributions
("+1.6k from image 1568x1568", "+12k to 24k from PDF 8 pages"). A
small italic disclosure renders only when a PDF is present, noting
that documents with charts may cost more (Anthropic does not publish
the per-page image overhead).
Hard warnings (PDF page caps exceeded) get their own row in the rust
accent so they read more urgently than the existing projection
warning. Unknown-cost images render with "?" instead of a token
figure; the image still appears in the breakdown so the user knows
it is attached.
Two new CSS classes: lco-draft-breakdown and lco-draft-disclosure.
The existing lco-draft-warning is unchanged; lco-draft-hard-warning
adds a higher-emphasis variant.
|
@DevanshuNEU is attempting to deploy a commit to the Dev's projects Team on Vercel. A member of the Team first needs to authorize it. |
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 22 minutes and 59 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
📝 WalkthroughWalkthroughThis PR implements deterministic token cost estimation for image and PDF attachments in the Claude AI compose interface. It introduces modules for computing image tokens (with per-model resolution capping) and PDF token ranges, integrates attachment cost tracking into the draft pre-submit flow with low/high projections, updates the overlay UI to display attachment breakdowns and warnings, and adds comprehensive test coverage and specification documentation. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant ContentScript as Content Script<br/>(entrypoints/claude-ai.content.ts)
participant AttachmentLib as Attachment Cost Lib<br/>(lib/attachment-cost.ts)
participant PDFLib as PDF Page Count Lib<br/>(lib/pdf-page-count.ts)
participant PreSubmitLib as Pre-submit Lib<br/>(lib/pre-submit.ts)
participant OverlayUI as Overlay UI<br/>(ui/overlay.ts)
User->>ContentScript: Add attachment (image/PDF)
ContentScript->>ContentScript: Extract file from input,<br/>track by fingerprint
alt Image Attachment
ContentScript->>ContentScript: Read dimensions via<br/>blob URL
else PDF Attachment
ContentScript->>PDFLib: countPdfPages(bytes)
PDFLib-->>ContentScript: page count or null
end
ContentScript->>AttachmentLib: computeAttachmentCost(descriptors, model)
AttachmentLib->>AttachmentLib: computeImageTokens() & computePdfTokenRange()<br/>for each attachment
AttachmentLib-->>ContentScript: {low, high, breakdown, warnings}
ContentScript->>PreSubmitLib: computePreSubmitEstimate({...draft, attachment costs})
PreSubmitLib->>PreSubmitLib: Compute low/high totals,<br/>session percentages,<br/>comparisons
PreSubmitLib-->>ContentScript: {estimatedTokens, estimatedTokensHigh,<br/>breakdown, warnings, hasPdf, ...}
ContentScript->>OverlayUI: Render draftEstimate
OverlayUI->>OverlayUI: Format "~X to Y tokens",<br/>display breakdown rows,<br/>show disclosure & warnings
OverlayUI-->>User: Display cost estimate
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
ui/overlay.ts (1)
435-443: Consider a clearer separator when joining multiple hard warnings.If the attachment agent ever emits two or more warnings (e.g., page-cap plus a future size-cap rule) and the strings don't terminate in punctuation, joining with a single space will run them together. Either ensure agent strings always end with
.or use a more visible delimiter such as' · '. Same low-impact note applies to the breakdown line at line 408 if you anticipate multi-attachment drafts.Optional tweak
- elDraftHardWarning.textContent = hard.join(' '); + elDraftHardWarning.textContent = hard.join(' · ');🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/overlay.ts` around lines 435 - 443, The current join of attachment warnings uses a single space which can run messages together; update the logic that sets elDraftHardWarning.textContent (reading from state.draftEstimate?.attachmentWarnings) to join with a clearer delimiter (for example ' · ' or '\n') instead of ' ', and apply the same change to the similar breakdown join logic used elsewhere (e.g., the breakdown line near where attachment warnings are assembled) so multiple warnings remain readable even when they lack terminal punctuation.tests/unit/pdf-page-count.test.ts (1)
79-86: Test name oversells what it covers.The fixture contains a
/Type /Pagesroot with/Count 7, so strategy 1 succeeds and strategy 2 (LEAF_PAGE) is never run. This test confirms the root path returns 7, not that the leaf regex avoids matching/Pages. To exercise the negative-lookahead, drop the/Countso strategy 1 fails, e.g.:Suggested addition
+ it('LEAF_PAGE does not match /Type /Pages when fallback runs', () => { + // No /Count anywhere -> strategy 1 fails, strategy 2 runs. + const text = `%PDF-1.4 +2 0 obj << /Type /Pages /Kids [3 0 R] >> endobj +%%EOF +`; + expect(countPdfPages(pdfBytes(text))).toBeNull(); + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/pdf-page-count.test.ts` around lines 79 - 86, Rename this test to reflect that it verifies the root /Count path (e.g., "returns /Count from Pages root") and do not claim it exercises the leaf-page negative-lookahead; keep the fixture as-is and the assertion calling countPdfPages(pdfBytes(text)) === 7. Then add a new test that drops the "/Count 7" from the Pages root and includes explicit leaf page objects (e.g., objects with "/Type /Page") so strategy 1 fails and strategy 2 (LEAF_PAGE) runs; in that new test call countPdfPages(pdfBytes(text)) and assert the expected leaf-derived page count (e.g., 2). Reference countPdfPages and pdfBytes when locating code to change or add tests.entrypoints/claude-ai.content.ts (1)
755-769: Image load can hang and leak the blob URL.Neither
onloadnoronerroris guaranteed to fire (slow decode, abandoned tab, very large or corrupt image), and there is no timeout. The blob URL stays alive (and the promise never settles), keeping the file's memory pinned for the lifetime of the page. Add a timeout that revokes the URL and resolvesnull, or usecreateImageBitmap(file)which gives youwidth/heightdirectly and rejects on failure without needing a blob URL.Proposed fix using createImageBitmap
- function readImageDimensions(file: File): Promise<{ width: number; height: number } | null> { - return new Promise(resolve => { - const url = URL.createObjectURL(file); - const img = new Image(); - img.onload = () => { - URL.revokeObjectURL(url); - resolve({ width: img.naturalWidth, height: img.naturalHeight }); - }; - img.onerror = () => { - URL.revokeObjectURL(url); - resolve(null); - }; - img.src = url; - }); - } + async function readImageDimensions(file: File): Promise<{ width: number; height: number } | null> { + try { + const bmp = await createImageBitmap(file); + const dims = { width: bmp.width, height: bmp.height }; + bmp.close(); + return dims; + } catch { + return null; + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@entrypoints/claude-ai.content.ts` around lines 755 - 769, The readImageDimensions function can hang and leak the blob URL because img.onload/img.onerror may never fire; change it to use createImageBitmap(file) to get width/height and rely on its rejection for failures, and if createImageBitmap is unavailable add a fallback that sets a timeout to revoke the blob URL and resolve null; update the function (readImageDimensions) to call createImageBitmap(file).then(bitmap => { const dims = {width: bitmap.width, height: bitmap.height}; bitmap.close(); resolve dims }).catch(() => resolve(null)), and ensure any created blob URL is revoked in the fallback and the timeout clears on success to avoid memory leaks.lib/pre-submit.ts (1)
205-219: Redundant Math.max on line 206.
estimatedTokensHighis always>= estimatedTokens(sametextTokensplusattachmentTokensHigh >= attachmentTokensLow), soMath.max(estimatedTokens, estimatedTokensHigh)simplifies toestimatedTokensHigh. Not a bug; the comment on line 203-204 already states the intent. Consider dropping theMath.maxfor clarity, or add an invariant comment if the defensive check is intentional.Optional simplification
- const compareTokens = Math.max(estimatedTokens, estimatedTokensHigh); - if (estimatedSessionPctHigh !== null && estimatedSessionPctHigh > MODEL_COMPARE_THRESHOLD_PCT && pctPerInputToken !== null) { + if (estimatedSessionPctHigh !== null && estimatedSessionPctHigh > MODEL_COMPARE_THRESHOLD_PCT && pctPerInputToken !== null) { @@ - estimatedPct: compareTokens * rate, + estimatedPct: estimatedTokensHigh * rate,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/pre-submit.ts` around lines 205 - 219, The code uses Math.max(estimatedTokens, estimatedTokensHigh) to compute compareTokens even though estimatedTokensHigh is always >= estimatedTokens; remove the redundant Math.max and set compareTokens = estimatedTokensHigh (or replace usages of compareTokens with estimatedTokensHigh) in the block that builds modelComparisons (symbols: compareTokens, estimatedTokensHigh, estimatedTokens, modelComparisons, pctPerInputToken, classifyModelTier, MODEL_COMPARE_THRESHOLD_PCT) to simplify the logic; if you intended a defensive check instead, add a short invariant comment above the assignment explaining that estimatedTokensHigh >= estimatedTokens is guaranteed so Math.max is unnecessary.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/attachment-cost-spec.md`:
- Around line 113-118: The markdown uses unlabeled fenced code blocks and
unescaped asterisks in math expressions causing MD040/MD037; update the fenced
blocks containing PDF_TOKENS_PER_PAGE_LOW and PDF_TOKENS_PER_PAGE_HIGH (and the
algorithm fence around the `maxLongPx, maxTokens =` snippet) to use a language
label like ```text, and wrap the inline math expressions (`low = N * 1500`,
`high = N * 3000`) in backticks so the `*` is not treated as emphasis; this
fixes the markdownlint warnings while keeping the displayed content identical.
In `@entrypoints/claude-ai.content.ts`:
- Around line 803-814: The filename-substring check is unreliable: instead of
scanning composeFormRef.textContent for tracked.filename, change the pruning
logic in the attachment reconciliation block (where composeFormRef,
attachmentMap and tracked.filename are used) to only search the attachment-card
subtree under composeFormRef (e.g., querySelectorAll for the attachment card
nodes or their filename elements) or, when available, compare tracked.filename
against the live input.files list; iterate the actual attachment-card nodes or
input.files to decide which keys to delete from attachmentMap so user-typed text
does not prevent removal.
In `@lib/pdf-page-count.ts`:
- Around line 22-23: The COUNT_THEN_TYPE_PAGES regex can produce cross-object
false positives; update the regexes so the inter-key window is much tighter
(e.g., use {0,100} instead of {0,8192}) and add a negative lookahead to forbid
the dictionary end token ">>" between the keys so both keys must be in the same
dictionary for TYPE_PAGES_THEN_COUNT and COUNT_THEN_TYPE_PAGES, and then change
the extraction logic to prefer matches from TYPE_PAGES_THEN_COUNT and only fall
back to COUNT_THEN_TYPE_PAGES if the forward-pattern produced zero matches; also
ensure the MAX_PLAUSIBLE_PAGES check remains and add a unit test covering a
font/different-dictionary `/Count` preceding the real page-tree to prevent
regression.
---
Nitpick comments:
In `@entrypoints/claude-ai.content.ts`:
- Around line 755-769: The readImageDimensions function can hang and leak the
blob URL because img.onload/img.onerror may never fire; change it to use
createImageBitmap(file) to get width/height and rely on its rejection for
failures, and if createImageBitmap is unavailable add a fallback that sets a
timeout to revoke the blob URL and resolve null; update the function
(readImageDimensions) to call createImageBitmap(file).then(bitmap => { const
dims = {width: bitmap.width, height: bitmap.height}; bitmap.close(); resolve
dims }).catch(() => resolve(null)), and ensure any created blob URL is revoked
in the fallback and the timeout clears on success to avoid memory leaks.
In `@lib/pre-submit.ts`:
- Around line 205-219: The code uses Math.max(estimatedTokens,
estimatedTokensHigh) to compute compareTokens even though estimatedTokensHigh is
always >= estimatedTokens; remove the redundant Math.max and set compareTokens =
estimatedTokensHigh (or replace usages of compareTokens with
estimatedTokensHigh) in the block that builds modelComparisons (symbols:
compareTokens, estimatedTokensHigh, estimatedTokens, modelComparisons,
pctPerInputToken, classifyModelTier, MODEL_COMPARE_THRESHOLD_PCT) to simplify
the logic; if you intended a defensive check instead, add a short invariant
comment above the assignment explaining that estimatedTokensHigh >=
estimatedTokens is guaranteed so Math.max is unnecessary.
In `@tests/unit/pdf-page-count.test.ts`:
- Around line 79-86: Rename this test to reflect that it verifies the root
/Count path (e.g., "returns /Count from Pages root") and do not claim it
exercises the leaf-page negative-lookahead; keep the fixture as-is and the
assertion calling countPdfPages(pdfBytes(text)) === 7. Then add a new test that
drops the "/Count 7" from the Pages root and includes explicit leaf page objects
(e.g., objects with "/Type /Page") so strategy 1 fails and strategy 2
(LEAF_PAGE) runs; in that new test call countPdfPages(pdfBytes(text)) and assert
the expected leaf-derived page count (e.g., 2). Reference countPdfPages and
pdfBytes when locating code to change or add tests.
In `@ui/overlay.ts`:
- Around line 435-443: The current join of attachment warnings uses a single
space which can run messages together; update the logic that sets
elDraftHardWarning.textContent (reading from
state.draftEstimate?.attachmentWarnings) to join with a clearer delimiter (for
example ' · ' or '\n') instead of ' ', and apply the same change to the similar
breakdown join logic used elsewhere (e.g., the breakdown line near where
attachment warnings are assembled) so multiple warnings remain readable even
when they lack terminal punctuation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0224510b-db2d-4dc2-82c9-452f3eae27e8
📒 Files selected for processing (12)
.gitignoredocs/attachment-cost-spec.mdentrypoints/claude-ai.content.tslib/attachment-cost.tslib/pdf-page-count.tslib/pre-submit.tstests/audit/overlay-state-audit.test.tstests/unit/attachment-cost.test.tstests/unit/pdf-page-count.test.tstests/unit/pre-submit.test.tsui/overlay-styles.tsui/overlay.ts
…ack [GET-24] Claude.ai's modern compose container is not a <form> element, so the form-scoped change listener never fired. Move the listener to document.documentElement with capture so file-input change events are caught regardless of where the input sits relative to the editor. Replace findFormParent with findComposeRegion: in addition to FORM and FIELDSET, accept any ancestor whose subtree contains an input[type=file], or the widest ancestor walked within 8 levels. composeFormRef is now populated for the modern DOM, which restores attachment-card removal detection via filename reconciliation. When the PDF page-count parser returns null (encrypted, fully compressed, malformed), still register the attachment with pageCount: null. The agent emits an unknown-cost breakdown row instead of silently dropping the file so the user sees Saar tracked the upload. The document-level listener is attached once at init and stays put across SPA navigations (documentElement is stable). fileChangeListenerAttached is no longer reset on navigation. Tests: 2 new cases for the null-page-count agent path. 1633 passing.
…ew [GET-24]
The previous PR shipped attachment cost preview as a passive cost
display. Anthropic's own PDF guidance is more direct: "Dense PDFs can
fill the context window before reaching the page limit." Saar should
warn users about that risk before they hit send.
Three additions to the AI Usage Coach surface:
1. Context-window projection. PreSubmitInput now accepts
currentContextPct (passed by the orchestrator from state.contextPct).
PreSubmitEstimate exposes projectedContextPctLow / High and
contextWindowSize so the overlay can show "would use ~N% to N% of
1000k context" right under the token figure. The warning fires on
the HIGH projection at >= 90 % of context, the same threshold used
for session warnings.
2. Aggregate request-size warning. AttachmentDescriptor now carries
fileSize so the agent can sum bytes across attachments. Warns at
> 30 MB ("approaching") and > 32 MB ("exceeds") per Anthropic's
"Maximum request size: 32 MB".
3. Coaching copy mirrors Anthropic's published advice. "Split the
document into sections" comes verbatim from the PDF docs; the
agent surfaces it instead of inventing alternative phrasing.
Spec updated in lockstep: docs/attachment-cost-spec.md now lists the
five active warning thresholds, the trigger conditions, and the
Anthropic source for each. Drift tests pin every constant.
Tests: 11 new cases covering 491-page-PDF overrun, 200K-model overrun,
the 90-100% soft warning band, and aggregate file-size triggers at
both 30 MB and 32 MB. 1644 passing.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Summary
Pre-submit cost preview now covers image and PDF attachments, the first Wave-1 feature that helps Free users see what an attachment will cost before they pay for it. Image cost is computed from Anthropic's published
width * height / 750formula with per-model resize and token caps; PDF cost is reported as Anthropic's 1500-3000 tokens-per-page range, never collapsed to a midpoint, because Anthropic itself publishes a range. Drift tests pin every value in Anthropic's published vision table verbatim, so any future formula change forces a docs review on the next CI run.Type of Change
feat— New featureWhat Was Changed
New files
lib/attachment-cost.ts— pure agent.computeImageTokens(w, h, model),computePdfTokenRange(pages),computeAttachmentCost(attachments, model). Per-model resize-and-cap table for images (Opus 4.7: 2576 px / 4784 tokens; everything else: 1568 px / 1568 tokens). PDF page-cap warnings derived fromgetContextWindowSizeso a new 1M-context model lands without a code edit.lib/pdf-page-count.ts— hand-rolled page-count parser. Scans the head + tail windows for the page-tree root's/Countentry; falls back to counting leaf/Type /Pageobjects. Returnsnullfor encrypted, fully-compressed, or malformed PDFs (overlay shows "?"). Avoids the ~600 KB pdfjs-dist payload.tests/unit/attachment-cost.test.ts— drift tests against Anthropic's published image table (Sonnet 4.6 and Opus 4.7), PDF page-cap warnings, mixed attachments, unknown-model fallback, edge cases.tests/unit/pdf-page-count.test.ts— page-tree extraction, intermediate-node max selection, leaf-page fallback, garbage / empty / implausible-count rejection.docs/attachment-cost-spec.md— single source of truth for the math. Verified against Anthropic docs on 2026-04-26. Updated in lockstep with the constants when Anthropic publishes a change.Modified
lib/pre-submit.ts—PreSubmitInputaccepts optionalattachmentTokensLow / High,attachmentBreakdown,attachmentWarnings,hasUnknownImage,hasPdf.PreSubmitEstimateexposes both bounds. Gate relaxes when attachments are present so attachments-only drafts produce an estimate. Warning fires on the LOW projection (no false alarms from PDF range spikes); model comparisons gate on the HIGH projection.entrypoints/claude-ai.content.ts— file-inputchangeevents delegated at the form level. Image dimensions read via blob URL (revoked synchronously). PDF bytes read locally viaFile.sliceand parsed withcountPdfPages. Attachment map keyed by name+size+lastModified; reconciled on DOM mutations.TOKEN_BATCHand SPA navigation clear the map. Bug fix:onComposeInputpreviously readtextContentfrom the form parent, inflating the char count by every attached filename; now reads from the contenteditable only.ui/overlay.tsandui/overlay-styles.ts— main draft row shows low-to-high range when a PDF is attached. New rows: per-attachment breakdown, italic PDF disclosure ("PDFs with charts or images may cost more"), and hard-warning row in rust accent for cap violations.tests/unit/pre-submit.test.ts— extended with attachment cases, range pass-through, threshold gate, warning asymmetry.tests/audit/overlay-state-audit.test.ts— updated to the newPreSubmitEstimateshape..gitignore—docs/*.mdis still ignored, butdocs/*-spec.mdis now tracked. Lets long-lived feature specs live alongside the local audit reports.How to Test
bun run compile— TypeScript clean.bun run test— 1631 passing (60 new).bun run build—claude-ai.jsat 69 KB, under the 100 KB ceiling..output/chrome-mv3and open claude.ai.~Nk tokensplus a breakdown line+Nk from image WxH. Send the message; the row clears.~Nk to Nk tokens · ~N% to N% of session, a breakdown line+Nk to Nk from PDF N pages, and the italic disclosure about charts and images.N PDF pages exceeds the 100-page limit on this model.Checklist
bun run test)bun run compile)bun run build)Related Issues
Closes GET-24.
Notes for Reviewer
Why a hand-rolled PDF parser instead of pdfjs-dist: the issue suggested pdfjs-dist with lazy load. pdfjs-dist is ~600 KB gzipped and brings DOM-dependency friction in MV3. We only need the page count, which lives in the page-tree root that sits in the textual portion of ~95 percent of standard PDFs. The 30-line regex parser ships zero dependencies and returns
nullcleanly when the PDF is encrypted or fully-compressed (overlay falls back to "?"). Filed as a Wave-2 follow-up to swap to pdfjs-dist via an offscreen document if accuracy becomes a complaint.Why PDF cost is reported as a range, not a midpoint: Anthropic publishes per-page cost as 1,500-3,000 tokens "depending on content density" plus an undisclosed per-page image-rendering overhead. Collapsing that to a single number would imply false precision. Inherent error band is ±33 percent from Anthropic's own publication; no formula will fix it. The overlay shows the range and the disclosure; users see range honesty.
Empirical calibration is filed for Wave 2. Calling Anthropic's
count_tokensendpoint with the actual prompt + attachments would push our error to near zero. That requires API-key plumbing, a request-budget plan, and a privacy review; out of scope here.Summary by CodeRabbit