feat: add Anthropic (Claude) as a categorizer backend by fnziman · Pull Request #32 · eshaffer321/itemize

fnziman · 2026-06-17T00:06:15Z

Summary

Makes the LLM categorizer backend pluggable so itemize can use either OpenAI (existing) or Anthropic Claude. Motivated by the # CLAUDE_API_KEY=your-claude-api-key placeholder already sitting in .env.example — the integration point was anticipated; this wires it up.

Design

Architectural moves (no behavior change for OpenAI users):

Rename categorizer.OpenAIClient → categorizer.ChatClient. The contract is provider-agnostic; the name should be too.
Move the concrete OpenAI HTTP client out of internal/domain/categorizer/openai_client.go into a new internal/adapters/clients/openai/ package. Per AGENTS.md, the domain layer must stay pure (no HTTP/IO) — putting Anthropic alongside OpenAI in domain/ would compound an existing violation, while putting Anthropic in adapters/ and leaving OpenAI in domain/ would be inconsistent. Cleanest fix is to move OpenAI out too. Happy to drop this move if you'd rather keep the diff smaller — the rest of the PR still works with openai_client.go left where it is.

New code:

internal/adapters/clients/anthropic/ — implements categorizer.ChatClient via Anthropic's Messages API. Raw net/http, mirroring the existing OpenAI client's shape (no new SDK dependency).
JSON forcing: Anthropic has no response_format: json_object. When the categorizer asks for JSON, the adapter appends an assistant message with content { to the request — Claude continues from the prefill and emits a JSON object, which the adapter reassembles into the existing ChatCompletionResponse shape. No schema coupling, no tool definitions, no consumer changes.
AnthropicConfig and CategorizerConfig in internal/infrastructure/config/config.go.
newChatClient factory in internal/adapters/clients/clients.go. Selection rules:
- Explicit CATEGORIZER_PROVIDER=openai|anthropic wins.
- Otherwise auto-detect from which API key is set.
- Both keys set without explicit provider → default to OpenAI (preserves status quo) + warning log.
- Neither key set → fail fast with a clear error.

Defaults

Anthropic model: claude-haiku-4-5-20251001 — chosen for speed/cost parity with the existing gpt-5.4-nano default.
New env vars: ANTHROPIC_API_KEY (also CLAUDE_API_KEY), ANTHROPIC_MODEL, CATEGORIZER_PROVIDER. All optional; existing OpenAI-only setups continue to work unchanged.

Backwards compatibility

OPENAI_API_KEY / OPENAI_APIKEY / OPENAI_MODEL paths are untouched.
Existing config.yaml files without anthropic / categorizer blocks still work.
The categorizer.Categorizer constructor and CategorizeItems signatures are unchanged.

Tests

Package	Coverage	Notes
`internal/adapters/clients/anthropic`	90.2%	mock HTTP server: success / JSON prefill+reassembly / structured 4xx / opaque 5xx / empty content
`internal/adapters/clients/openai`	80.8%	parity tests for the moved code, same shape as anthropic
`internal/adapters/clients`	69.7%	every branch of `newChatClient`: explicit / auto-detect / precedence / missing-key / unknown provider
`internal/domain/categorizer`	95.1%	unchanged — mock renamed `MockOpenAIClient` → `MockChatClient`

go test ./... -race passes; golangci-lint run clean for files in this PR (two pre-existing QF1012 warnings in categorizer.go:257,262 are present on main already and untouched here).

Test plan

go test ./... -race -cover — all green
go vet ./... — clean
golangci-lint run ./... — clean for this PR's files
go build ./... — clean
Manual smoke A — OpenAI path unchanged: OPENAI_API_KEY=... ./itemize walmart -dry-run -days 14 -verbose
Manual smoke B — Claude path live: ANTHROPIC_API_KEY=... CATEGORIZER_PROVIDER=anthropic ./itemize walmart -dry-run -days 14 -verbose
Manual smoke C — both keys set with no CATEGORIZER_PROVIDER → picks OpenAI + logs warning
Manual smoke D — neither key set → fails fast with clear error

(Live smokes A and B require accounts; happy to run them and report back if useful, or leave them to your discretion.)

Out of scope (intentionally not included)

Ollama backend — the OLLAMA_ENDPOINT line in .env.example is preserved but not wired. Separate work.
Switch to the official anthropic-sdk-go — kept raw net/http to mirror the OpenAI client and avoid adding a dependency. Easy to swap later.
Streaming responses — not needed for categorization.
Per-backend prompt tuning — the current prompt is generic-enough; if Claude needs adjustments after live testing, that can go in a follow-up.

Notes for the reviewer

If you'd prefer the rename/move dropped: revert the move portion (keep openai_client.go in domain/) and update the import in clients.go. The Anthropic addition still stands on its own.
If claude-haiku-4-5-20251001 isn't the model you'd default to, easy one-line change in LoadFromEnv().
If you want the selection rules different (e.g., prefer Anthropic when both keys are set, or always fail when both are set without explicit provider), swap of one switch branch in newChatClient.

The categorizer already depended on a chat-completion interface (misleadingly named OpenAIClient); only the concrete OpenAI HTTP implementation was hardcoded. This makes the backend pluggable so users can pick OpenAI or Claude via config or env vars, mirroring the placeholder already present in .env.example. Architectural moves (mechanical, no behavior change): * Rename categorizer.OpenAIClient -> categorizer.ChatClient (the contract is provider-agnostic, the name should be too). * Move the concrete OpenAI HTTP client out of internal/domain/categorizer/ into a new internal/adapters/clients/openai/ package, renaming RealOpenAIClient -> openai.Client. Per AGENTS.md the domain layer must stay pure; putting Anthropic alongside OpenAI in domain/ would compound the existing violation. New code: * internal/adapters/clients/anthropic/ implements categorizer.ChatClient via the Messages API. When the categorizer asks for JSON (response_format json_object), the adapter prefills the assistant turn with "{" so Claude emits guaranteed-shape JSON the categorizer can json.Unmarshal unchanged. * AnthropicConfig and CategorizerConfig structs in config.go. * newChatClient factory in clients.go: explicit CATEGORIZER_PROVIDER wins; otherwise auto-detect from whichever key is set; if both are set, OpenAI is preferred with a warn log (preserves status quo). Defaults: * Anthropic model: claude-haiku-4-5-20251001 (speed/cost parity with the existing gpt-5.4-nano default). * Env vars: ANTHROPIC_API_KEY (CLAUDE_API_KEY also accepted), ANTHROPIC_MODEL, CATEGORIZER_PROVIDER. Existing OPENAI_* setups continue to work unchanged. Tests: * anthropic/client_test.go: 90% coverage. Mock HTTP server, success / prefill+JSON / structured error / opaque 5xx / empty content. * openai/client_test.go: 81% coverage. Parity tests for the moved code. * clients_test.go: every selection branch (explicit/auto/precedence/error). * categorizer tests pass unchanged (mock renamed MockOpenAIClient -> MockChatClient). Docs: * README: "Choosing an LLM backend" subsection. * AGENTS.md: env vars + architecture note that the categorizer is pluggable and where the concrete backends live. * config.yaml + .env.example: anthropic / categorizer blocks added. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-06-20T20:43:33Z

Codecov Report

❌ Patch coverage is 85.81560% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.07%. Comparing base (e64d568) to head (bd552e1).
⚠️ Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
internal/adapters/clients/anthropic/client.go	85.91%	5 Missing and 5 partials ⚠️
internal/adapters/clients/clients.go	86.95%	6 Missing ⚠️
internal/domain/categorizer/categorizer.go	71.42%	2 Missing ⚠️
internal/infrastructure/config/config.go	83.33%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #32      +/-   ##
==========================================
+ Coverage   56.78%   58.07%   +1.29%     
==========================================
  Files          43       44       +1     
  Lines        4857     4995     +138     
==========================================
+ Hits         2758     2901     +143     
+ Misses       1934     1918      -16     
- Partials      165      176      +11

Flag	Coverage Δ
unittests	`58.07% <85.81%> (+1.29%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
internal/adapters/clients/openai/client.go	`73.68% <100.00%> (ø)`
internal/domain/categorizer/categorizer.go	`93.65% <71.42%> (ø)`
internal/infrastructure/config/config.go	`79.01% <83.33%> (+1.07%)`	⬆️
internal/adapters/clients/clients.go	`72.72% <86.95%> (+72.72%)`	⬆️
internal/adapters/clients/anthropic/client.go	`85.91% <85.91%> (ø)`

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

fnziman and others added 2 commits June 16, 2026 20:05

fix: default Anthropic categorizer model

bd552e1

eshaffer321 merged commit 517b0e2 into eshaffer321:main Jun 20, 2026
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Anthropic (Claude) as a categorizer backend#32

feat: add Anthropic (Claude) as a categorizer backend#32
eshaffer321 merged 2 commits into
eshaffer321:mainfrom
fnziman:feat/llm-pluggable-categorizer

fnziman commented Jun 17, 2026

Uh oh!

codecov Bot commented Jun 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fnziman commented Jun 17, 2026

Summary

Design

Defaults

Backwards compatibility

Tests

Test plan

Out of scope (intentionally not included)

Notes for the reviewer

Uh oh!

codecov Bot commented Jun 20, 2026

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants