Skip to content

feat: add Anthropic (Claude) as a categorizer backend#32

Merged
eshaffer321 merged 2 commits into
eshaffer321:mainfrom
fnziman:feat/llm-pluggable-categorizer
Jun 20, 2026
Merged

feat: add Anthropic (Claude) as a categorizer backend#32
eshaffer321 merged 2 commits into
eshaffer321:mainfrom
fnziman:feat/llm-pluggable-categorizer

Conversation

@fnziman

@fnziman fnziman commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Summary

Makes the LLM categorizer backend pluggable so itemize can use either OpenAI (existing) or Anthropic Claude. Motivated by the # CLAUDE_API_KEY=your-claude-api-key placeholder already sitting in .env.example — the integration point was anticipated; this wires it up.

Design

Architectural moves (no behavior change for OpenAI users):

  • Rename categorizer.OpenAIClientcategorizer.ChatClient. The contract is provider-agnostic; the name should be too.
  • Move the concrete OpenAI HTTP client out of internal/domain/categorizer/openai_client.go into a new internal/adapters/clients/openai/ package. Per AGENTS.md, the domain layer must stay pure (no HTTP/IO) — putting Anthropic alongside OpenAI in domain/ would compound an existing violation, while putting Anthropic in adapters/ and leaving OpenAI in domain/ would be inconsistent. Cleanest fix is to move OpenAI out too. Happy to drop this move if you'd rather keep the diff smaller — the rest of the PR still works with openai_client.go left where it is.

New code:

  • internal/adapters/clients/anthropic/ — implements categorizer.ChatClient via Anthropic's Messages API. Raw net/http, mirroring the existing OpenAI client's shape (no new SDK dependency).
  • JSON forcing: Anthropic has no response_format: json_object. When the categorizer asks for JSON, the adapter appends an assistant message with content { to the request — Claude continues from the prefill and emits a JSON object, which the adapter reassembles into the existing ChatCompletionResponse shape. No schema coupling, no tool definitions, no consumer changes.
  • AnthropicConfig and CategorizerConfig in internal/infrastructure/config/config.go.
  • newChatClient factory in internal/adapters/clients/clients.go. Selection rules:
    • Explicit CATEGORIZER_PROVIDER=openai|anthropic wins.
    • Otherwise auto-detect from which API key is set.
    • Both keys set without explicit provider → default to OpenAI (preserves status quo) + warning log.
    • Neither key set → fail fast with a clear error.

Defaults

  • Anthropic model: claude-haiku-4-5-20251001 — chosen for speed/cost parity with the existing gpt-5.4-nano default.
  • New env vars: ANTHROPIC_API_KEY (also CLAUDE_API_KEY), ANTHROPIC_MODEL, CATEGORIZER_PROVIDER. All optional; existing OpenAI-only setups continue to work unchanged.

Backwards compatibility

  • OPENAI_API_KEY / OPENAI_APIKEY / OPENAI_MODEL paths are untouched.
  • Existing config.yaml files without anthropic / categorizer blocks still work.
  • The categorizer.Categorizer constructor and CategorizeItems signatures are unchanged.

Tests

Package Coverage Notes
internal/adapters/clients/anthropic 90.2% mock HTTP server: success / JSON prefill+reassembly / structured 4xx / opaque 5xx / empty content
internal/adapters/clients/openai 80.8% parity tests for the moved code, same shape as anthropic
internal/adapters/clients 69.7% every branch of newChatClient: explicit / auto-detect / precedence / missing-key / unknown provider
internal/domain/categorizer 95.1% unchanged — mock renamed MockOpenAIClientMockChatClient

go test ./... -race passes; golangci-lint run clean for files in this PR (two pre-existing QF1012 warnings in categorizer.go:257,262 are present on main already and untouched here).

Test plan

  • go test ./... -race -cover — all green
  • go vet ./... — clean
  • golangci-lint run ./... — clean for this PR's files
  • go build ./... — clean
  • Manual smoke A — OpenAI path unchanged: OPENAI_API_KEY=... ./itemize walmart -dry-run -days 14 -verbose
  • Manual smoke B — Claude path live: ANTHROPIC_API_KEY=... CATEGORIZER_PROVIDER=anthropic ./itemize walmart -dry-run -days 14 -verbose
  • Manual smoke C — both keys set with no CATEGORIZER_PROVIDER → picks OpenAI + logs warning
  • Manual smoke D — neither key set → fails fast with clear error

(Live smokes A and B require accounts; happy to run them and report back if useful, or leave them to your discretion.)

Out of scope (intentionally not included)

  • Ollama backend — the OLLAMA_ENDPOINT line in .env.example is preserved but not wired. Separate work.
  • Switch to the official anthropic-sdk-go — kept raw net/http to mirror the OpenAI client and avoid adding a dependency. Easy to swap later.
  • Streaming responses — not needed for categorization.
  • Per-backend prompt tuning — the current prompt is generic-enough; if Claude needs adjustments after live testing, that can go in a follow-up.

Notes for the reviewer

  • If you'd prefer the rename/move dropped: revert the move portion (keep openai_client.go in domain/) and update the import in clients.go. The Anthropic addition still stands on its own.
  • If claude-haiku-4-5-20251001 isn't the model you'd default to, easy one-line change in LoadFromEnv().
  • If you want the selection rules different (e.g., prefer Anthropic when both keys are set, or always fail when both are set without explicit provider), swap of one switch branch in newChatClient.

fnziman and others added 2 commits June 16, 2026 20:05
The categorizer already depended on a chat-completion interface
(misleadingly named OpenAIClient); only the concrete OpenAI HTTP
implementation was hardcoded. This makes the backend pluggable so
users can pick OpenAI or Claude via config or env vars, mirroring the
placeholder already present in .env.example.

Architectural moves (mechanical, no behavior change):

* Rename categorizer.OpenAIClient -> categorizer.ChatClient (the
  contract is provider-agnostic, the name should be too).
* Move the concrete OpenAI HTTP client out of internal/domain/categorizer/
  into a new internal/adapters/clients/openai/ package, renaming
  RealOpenAIClient -> openai.Client. Per AGENTS.md the domain layer
  must stay pure; putting Anthropic alongside OpenAI in domain/ would
  compound the existing violation.

New code:

* internal/adapters/clients/anthropic/ implements categorizer.ChatClient
  via the Messages API. When the categorizer asks for JSON
  (response_format json_object), the adapter prefills the assistant
  turn with "{" so Claude emits guaranteed-shape JSON the categorizer
  can json.Unmarshal unchanged.
* AnthropicConfig and CategorizerConfig structs in config.go.
* newChatClient factory in clients.go: explicit CATEGORIZER_PROVIDER
  wins; otherwise auto-detect from whichever key is set; if both are
  set, OpenAI is preferred with a warn log (preserves status quo).

Defaults:

* Anthropic model: claude-haiku-4-5-20251001 (speed/cost parity with
  the existing gpt-5.4-nano default).
* Env vars: ANTHROPIC_API_KEY (CLAUDE_API_KEY also accepted),
  ANTHROPIC_MODEL, CATEGORIZER_PROVIDER. Existing OPENAI_* setups
  continue to work unchanged.

Tests:

* anthropic/client_test.go: 90% coverage. Mock HTTP server,
  success / prefill+JSON / structured error / opaque 5xx / empty content.
* openai/client_test.go: 81% coverage. Parity tests for the moved code.
* clients_test.go: every selection branch (explicit/auto/precedence/error).
* categorizer tests pass unchanged (mock renamed
  MockOpenAIClient -> MockChatClient).

Docs:

* README: "Choosing an LLM backend" subsection.
* AGENTS.md: env vars + architecture note that the categorizer is
  pluggable and where the concrete backends live.
* config.yaml + .env.example: anthropic / categorizer blocks added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov

codecov Bot commented Jun 20, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 85.81560% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.07%. Comparing base (e64d568) to head (bd552e1).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
internal/adapters/clients/anthropic/client.go 85.91% 5 Missing and 5 partials ⚠️
internal/adapters/clients/clients.go 86.95% 6 Missing ⚠️
internal/domain/categorizer/categorizer.go 71.42% 2 Missing ⚠️
internal/infrastructure/config/config.go 83.33% 1 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main      #32      +/-   ##
==========================================
+ Coverage   56.78%   58.07%   +1.29%     
==========================================
  Files          43       44       +1     
  Lines        4857     4995     +138     
==========================================
+ Hits         2758     2901     +143     
+ Misses       1934     1918      -16     
- Partials      165      176      +11     
Flag Coverage Δ
unittests 58.07% <85.81%> (+1.29%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/adapters/clients/openai/client.go 73.68% <100.00%> (ø)
internal/domain/categorizer/categorizer.go 93.65% <71.42%> (ø)
internal/infrastructure/config/config.go 79.01% <83.33%> (+1.07%) ⬆️
internal/adapters/clients/clients.go 72.72% <86.95%> (+72.72%) ⬆️
internal/adapters/clients/anthropic/client.go 85.91% <85.91%> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@eshaffer321 eshaffer321 merged commit 517b0e2 into eshaffer321:main Jun 20, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants