Description
The Docker dev stack always starts a bundled ollama container, and the app
service hard-depends on it (depends_on: ollama -> service_healthy). Consequences:
- The whole stack refuses to come up unless Ollama is healthy, even though only
/forms/fill actually needs the LLM.
- There's no first-class way to run a host-native Ollama (Metal/GPU on
macOS) — the bundled container can't use the Mac GPU.
- No supported "run without Ollama" path for working on non-LLM features
(templates, weather/zipcode, Whisper transcription).
Proposed change
Make Ollama a swappable backend with two supported modes:
Bundled (default) — make up / make fireform
- Ollama runs in Docker, gated behind a
bundled-ollama compose profile.
Native — make up-native
- Compose runs without the profile, so no Ollama container; the app reaches a
host-native Ollama via host.docker.internal:11434.
Concretely:
docker/dev/compose.yml: gate ollama behind profiles: ["bundled-ollama"];
drop app's hard depends_on: ollama (app/services/llm.py already retries
the connection); add extra_hosts: "host.docker.internal:host-gateway".
Makefile: add up-native target + COMPOSE_NATIVE (omits the profile);
default COMPOSE keeps the profile so existing behavior is unchanged.
docker/.env.example: document bundled vs native OLLAMA_HOST.
README.md: update run-from-source instructions.
Open question / follow-up
make up-native currently only prints a reminder to set
OLLAMA_HOST=http://host.docker.internal:11434 in .env.dev. If that edit is
missed, the app silently falls back to the container address and fills fail —
the same "fails silently depending on which path you follow" footgun as #382.
Consider having up-native export OLLAMA_HOST for its compose invocation so
native mode works without hand-editing.
Acceptance criteria
Related: #382
Description
The Docker dev stack always starts a bundled
ollamacontainer, and theappservice hard-depends on it (
depends_on: ollama -> service_healthy). Consequences:/forms/fillactually needs the LLM.macOS) — the bundled container can't use the Mac GPU.
(templates, weather/zipcode, Whisper transcription).
Proposed change
Make Ollama a swappable backend with two supported modes:
Bundled (default) —
make up/make fireformbundled-ollamacompose profile.Native —
make up-nativehost-native Ollama via
host.docker.internal:11434.Concretely:
docker/dev/compose.yml: gateollamabehindprofiles: ["bundled-ollama"];drop
app's harddepends_on: ollama(app/services/llm.pyalready retriesthe connection); add
extra_hosts: "host.docker.internal:host-gateway".Makefile: addup-nativetarget +COMPOSE_NATIVE(omits the profile);default
COMPOSEkeeps the profile so existing behavior is unchanged.docker/.env.example: document bundled vs nativeOLLAMA_HOST.README.md: update run-from-source instructions.Open question / follow-up
make up-nativecurrently only prints a reminder to setOLLAMA_HOST=http://host.docker.internal:11434in.env.dev. If that edit ismissed, the app silently falls back to the container address and fills fail —
the same "fails silently depending on which path you follow" footgun as #382.
Consider having
up-nativeexportOLLAMA_HOSTfor its compose invocation sonative mode works without hand-editing.
Acceptance criteria
make upstarts the full stack incl. bundled Ollama (unchanged behavior)make up-nativestarts everything except Ollama and talks to a host-native instance.env.exampleandREADME.mddocument both modesRelated: #382