[ENHANCEMENT]: Decouple Ollama from the Docker stack — opt-in bundled container + native-host mode

## Description

The Docker dev stack always starts a bundled `ollama` container, and the `app`
service hard-depends on it (`depends_on: ollama -> service_healthy`). Consequences:

- The whole stack refuses to come up unless Ollama is healthy, even though only
  `/forms/fill` actually needs the LLM.
- There's no first-class way to run a **host-native** Ollama (Metal/GPU on
  macOS) — the bundled container can't use the Mac GPU.
- No supported "run without Ollama" path for working on non-LLM features
  (templates, weather/zipcode, Whisper transcription).

## Proposed change

Make Ollama a swappable backend with two supported modes:

**Bundled (default)** — `make up` / `make fireform`
- Ollama runs in Docker, gated behind a `bundled-ollama` compose profile.

**Native** — `make up-native`
- Compose runs without the profile, so no Ollama container; the app reaches a
  host-native Ollama via `host.docker.internal:11434`.

Concretely:
- `docker/dev/compose.yml`: gate `ollama` behind `profiles: ["bundled-ollama"]`;
  drop `app`'s hard `depends_on: ollama` (`app/services/llm.py` already retries
  the connection); add `extra_hosts: "host.docker.internal:host-gateway"`.
- `Makefile`: add `up-native` target + `COMPOSE_NATIVE` (omits the profile);
  default `COMPOSE` keeps the profile so existing behavior is unchanged.
- `docker/.env.example`: document bundled vs native `OLLAMA_HOST`.
- `README.md`: update run-from-source instructions.

## Open question / follow-up

`make up-native` currently only **prints a reminder** to set
`OLLAMA_HOST=http://host.docker.internal:11434` in `.env.dev`. If that edit is
missed, the app silently falls back to the container address and fills fail —
the same "fails silently depending on which path you follow" footgun as #382.
Consider having `up-native` export `OLLAMA_HOST` for its compose invocation so
native mode works without hand-editing.

## Acceptance criteria

- [ ] `make up` starts the full stack incl. bundled Ollama (unchanged behavior)
- [ ] `make up-native` starts everything except Ollama and talks to a host-native instance
- [ ] Stack starts cleanly when Ollama is absent; only LLM-dependent endpoints degrade
- [ ] `.env.example` and `README.md` document both modes
- [ ] No regression to weather / zipcode / Whisper / template flows

Related: #382


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENHANCEMENT]: Decouple Ollama from the Docker stack — opt-in bundled container + native-host mode #576

Description

Proposed change

Open question / follow-up

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[ENHANCEMENT]: Decouple Ollama from the Docker stack — opt-in bundled container + native-host mode #576

Description

Description

Proposed change

Open question / follow-up

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions