Skip to content

fix(docker): resolve 8 Docker deployment bugs#1926

Open
mrveiss wants to merge 8 commits intoDev_new_guifrom
fix/docker-bugs-batch
Open

fix(docker): resolve 8 Docker deployment bugs#1926
mrveiss wants to merge 8 commits intoDev_new_guifrom
fix/docker-bugs-batch

Conversation

@mrveiss
Copy link
Owner

@mrveiss mrveiss commented Mar 18, 2026

Summary

Batch fix for 8 Docker deployment bugs discovered during #1809 review. Ordered from quick wins to most complex:

Files changed

  • docker-compose.yml — ChromaDB depends_on, OLLAMA_HOST fix, LlamaIndex vars, Celery worker service
  • docker/.env.docker — OLLAMA_HOST hostname fix, LlamaIndex endpoint vars
  • docker/backend/Dockerfile — Copy shared config files
  • docker/frontend/Dockerfile — Clear Vite build args for proxy mode
  • docker/slm/Dockerfile — Add git, migration entrypoint, healthcheck tuning
  • docker/slm/entrypoint.sh — New: runs migrations before app start

Test plan

  • docker compose config validates without errors
  • docker compose build succeeds for all services
  • docker compose up -d starts all services including worker
  • Backend connects to ChromaDB without startup errors
  • LLM/Ollama URLs are well-formed (no http://http:// prefix)
  • RAG/knowledge operations reach Ollama container
  • SLM tables exist after first boot
  • Frontend API calls work through nginx proxy (no direct container hostname)
  • Celery worker processes async tasks from Redis queues

Closes #1910, closes #1908, closes #1909, closes #1877, closes #1895, closes #1879, closes #1892, closes #1893

mrveiss added 8 commits March 18, 2026 23:20
Backend connects to ChromaDB for RAG/vector operations but only had
depends_on for Redis. Without this, ChromaDB startup race causes
connection errors on first boot.
SSOT config expects AUTOBOT_OLLAMA_HOST as a bare hostname (e.g.
"autobot-ollama") but docker-compose and .env.docker set it to a full
URL ("http://autobot-ollama:11434"). Code that constructs URLs from
the host field produces malformed "http://http://..." URLs.

Fix: set HOST to hostname only, add separate OLLAMA_PORT and
OLLAMA_ENDPOINT vars for direct URL use.
AUTOBOT_LLAMAINDEX_LLM_ENDPOINT and AUTOBOT_LLAMAINDEX_EMBEDDING_ENDPOINT
default to http://127.0.0.1:11434 in SSOT config. In Docker, localhost is
the container itself — Ollama is on the Docker network at autobot-ollama.
Without these vars, all knowledge/RAG operations fail.
SLM git_tracker service errors every 5 minutes because git is not
installed in the Docker image. Adding git to the runtime deps
package list (~50MB) eliminates the log noise.
Frontend Dockerfile hardcoded VITE_BACKEND_HOST=autobot-backend which
is a Docker-internal hostname unreachable from browsers. The frontend
already supports proxy mode (empty host = relative URLs through nginx),
so clearing the build args lets nginx handle API routing.
Backend logs warnings about missing llm_models.yaml and
permission_rules.yaml on Docker startup. These files live in
autobot-infrastructure/shared/config/ which isn't copied into the
backend image. Add COPY instructions to place them in the backend
config directory.
docker-compose.yml had no Celery worker — async tasks (Ansible deploys,
code indexing, background jobs) queued in Redis but never executed.

Adds autobot-worker service using the backend image with celery worker
command, listening on all defined queues (celery, deployments,
provisioning, services) with concurrency=2.
On first boot, SLM API calls fail with 500 errors because tables
don't exist yet. While the SLM lifespan already runs migrations,
an explicit entrypoint ensures migrations complete before uvicorn
starts accepting connections.

Also increases start_period from 30s to 60s and retries from 3 to 5
to accommodate first-boot migration time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant