fix(docker): resolve 8 Docker deployment bugs by mrveiss · Pull Request #1926 · mrveiss/AutoBot-AI

mrveiss · 2026-03-18T22:04:08Z

Summary

Batch fix for 8 Docker deployment bugs discovered during #1809 review. Ordered from quick wins to most complex:

Bug: Docker backend missing depends_on for ChromaDB — RAG startup race condition #1910 — Add depends_on: autobot-chromadb to backend (prevents RAG startup race)
Bug: Docker AUTOBOT_OLLAMA_HOST set to URL but SSOT expects hostname — double-prefix in URL construction #1908 — Fix AUTOBOT_OLLAMA_HOST from full URL to hostname-only (prevents malformed http://http://... URLs)
Bug: Docker missing LlamaIndex endpoint env vars — RAG/knowledge defaults to localhost #1909 — Add AUTOBOT_LLAMAINDEX_LLM_ENDPOINT and AUTOBOT_LLAMAINDEX_EMBEDDING_ENDPOINT (RAG defaults to localhost without these)
Bug: SLM git_tracker fails in Docker — git not installed #1877 — Add git package to SLM Docker image (eliminates git_tracker log spam)
Bug: Docker frontend Dockerfile hardcodes backend hostname — breaks external access #1895 — Clear Vite build args in frontend Dockerfile (enables nginx proxy mode instead of broken container hostnames)
Bug: Backend missing config files in Docker — config.yaml, llm_models.yaml, permission_rules.yaml #1879 — Copy llm_models.yaml and permission_rules.yaml into backend image (fixes missing config warnings)
Bug: Docker compose missing Celery worker service — async tasks never execute #1892 — Add autobot-worker Celery service to docker-compose (async tasks now actually execute)
Bug: Docker compose has no database migration step — SLM tables missing on first boot #1893 — Add SLM migration entrypoint + increase healthcheck tolerance (tables exist before first request)

Files changed

docker-compose.yml — ChromaDB depends_on, OLLAMA_HOST fix, LlamaIndex vars, Celery worker service
docker/.env.docker — OLLAMA_HOST hostname fix, LlamaIndex endpoint vars
docker/backend/Dockerfile — Copy shared config files
docker/frontend/Dockerfile — Clear Vite build args for proxy mode
docker/slm/Dockerfile — Add git, migration entrypoint, healthcheck tuning
docker/slm/entrypoint.sh — New: runs migrations before app start

Test plan

docker compose config validates without errors
docker compose build succeeds for all services
docker compose up -d starts all services including worker
Backend connects to ChromaDB without startup errors
LLM/Ollama URLs are well-formed (no http://http:// prefix)
RAG/knowledge operations reach Ollama container
SLM tables exist after first boot
Frontend API calls work through nginx proxy (no direct container hostname)
Celery worker processes async tasks from Redis queues

Closes #1910, closes #1908, closes #1909, closes #1877, closes #1895, closes #1879, closes #1892, closes #1893

Backend connects to ChromaDB for RAG/vector operations but only had depends_on for Redis. Without this, ChromaDB startup race causes connection errors on first boot.

SSOT config expects AUTOBOT_OLLAMA_HOST as a bare hostname (e.g. "autobot-ollama") but docker-compose and .env.docker set it to a full URL ("http://autobot-ollama:11434"). Code that constructs URLs from the host field produces malformed "http://http://..." URLs. Fix: set HOST to hostname only, add separate OLLAMA_PORT and OLLAMA_ENDPOINT vars for direct URL use.

AUTOBOT_LLAMAINDEX_LLM_ENDPOINT and AUTOBOT_LLAMAINDEX_EMBEDDING_ENDPOINT default to http://127.0.0.1:11434 in SSOT config. In Docker, localhost is the container itself — Ollama is on the Docker network at autobot-ollama. Without these vars, all knowledge/RAG operations fail.

SLM git_tracker service errors every 5 minutes because git is not installed in the Docker image. Adding git to the runtime deps package list (~50MB) eliminates the log noise.

Frontend Dockerfile hardcoded VITE_BACKEND_HOST=autobot-backend which is a Docker-internal hostname unreachable from browsers. The frontend already supports proxy mode (empty host = relative URLs through nginx), so clearing the build args lets nginx handle API routing.

Backend logs warnings about missing llm_models.yaml and permission_rules.yaml on Docker startup. These files live in autobot-infrastructure/shared/config/ which isn't copied into the backend image. Add COPY instructions to place them in the backend config directory.

docker-compose.yml had no Celery worker — async tasks (Ansible deploys, code indexing, background jobs) queued in Redis but never executed. Adds autobot-worker service using the backend image with celery worker command, listening on all defined queues (celery, deployments, provisioning, services) with concurrency=2.

On first boot, SLM API calls fail with 500 errors because tables don't exist yet. While the SLM lifespan already runs migrations, an explicit entrypoint ensures migrations complete before uvicorn starts accepting connections. Also increases start_period from 30s to 60s and retries from 3 to 5 to accommodate first-boot migration time.

mrveiss added 8 commits March 18, 2026 23:20

fix(docker): add ChromaDB depends_on to backend service (#1910)

8c28eb4

Backend connects to ChromaDB for RAG/vector operations but only had depends_on for Redis. Without this, ChromaDB startup race causes connection errors on first boot.

fix(docker): add git to SLM container image (#1877)

ba21e92

SLM git_tracker service errors every 5 minutes because git is not installed in the Docker image. Adding git to the runtime deps package list (~50MB) eliminates the log noise.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(docker): resolve 8 Docker deployment bugs#1926

fix(docker): resolve 8 Docker deployment bugs#1926
mrveiss wants to merge 8 commits intoDev_new_guifrom
fix/docker-bugs-batch

mrveiss commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mrveiss commented Mar 18, 2026

Summary

Files changed

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant