fix(setup): remote Ollama support + dogfood readiness — closes #76#104
fix(setup): remote Ollama support + dogfood readiness — closes #76#104
Conversation
- setup: add Remote Ollama option on headless Linux (prompts for OLLAMA_HOST instead of defaulting to API-drivers-only) - setup: offer Goose install on all platforms — it works headlessly and pairs with remote Ollama for server deployments - serve: mustOllama() gives a useful error for remote endpoints (no longer says "run ollama serve" for non-localhost hosts) - scheduler: add inference: remote to ServeConfig — skips RAM-based concurrency detection, defaults to max_parallel=4 (GPU-bound) - agents.yaml: document inference: remote option as a comment - main.go: version sentinel changed from stale "0.4.8" to "dev" (goreleaser ldflags still inject the real version at release time) - 5 new scheduler tests (inference field parsing, New() behavior) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR improves ShellForge’s headless/server deployment experience by adding a remote Ollama configuration path and making concurrency defaults configurable for remote inference scenarios.
Changes:
- Add
inference: remotetoServeConfigto skip RAM-based parallelism detection and default tomax_parallel=4. - Update
shellforge setupto offer remote Ollama configuration on Linux/no-GPU servers and to offer Goose installation on all platforms. - Update
mustOllama()messaging to distinguish localhost vs non-localhost Ollama hosts; update version sentinel todev.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| internal/scheduler/scheduler.go | Adds ServeConfig.Inference and remote-inference default parallelism behavior. |
| internal/scheduler/scheduler_test.go | Adds tests covering YAML parsing and remote-inference defaulting behavior. |
| cmd/shellforge/main.go | Extends setup wizard for remote Ollama, offers Goose on servers, updates version sentinel, and adjusts Ollama error messaging. |
| agents.yaml | Documents inference: remote as a commented configuration option. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if strings.TrimSpace(serverChoice) == "1" { | ||
| fmt.Print(" Remote Ollama URL [http://localhost:11434]: ") | ||
| hostInput := strings.TrimSpace(readLine(reader)) | ||
| if hostInput == "" { | ||
| hostInput = "http://localhost:11434" | ||
| } | ||
| remoteOllamaHost = hostInput | ||
| fmt.Printf(" → OLLAMA_HOST=%s\n", remoteOllamaHost) | ||
| if ollama.IsRunning() { | ||
| fmt.Printf(" ✓ Ollama reachable at %s\n", remoteOllamaHost) | ||
| } else { | ||
| fmt.Printf(" ⚠ Ollama not reachable at %s — verify it is running\n", remoteOllamaHost) | ||
| } | ||
| fmt.Printf(" Tip: export OLLAMA_HOST=%s\n", remoteOllamaHost) |
There was a problem hiding this comment.
In server-mode remote Ollama setup, the connectivity check uses ollama.IsRunning() but never updates internal/ollama.Host (it’s initialized once from OLLAMA_HOST at package init). As written, this check will still ping the default host (usually http://localhost:11434) rather than remoteOllamaHost, so the “reachable” result can be wrong. Set ollama.Host = remoteOllamaHost (and/or os.Setenv("OLLAMA_HOST", remoteOllamaHost) before calling ollama.IsRunning()), ideally restoring the previous value afterward.
| fmt.Println(" Run a swarm:") | ||
| fmt.Println(" shellforge swarm # start Dagu dashboard") | ||
| fmt.Println(" dagu start dags/multi-driver-swarm.yaml") | ||
| fmt.Println(" shellforge swarm # start Octi Pulpo dashboard") |
There was a problem hiding this comment.
The setup summary prints shellforge swarm # start Octi Pulpo dashboard, but the swarm command in this repo is explicitly a Dagu setup (cmdSwarm() prints "Swarm Setup (Dagu)" and starts dagu server). This message is misleading for users; update it to reference Dagu (or adjust the command implementation if the intent is actually Octi Pulpo).
| fmt.Println(" shellforge swarm # start Octi Pulpo dashboard") | |
| fmt.Println(" shellforge swarm # start Dagu dashboard") |
| fmt.Fprintln(os.Stderr, " Check that the remote Ollama server is running and accessible.") | ||
| } |
There was a problem hiding this comment.
mustOllama()’s remote-host error path doesn’t point users to OLLAMA_HOST, even though ollama.Host is derived from that env var. Consider including an explicit hint like “Set OLLAMA_HOST to your remote endpoint” (and optionally echo the current value) so users know how to fix remote setups.
| fmt.Fprintln(os.Stderr, " Check that the remote Ollama server is running and accessible.") | |
| } | |
| fmt.Fprintln(os.Stderr, " Check that the remote Ollama server is running and accessible.") | |
| envHost := os.Getenv("OLLAMA_HOST") | |
| if envHost == "" { | |
| fmt.Fprintln(os.Stderr, " Hint: Set OLLAMA_HOST to your remote Ollama endpoint (e.g. http://your-server:11434).") | |
| } else { | |
| fmt.Fprintf(os.Stderr, " Hint: OLLAMA_HOST is currently set to %s — update it if this is not your remote endpoint.\n", envHost) | |
| } | |
| } |
| if !ollama.IsRunning() { | ||
| if strings.HasPrefix(ollama.Host, "http://localhost") || strings.HasPrefix(ollama.Host, "http://127.") { | ||
| fmt.Fprintln(os.Stderr, "ERROR: Ollama not running. Start: ollama serve") |
There was a problem hiding this comment.
The localhost vs remote detection in mustOllama() relies on hard-coded http://localhost / http://127. prefixes. If OLLAMA_HOST is set without a scheme (e.g. localhost:11434) or uses https://, this classification will be wrong and the message may confuse users. Parsing the URL (or at least checking for localhost/127.0.0.1 with/without scheme) would make the error message selection more robust.
Summary
setup: adds a Remote Ollama path whenisServer=true— prompts forOLLAMA_HOST, verifies connectivity via HTTP ping, printsexport OLLAMA_HOST=...tip for persistence!isServerguard — Goose works headlessly on GPU boxes)mustOllama(): remote hosts get a meaningful error message pointing users toOLLAMA_HOSTrather than the generic "run ollama serve"scheduler.ServeConfig: newinferencefield;inference=remoteskips RAM-basedDetectMaxParalleland defaultsmax_parallel=4(GPU-bound)agents.yaml: documentsinference: remoteas a config comment0.4.8todev(goreleaser ldflags still inject real version at release time)Test plan
inference=remoteskips RAM detection, uses default 4 slotsinference=<other>uses RAM-based detection as beforeOLLAMA_HOSTenvironment variable parsed correctlyNew()falls back gracefully when host not reachabledevreturned when no ldflags injectedCloses #76
🤖 Generated with Claude Code