SmartHealth-LLM is a multi-agent health assistant backend + frontend stack.
It includes:
- FastAPI backend with agent orchestration
- Specialized agents (
conversation,symptom_matcher,disease_info,reasoning) - Local/vector retrieval + optional internet fallback
- Built-in run metrics collection and export
- Excel-based evaluation runner for batch query testing
backend/ FastAPI app, agents, tools, prompts, models
frontend/ React client
scripts/ Bootstrap + evaluation scripts
tests/ Pytest suites
Dockerfile.backend Backend container
docker-compose.yml Full stack local docker run
./scripts/bootstrap.shThis does:
- create
.venv - install backend dependencies
- install frontend dependencies (
npm ci) - create
backend/.envfrombackend/.env.exampleif missing
make setupUseful commands:
make dev-backendmake dev-frontendmake test-backendmake test-allmake docker-upmake docker-down
Copy and edit backend env file:
cp backend/.env.example backend/.envImportant vars:
GROQ_API_KEY(if using Groq adapter)SERPER_API_KEY(optional, enables live web fallback)OLLAMA_HOST(defaulthttp://localhost:11434)
source .venv/bin/activate
cd backend
uvicorn app.main:app --reload --port 7860Backend URL: http://localhost:7860
cd frontend
npm startFrontend URL: http://localhost:3000
Optional frontend API base override:
REACT_APP_API_URL=http://localhost:7860 npm startcp backend/.env.example backend/.env
# fill required keys in backend/.env
docker compose up --buildURLs:
- Frontend:
http://localhost:3000 - Backend:
http://localhost:7860
- Open frontend at
http://localhost:3000 - Start chat with normal text:
- casual message -> conversation agent path
- symptom/disease question -> medical agent workflow
- Backend endpoint used by frontend:
POST /chat/send - Optional debug run with logs:
POST /debug/debug_chat_send - Check run analytics with:
GET /metrics/summaryGET /metrics/runs
GET /-> backend status messageGET /health/status->{"status":"ok"}GET /health/pingGET /health/liveGET /health/ready
POST /chat/sendPOST /chat/historyPOST /chat/clear
Example:
curl -X POST http://localhost:7860/chat/send \
-H "Content-Type: application/json" \
-d '{"message":"I have fever and cough","session_id":"demo-1"}'POST /debug/debug_chat_send
GET /metrics/summaryGET /metrics/runs?limit=50POST /metrics/save-localPOST /metrics/reset
Save metrics locally:
curl -X POST http://localhost:7860/metrics/save-local \
-H "Content-Type: application/json" \
-d '{"filepath":"metrics_store/session_metrics.json","limit":500}'Per run:
- routing: intent, planned/executed agents
- tool usage: local DB calls/success, vector DB calls/success, internet calls/success
- memory usage: recall/save counts, context items used
- latency and status
- final output and relevance score
Aggregate summary fields include:
local_data_usage_rateinternet_usage_rateweb_fallback_ratelocal_hit_success_rateavg_relevance_scoreavg_latency_ms,p95_latency_msmedical_query_rate,conversation_query_rate
Script: scripts/run_excel_eval.py
source .venv/bin/activate
python scripts/run_excel_eval.py --input eval_queries.xlsx --create-templateThis creates an Excel file with queries column.
Put one query per row under queries.
python scripts/run_excel_eval.py \
--input eval_queries.xlsx \
--output eval_queries_evaluated.xlsxThe output file writes results back in the same row with columns like:
run_id,status,errorintent,agents_planned,agents_executedconversation_output,symptom_matcher_output,disease_info_output,reasoning_outputfinal_output- metrics columns (
latency_ms, relevance, local/internet usage, memory usage, tool errors)
Backend tests:
source .venv/bin/activate
pytest -q tests/backendAll tests:
source .venv/bin/activate
pytest -qFree-tier details change over time. The notes below are accurate as checked on February 9, 2026.
Why:
- Free CPU Basic hardware is available.
- Good for public demo sharing.
How:
- Create a new Docker Space on Hugging Face.
- Connect your GitHub repo (or push repo files directly).
- Ensure
backend/.envvalues are set as Space Secrets (for keys). - Build/deploy using
Dockerfile.backendor full-stack approach you choose. - Verify health endpoint after deployment.
Notes:
- Free hardware has limits (CPU/RAM/storage).
- Disk persistence on default free setup is limited/non-persistent for app runtime data.
- Official references:
Why:
- Free web services are available for testing/hobby preview.
How:
- Create a Render account and connect GitHub repo.
- Create a new Web Service from this repo.
- Set build/start commands for backend:
- Build:
pip install -r backend/requirements.txt - Start:
cd backend && uvicorn app.main:app --host 0.0.0.0 --port $PORT
- Build:
- Add environment variables from
backend/.env.example. - Deploy and test
GET /health/status.
Notes:
- Free services have usage/feature limits and are not recommended for production.
- Official references:
Why:
- Very quick deploy workflow.
How:
- Create Railway project from GitHub repo.
- Add backend service with start command:
cd backend && uvicorn app.main:app --host 0.0.0.0 --port $PORT
- Add env vars from
backend/.env.example. - Deploy and validate health + chat endpoints.
Notes:
- Current model includes trial credits and then low-cost paid usage.
- Use it for fast testing if free budget is acceptable.
- Official references:
Best when you need:
- reliability, scaling control, networking/security compliance.
Notes:
- Historical free allowances changed; verify current plan terms before choosing.
- Use mainly if you want Fly’s multi-region container model.
Detailed reproducible setup and hosting notes:
docs/REPRODUCIBLE_SETUP_AND_HOSTING.md
MIT