You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cd infra/gcp
# Copy and fill in your variables
cp terraform.tfvars.example terraform.tfvars
# Create state bucket (once)
gsutil mb gs://mockstack-tfstate
# Init and apply
terraform init
terraform plan -var-file=terraform.tfvars
terraform apply -var-file=terraform.tfvars
CI/CD (GitHub Actions)
Add these GitHub Secrets:
GCP_PROJECT_ID
GCP_WORKLOAD_IDENTITY_PROVIDER
GCP_SA_EMAIL
GCP_REGION (e.g. us-central1)
Push to main → .github/workflows/deploy.yml runs automatically.
Database Migrations
Migrations run automatically in CI before deploying new service revisions.
Manual run:
cd chat-service
DATABASE_URL=postgresql://mock:mock@<cloud-sql-proxy-host>:5432/mockstack \
alembic upgrade head
Environment Variables Reference
chat-service (production)
Variable
Required
Description
DATABASE_URL
✅
Cloud SQL asyncpg URL
REDIS_URL
✅
Memorystore URL
JWT_SECRET
✅
Strong random secret (32+ chars)
ENV
✅
production
CORS_ORIGINS
✅
Comma-separated allowed origins
AI_GRPC_TARGET
✅
ai-service:50051 (internal)
workers (production)
Variable
Required
Description
DATABASE_URL
✅
Cloud SQL asyncpg URL
REDIS_URL
✅
Memorystore URL
MAX_RETRIES
✅
Job retry attempts (default: 3)
Health Checks
Endpoint
Expected
GET /health (chat-service)
{"status": "ok"}
Redis PING
PONG
Cloud SQL pg_isready
exit 0
Scaling
chat-service: Cloud Run autoscales 1–10 instances by CPU/concurrency.
ai-service: Cloud Run autoscales 1–5 instances (CPU-bound).
workers: Cloud Run Job, scheduled or triggered by queue depth.
PostgreSQL: Scale up Cloud SQL tier manually; consider read replicas for analytics queries.
Rollback
# Roll back Cloud Run to previous revision
gcloud run services update-traffic chat-service-prod \
--to-revisions=PREVIOUS_REVISION=100 --region us-central1
# Roll back DB migrationcd chat-service && alembic downgrade -1
Operational Runbooks
High CPU on chat-service
Check /docs is disabled (set ENV=production)
Check rate limit keys in Redis (KEYS ratelimit:*)
Scale up min-instances or upgrade Cloud Run CPU allocation
Worker DLQ growing
Inspect queue:dlq:queue:reminders and queue:dlq:queue:analytics in Redis
Fix the root cause (DB connectivity, schema mismatch)