Skip to content

fix(deploy): remove startCommand from railway.json to restore prod#310

Merged
DevanshuNEU merged 1 commit into
OpenCodeIntel:mainfrom
DevanshuNEU:fix/railway-port-expansion-restore-prod
May 15, 2026
Merged

fix(deploy): remove startCommand from railway.json to restore prod#310
DevanshuNEU merged 1 commit into
OpenCodeIntel:mainfrom
DevanshuNEU:fix/railway-port-expansion-restore-prod

Conversation

@DevanshuNEU

@DevanshuNEU DevanshuNEU commented May 15, 2026

Copy link
Copy Markdown
Collaborator

Summary

Production is down. api.opencodeintel.com is timing out on every request. The Railway healthcheck failure on PR #302 took the live replica with it; subsequent retries cannot boot.

Root cause: Railway changed how startCommand in railway.json is evaluated. The $PORT variable is now being passed as the literal string "$PORT" to uvicorn instead of being shell-expanded, so uvicorn crashes on every start attempt. The healthcheck at /health then fails all 11 attempts over 5 minutes because the server never comes up.

The same railway.json deployed successfully for PR #293 two months ago. A full diff of every file between PR #293 (last working) and PR #302 (failed) shows zero runtime changes - only docs and backend/CLAUDE.md touched. No Dockerfile, requirements.txt, or backend code changed. This is a platform-side behavior change, not a code regression.

Fix

Remove the startCommand line from railway.json. The Dockerfile's built-in CMD already handles boot correctly:

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--proxy-headers"]

Bonus side effect: this also restores --proxy-headers which the deleted startCommand was silently dropping (the comment in backend/Dockerfile:32-33 flagged this for the proxy IP allowlist use case).

Risk

Low. Failed deploys do not take prod down further than it already is - it is already fully offline. If this fix is wrong, the deploy fails again and we stay where we are. If it is right, prod comes back on the latest main.

Process note

This hotfix skipped the /oci-design ADR gate (Phase 1F warn-only hook fired). Justification: production restoration is higher priority than the design-gate process. Will backfill an ADR or dogfood-finding entry after prod recovers documenting the bypass and the Railway platform behavior change.

Test plan

  • Merge to main
  • Railway auto-deploy triggers
  • Build + Deploy + Healthcheck all pass green
  • curl https://api.opencodeintel.com/health returns 200
  • Frontend at opencodeintel.com loads playground without CORS / network errors

Summary by CodeRabbit

  • Chores
    • Modified deployment configuration settings by removing a start command specification.

Review Change Stack

Railway changed how startCommand is evaluated; the $PORT variable was
being passed as the literal string instead of being shell-expanded,
causing uvicorn to crash on every boot and the /health probe to time
out across 11 attempts. The Dockerfile's built-in CMD already binds
to the EXPOSE'd port with --proxy-headers, so removing the override
restores boot.

Same railway.json shipped fine for PR OpenCodeIntel#293 two months ago, and no
runtime code, Dockerfile, or requirements changed between OpenCodeIntel#293 and
the failing OpenCodeIntel#302 deploy (only docs touched). Root cause is a Railway
platform behavior change.

Hotfix: skipped /oci-design gate (Phase 1F warn) because prod is
fully down. Backfilling an ADR or dogfood finding after recovery.
@vercel

vercel Bot commented May 15, 2026

Copy link
Copy Markdown

@DevanshuNEU is attempting to deploy a commit to the Dev's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai

coderabbitai Bot commented May 15, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d8caa2d2-8ec3-478e-b939-e46bf99220aa

📥 Commits

Reviewing files that changed from the base of the PR and between 106ff72 and ebc3bde.

📒 Files selected for processing (1)
  • railway.json
💤 Files with no reviewable changes (1)
  • railway.json

📝 Walkthrough

Walkthrough

The PR removes the startCommand field from the Railway deployment configuration in railway.json. Other deploy settings—restart policy, max retries, and healthcheck configuration—remain unchanged.

Changes

Deployment Configuration

Layer / File(s) Summary
Railway startCommand removal
railway.json
The deploy.startCommand entry containing the uvicorn startup command is removed; remaining deploy configuration (restart policy, healthcheck, timeout) stays in place.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Poem

🐰 A command takes flight, no more uvicorn's call,
Railway keeps its steady stance through it all,
Health checks still tick, restarts remain true—
One line departs, the rest shines through!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: removing startCommand from railway.json to restore production deployment after a platform-side behavior change broke the healthcheck.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@vercel

vercel Bot commented May 15, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
opencodeintel Ignored Ignored Preview May 15, 2026 4:06pm

@DevanshuNEU DevanshuNEU merged commit 7a29b89 into OpenCodeIntel:main May 15, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant