RunsOn v2 → v3 migration
RunsOn v3 is now a stable production release. This issue tracks migrating our self-hosted GitHub Actions runner stack from v2 to v3 across the affected lecture/course repos.
Why now
v3 originates from AWS deprecating App Runner (announced April 2026), which the v2 control plane runs on. RunsOn rebuilt the orchestrator on plain ECS/Fargate. So v2 is ultimately on borrowed time tied to App Runner's eventual shutdown.
- No forced shutdown — v2 keeps running for now; no emergency.
- v2 support ends 30 July 2026 — after that, bug fixes & improvements land only on v3 (v2 keeps running until AWS actually switches off App Runner).
- Action: migrate deliberately on our schedule before the deadline, not in a scramble when AWS pulls the plug.
What changes
v3 is a breaking release. RunsOn explicitly says do not upgrade the v2 stack in place — you deploy a fresh v3 stack with a new (separate) GitHub App. v2 and v3 can run side by side, which lets us cut over repo-by-repo by moving the App installation. The runs-on=${{ github.run_id }}/... workflow label stays generic; routing is determined by which App is installed on a repo.
The one in-repo change — the disk label:
Every RunsOn workflow we have uses .../disk=large. In v3 the disk= label is parsed but ignored (no longer provisions a volume) and only emits a deprecation warning. Replacement is volume=:
disk=large → volume=80gb
- custom:
volume=100gb:gp3:500mbs:4000iops (size:type:throughput:iops)
⚠️ Gotcha: if a repo is pointed at a v3 stack without updating the label, the runner silently boots with the image's default root volume instead of "large" — GPU build jobs (JAX caches, conda envs, datasets) can then fail with "no space left on device". So the label edit must happen together with the stack cutover for each repo — not before (v2 won't understand volume=) and not after.
Custom images (e.g. quantecon_ubuntu2404) are unaffected by v3 — image defs in .github/runs-on.yml stay valid — but the AMI id is account+region-specific, so deploy the v3 stack in the same AWS account/region so the AMI still resolves.
Affected repos
| Repo |
RunsOn workflows |
Image |
lecture-jax |
ci, cache, publish, collab |
quantecon_ubuntu2404 / ubuntu24-gpu-x64 |
lecture-python.myst |
ci, cache, publish, collab |
same |
lecture-python-programming |
ci, cache, publish |
same |
lecture-stats |
ci, cache, publish, collab |
same |
iuj_feb_2026, scipy_tutorial_2026 |
course repos |
same |
Not affected (no RunsOn): lecture-python-intro (uses ubuntu-latest + legacy quantecon-large), lecture-python-advanced.myst (ubuntu-latest).
Migration plan — pilot on lecture-stats
lecture-stats is the ideal test ground: it's slated for deprecation (low risk) yet exercises the full matrix — GPU family (g4dn.2xlarge), the custom AMI image, the RunsOn default GPU image (ubuntu24-gpu-x64 in collab.yml), and the disk label across all four workflow types. If it passes, the bigger repos are essentially copy-paste.
- Deploy the v3 stack in AWS (CloudFormation one-click, ~10 min), same account/region as v2.
- Register + install the new GitHub App on
lecture-stats only.
- Combined cutover on
lecture-stats: switch the 4 labels disk=large → volume=80gb in the same change that points it at the v3 App; confirm the AMI resolves.
- Run all four workflows green and confirm disk headroom is adequate.
- Roll out to
lecture-jax, lecture-python.myst, lecture-python-programming (then course repos).
- Decommission the v2 stack once everything is cut over, before 30 July 2026.
AWS v3 install (CloudFormation path)
The v3 built-in CF template assumes github.com + RunsOn's embedded networking — correct for us. (GHES / existing-VPC would need the Terraform path; not our case.)
- Prereqs: AWS CloudFormation perms (same account+region as v2); GitHub org admin on
QuantEcon; existing RunsOn license key (plain or SSM ref — reuse ours, no trial needed).
- Launch CloudFormation stack, key params:
- GitHub org:
QuantEcon
- License key (or SSM ref)
- Email for cost & alert reports
- Environment name: e.g.
v3 (distinct from v2, since they run side by side)
AppSize: small/medium/high/xhigh preset (replaces v2 CPU/mem/queue knobs) — small/medium is plenty for our repo count
- Optional hardening:
EnableWAF, EnableAdminRoutes
- RunsOn auto-creates a private GitHub App (org-only; creds stay in our AWS account).
- When the stack completes, open the
RunsOnEntryPoint HTTPS URL from CloudFormation Outputs → click "Register app" → install on lecture-stats.
- Stack creates: 1 VPC, 2 public + 2 private subnets, EC2 security group, API Gateway + Lambda ingress, 1 ECS/Fargate worker service. No App Runner.
Checklist
References
/cc @mmcky
RunsOn v2 → v3 migration
RunsOn v3 is now a stable production release. This issue tracks migrating our self-hosted GitHub Actions runner stack from v2 to v3 across the affected lecture/course repos.
Why now
v3 originates from AWS deprecating App Runner (announced April 2026), which the v2 control plane runs on. RunsOn rebuilt the orchestrator on plain ECS/Fargate. So v2 is ultimately on borrowed time tied to App Runner's eventual shutdown.
What changes
v3 is a breaking release. RunsOn explicitly says do not upgrade the v2 stack in place — you deploy a fresh v3 stack with a new (separate) GitHub App. v2 and v3 can run side by side, which lets us cut over repo-by-repo by moving the App installation. The
runs-on=${{ github.run_id }}/...workflow label stays generic; routing is determined by which App is installed on a repo.The one in-repo change — the
disklabel:Every RunsOn workflow we have uses
.../disk=large. In v3 thedisk=label is parsed but ignored (no longer provisions a volume) and only emits a deprecation warning. Replacement isvolume=:disk=large→volume=80gbvolume=100gb:gp3:500mbs:4000iops(size:type:throughput:iops)volume=) and not after.Custom images (e.g.
quantecon_ubuntu2404) are unaffected by v3 — image defs in.github/runs-on.ymlstay valid — but the AMI id is account+region-specific, so deploy the v3 stack in the same AWS account/region so the AMI still resolves.Affected repos
lecture-jaxquantecon_ubuntu2404/ubuntu24-gpu-x64lecture-python.mystlecture-python-programminglecture-statsiuj_feb_2026,scipy_tutorial_2026Not affected (no RunsOn):
lecture-python-intro(usesubuntu-latest+ legacyquantecon-large),lecture-python-advanced.myst(ubuntu-latest).Migration plan — pilot on
lecture-statslecture-statsis the ideal test ground: it's slated for deprecation (low risk) yet exercises the full matrix — GPU family (g4dn.2xlarge), the custom AMI image, the RunsOn default GPU image (ubuntu24-gpu-x64in collab.yml), and thedisklabel across all four workflow types. If it passes, the bigger repos are essentially copy-paste.lecture-statsonly.lecture-stats: switch the 4 labelsdisk=large→volume=80gbin the same change that points it at the v3 App; confirm the AMI resolves.lecture-jax,lecture-python.myst,lecture-python-programming(then course repos).AWS v3 install (CloudFormation path)
The v3 built-in CF template assumes github.com + RunsOn's embedded networking — correct for us. (GHES / existing-VPC would need the Terraform path; not our case.)
QuantEcon; existing RunsOn license key (plain or SSM ref — reuse ours, no trial needed).QuantEconv3(distinct from v2, since they run side by side)AppSize:small/medium/high/xhighpreset (replaces v2 CPU/mem/queue knobs) —small/mediumis plenty for our repo countEnableWAF,EnableAdminRoutesRunsOnEntryPointHTTPS URL from CloudFormation Outputs → click "Register app" → install onlecture-stats.Checklist
lecture-statslecture-stats:disk=large→volume=80gbacross ci/cache/publish/collabquantecon_ubuntu2404AMI resolves under v3lecture-statsworkflows (green + disk headroom)lecture-jaxlecture-python.mystlecture-python-programmingiuj_feb_2026,scipy_tutorial_2026)References
/cc @mmcky