Skip to content

feat(migrate): add Graviton (ARM64) cost optimization#86

Open
herosjourney wants to merge 17 commits into
awslabs:mainfrom
herosjourney:feat/graviton-cost-optimization
Open

feat(migrate): add Graviton (ARM64) cost optimization#86
herosjourney wants to merge 17 commits into
awslabs:mainfrom
herosjourney:feat/graviton-cost-optimization

Conversation

@herosjourney

@herosjourney herosjourney commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Depends on

⚠️ Stacked on #78 — please merge that first.

Summary

Add Graviton (ARM64) cost optimization to the gcp-to-aws migration skill. When compute workloads are eligible for AWS Graviton processors, the skill now defaults to ARM64 instance types (~15–20% hourly price savings) and surfaces the decision appropriately through the migration phases.

Problem

The migration skill always emitted x86 instance types regardless of workload compatibility. Graviton instances offer 15–20% lower per-hour pricing at the same vCPU/memory spec, but the skill had no mechanism to detect eligibility, ask the user when ambiguous, or emit ARM64 targets in Terraform output.

Solution

Area Change
Shared references New graviton.md (single source of truth for tiers, mappings, phase behavior) and schema-graviton.md (JSON schemas for graviton_profile, architecture_comparison, design block)
Discover Emit graviton_profile per compute service from app code signals (Step 2.7 in discover-app-code.md) and IaC signals (discover-iac.md)
Clarify Add Q11b decision table — defaults to Graviton when all services are ready; only asks when conditional/unknown with risk signals
Design Branch on cpu_architecture preference to emit Graviton or x86 instance families per service
Estimate Model hourly price discount only; emit architecture_comparison block in estimation-infra.json; add Graviton rows to pricing-cache.md
Generate Emit ARM64 in Terraform (Fargate runtime_platform, Lambda architectures, EC2 Graviton families); add "Graviton Migration Notes" to output docs
SKILL.md Register graviton.md conditional loading rule

Compatibility tiers

  • ready — Python, Node.js, Go, PHP, Ruby, pure-JVM Java, .NET 6+ on Linux, managed services (RDS MySQL/PostgreSQL, Aurora, ElastiCache, Lambda, Fargate)
  • conditional — JNI deps, niche C extensions, Rust/C/C++ recompile, native gem extensions, vendor AMIs
  • incompatible — Windows/.NET Framework, GPU/CUDA, RDS SQL Server

Design decisions

  • Default to Graviton (no Clarify question) when ALL services are ready — consistent with existing db.t4g defaults
  • Model only hourly price savings in Estimate; never count performance uplift in automated math
  • Report rendering deferred to a follow-up after Enforce comprehensive migration HTML reports with post-write validation #78 merges (avoids file conflicts with the validator)

Test plan

  • Validated on SF Beach Terraform fixture (Cloud Run → Fargate ARM64, Cloud SQL → RDS db.t4g, Redis → ElastiCache cache.t4g)
  • Confirmed Q11b fires only when conditional/unknown profiles with risk signals exist
  • Confirmed Q11b is skipped (default applied) when all profiles are ready
  • estimation-infra.json emits architecture_comparison block with correct savings math

Files changed (13)

All within migrate/plugins/migration-to-aws/skills/gcp-to-aws/:

  • SKILL.md — conditional load rule
  • references/shared/graviton.md — new
  • references/shared/schema-graviton.md — new
  • references/shared/pricing-cache.md — Graviton pricing rows
  • references/shared/schema-estimate-infra.mdarchitecture_comparison schema
  • references/phases/discover/discover-app-code.md — Step 2.7
  • references/phases/discover/discover-iac.md — Graviton signals from machine_type
  • references/phases/clarify/clarify-compute.md — Q11b decision table
  • references/phases/clarify/clarify.md — batch index + defaults table
  • references/design-refs/compute.md — branch on cpu_architecture
  • references/phases/estimate/estimate-infra.md — architecture_comparison emission
  • references/phases/generate/generate-artifacts-infra.md — ARM64 Terraform
  • references/phases/generate/generate-artifacts-docs.md — Graviton Migration Notes section

Logan Kleier and others added 17 commits June 17, 2026 09:34
Expand generate-artifacts-report.md with combined TCO, security baseline
component costs, gap analysis, and assumptions sections; add an HTML validator
script, reference fixture, and tests so stub appendices fail validation instead
of shipping as complete reports.

Co-authored-by: Cursor <cursoragent@cursor.com>
…ults.

Fix broken TOC anchors in the reference fixture; validate nav links match
section ids; require exactly one <section id> per required block; scope
GuardDuty checks to cost/security appendices; gate exec-tco on both estimate
JSON files; default to rename-on-fail; expand test coverage to 11 cases.

Co-authored-by: Cursor <cursoragent@cursor.com>
…RT_OK scope.

Validate reference HTML with estimation-infra/ai fixture JSON; replace dead
GuardDuty regex with component dollar matching; document that REPORT_OK means
structure complete not numerically audited; expand tests to 14 cases.

Co-authored-by: Cursor <cursoragent@cursor.com>
Add Generate-phase report overview, REPORT_OK scope, pytest/validator
commands, and fix stale skill/docs paths in Development section.

Co-authored-by: Cursor <cursoragent@cursor.com>
Replace the hardcoded SF Beach path with STUB_FAIL pytest coverage, add a
committed stub fixture for manual CLI validation, and use word-boundary
regex in _dollar_amount_present so $13 does not match $130.

Co-authored-by: Cursor <cursoragent@cursor.com>
Drop project-specific structural-reference wording from the PR-facing
README and reference HTML footer.

Co-authored-by: Cursor <cursoragent@cursor.com>
Add readability gates (no Rubric:/Section N headings), rewrite the reference
report with security teaser + appendix split, and document conventions in the
spec so generated reports stay exec-friendly.

Co-authored-by: Cursor <cursoragent@cursor.com>
Require exec-security-teaser when security_baseline exists, a verdict banner
when recommendation exists, and --migration-dir canary checks on real runs;
refresh the stub regression fixture and wire --migration-dir into Generate Step 4.

Co-authored-by: Cursor <cursoragent@cursor.com>
Reframe Stay if, risk severity, deferred callout, and estimate/report spec
so phased infra+AI migration proceeds while analytics target is evaluated
in parallel with AWS account team or data partner.

Co-authored-by: Cursor <cursoragent@cursor.com>
Correct observability messaging (5 GB logs / 10 metrics / 10 alarms always-free vs GCP's larger tier), subtract allowances in Step 4, and recascade SF Beach reference totals ($118→$112 infra, ~$503 combined savings).

Co-authored-by: Cursor <cursoragent@cursor.com>
Ensure user-facing cost output consistently uses 'estimated monthly
costs' phrasing across the estimate phase summaries, report template,
README, and report fixture. Adds an explicit cost-labeling rule to the
estimate orchestrator and each sub-estimate Present Summary.
…xec flow

Apply Anthropic frontend-design skill principles to the migration report:

- "Name things by what people control, not how the system is built":
  add a post-write gate so executive-flow sections (decision-summary,
  exec-*) cannot expose artifact filenames (*.json) or Terraform resource
  IDs (aws_<resource>.<name>). Those identifiers stay in the technical
  appendices, which remain exempt. Gated under --no-readability.
- "Structure is information": clarify the numbered-heading ban targets
  decorative labels only — genuine sequences (cluster order, phased weeks,
  migration phases, rollback steps) keep their numbering.
- "One name per concept": require a single consistent label for the
  recommended model and cost tier across verdict, tables, and appendices.

Updates the reference fixture to reader-facing copy in exec-tco and
exec-security-teaser, documents check awslabs#14 + the sequence nuance in
validate-migration-report.md and generate-artifacts-report.md, and adds
4 tests (28 pass).
Adds Graviton/ARM64 as the default architecture for eligible compute,
wired through all six phases of the gcp-to-aws skill.

New canonical references:
- shared/graviton.md — compatibility tiers, GCP→Graviton mapping, per-phase rules
- shared/schema-graviton.md — graviton_profile, cpu_architecture, architecture_comparison

Phase wiring:
- Discover (app-code + IaC): emit graviton_profile per compute service
- Clarify: Q11b — conditional Graviton question; auto-default when all services are tier=ready
- Design (compute): branch on design_constraints.cpu_architecture; emit graviton block
- Estimate: Part 2C architecture_comparison (hourly price discount only) + EC2 dual-arch MCP recipe
- Generate: emit runtime_platform ARM64 / architectures=["arm64"] / Graviton instance types

Pricing cache: add verified us-east-1 ARM64 rows (t4g/m7g/c7g/r7g, Fargate ARM64,
Lambda arm64); uncached sizes fall back to the awspricing MCP.

SKILL.md: register graviton.md as a conditional reference (compute/db/cache present
or graviton_profile exists); add Graviton CPU-architecture default.

Scope: gcp-to-aws only. Heroku parity and the report numeric cross-check
(pairs with PR awslabs#78) are follow-ups.
…fy Graviton logic

(a) Report/validator honesty:
- Soften the over-stated claim that the report validator enforces Graviton
  numeric consistency. PR awslabs#78's validate-migration-report.py is a
  structural/readability gate and explicitly does NOT audit dollar figures.
- Document architecture_comparison as an optional field in
  schema-estimate-infra.md (additive; outside PR awslabs#78's edited region).
- Specify report rendering + a validate-migration-report.py numeric assertion
  as an explicit follow-up that lands on top of PR awslabs#78 (graviton.md
  "Report rendering"). No edits to PR awslabs#78-owned report/validator files.

(c) Clarify fallbacks:
- Replace Q11b fire/skip prose with an exhaustive decis- Replace Q11b fire/skip prose with an exhaustive decis- Replace Q11b fire/skip prose with an exhaustive decis- Replace Q11b fire/skip prose with an exhaustive decis- Replace Q11b fire/skip prose with an exhaustive decis- Replace Q- Define "risk signals" precisely, mirroring schema-graviton.md detection tables.

Avoids collision with open PR awslabs#78 (no shared report/validator file edits).
Addresses second Cursor pass:
1. clarify.md orchestrator now registers Q11b — phase index (Q8–Q11b),
   Category C question list, and a validation-checklist item requiring
   cpu_architecture when compute is present (written even when auto-defaulted).
2. clarify-compute.md Q11b decision table: explicit row for IaC-only
   profiles (source=iac, no app_code profile; e.g. machine_type-only
   conditional) → ask, removing the strict-agent ambiguity.
3. Document database/cache cpu_architecture propagation as a tracked v1
   limitation (managed DB/cache stay Graviton-default; no compatibility
   risk — consistency follow-up).
4. Mark billing-only graviton_profile as planned/not-yet-emitted in
   schema-graviton.md; Q11b row 1 covers billing-sourced compute meanwhile.
5. compute.md Example 3: t3.medium → t4g.medium (Graviton default).

No edits to PR awslabs#78-owned files.
…Notes

Closes the load-condition inconsistency surfaced while tracing outputs:

- SKILL.md conditional-load condition for graviton.md now reads
  "Design, Estimate, Generate" (was "Design and Estimate only") — so the
  Generate phase is authorized to load it, matching the existing reference
  to graviton.md in generate-artifacts-infra.md.
- graviton.md top note updated to match.
- generate-artifacts-docs.md: add a "Graviton Migration Notes" subsection
  (Section 4, infra track) — services moved to arm64, conditional caveats,
  the arm64 docker build step, and the load-test-before-downsizing note.
  Renders only when a compute resource targets arm64; skipped on x86 opt-out.

Result: the Graviton arm64 guidance now reliably reaches the generated
MIGRATION_GUIDE, not just the Terraform.
Optional polish from review: add a Q11b (CPU architecture) row to the
Defaults Table with the auto-default values, and name Q11b explicitly in
the Batch 2 planning table. The Step 5 structural checklist is left as-is
since the handoff Validation Checklist gate (#4) already enforces
cpu_architecture when compute is present.
@herosjourney herosjourney requested a review from a team as a code owner June 29, 2026 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant