Skip to content

kzaky/khaledzaky.com

Repository files navigation

khaledzaky.com

CI AWS CodeBuild License: MIT Astro Tailwind CSS AWS Lambda Amazon Bedrock Python Infrastructure as Code

My personal website and blog — khaledzaky.com

Built with Astro and Tailwind CSS, deployed on AWS, with an AI-powered blog agent that researches, drafts, and publishes posts with human-in-the-loop approval.


Architecture Overview

graph LR
    subgraph "Content & Build"
        A[Markdown Posts] --> B[Astro v5 + Tailwind]
        B --> C[Static HTML/CSS/JS]
    end

    subgraph "CI/CD"
        D[GitHub Push] --> E[AWS CodeBuild]
        E --> F[S3 Bucket]
        F --> G[CloudFront CDN]
    end

    C --> D
    G --> H[khaledzaky.com]
Loading
graph TD
    subgraph "AI Blog Agent"
        EM[Email — your draft/bullets/ideas] --> IG[Ingest Lambda]
        CLI[CLI Trigger] --> SF
        IG --> SF[Step Functions]
        SF --> R[Research Lambda — Tavily web search + enrich]
        R -->|Claude Sonnet 4.6| D[Draft Lambda — voice profile + polish]
        D --> VC[Verify Lambda — URL + citation check]
        VC --> CH[Chart Lambda — SVG generation]
        CH --> N[Notify Lambda]
        N -->|SNS Email| U[Human Review]
        U -->|Approve| AP[Approve Lambda]
        U -->|Request Revisions| AP
        U -->|Reject| AP
        AP -->|Approved| P[Publish Lambda]
        AP -->|Revise with Feedback| D
        P -->|GitHub API — post + charts| GH[GitHub Commit]
        GH --> CB[CodeBuild Auto-Deploy]
    end

    subgraph "AWS Services"
        SES[Amazon SES] -.-> IG
        S3[S3 — drafts, charts, voice profile] -.-> N
        S3 -.-> P
        S3 -.-> CH
        S3 -.-> D
        BK[Amazon Bedrock] -.-> R
        BK -.-> D
        TV[Tavily Web Search] -.-> R
        AG[API Gateway] -.-> AP
    end
Loading

Tech Stack

Layer Technology
Framework Astro v5 with Tailwind CSS v4 (@tailwindcss/vite + @tailwindcss/typography)
Content Markdown with Astro content collections
Build AWS CodeBuild (Node.js 20)
Hosting Amazon S3 (OAC-locked) + CloudFront (HTTPS-only, compressed, security headers)
TLS AWS Certificate Manager
AI Model Claude Sonnet 4.6 via Amazon Bedrock (with voice profile)
Web Search Tavily API (real-time web sources for citations)
Charts & Diagrams SVG bar/donut charts (from numeric data) + conceptual diagrams: comparison, progression, stack, convergence, venn (LLM-detected, code-rendered). All support light/dark mode via CSS custom properties. SVGs are inlined at runtime via client-side JS for dark mode support
Orchestration AWS Step Functions
Approval API Gateway HTTP API + Lambda
Notifications Amazon SNS (email)
Email Ingest Amazon SES (inbound) + Route 53 MX
DNS Amazon Route 53
Secrets AWS SSM Parameter Store (SecureString) — GitHub token + Tavily API key
Source Control GitHub (master branch, webhook-triggered deploys)

Project Structure

khaledzaky.com/
├── src/
│   ├── components/       # Astro components (Header, Footer, SectionCard, CredibilityRow)
│   ├── content/blog/     # Markdown blog posts (content collection)
│   ├── layouts/          # BaseLayout, BlogPost layout
│   ├── pages/            # index, about, work, blog routes (rss.xml.js)
│   ├── plugins/          # Rehype plugins (lazy images)
│   └── styles/           # Global CSS (Tailwind v4 @theme + design tokens)
├── public/               # Static assets (images, favicon)
├── agent/                # AI blog agent (Lambda functions + IaC)
│   ├── research/         # Enriches author's points with data & citations
│   ├── draft/            # Polishes author content using voice profile
│   ├── chart/            # Renders SVG charts + conceptual diagrams (5 types)
│   │   └── renderers/    # Modular renderers: bar, pie, comparison, progression, stack, convergence, venn
│   ├── notify/           # SNS email with one-click approve/revise/reject
│   ├── approve/          # API Gateway handler for approval + revision feedback
│   ├── publish/          # Commits posts + chart images to GitHub
│   ├── ingest/           # SES email trigger — parses author content & directives
│   ├── voice-profile.md  # Author voice & style guide (injected into prompts)
│   ├── tests/            # Smoke tests (handler imports, signatures, renderers)
│   ├── ruff.toml         # Python linter configuration
│   ├── template.yaml     # CloudFormation template
│   └── deploy.sh         # One-command deployment script
├── infra/                # Site infrastructure IaC
│   ├── template.yaml     # CloudFormation — CloudFront, IAM, monitoring, CloudTrail
│   ├── storage.yaml      # CloudFormation — S3 bucket (us-east-2)
│   └── deploy.sh         # Deployment script (requires email, cert ARN, zone ID)
├── .github/workflows/   # GitHub Actions CI (Node build + audit, Python lint + tests)
├── RECOVERY.md           # Disaster recovery runbook
├── buildspec.yml         # AWS CodeBuild build specification
├── astro.config.mjs      # Astro configuration
└── package.json

Running Locally

# Clone the repo
git clone https://github.com/kzaky/khaledzaky.com.git
cd khaledzaky.com

# Install dependencies
npm install

# Start dev server
npm run dev
# → http://localhost:4321

# Build for production
npm run build
# → outputs to dist/

Deployment

Deployment is fully automated. Pushing to master triggers AWS CodeBuild, which:

  1. Installs dependencies (npm ci)
  2. Audits dependencies (npm audit --audit-level=high)
  3. Builds the site (npm run build)
  4. Syncs dist/ to the S3 bucket (--delete to remove stale files)
  5. Invalidates the CloudFront cache (/*)
sequenceDiagram
    participant Dev as Developer
    participant GH as GitHub
    participant CB as CodeBuild
    participant S3 as S3 Bucket
    participant CF as CloudFront

    Dev->>GH: git push origin master
    GH->>CB: Webhook trigger
    CB->>CB: npm ci && npm audit && npm run build
    CB->>S3: aws s3 sync dist/
    CB->>CF: create-invalidation (/*)
    CF->>CF: Cache refreshed
Loading

AI Blog Agent

The blog agent is your editor, not your ghostwriter. You provide your draft, bullets, or ideas — the agent enriches them with research and data, polishes the prose in your voice, generates data-driven charts and conceptual diagrams, and publishes with your approval.

How It Works

  1. Trigger — Send an email to blog@khaledzaky.com with your draft/bullets in the body, or run the CLI
  2. Ingest (email only) — SES receives the email; Ingest Lambda parses author content and optional directives (Categories, Tone, Hero)
  3. Research — Generates 5-8 targeted search queries via Claude Haiku, fetches results via Tavily (8 results/query), fetches full article text for top results, then enriches the author's points with supporting data and verified inline citations ([text](url) format). A thinking plan pass (Sonnet invoke_model+thinking) frames research angles. A cross-reference fact-check pass (Haiku) verifies key claims against sources before they reach the draft
  4. Draft — Claude Sonnet 4.6 (with extended thinking via invoke_model) plans and polishes the author's content using an injected voice profile. Four deterministic Haiku passes follow: chart placeholder insertion, diagram placeholder insertion, citation audit, and voice profile compliance audit. Accepts Goal/Avoid/Analogies directives from the ingest step
  5. Chart & Diagram — Handles two types of visuals: (1) matches structured data points to <!-- CHART: --> placeholders and renders SVG bar/donut charts, (2) parses <!-- DIAGRAM: --> placeholders and renders conceptual SVG diagrams (comparison, progression, stack, convergence, venn). All visuals use the site's color palette with light/dark mode support (CSS custom properties + .dark class)
  6. Notify — Draft (with charts and diagrams) is saved to S3 and a full-text email is sent with a presigned download link and three one-click actions
  7. Review — The pipeline pauses and waits for human action (up to 7 days):
    • Approve — publishes the post and charts immediately
    • Request Revisions — opens a feedback form; the agent revises and re-sends
    • Reject — discards the draft
  8. Publish — On approval, the post, chart images, and diagram SVGs are committed to GitHub via API, triggering auto-deploy

Deploying the Agent

Prerequisites:

  • AWS CLI configured with appropriate credentials
  • GitHub Personal Access Token stored in SSM:
    aws ssm put-parameter --name "/blog-agent/github-token" \
      --type SecureString --value "ghp_YOUR_TOKEN"
  • Amazon Bedrock model access enabled for Anthropic Claude

Deploy:

cd agent
./deploy.sh your-email@example.com

Confirm the SNS email subscription when you receive it.

Triggering a New Post

Option 1: Email (preferred)

Send an email from your authorized address to blog@khaledzaky.com:

  • Subject = your blog topic or title idea
  • Body = your draft, bullets, ideas, or stream of consciousness
  • Optional directives: Categories: tech, cloud, Tone: more technical, Hero: yes, Goal: reader takeaway, Avoid: vendor comparisons, Analogies: distributed tracing

The agent uses your content as the skeleton and polishes it in your voice.

Option 2: CLI

aws stepfunctions start-execution \
  --state-machine-arn $(aws cloudformation describe-stacks \
    --stack-name blog-agent \
    --query 'Stacks[0].Outputs[?OutputKey==`StateMachineArn`].OutputValue' \
    --output text) \
  --input '{"topic": "Your topic here", "categories": ["tech", "cloud"], "author_content": "Your bullets and ideas..."}'

Agent Architecture

stateDiagram-v2
    [*] --> Research
    Research --> Draft
    Draft --> GenerateCharts
    GenerateCharts --> NotifyForReview
    NotifyForReview --> CheckApproval: Human clicks link
    CheckApproval --> Publish: Approved
    CheckApproval --> Revise: Request Revisions
    CheckApproval --> Rejected: Rejected
    Revise --> GenerateCharts: Re-draft + re-chart
    Research --> PipelineFailed: Error (after retries)
    Draft --> PipelineFailed: Error (after retries)
    GenerateCharts --> PipelineFailed: Error (after retries)
    NotifyForReview --> PipelineFailed: Error (after retries)
    Publish --> PipelineFailed: Error (after retries)
    Publish --> [*]
    Rejected --> [*]
    PipelineFailed --> [*]
Loading

Why Step Functions Instead of LangChain/LangGraph

The blog agent uses AWS Step Functions with single-purpose Lambda functions instead of an agentic framework like LangChain or LangGraph. This is a deliberate choice:

  • Deterministic workflow — The pipeline is a known sequence (Research → Draft → Chart → Review → Publish), not a dynamic reasoning loop. Step Functions is purpose-built for this.
  • Debuggability — Every step is visible in the Step Functions console with full input/output. Agent framework loops are significantly harder to trace and debug.
  • Cost predictability — Fixed number of LLM calls per run. No risk of runaway tool-use loops.
  • Native HITLwaitForTaskToken handles human-in-the-loop approval without needing a persistence backend (Postgres, Redis) or custom resume logic.
  • Operational maturity — Retries with exponential backoff, catch blocks, DLQ, X-Ray tracing, CloudWatch alarms, and per-function IAM least privilege come from the infrastructure, not framework abstractions.
  • No framework dependency — Plain Python + Bedrock SDK. No version churn, no breaking API changes, no transitive dependency surface.

The tradeoff: the agent can't dynamically decide to call tools or branch its own reasoning. If the Draft step finds a research gap, it can't autonomously loop back. That's acceptable here — the workflow is well-defined, and the "basic" is a feature, not a limitation.

Security

  • Secrets — GitHub token and Tavily API key stored in SSM Parameter Store as SecureString, never in code or environment variables
  • IAM — Each Lambda function has its own IAM role scoped to only the permissions it needs (e.g., Approve can only send task tokens, Ingest can only read inbound email and start executions, Publish can only read drafts and access the GitHub token)
  • API Gateway — Approval endpoint is public but uses one-time Step Functions task tokens that expire after 7 days
  • S3 — AES-256 server-side encryption enabled; all public access blocked (4/4 settings); Origin Access Control (OAC) restricts reads to CloudFront only; S3 website hosting disabled
  • SES — TLS required on inbound email; spam and virus scanning enabled; only authorized sender processed
  • Encryption — SSM parameters use AWS-managed KMS
  • No hardcoded credentials — All sensitive values injected via environment variables or SSM at runtime
  • Lifecycle — Draft objects auto-expire after 90 days

Cost Estimate

The agent is designed to be extremely cheap to run:

Resource Cost
Lambda (9 functions, ~30s/invocation) ~$0.00 per post
Step Functions (1 execution) ~$0.00 per post
Bedrock Claude Sonnet 4.6 + Haiku (~10 calls/post: research query gen, enrichment, data extraction, cross-ref fact-check, draft, chart placement, diagram detection, citation audit, voice audit, citation verification) ~$0.12 per post
Tavily web search (5-8 LLM-generated queries/post, free tier: 1,000/month) ~$0.00
S3 (draft storage) ~$0.00
SNS (1 email) ~$0.00
API Gateway (1-3 requests) ~$0.00
SES (1 inbound email) ~$0.00
Total per post ~$0.12

At 10 posts/month, the agent costs roughly $1.20/month. The website infrastructure itself costs ~$3.50/month (primarily Route 53 hosted zone fees).

Infrastructure Hardening

The hosting infrastructure has been hardened across security, performance, and cost:

Area Detail
S3 Access Public access fully blocked; Origin Access Control (OAC) restricts reads to CloudFront distribution ARN only
HTTPS HTTP requests 301-redirect to HTTPS; TLS 1.2 minimum enforced
Security Headers CSP, HSTS (preload), X-Frame-Options, X-Content-Type-Options, Referrer-Policy, X-XSS-Protection, Permissions-Policy, X-Robots-Tag via CloudFront response headers policy
Compression Gzip + Brotli enabled on CloudFront
URL Rewriting CloudFront Function handles index.html resolution (replaces S3 website hosting)
Custom Errors 403 and 404 mapped to /404.html
TLS Certificate Wildcard ACM cert (*.khaledzaky.com + apex), auto-renewing
Price Class PriceClass_100 (NA + EU edge locations)
HTTP/2 + HTTP/3 Both enabled on CloudFront

Infrastructure as Code

All infrastructure is managed via CloudFormation across three stacks:

Stack Region Resources
khaledzaky-infra us-east-1 CloudFront distribution, OAC, security headers policy, index rewrite function, IAM role, Route 53 health check, CloudWatch alarm + dashboard, CloudTrail
khaledzaky-storage us-east-2 S3 site bucket (versioning, AES-256 + BucketKey, 90-day lifecycle)
blog-agent us-east-1 9 Lambda functions, Step Functions (with retry/catch), SNS, S3 drafts bucket, SQS DLQ, API Gateway (throttled: 5 req/s, burst 10), 3 CloudWatch alarms

Resources not in CFN (import not supported): CodeBuild project, AWS Budget, S3 bucket policy, Lambda/CodeBuild log group retention (managed via CLI).

Monitoring & Observability

Area Detail
Uptime Route 53 HTTPS health check (30s interval) → CloudWatch alarm → SNS email if site goes down
Dashboard CloudWatch dashboard (9 sections): CloudFront traffic + origin latency + error rates, Route 53 health check + alarm status grid, CodeBuild deploy frequency + duration, Lambda invocations/errors/duration/throttles (all 9 functions), Step Functions pipeline pass/fail, API Gateway (approve + upload endpoints), SNS delivery, Bedrock token usage (Sonnet 4.6 + Haiku 4.5), billing vs budget + S3 size
Alerting CloudWatch alarms for: pipeline execution failures, Lambda errors, API Gateway 5xx — all notify via SNS
Logging Structured JSON logging (with correlation IDs) on all 9 Lambda functions; 30-day retention on Lambda + CodeBuild log groups, 90-day on Step Functions
Dead Letter Queue SQS DLQ on Ingest Lambda catches failed async invocations from SES
Error Handling Step Functions Retry (with exponential backoff) on all Task states; Catch → PipelineFailed for unrecoverable errors
Audit CloudTrail multi-region trail → S3 (management events)
Tracing X-Ray active on all 9 Lambda functions + Step Functions
Budget $25/month with 80% and 100% email alerts
SEO Google Search Console verified, sitemap + RSS autodiscovery, JSON-LD schema

License & Copyright

Copyright Khaled Zaky. All rights reserved for the following — you may not reuse without written permission:

  • src/content/blog/ (blog post content)
  • public/img/ (personal images)

The code and styles are licensed under the MIT License.

About

My personal website/blog based on Jekyll

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors