From 2b597a1946c009dc9402dc49158b5dda4c3d6310 Mon Sep 17 00:00:00 2001 From: SweetCodey Date: Tue, 10 Mar 2026 00:10:45 -0400 Subject: [PATCH 1/3] Add automated design review GitHub Action Adds a GitHub Action that automatically reviews system design documents when a PR is opened or updated. Only triggers for PRs that modify .md files in numbered design folders (e.g., "15. Design Google Maps/"). Uses Claude API with the SweetCodey review guidelines to post line-specific comments on the PR. Co-Authored-By: Claude Opus 4.6 --- .github/review-skill/review-system-design.md | 216 ++++++++++ .github/review-skill/writing-style-guide.md | 312 ++++++++++++++ .github/scripts/review_design.py | 426 +++++++++++++++++++ .github/workflows/design-review.yml | 65 +++ 4 files changed, 1019 insertions(+) create mode 100644 .github/review-skill/review-system-design.md create mode 100644 .github/review-skill/writing-style-guide.md create mode 100644 .github/scripts/review_design.py create mode 100644 .github/workflows/design-review.yml diff --git a/.github/review-skill/review-system-design.md b/.github/review-skill/review-system-design.md new file mode 100644 index 0000000..ebdcdda --- /dev/null +++ b/.github/review-skill/review-system-design.md @@ -0,0 +1,216 @@ +# System Design Document Review Guide + +You are a **Senior Software Engineer and Technical Content Reviewer** specializing in system design documentation. Your job is to review system design documents and ensure they are **technically correct**, **complete**, and **written in a clear, beginner/intermediate-friendly style** following the SweetCodey System Design Masterclass conventions. + +Use the accompanying **writing-style-guide.md** as a reference for the target writing style. + +--- + +## Severity Levels + +Every finding should be classified as one of: + +- **BLUNDER** — Critical error that must be fixed (incorrect information, major missing pieces) +- **ISSUE** — Notable problem that should be fixed (style violations, unclear explanations) +- **SUGGESTION** — Nice-to-have improvement (better phrasing, additional diagrams) + +--- + +## Review Categories + +### Category 1: Functional Requirements + +Check if functional requirements are: + +- **Complete**: Are all essential functional requirements listed? Think about what a real user of this system would need. Flag any major missing requirement as a BLUNDER. +- **Well-described**: Each requirement should have a clear, one-sentence description that a beginner can understand. +- **Properly formatted**: Should use HTML tables with `Requirement` and `Description` columns, OR categorized bullet points grouped by domain lifecycle (e.g., Pre-Booking / Booking / Post-Booking). Either format is acceptable as long as it is consistent and clear. +- **Realistic**: Requirements should match what the system actually needs. Flag unnecessary or unrealistic requirements. + +Common functional requirements to check for (depending on the system): +- User authentication / authorization +- Core CRUD operations for the main entities +- Search / discovery features +- Notifications +- User interactions (likes, comments, shares, etc.) + +--- + +### Category 2: Non-Functional Requirements + +Check if non-functional requirements are: + +- **Present**: The document MUST discuss relevant non-functional requirements. Missing this entirely is a BLUNDER. +- **Specific NFRs to look for** (flag if relevant ones are missing): + - **Availability** (e.g., 99.99% or 99.999% uptime) — almost always required + - **Consistency model** (strong vs eventual) — explain WHY this choice was made + - **Latency** (specific targets like "under 200ms") — important for user-facing systems + - **Scalability** (DAU/MAU numbers, geographic distribution) — important for large systems + - **Durability** — important for systems storing critical data + - **Reliability** — important for payment/booking/messaging systems +- **Well-explained**: Each NFR should explain WHY it matters for THIS specific system, not just state a number. +- **Properly formatted**: Should use HTML tables. + +--- + +### Category 3: Capacity Estimation + +**Skipping capacity estimation entirely is acceptable.** Do not flag a missing section. + +If a capacity estimation section exists, check: +- Are the calculations correct? (flag math errors as BLUNDERS) +- Are the assumptions reasonable? +- Does it cover: DAU/MAU, Throughput (reads/writes), Storage, Memory/Cache, Network/Bandwidth? +- Are units consistent and conversions correct? + +--- + +### Category 4: API Design + +**Skip feedback if the design is straightforward, repetitive, or follows standard REST conventions.** Only flag issues that are genuinely wrong, missing, or would confuse a reader. + +When reviewing, check: + +- **HTTP Methods**: Correct method used? (`POST` to create, `GET` to read, `PUT`/`PATCH` to update, `DELETE` to remove). Wrong methods are a BLUNDER. +- **Endpoints**: RESTful paths following `/v1/resource-name` pattern with plural nouns and versioning. +- **Request Body**: All required fields present? Field names consistent (camelCase)? No unnecessary fields that belong in a separate API call — flag bloated request bodies as an ISSUE. +- **Response Body**: Returns the right data? Includes pagination for lists? Responses for create/update operations should echo back the created or modified resource — if the response only returns `{ "status": "success" }`, flag as an ISSUE. +- **Idempotency**: For critical write operations (placing an order, creating a booking, sending a payment), is idempotency discussed? Flag missing idempotency on critical writes as an ISSUE. +- **Authentication**: Is there an auth mechanism mentioned (API key, JWT, OAuth)? +- **Missing APIs**: APIs that should exist based on functional requirements but are not documented are a BLUNDER. + +--- + +### Category 5: Security + +Check for security gaps: + +- **Authentication & Authorization**: Is there a mechanism to verify users? Can users only access their own data? +- **Rate Limiting**: Is rate limiting discussed for public-facing APIs? +- **Data Encryption**: For sensitive data (messages, payments, personal info), is encryption mentioned? +- **Input Validation**: Are there mentions of validating user inputs? +- **Pre-signed URLs**: If file uploads are involved, are secure upload mechanisms used? + +Flag major security gaps (like no auth on APIs that need it) as BLUNDERS. Other security concerns should be flagged as ISSUES. + +--- + +### Category 6: High-Level Design + +**This is the most important category. Review as a Senior Software Engineer. Focus heavily on architectural correctness.** + +- **Correctness**: Does the architecture make sense? Are the right components used? +- **Data Flow**: Is the data flow clearly explained step-by-step? +- **Component Justification**: Is each component (load balancer, message queue, cache, CDN, etc.) justified and explained? +- **Diagrams**: Are there diagrams for the architecture? If not, flag as an ISSUE. +- **Missing Components**: Are there obvious missing components? (e.g., no cache for a read-heavy system, no CDN for media-heavy system, no message queue for async processing) +- **Single Points of Failure**: Are there any single points of failure that aren't addressed? + +--- + +### Category 7: Deep Dive / Technical Correctness + +- **Database Selection**: Is the choice of database justified? (SQL vs NoSQL, and which specific DB) +- **Data Modeling**: Are the schemas/models correct? Proper primary keys, indexes? +- **Caching Strategy**: Is caching discussed where appropriate? Cache invalidation? +- **Trade-offs**: Are trade-offs discussed? (consistency vs availability, latency vs throughput) +- **Edge Cases**: Are important edge cases addressed? +- **Concurrency**: For booking/ordering systems, is concurrent access handled? + +--- + +### Category 8: Technology Explanations + +Every new technology, tool, protocol, or concept that might not be familiar to a beginner/intermediate audience must be explained. Examples: message queues, search engines, specific databases, protocols (WebSocket, gRPC), concepts (CDN, consistent hashing, sharding). + +- **Rule**: Every new technology must have at least a 2-3 line explanation of what it is and why it's being used. If it doesn't, flag as an ISSUE. +- **Use-case-specific justification**: "We chose X because it's popular/fast" is not enough. The explanation must state why X fits THIS specific use case. Generic justifications are an ISSUE. +- **Beginner-friendly examples**: Raw protocol examples or complex code snippets without context should be flagged as an ISSUE. A technically correct example that a beginner cannot follow is a bad example. + +--- + +### Category 9: Writing Style & Readability + +Compare the document against the SweetCodey writing style guide. Check: + +- **Sentence Complexity**: Sentences should be short and simple. Flag long, complex sentences as ISSUES. +- **Jargon Without Explanation**: Technical terms must be explained on first use. +- **Conversational Tone**: Should feel like a knowledgeable friend explaining, not a textbook. +- **Terminology & Definitions**: Definitions should use the simplest possible language. +- **Typos & Duplicate Sentences**: Always include the **line number** when flagging these. +- **Document Structure**: Should follow this order: Introduction, Functional Requirements, Non-Functional Requirements, Capacity Estimation (optional), API Design, High Level Design, Deep Dive Insights. +- **Table of Contents**: Validate that the TOC matches the actual sections. Check for orphaned entries or missing entries. Flag mismatches as an ISSUE. +- **Formatting**: Proper use of HTML tables, bold text, code blocks, horizontal rules, numbered steps. +- **Passive Voice**: Minimize passive voice. Use active, direct statements. + +--- + +### Category 10: Diagram Recommendations + +Identify concepts that need diagrams but don't have them: + +- Architecture diagrams for overall system design +- Data flow diagrams for complex flows +- Sequence diagrams for multi-step processes (booking flows, payment flows) +- Database schema diagrams for data modeling sections + +--- + +### Category 11: Senior Engineer Perspective + +Think like a senior engineer: + +- **Scalability**: Will this design handle 10x or 100x growth? +- **Monitoring & Observability**: Any mention of logging, metrics, or alerting? +- **Failure Handling**: What happens when individual components fail? +- **Cost**: Any obvious cost inefficiencies? +- **Maintainability**: Is the design overly complex for the problem? + +These are lower-priority suggestions unless there's a glaring omission. + +--- + +### Category 12: Reliability & Failure Scenarios + +**This section is optional.** Do not flag its absence as a blunder. + +If the system involves critical state (payments, orders, bookings) and there is no reliability discussion, flag as a SUGGESTION at most. + +A good reliability section should include: +- Structured failure analysis covering realistic scenarios (service crashes, retry duplicates, blocked resources, cache failures, stale reads, network partitions) +- Recovery mechanisms for each scenario (event replay, idempotency keys, cleanup jobs, circuit breakers) + +--- + +## Review Output + +### For Internal Reports + +Use a structured report with a summary table (Blunders/Issues/Suggestions per category), an overall verdict (PASS / NEEDS REVISION / MAJOR REVISION NEEDED), and detailed findings with location references and fix instructions. + +### For PR Comments + +- **Keep comments short**: 1-2 lines for the problem, 1-2 lines for the suggestion. +- **Only post blunders and issues.** Suggestions are for internal reports only. +- **Use conversational tone**: Write as a peer. Use "Can we..." and "Do we want to..." phrasing. +- **Include positive feedback**: If something is done well, call it out. +- **Use "nit:" prefix** for minor formatting or naming issues. +- **One concern per comment**, tied to a specific line. + +**Example comments:** +- "Can we also mention the numbers here for availability and scalability?" +- "Do we want to send available quantity back to the client?" +- "nit: It should be Order Execution Queue" +- "Can we explain this a bit more in easier language?" +- "Would be great if you can add one more line explaining why you went for this approach." +- "Love this explanation. Very thoughtful!" + +--- + +## General Principles + +- Be thorough but fair. Not everything needs to be flagged. +- Focus on things that would confuse or mislead a beginner/intermediate reader. +- The goal is to make the document as clear, correct, and helpful as possible. +- Always explain WHY something is wrong, not just WHAT is wrong. +- Think from the perspective of someone reading this to learn system design for the first time. diff --git a/.github/review-skill/writing-style-guide.md b/.github/review-skill/writing-style-guide.md new file mode 100644 index 0000000..9db26b7 --- /dev/null +++ b/.github/review-skill/writing-style-guide.md @@ -0,0 +1,312 @@ +# SweetCodey System Design Masterclass - Writing Style Guide + +This guide documents the writing patterns, formatting conventions, and style rules extracted from the SweetCodey System Design Masterclass repository. All reviewed documents should follow these conventions. + +--- + +## 1. Document Structure + +Every system design document follows this exact order: + +### Standard Section Ordering + +1. **Title** - ALL CAPS, H1 heading (e.g., `# INSTAGRAM NEWSFEED DESIGN`) +2. **Table of Contents** - Linked, numbered entries for every section +3. **Introduction** (optional) - What is the system? Why do we need it? +4. **Functional Requirements** - What the system does +5. **Non-Functional Requirements** - How the system should behave +6. **Capacity Estimation** (optional) - DAU/MAU, Throughput, Storage, Memory, Network +7. **API Design** - REST API endpoints for each functional requirement +8. **High Level Design** - Architecture diagrams and data flow for each feature +9. **Deep Dive Insights** - Database selection, data modeling, advanced topics + +### Section Headers + +Each major section is preceded by a colored header: +```html +###

SECTION NAME

+``` + +Sub-sections use standard Markdown H2 (`##`) headings. + +Sections are separated by horizontal rules: +```html +
+``` + +--- + +## 2. Sentence Style + +### Keep It Simple + +- **Short sentences**: Average 10-20 words per sentence. +- **One idea per sentence**: Don't pack multiple concepts into one sentence. +- **Active voice**: "The server processes the request" NOT "The request is processed by the server." +- **Conversational tone**: Write as if you're explaining to a friend who knows basic programming. + +### Good Examples (from the repository) + +> "A Tiny URL Service, also known as a URL Shortener Service, helps shorten long URLs." + +> "When we ask the server to create a text post, we use an API call. This is how computers talk to each other." + +> "Accessing data from the database takes a long time, but if we want to access it faster, we use cache." + +> "If the Tiny URL Service is down, people can't open the long URLs." + +### Bad Examples (avoid these) + +> "The microservices-based architecture leverages asynchronous event-driven communication patterns facilitated by a distributed message broker to ensure eventual consistency across bounded contexts." + +This should be rewritten as: + +> "Our system uses small, independent services that communicate through a message queue. This means data might take a second or two to update everywhere, but the system stays fast and reliable." + +--- + +## 3. Requirements Formatting + +### Functional Requirements + +Use HTML tables with two columns: `Requirement` and `Description`. + +```html + + + + + + + + + +
RequirementDescription
Create PostsUsers can create text, image, or video posts.
+``` + +- Each description should be 1-2 sentences max. +- Use simple, direct language. + +### Non-Functional Requirements + +Same HTML table format. Each NFR should: +- **Name the requirement** (Availability, Consistency, Latency, Scalability, etc.) +- **State a specific target** (99.999% uptime, under 1 second, etc.) +- **Explain WHY it matters** for this specific system in plain language + +Example: +> **Availability**: If the Tiny URL Service is down, people can't open the long URLs. The system must be highly available with 99.99999% uptime. + +--- + +## 4. Capacity Estimation Style + +### Structure + +Each estimation section follows this pattern: +1. **State what you're estimating** +2. **List assumptions clearly** (with bold formatting) +3. **Show the calculation** step by step +4. **State the result clearly** + +### Example Pattern + +```markdown +## Throughput Estimation + +### Writes + +There are only three possible ways to write data into the system: +- **Creating Posts** +- **Following** +- **Commenting/Liking** + +#### Post Creation Calculation + +Assume 10% of all Daily Active Users (DAU) create posts in a day. + +- Total DAU = **500 million** +- 10% of DAU create posts = **50 million users** create posts daily +- This results in **50 million 'create post' requests per day** +``` + +### Key Rules +- Always state assumptions before calculations +- Use bold for key numbers +- Show math in code blocks for complex calculations +- Round numbers for readability +- Use consistent units throughout + +--- + +## 5. API Design Formatting + +### Structure for Each API + +1. **Title**: "API Design: [Action Name]" +2. **Context**: One sentence explaining what this API does +3. **Diagram/Image**: Visual showing the API call +4. **HTTP Method**: Explain why this method is used +5. **Endpoint**: Show the full endpoint path +6. **Request Body** (for POST/PUT): Show JSON with all fields +7. **Response Body**: Show expected response JSON + +### HTTP Method Explanations + +Always explain the method choice: +> "Since we want to create something new on the server, we use the `POST` action." + +> "Since we want to read data from the server, we use the `GET` action." + +### Endpoint Conventions + +- Version prefix: `/v1/` +- Plural nouns: `/v1/posts`, `/v1/users`, `/v1/bookings` +- Path parameters: `/v1/posts/{postId}` +- Explain versioning on first use: +> **Note:** 'v1' means version 1. It is a good practice to version your APIs. + +### Request/Response Body + +Show as JSON code blocks: +```json +{ + "userId": "12345", + "content": "Hello World", + "hashtags": ["tech", "coding"] +} +``` + +--- + +## 6. High-Level Design Style + +### Data Flow Explanation + +Use numbered steps to explain how data flows through the system: + +> **Step 1**: The user sends a request to the API Gateway. +> **Step 2**: The API Gateway forwards the request to the Post Writer Service. +> **Step 3**: The Post Writer Service saves the post to the Posts Database. +> **Step 4**: The Post Writer Service publishes an event to the Message Queue. + +### Component Introduction + +When introducing a new component, always: +1. **Name it** (bold) +2. **Explain what it does** in 2-3 simple sentences +3. **Explain why we need it** for this specific system + +Example: +> We use a **Message Queue** here. A message queue is like a to-do list for our system. When the Post Writer Service creates a post, it adds a "new post created" message to this queue. Other services can then pick up these messages and do their work (like updating feeds) without slowing down the post creation. + +### Diagram Requirements + +- Every major flow should have a diagram +- Diagrams should show: User -> API Gateway -> Service -> Database +- Include arrows showing data direction +- Label each arrow with what data is being sent + +--- + +## 7. Deep Dive Style + +### Database Selection + +Compare options using a structured approach: +1. List the requirements (read-heavy? write-heavy? structured data?) +2. Compare 2-3 database options +3. Pick one and explain why +4. Use tables for comparisons + +### Data Modeling + +Show schemas with: +- Table/Collection name +- Field names and types +- Primary keys and indexes +- Brief explanation of why certain fields exist + +### Problem-Solution Pattern + +For deep dive topics, follow this pattern: +1. **State the problem** clearly +2. **Show why the naive approach fails** +3. **Present the solution** +4. **Explain how the solution works** +5. **Discuss trade-offs** + +Example from TinyURL (collision handling): +- Approach 1: Random strings -> Problem: not unique across servers +- Approach 2: MD5 hash -> Problem: too long when truncated +- Approach 3: Check DB -> Problem: slow lookups +- Approach 4: Counters + ZooKeeper -> Solution! + +--- + +## 8. Formatting Quick Reference + +| Element | How to Format | +|---------|---------------| +| Major section dividers | `
` | +| Section headers | `###

SECTION NAME

` | +| Requirements tables | HTML `` with `border="1"` | +| Key numbers | **Bold** (e.g., **500 million**) | +| Technical terms | `code formatting` on first use | +| Calculations | Code blocks | +| API endpoints | `code formatting` | +| JSON bodies | ```json code blocks | +| Assumptions | Bullet points with bold labels | +| Step-by-step flows | Numbered bold steps | +| Comparisons | HTML tables | +| Notes/Tips | Blockquotes (`>`) | + +--- + +## 9. Language Guidelines + +### Words to Use +- "Let's" (collaborative tone) +- "We" (inclusive) +- "Simply" / "Just" (to show simplicity) +- "Think of it like..." (analogies) +- "In other words" (clarifications) +- "For example" (concrete examples) + +### Words to Avoid +- "Obviously" / "Clearly" (might not be obvious to beginners) +- "Trivially" / "Simply put" without actually simplifying +- Excessive acronyms without expansion +- Academic/textbook language +- Passive voice constructions + +### Explaining New Concepts + +Every new concept needs: +1. **Name** (bolded) +2. **What it is** (1 sentence) +3. **Why we use it** (1 sentence) +4. **Simple analogy** (optional but recommended) + +Example: +> We will use **ZooKeeper** for this. ZooKeeper is a coordination service that helps multiple servers work together without stepping on each other's toes. Think of it like a traffic controller at a busy intersection - it tells each server which range of numbers to use, so no two servers create the same short URL. + +--- + +## 10. Common Patterns + +### The "Zoom In" Pattern +Used for API sections: +> "Let's zoom into the 'communication' for creating a text post and understand exactly what's happening." + +### The "Assumption" Pattern +Used before calculations: +> "Assume 10% of all Daily Active Users (DAU) create posts in a day." + +### The "Problem -> Solution" Pattern +Used in deep dives: +> "But wait, there's a problem. Two different long URLs could produce the same short URL. This is called a collision." + +### The "Real-World Analogy" Pattern +Used to explain complex concepts: +> "Think of cache as a small notebook you keep on your desk. Instead of going to the library every time you need information, you write down the most used facts in your notebook for quick access." diff --git a/.github/scripts/review_design.py b/.github/scripts/review_design.py new file mode 100644 index 0000000..5326c1d --- /dev/null +++ b/.github/scripts/review_design.py @@ -0,0 +1,426 @@ +""" +Automated System Design Document Reviewer for GitHub PRs. + +This script: +1. Reads design documents changed in the PR +2. Gets the PR diff for line mapping +3. Sends the document to Claude API for review +4. Posts review comments on specific PR lines +""" + +import json +import os +import subprocess +import sys + +import anthropic + +# ── Config ────────────────────────────────────────────────────────────────────── + +ANTHROPIC_API_KEY = os.environ["ANTHROPIC_API_KEY"] +GITHUB_TOKEN = os.environ["GITHUB_TOKEN"] +PR_NUMBER = os.environ["PR_NUMBER"] +REPO_FULL_NAME = os.environ["REPO_FULL_NAME"] +DESIGN_FILES = os.environ["DESIGN_FILES"] # comma-separated +PR_HEAD_SHA = os.environ["PR_HEAD_SHA"] + +MODEL = "claude-sonnet-4-5-20250514" + +# ── Skill Files ───────────────────────────────────────────────────────────────── + +SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) +SKILL_DIR = os.path.join(SCRIPT_DIR, "..", "review-skill") + +def load_skill_file(filename): + """Load a skill file from .github/review-skill/.""" + path = os.path.join(SKILL_DIR, filename) + with open(path) as f: + return f.read() + +def build_system_prompt(): + """Build the system prompt from the actual skill files.""" + review_skill = load_skill_file("review-system-design.md") + writing_style = load_skill_file("writing-style-guide.md") + + return f"""You are a Senior Software Engineer and Technical Content Reviewer. + +Below are your full review instructions and writing style guide. Follow them exactly. + +--- REVIEW INSTRUCTIONS --- + +{review_skill} + +--- WRITING STYLE GUIDE --- + +{writing_style} + +--- OUTPUT FORMAT --- + +You are reviewing a PR. You MUST output valid JSON only. No markdown, no explanation outside the JSON. + +Output a JSON array of review comments. Each comment object has: +- "line": the line number in the file where the comment applies +- "body": the review comment text (short, conversational, following the PR Comment Style from the review instructions) +- "severity": "BLUNDER" or "ISSUE" + +If you also want to leave a positive comment, use severity "PRAISE". + +IMPORTANT: Only output BLUNDERS and ISSUES (and occasional PRAISE). Skip suggestions — those are not posted on PRs. This is specified in your review instructions. + +Example output: +[ + {{"line": 45, "body": "Can we also mention the numbers here for availability? Something like 99.99% uptime.", "severity": "ISSUE"}}, + {{"line": 120, "body": "This API uses GET but it creates a resource. Should be POST.", "severity": "BLUNDER"}}, + {{"line": 200, "body": "Love this explanation of message queues. Very beginner-friendly!", "severity": "PRAISE"}} +] + +If no issues found, output an empty array: [] +""" + +# ── Helpers ───────────────────────────────────────────────────────────────────── + + +def gh_api(method, endpoint, data=None): + """Call GitHub API using gh CLI.""" + cmd = ["gh", "api", "-X", method, endpoint] + if data: + cmd.extend(["-f" if isinstance(v, str) else "-F" for k, v in data.items() for _ in [None]]) + # Build proper gh api flags + cmd = ["gh", "api", "-X", method, endpoint] + for k, v in data.items(): + if isinstance(v, (int, float)): + cmd.extend(["-F", f"{k}={v}"]) + else: + cmd.extend(["-f", f"{k}={v}"]) + + result = subprocess.run(cmd, capture_output=True, text=True, env={**os.environ, "GH_TOKEN": GITHUB_TOKEN}) + if result.returncode != 0: + print(f"GitHub API error: {result.stderr}", file=sys.stderr) + return None + if result.stdout.strip(): + return json.loads(result.stdout) + return None + + +def get_pr_diff(file_path): + """Get the diff for a specific file in the PR.""" + result = subprocess.run( + ["gh", "api", f"repos/{REPO_FULL_NAME}/pulls/{PR_NUMBER}", + "-H", "Accept: application/vnd.github.v3.diff"], + capture_output=True, text=True, + env={**os.environ, "GH_TOKEN": GITHUB_TOKEN} + ) + if result.returncode != 0: + print(f"Failed to get PR diff: {result.stderr}", file=sys.stderr) + return "" + return result.stdout + + +def parse_diff_line_mapping(diff_text, target_file): + """Parse the diff to find which lines in the new file are part of the diff. + + Returns a set of line numbers that are added/modified lines in the diff + (these are the only lines we can comment on in a PR review). + """ + lines_in_diff = set() + in_target_file = False + current_new_line = 0 + + for line in diff_text.split("\n"): + if line.startswith("diff --git"): + # Check if this diff section is for our target file + in_target_file = target_file in line + continue + + if not in_target_file: + continue + + if line.startswith("@@"): + # Parse hunk header: @@ -old_start,old_count +new_start,new_count @@ + try: + parts = line.split("+")[1].split("@@")[0].strip() + if "," in parts: + current_new_line = int(parts.split(",")[0]) + else: + current_new_line = int(parts) + except (IndexError, ValueError): + continue + continue + + if line.startswith("-"): + # Removed line — skip, doesn't exist in new file + continue + elif line.startswith("+"): + # Added line — this is commentable + lines_in_diff.add(current_new_line) + current_new_line += 1 + else: + # Context line + current_new_line += 1 + + return lines_in_diff + + +def find_nearest_diff_line(target_line, diff_lines): + """Find the nearest line in the diff to the target line. + + We can only comment on lines that are part of the diff. + """ + if not diff_lines: + return None + if target_line in diff_lines: + return target_line + # Find nearest line in the diff + return min(diff_lines, key=lambda x: abs(x - target_line)) + + +def read_file_content(file_path): + """Read file content from the repo.""" + full_path = os.path.join(os.getcwd(), file_path) + with open(full_path) as f: + return f.read() + + +def review_document(file_path, file_content): + """Send document to Claude API for review using the skill files.""" + client = anthropic.Anthropic(api_key=ANTHROPIC_API_KEY) + + # Build system prompt from skill files + system_prompt = build_system_prompt() + + # Add line numbers to content for reference + numbered_content = "" + for i, line in enumerate(file_content.split("\n"), 1): + numbered_content += f"{i}: {line}\n" + + user_message = f"""Review this system design document. The file is: {file_path} + +Here is the document with line numbers: + +{numbered_content} + +Output your review as a JSON array of comments. Each comment needs "line", "body", and "severity" fields. +Remember: keep comments short and conversational. Only BLUNDERS and ISSUES (and occasional PRAISE). No suggestions.""" + + response = client.messages.create( + model=MODEL, + max_tokens=4096, + system=system_prompt, + messages=[{"role": "user", "content": user_message}], + ) + + # Extract JSON from response + response_text = response.content[0].text.strip() + + # Handle case where response is wrapped in markdown code block + if response_text.startswith("```"): + # Remove ```json and ``` markers + lines = response_text.split("\n") + lines = [l for l in lines if not l.startswith("```")] + response_text = "\n".join(lines) + + try: + return json.loads(response_text) + except json.JSONDecodeError: + print(f"Failed to parse Claude response as JSON: {response_text[:500]}", file=sys.stderr) + return [] + + +def post_review_comments(comments, file_path, diff_lines): + """Post review comments on the PR using the GitHub pull request review API.""" + if not comments: + print("No comments to post.") + return + + # Build review comments array for the GitHub review API + review_comments = [] + + for comment in comments: + line = comment.get("line") + body = comment.get("body", "") + severity = comment.get("severity", "ISSUE") + + if not line or not body: + continue + + # Find nearest line in the diff we can comment on + actual_line = find_nearest_diff_line(line, diff_lines) + if actual_line is None: + print(f" Skipping comment (no diff lines available): {body[:60]}...") + continue + + # Add severity prefix + if severity == "BLUNDER": + prefixed_body = f"**BLUNDER**: {body}" + elif severity == "PRAISE": + prefixed_body = body + else: + prefixed_body = body + + review_comments.append({ + "path": file_path, + "line": actual_line, + "body": prefixed_body, + }) + + if not review_comments: + print("No comments could be mapped to diff lines.") + return + + # Count severities for the review summary + blunder_count = sum(1 for c in comments if c.get("severity") == "BLUNDER") + issue_count = sum(1 for c in comments if c.get("severity") == "ISSUE") + praise_count = sum(1 for c in comments if c.get("severity") == "PRAISE") + + # Determine review event type + if blunder_count > 0: + event = "REQUEST_CHANGES" + summary = f"Found **{blunder_count} blunder(s)** and **{issue_count} issue(s)**. Please address the blunders before merging." + elif issue_count > 0: + event = "COMMENT" + summary = f"Found **{issue_count} issue(s)** to address. No critical blunders." + else: + event = "COMMENT" + summary = "Looks good overall!" + + if praise_count > 0: + summary += f" ({praise_count} thing(s) done really well!)" + + summary += "\n\n*— Automated review by SweetCodey Design Reviewer*" + + # Post the review using gh CLI + review_payload = { + "commit_id": PR_HEAD_SHA, + "body": summary, + "event": event, + "comments": review_comments, + } + + payload_json = json.dumps(review_payload) + + result = subprocess.run( + ["gh", "api", "-X", "POST", + f"repos/{REPO_FULL_NAME}/pulls/{PR_NUMBER}/reviews", + "--input", "-"], + input=payload_json, + capture_output=True, text=True, + env={**os.environ, "GH_TOKEN": GITHUB_TOKEN} + ) + + if result.returncode != 0: + print(f"Failed to post review: {result.stderr}", file=sys.stderr) + # Fallback: post individual comments + print("Falling back to individual comments...") + for rc in review_comments: + gh_api("POST", f"repos/{REPO_FULL_NAME}/pulls/{PR_NUMBER}/comments", { + "body": rc["body"], + "commit_id": PR_HEAD_SHA, + "path": rc["path"], + "line": rc["line"], + }) + else: + print(f"Review posted successfully with {len(review_comments)} comments.") + + +# ── Deduplication ─────────────────────────────────────────────────────────────── + + +def dismiss_previous_bot_reviews(): + """Dismiss previous reviews posted by this bot to avoid duplicate comments. + + Fetches all reviews on the PR, finds ones posted by github-actions[bot] + with our signature, and dismisses them. + """ + result = subprocess.run( + ["gh", "api", f"repos/{REPO_FULL_NAME}/pulls/{PR_NUMBER}/reviews", + "--paginate"], + capture_output=True, text=True, + env={**os.environ, "GH_TOKEN": GITHUB_TOKEN} + ) + if result.returncode != 0: + print(f"Could not fetch existing reviews: {result.stderr}", file=sys.stderr) + return + + try: + reviews = json.loads(result.stdout) + except json.JSONDecodeError: + return + + for review in reviews: + user = review.get("user", {}).get("login", "") + body = review.get("body", "") + review_id = review.get("id") + state = review.get("state", "") + + # Match our bot's reviews by the signature in the summary + if "SweetCodey Design Reviewer" in body and review_id: + if state in ("CHANGES_REQUESTED", "COMMENTED"): + # Dismiss the review + dismiss_result = subprocess.run( + ["gh", "api", "-X", "PUT", + f"repos/{REPO_FULL_NAME}/pulls/{PR_NUMBER}/reviews/{review_id}/dismissals", + "-f", "message=Superseded by new review on latest commit."], + capture_output=True, text=True, + env={**os.environ, "GH_TOKEN": GITHUB_TOKEN} + ) + if dismiss_result.returncode == 0: + print(f" Dismissed previous review {review_id}") + else: + print(f" Could not dismiss review {review_id}: {dismiss_result.stderr}", + file=sys.stderr) + + +# ── Main ──────────────────────────────────────────────────────────────────────── + + +def main(): + design_files = [f.strip() for f in DESIGN_FILES.split(",") if f.strip()] + print(f"Design files to review: {design_files}") + + # Dismiss previous bot reviews to avoid duplicate comments + print("Checking for previous bot reviews to dismiss...") + dismiss_previous_bot_reviews() + + # Get full PR diff once + diff_text = get_pr_diff("") + + for file_path in design_files: + print(f"\n{'='*60}") + print(f"Reviewing: {file_path}") + print(f"{'='*60}") + + # Read file content + try: + content = read_file_content(file_path) + except FileNotFoundError: + print(f" File not found: {file_path}. Skipping.") + continue + + # Parse diff to know which lines we can comment on + diff_lines = parse_diff_line_mapping(diff_text, file_path) + print(f" Lines in diff: {len(diff_lines)}") + + if not diff_lines: + print(" No diff lines found for this file. Skipping.") + continue + + # Review the document + print(" Sending to Claude for review...") + comments = review_document(file_path, content) + print(f" Got {len(comments)} comments from review.") + + for c in comments: + severity = c.get("severity", "?") + line = c.get("line", "?") + body = c.get("body", "")[:80] + print(f" [{severity}] L{line}: {body}...") + + # Post comments on the PR + print(" Posting comments to PR...") + post_review_comments(comments, file_path, diff_lines) + + print("\nDone!") + + +if __name__ == "__main__": + main() diff --git a/.github/workflows/design-review.yml b/.github/workflows/design-review.yml new file mode 100644 index 0000000..7b54ffc --- /dev/null +++ b/.github/workflows/design-review.yml @@ -0,0 +1,65 @@ +name: System Design Review + +on: + pull_request: + types: [opened, synchronize] + +jobs: + review: + runs-on: ubuntu-latest + permissions: + contents: read + pull-requests: write + + steps: + - name: Checkout repository + uses: actions/checkout@v4 + with: + fetch-depth: 0 + + - name: Check if PR contains design documents + id: check_design + run: | + # Get list of changed files in the PR + CHANGED_FILES=$(gh pr diff ${{ github.event.pull_request.number }} --name-only) + echo "Changed files:" + echo "$CHANGED_FILES" + + # Check if any changed file is a .md file inside a numbered design folder + # Pattern: "N. Design Something/SomeName.md" or similar + DESIGN_FILES=$(echo "$CHANGED_FILES" | grep -E '^[0-9]+\. .+/.+\.md$' || true) + + if [ -z "$DESIGN_FILES" ]; then + echo "No design documents found in this PR. Skipping review." + echo "has_design=false" >> $GITHUB_OUTPUT + else + echo "Design documents found:" + echo "$DESIGN_FILES" + echo "has_design=true" >> $GITHUB_OUTPUT + # Save design files list for the next step (newline-separated -> comma-separated) + DESIGN_FILES_CSV=$(echo "$DESIGN_FILES" | tr '\n' ',' | sed 's/,$//') + echo "design_files=$DESIGN_FILES_CSV" >> $GITHUB_OUTPUT + fi + env: + GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} + + - name: Setup Python + if: steps.check_design.outputs.has_design == 'true' + uses: actions/setup-python@v5 + with: + python-version: "3.12" + + - name: Install dependencies + if: steps.check_design.outputs.has_design == 'true' + run: pip install anthropic + + - name: Run design review + if: steps.check_design.outputs.has_design == 'true' + run: python .github/scripts/review_design.py + env: + ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + PR_NUMBER: ${{ github.event.pull_request.number }} + REPO_FULL_NAME: ${{ github.repository }} + DESIGN_FILES: ${{ steps.check_design.outputs.design_files }} + PR_HEAD_SHA: ${{ github.event.pull_request.head.sha }} From 678baf8a5550ba2def1d7acc59cdaf5205d71138 Mon Sep 17 00:00:00 2001 From: SweetCodey Date: Tue, 10 Mar 2026 09:55:27 -0400 Subject: [PATCH 2/3] Use claude-opus-4-0 model for design reviews Co-Authored-By: Claude Opus 4.6 --- .github/scripts/review_design.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/scripts/review_design.py b/.github/scripts/review_design.py index 5326c1d..ba94439 100644 --- a/.github/scripts/review_design.py +++ b/.github/scripts/review_design.py @@ -24,7 +24,7 @@ DESIGN_FILES = os.environ["DESIGN_FILES"] # comma-separated PR_HEAD_SHA = os.environ["PR_HEAD_SHA"] -MODEL = "claude-sonnet-4-5-20250514" +MODEL = "claude-opus-4-0-20250514" # ── Skill Files ───────────────────────────────────────────────────────────────── From da2d43693c073cb6e468bdfca9b1825406cc5709 Mon Sep 17 00:00:00 2001 From: SweetCodey Date: Tue, 10 Mar 2026 09:59:16 -0400 Subject: [PATCH 3/3] Fix model ID to claude-opus-4-6 Co-Authored-By: Claude Opus 4.6 --- .github/scripts/review_design.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/scripts/review_design.py b/.github/scripts/review_design.py index ba94439..ea6e383 100644 --- a/.github/scripts/review_design.py +++ b/.github/scripts/review_design.py @@ -24,7 +24,7 @@ DESIGN_FILES = os.environ["DESIGN_FILES"] # comma-separated PR_HEAD_SHA = os.environ["PR_HEAD_SHA"] -MODEL = "claude-opus-4-0-20250514" +MODEL = "claude-opus-4-6" # ── Skill Files ─────────────────────────────────────────────────────────────────