Add 60 interview-style problems and tracking index#1
Open
shiningflash wants to merge 11 commits into
Open
Conversation
Adds 9 interview-style fundamentals problems with full question and solution markdown files, including diagrams and concrete examples. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 SQL-focused interview problems with worked examples and EXPLAIN plan walkthroughs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8 system design problems covering electricity retailer platform, banking widgets, surge pricing, streaming aggregations, billing pipelines, real-time driver tracking, year-in-review batch, and notification dedup. Each includes architecture diagrams, capacity math, and risk discussion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 scenario-based interview problems covering silent data bugs, cost spikes, analyst trust, pipeline ownership transfer, executive pressure, and Kafka data loss recovery. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 cloud-decision interview problems comparing Lambda vs Cloud Run, scheduled serverless jobs, BigQuery vs Snowflake, S3 vs warehouse storage, managed Airflow vs self-hosted, and BigQuery access control models. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 data modeling problems covering Airbnb-style schema, subscription history with valid_from/to, mixing facts and dimensions, explaining grain, and current state vs event history. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 debugging-scenario interview problems covering region-zero revenue, silent task success with empty output, sudden query slowdowns, vague user reports, and recurring partition anomalies. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 cost/performance problems covering BigQuery bill investigation, Spark job tuning, daily-data hourly-scan waste, the 'throw more memory' reflex, and partitioning vs clustering vs materialized views. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 streaming-fundamentals problems covering watermarks, Kafka per-partition ordering, and diagnosing growing consumer lag before scaling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 people/process problems covering analyst onboarding, fast-vs-right trade-offs, metric ownership disputes, blameless postmortems, inheriting undocumented pipelines, breaking dbt changes with many consumers, and Airflow scheduler scaling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add rows for problems 6-65 across 10 new categories (Fundamentals, SQL Thinking, System Design, Scenarios, Cloud Decisions, Data Modeling, Debugging, Cost and Performance, Streaming, People and Process) plus expanded legend and difficulty guide. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PROBLEMS.md— a single index table linking every problem to its question and solution, with category, topic and difficulty tags suitable for filtering on a website.What each problem contains
Every problem has its own folder with:
question.md— scenario, task list, and what a good answer should cover.solution.md(orsolution.pyfor code-style ones) — a walkthrough written like an experienced engineer would explain it on a whiteboard.Solutions include diagrams (ASCII / mermaid-style boxes) where the topic benefits from visualization, capacity math where relevant, common-mistake lists, and bonus follow-up questions to anticipate.
Format consistency
All new problems follow the same shape as the existing Problems 1-4:
Coverage
Test plan