Repo:
sauremilk/drift· Package:drift-analyzer· Command:drift· Requires: Python 3.11+
What is drift?
Drift is a deterministic static analyzer for architectural drift in AI-accelerated Python repositories. It detects architecture erosion through cross-file coherence problems such as pattern fragmentation, architecture violations, and structural hotspots before they become normal team habits.
Who is it for?
- Python teams with fast-growing codebases where architecture matters
- Tech leads who want fast structural feedback, not just style or type checks
- Teams using AI coding tools and seeing more cross-file drift across modules
pip install drift-analyzer
drift analyze --repo .That gives you a drift score, the hottest modules, and actionable findings in one run.
- Not sure where to start? Use the central docs routing page: Start Here.
- Casual user: install drift, run
drift analyze --repo ., and start with Quick Start and Configuration. - Evaluator: review Example Findings, Trust and Evidence, and Stability and Release Status before deciding on rollout.
- Contributor: use CONTRIBUTING.md once you are ready to submit a fix, improve docs, or work on signal quality.
- Core maintainer: use CONTRIBUTING.md, DEVELOPER.md, and POLICY.md for the full quality, architecture, and release guardrails.
The PyPI classifier remains Development Status :: 3 - Alpha intentionally.
That is not a claim that the whole tool is immature. It is a conservative release signal for a product whose core Python analysis is already usable, while some adjacent surfaces still have mixed maturity.
| Area | Status | What that means today |
|---|---|---|
| Core Python analysis | Stable | Primary analysis path, CLI usage, and main signal set are the most production-ready parts of drift. |
| CI and SARIF workflow | Stable | Suitable for report-only rollout now, then selective gating once teams calibrate findings locally. |
| TypeScript support | Experimental | Optional support exists, but Python remains the primary target and the more validated path. |
| Embeddings-based parts | Optional / experimental | Not required for the core detector path and should be treated as exploratory add-ons. |
| Benchmark methodology | Evolving | Public and reproducible, but still conservative in its claims and not the final word on every repository shape. |
Why keep Alpha for now: release signaling should reflect the least mature user-facing surfaces, not only the strongest path. Drift already has stable core workflows, but the overall product story still includes experimental and evolving areas.
See Stability and Release Status for the explicit matrix and the criteria for a future move toward Beta.
DRIFT SCORE 0.52
Top finding: PFS 0.85 Error handling split 4 ways at src/api/routes.py:42
Next action: consolidate variants into one shared pattern
- uses: sauremilk/drift@v1
with:
fail-on: none
upload-sarif: "true"Start report-only first. Tighten to fail-on: high once the team understands the signal quality in its own repo.
git clone https://github.com/sauremilk/drift.git
cd drift/examples/demo-project
pip install drift-analyzer
drift analyze --repo .The demo project contains intentional drift patterns, so you get useful findings immediately.
When your team uses GitHub Copilot, Cursor, or other AI coding tools, code passes CI while the repository quietly accumulates architectural drift:
- Pattern fragmentation: error handling is implemented 4 different ways across the same service
- Boundary violations: the API layer imports directly from the database layer
- Silent duplication: AI generates a new validator instead of finding the existing one
- Churn hotspots: the same files change every sprint because the structure is unclear
Your linter, type checker, and test suite won't catch this. Drift does — deterministically, without any LLM in the pipeline. That makes drift useful for architectural drift detection in AI-accelerated Python codebases, with architecture erosion analysis and cross-file coherence findings that teams can act on.
- Ruff / formatters / type checkers: local correctness and style signals, not cross-module coherence.
- Semgrep / CodeQL / security scanners: risky flows and policy violations, not whether patterns fragment across a codebase.
- Sonar / maintainability dashboards: broad quality heuristics, not a drift-specific score grounded in reproducible signal families.
Current public evidence: 15 real-world repositories in the study corpus, 6 scoring signals, and 4 report-only signals kept out of the composite score until their precision improves. Full study → · Trust & limitations
Problem: A FastAPI service has 4 connectors, each implementing error handling differently — bare except, custom exceptions, retry decorators, and silent fallbacks.
Solution:
drift analyze --repo . --sort-by impact --max-findings 5Output: PFS finding with score 0.96 — "26 error_handling variants in connectors/" — shows exactly which files diverge and suggests consolidation.
Problem: A database model file imports directly from the API layer, creating a circular dependency that breaks test isolation.
Solution:
drift check --fail-on highOutput: AVS finding — "DB import in API layer at src/api/auth.py:18" — blocks the CI pipeline until the import direction is fixed.
Problem: AI code generation created 6 identical _run_async() helper functions across separate task files instead of finding the existing shared utility.
Solution:
drift analyze --repo . --format json | jq '.findings[] | select(.signal=="MDS")'Output: MDS findings listing all 6 locations with similarity scores ≥ 0.95, enabling a single extract-to-shared-module refactoring.
If you are evaluating drift, the fastest way to understand the value is to look at concrete findings rather than abstract signal names.
See docs-site/product/example-findings.md for 5 short examples with code, the likely finding, why it matters, and how to fix it:
- Pattern fragmentation: three incompatible error-handling patterns in one module
- Mutant duplicate: two copied formatter functions that will drift apart later
- Architecture violation: a
db/module importing fromapi/ - Doc-implementation drift: README structure that no longer matches the repo
- Temporal volatility: a small file that became a churn hotspot in git history
name: Drift
on: [push, pull_request]
jobs:
drift:
runs-on: ubuntu-latest
permissions:
contents: read
security-events: write
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: sauremilk/drift@v1
with:
fail-on: none # report findings without blocking CI
upload-sarif: "true" # findings appear as PR annotationsOnce the team has reviewed findings for a few sprints, tighten the gate:
- uses: sauremilk/drift@v1
with:
fail-on: high # block only high-severity findings
upload-sarif: "true"drift check --fail-on none # report-only
drift check --fail-on high # block on high-severity findings# .pre-commit-config.yaml
repos:
- repo: local
hooks:
- id: drift
name: drift
entry: drift check --fail-on high
language: system
pass_filenames: false
always_run: trueMore setup paths:
╭─ drift analyze myproject/ ──────────────────────────────────────────────────╮
│ DRIFT SCORE 0.52 │ 87 files │ 412 functions │ AI: 34% │ 2.1s │
╰──────────────────────────────────────────────────────────────────────────────╯
Module Drift Ranking
Module Score Findings Top Signal
─────────────────────────────────────────────────────────────
src/api/routes/ 0.71 12 PFS 0.85
src/services/auth/ 0.58 7 AVS 0.72
src/db/models/ 0.41 4 MDS 0.61
┌──┬────────┬───────┬──────────────────────────────────────┬──────────────────────┐
│ │ Signal │ Score │ Title │ Location │
├──┼────────┼───────┼──────────────────────────────────────┼──────────────────────┤
│◉ │ PFS │ 0.85 │ Error handling split 4 ways │ src/api/routes.py:42 │
│◉ │ AVS │ 0.72 │ DB import in API layer │ src/api/auth.py:18 │
│○ │ MDS │ 0.61 │ 3 near-identical validators │ src/utils/valid.py │
└──┴────────┴───────┴──────────────────────────────────────┴──────────────────────┘
Drift currently scores six signal families and reports four additional report-only signals:
PFSPattern FragmentationAVSArchitecture ViolationsMDSMutant DuplicatesEDSExplainability DeficitTVSTemporal VolatilitySMSSystem MisalignmentDIADoc-Implementation Drift (report-only, weight0.00)BEMBroad Exception Monoculture (report-only, weight0.00)TPDTest Polarity Deficit (report-only, weight0.00)GCDGuard Clause Deficit (report-only, weight0.00)
Signal details and scoring model:
Data sourced from STUDY.md §9 and benchmark_results/.
| Capability | drift | SonarQube | pylint / mypy | jscpd / CPD |
|---|---|---|---|---|
| Pattern Fragmentation (N variants per module) | Yes | No | No | No |
| Near-Duplicate Detection (AST structural) | Yes | Partial (text) | No | Yes (text) |
| Architecture Violation (layer + circular deps) | Yes | Partial | No | No |
| Temporal Volatility (churn anomalies) | Yes | No | No | No |
| System Misalignment (novel imports) | Yes | No | No | No |
| Composite Health Score | Yes | Yes (different) | No | No |
| Zero Config (no server needed) | Yes | No (server) | Partial | Yes |
| SARIF Output (GitHub Code Scanning) | Yes | Yes | No | No |
| TypeScript Support | Optional ¹ | Yes | No | Yes |
¹ Experimental via drift-analyzer[typescript]. Python is the primary target.
Drift is designed to complement linters and security scanners, not replace them. Recommended stack: linter (style) + type checker (types) + drift (coherence) + security scanner (SAST).
Full comparison: STUDY.md §9 — Tool Landscape Comparison
- Python teams using AI coding tools (Copilot, Cursor, Cody) in existing codebases
- Tech leads who want to catch structural erosion before it becomes team habit
- CI pipelines that need a deterministic architecture check without LLM infrastructure
Teams often describe drift as an architectural linter for repositories where GitHub Copilot and similar assistants accelerate local delivery faster than shared design conventions can keep up.
- teams with Python 3.11+ already available locally and in CI
- repositories with 20+ files and recurring refactors across modules
- teams using AI assistance enough that copy-modify drift and boundary erosion are real review problems
- tiny repos where a few findings would dominate the score
- teams looking for bug finding, security review, or strict pass/fail quality gates on day one
- teams without Python 3.11+ in their execution path yet
Drift works best on Python repositories with 20+ files and some history. If you see too many findings on the first run:
- Start with
drift check --fail-on noneto just observe. - Focus on findings with score ≥ 0.7 — those have the strongest signal.
- Ignore generated code or vendor directories (configure exclusions in
drift.yaml).
- you expect bug finding, security scanning, or type safety enforcement
- you need zero false positives on a tiny repository from day one
- you want one absolute score to replace code review judgment
Drift is most useful when teams treat the score as orientation and the findings as investigation prompts.
The safest adoption path is progressive:
- Start with
drift analyzelocally and review the top findings. - Add
drift checkin CI as report-only discipline for a short period. - Gate only on
highfindings once the team understands the output. - Tune config and policies only after reviewing real findings in your repo.
Recommended guides:
Public claims safe to repeat for v0.6.0: Drift is deterministic, benchmarked on 15 real-world repositories in the current study corpus, and uses 6 scoring signals plus 4 report-only signals (DIA, BEM, TPD, GCD) with weight
0.00until precision improves.What's limited: Benchmark validation is single-rater; not yet independently replicated. Small repos can be noisy. Temporal signals depend on clone depth. The composite score is orientation, not a verdict.
What's next: Independent external validation, multi-rater ground truth, signal-specific confidence intervals.
Drift is designed to earn trust through determinism and reproducibility:
- no LLMs in the detection pipeline
- reproducible CLI and CI output
- signal-specific interpretation instead of score-only messaging
- explicit benchmarking and known-limitations documentation
The drift score measures structural entropy, not code quality. Keep these principles in mind:
- Interpret deltas, not snapshots. Use
drift trendto track changes over time. A single score in isolation has limited meaning. - Temporary increases are expected during migrations. Two coexisting patterns (old and new) will raise PFS/MDS signals. This is the migration happening, not a problem.
- Deliberate polymorphism is not erosion. Strategy, Adapter, and Plugin patterns produce structural similarity that MDS flags as duplication. Findings include a
deliberate_pattern_riskhint — verify intent before acting. - The score rewards reduction, not correctness. Deleting code lowers the score just like refactoring does. Do not optimize for a low score — optimize for understood, intentional structure.
For a detailed discussion of epistemological boundaries (what drift can and cannot see), see STUDY.md §14.
Drift vs. erosion: Without
layer_boundariesindrift.yaml, drift detects emergent drift — structural patterns that diverge without explicit prohibition. With configuredlayer_boundaries, drift additionally performs conformance checking against a defined architecture. Both modes are complementary: drift does not replace dedicated architecture conformance frameworks (e.g. PyTestArch for executable layer rules in pytest), but catches cross-file coherence issues those tools do not model.
Start with the strongest, most actionable findings first. If a signal is noisy for your repository shape, tune or de-emphasize it instead of forcing an early hard gate.
Further reading:
We welcome bug reports, signal improvements, and documentation fixes. If you run drift on your codebase and get surprising results — good or bad — please open an issue or start a discussion.
See CONTRIBUTING.md for setup instructions and good first issues.
- Getting Started
- How It Works
- Benchmarking and Trust
- Product Strategy
- Contributor Guide
- Developer Guide
drift has working CLI, GitHub Action, configuration, JSON/SARIF output, benchmark material, and active tests.
Current release posture:
- PyPI classifier remains Alpha intentionally
- core Python analysis: stable
- CI and SARIF workflow: stable
- TypeScript support: experimental
- embeddings-based parts: optional / experimental
- benchmark methodology: evolving
Rationale and matrix: Stability and Release Status
MIT. See LICENSE.
