Skip to content

Sentinel-Audits/sentinelaudit-oss

Repository files navigation

Sentinel Audit

License: Apache-2.0 Scope: Selective OSS Security Workflow

SentinelAudit is an AI-assisted smart contract security workflow for Ethereum and the broader EVM ecosystem.

This repository is the selective open-source surface of the project: reusable triage, analysis, and validation tooling that sits underneath the product.

Getting Started

Core public surfaces:

  • LLM triage harnesses
  • Slither runner helper modules
  • benchmark corpus and scorecard scripts
  • release and evaluation docs

Start here:

  • OSS_MODULES.md
  • CONTRIBUTING.md
  • CODE_OF_CONDUCT.md

It is not just "run Slither and summarize it." The system is designed to:

  • assemble a compilable workspace from selected repo files
  • run static analysis against that workspace
  • structure findings into report-ready objects
  • carry bounty scope into triage, validation, and dossier generation
  • separate first-party findings from dependency risk
  • keep lower-signal output in research lanes instead of promoting everything
  • support validation workflows instead of stopping at scanner output

The current product direction is captured in ROADMAP.md.

Open-Source Scope

This repo is meant to expose reusable security workflow building blocks:

  • structured triage harnesses
  • repo-aware compile and scan helpers
  • public benchmark corpus and scorecard generation
  • validation runner patterns
  • release and evaluation methodology

Private product layers such as billing, auth, customer history, and internal audit-intelligence operations are intentionally kept out of the public surface.

Public Structure

SentinelAudit is being prepared for a selective public release.

The near-term plan is:

  • publish reusable tooling and method-heavy pieces first
  • keep billing, auth, customer data paths, and internal intelligence flows private
  • avoid breaking the current product control plane while opening modules intentionally

See:

  • PUBLIC_RELEASE_BOUNDARY.md
  • OSS_MODULES.md
  • PUBLIC_RELEASE_CHECKLIST.md
  • PUBLIC_REPO_SETUP.md
  • CONTRIBUTING.md
  • CODE_OF_CONDUCT.md
  • SECURITY.md
  • LICENSE
  • NOTICE

Before publishing any slice of the repo, run:

bun run audit:public-surface

Architecture

flowchart LR
    U["User"] --> W["web<br/>Next.js UI"]
    W --> B["backend<br/>Cloudflare Worker + Hono"]
    B --> S["slither runner<br/>compile + scan"]
    B --> L["llm-worker<br/>structuring + fixes + dossiers"]
    B --> E["echidna runner<br/>optional fuzzing"]
    B --> R["R2 + DB<br/>workspace, findings, jobs, events"]
    S --> B
    L --> B
    E --> B
    B --> W
Loading

Audit Model

flowchart TD
    A["Selected repo files"] --> B["Repo-aware workspace expansion"]
    B --> C["Compile context<br/>foundry/remappings/config/vendor roots"]
    C --> D["Slither detectors"]
    D --> E["Normalized findings"]
    E --> F["Semantic fact extraction"]
    F --> F2["Dimensional fact extraction<br/>selected accounting paths"]
    F2 --> G["Structured report objects"]
    G --> H["Promotion policy"]
    H --> I["Report findings"]
    H --> J["Needs review"]
    H --> K["Research notes"]
    G --> L["Dependency findings lane"]
    I --> N["Validation lane<br/>fuzz / deterministic / manual PoC"]
    N --> O["Bounty dossier + export pack"]
    H --> M["Evaluation harnesses<br/>goldset + repo + auditor review set"]
Loading

Scope Philosophy

Sentinel now distinguishes between:

  • first-party findings: issues in the user's project code
  • dependency findings: vendored or third-party code such as lib/, vendor/, node_modules/, and package imports
  • research notes: lower-signal output kept for manual review

Dependencies are still analyzed when needed for compilation and context, but they should not dominate the main report verdict by default.

Trust Model

Sentinel is now built around a stricter promotion rule:

  • detector output is not enough
  • semantic facts come first
  • promotion policy decides whether something becomes:
    • report_finding
    • needs_review
    • research_note

The core questions Sentinel tries to answer automatically are:

  • who can call this path
  • what protection exists
  • who controls the dangerous argument
  • whether state is finalized before external interaction
  • whether the affected code is first-party or dependency code
  • whether arithmetic/accounting logic mixes units like assets, shares, prices, or fee scales in a suspicious way

If those answers are weak or incomplete, Sentinel should hold the finding back instead of pretending to have higher confidence than it does.

For accounting-sensitive findings, Sentinel also uses a narrow dimensional reasoning layer. This is not a blanket pass over every detector; it is applied only on selected monetary paths where unit confusion can materially affect exploitability and promotion confidence.

Services

Typical local services:

  • web at http://localhost:3000
  • backend at http://localhost:8787
  • workers/llm-worker at http://localhost:8788
  • api/slither at http://localhost:8080

One-Time Setup

  1. Install dependencies:
cd web && bun install
cd ../backend && bun install
cd ../workers/llm-worker && bun install
cd ../..
  1. Prepare env files:
  • web/.env.local
    • set NEXT_PUBLIC_API_URL=http://localhost:8787
  • backend/.dev.vars
    • local backend worker config
  • workers/llm-worker/.dev.vars
    • copy from workers/llm-worker/.dev.vars.example
  1. Make sure Docker Desktop is running for Slither.

Local Run

From D:\projects\audit\apps:

bun run dev:slither
bun run dev:echidna
bun run dev:llm
bun run dev:backend
bun run dev:web

Local Verification

bun run check:local

This currently verifies:

  • web TypeScript compiles
  • backend test suite passes
  • llm-worker local-safe tests pass

Important Local Notes

  • local report structuring happens after a job reaches READY, not when the report page refreshes
  • local backend should use LLM_WORKER_URL=http://127.0.0.1:8788
  • backend and llm-worker must share the same LLM_WORKER_TOKEN
  • local backend sends inline findings to the local worker because Wrangler's simulated local R2 is not shared across services
  • backend/.dev.vars should point SLITHER_RUNNER_URL at http://localhost:8080

What AI Is Responsible For

AI is used as a constrained layer on top of deterministic inputs.

Good uses:

  • structuring normalized findings
  • exploitability-oriented triage
  • fix generation
  • bounty dossier generation
  • Echidna harness planning/generation
  • proof-plan generation for deterministic tests and manual PoCs
  • deterministic test generation and execution with repo-aware Foundry/Hardhat selection

Bad uses:

  • inventing vulnerabilities from raw detector output
  • promoting dependency noise into first-party findings
  • acting like a one-click replacement for human audit judgment

What Still Matters Most

The next serious quality leap is not "more AI."

It is automatic repo-local semantic extraction and stronger promotion policy:

  • who can call this
  • what protects it
  • what input is attacker-controlled
  • what state changes before and after sensitive calls
  • whether the code is first-party or vendored

That is the path from "AI-assisted scanner" to "real audit triage system."

Validation Model

Sentinel now treats validation as a first-class lane after findings are structured.

  • fuzz_target
    • suitable for Echidna harness generation and counterexample hunting
  • deterministic_test
    • better validated through a crisp transaction sequence, Foundry test, and state assertions
  • manual_poc
    • needs an attacker walkthrough, evidence capture, and reviewer-facing proof
  • review_only
    • useful context, but not worth automated proof generation

Validation evidence can strengthen a finding:

  • successful Echidna counterexamples are merged back into report findings
  • successful deterministic tests from Foundry or Hardhat are merged back into report findings
  • validated findings rise in report ordering
  • bounty dossiers and exports now carry validation posture explicitly

Bounty Workflow

Sentinel's bounty mode is not just a themed report.

It now supports:

  • scope setup on /bounty
  • fresh audits with bounty scope attached from the start
  • attaching bounty scope to existing completed audits
  • scope-aware LLM triage during structuring
  • bounty dossier generation
  • bounty pack export
  • proof planning for manual PoC and deterministic test work

Evaluation Harnesses

Sentinel now has three complementary trust harnesses inside workers/llm-worker:

  • snippet gold set
    • verdict quality on isolated cases
  • repo benchmark set
    • repo-shaped regression fixtures
  • auditor review set
    • whether a finding is strong enough to deserve the main report

Useful commands:

cd workers/llm-worker
bun run test:triage
bun run eval:triage
bun run eval:triage:repo
bun run eval:triage:auditor

Production Learning Loop

Sentinel now has the start of a real production-learning loop.

For each completed audit, the backend stores a private audit intelligence artifact in R2 under:

  • results/internal/audit-intelligence/<jobId>.json

The goal is to make real audit runs useful for improving Sentinel over time without relying on a fixed target list. These artifacts are designed for:

  • batch download and offline analysis
  • finding detector families that still create noisy needs_review output
  • spotting low semantic-fact coverage in real repos
  • building new benchmark fixtures and auditor review cases from real usage

More Docs

  • backend/README.md
  • web/README.md
  • workers/llm-worker/README.md
  • ROADMAP.md

About

Selective open-source smart contract security tooling. AI-assisted triage harnesses, repo-aware Slither helpers, and validation workflow components.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors