feat: add IDOR testing as a 6th parallel pipeline#154
Closed
mesutgungor wants to merge 166 commits intoKeygraphHQ:mainfrom
Closed
feat: add IDOR testing as a 6th parallel pipeline#154mesutgungor wants to merge 166 commits intoKeygraphHQ:mainfrom
mesutgungor wants to merge 166 commits intoKeygraphHQ:mainfrom
Conversation
fixes
Simplified
typo
italics
assets
Updated Discord invite links in README.md to use a permanent invite link that will not expire. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
chore: added logging
…he actual content
Simplified deliverable management by removing automatic copying to ~/Documents/pentest-deliverables/. All deliverables now remain only in <target-repo>/deliverables/, eliminating file duplication and improving UX. Changes: - Removed savePermanentDeliverables() function from src/setup/deliverables.js - Removed function call and related console output from shannon.mjs - Removed unused 'os' import from deliverables.js 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Remove unnecessary screenshot storage to reduce file I/O and disk usage: - Removed screenshot directory creation - Removed --output-dir flag from Playwright MCP setup - Agents can still take screenshots, but they won't persist to disk Screenshots were not being used by any part of Shannon for analysis or reporting, making their storage unnecessary overhead. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…healing
## Unified Audit System (v3.0)
- Implemented crash-safe, append-only logging to audit-logs/{hostname}_{sessionId}/
- Added session.json with comprehensive metrics (timing, cost, attempts)
- Agent execution logs with turn-by-turn detail
- Prompt snapshots saved to audit-logs/.../prompts/{agent}.md
- SessionMutex prevents race conditions during parallel execution
- Self-healing reconciliation before every CLI command
## Session Metadata Standardization
- Fixed critical bug: standardized on 'id' field (not 'sessionId') throughout codebase
- Updated: shannon.mjs (recon, report), src/phases/pre-recon.js
- Added validation in AuditSession to fail fast on incorrect field usage
- JavaScript shorthand syntax was causing wrong field names
## Schema Improvements
- session.json: Added cost_usd per phase, removed redundant final_cost_usd
- Renamed 'percentage' -> 'duration_percentage' for clarity
- Simplified agent metrics to single total_cost_usd field
- Removed unused validation object from schema
## Legacy System Removal
- Removed savePromptSnapshot() - prompts now only saved by audit system
- Removed target repo pollution (prompt-snapshots/ no longer created)
- Single source of truth: audit-logs/{hostname}_{sessionId}/prompts/
## Export Script Simplification
- Removed JSON export mode (session.json already exists)
- CSV-only export with clean columns: agent, phase, status, attempts, duration_ms, cost_usd
- Tested on real session data
## Documentation
- Updated CLAUDE.md with audit system architecture
- Added .gitignore entry for audit-logs/
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Reasoning: - Shannon is a local CLI tool with direct filesystem access - Manual file editing (JSON, rm -rf) is simpler than reconciliation script - Automatic reconciliation runs before every command (built-in) - If auto-reconciliation has bugs, fix the code, don't create workarounds - Over-engineered for a local development tool For recovery: Just delete .shannon-store.json or edit JSON files directly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- Added comprehensive header comment explaining use case - Documents data source (session.json from audit-logs) - CSV output format and use cases clearly described - Includes usage examples and note about raw data access - Removes need for separate docs/ folder in repo Docs were design artifacts, not needed in open source repo. All relevant documentation now lives in code comments. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Reasoning:
- Pollutes target repo with run-metadata.json
- Redundant with audit system (session.json has all metadata)
- Less useful than comprehensive audit logs
- Target repos should stay clean - only deliverables belong there
All debugging info now lives in audit-logs/{hostname}_{sessionId}/session.json
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
ROOT CAUSE: - Exploitation phase checked session.validationResults to determine eligibility - validationResults field was removed during audit system refactor - Field never existed in session schema, so all exploits were skipped THE FIX: - Exploitation phase now validates queue files directly when checking eligibility - Reads exploitation_queue.json and checks if vulnerabilities array is non-empty - No need to store validation results - just re-validate on demand CHANGES: 1. runParallelExploit() now calls safeValidateQueueAndDeliverable() directly 2. Removed validationResults parameter from markAgentCompleted() 3. Simplified calculateVulnerabilityAnalysisSummary() - no longer needs validation data 4. Simplified calculateExploitationSummary() - no longer needs validation data IMPACT: - Exploitation agents will now run when vulnerabilities are found - Queue files are the single source of truth for eligibility - Simpler architecture - no duplicate state storage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
…c-ai/claude-agent-sdk Anthropic rebranded the SDK in 2025 from "Claude Code SDK" to "Claude Agent SDK". Updated all references across package.json, Dockerfile, and documentation to use the current @anthropic-ai/claude-agent-sdk package. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Remove unused files and exports to improve codebase maintainability: Phase 1 - Deleted files (5): - login_resources/generate-totp-standalone.mjs (replaced by MCP tool) - mcp-server/src/tools/index.js (unused barrel export) - mcp-server/src/utils/index.js (unused barrel export) - mcp-server/src/validation/index.js (unused barrel export) - src/agent-status.js (deprecated 309-line status manager) Phase 2 - Removed unused exports (3): - mcp-server/src/index.js: shannonHelperServer constant - mcp-server/src/utils/error-formatter.js: createFileSystemError function - src/utils/git-manager.js: cleanWorkspace (now internal-only) Phase 3 - Unexported internal functions (4): - src/checkpoint-manager.js: runSingleAgent, runAgentRange, runParallelVuln, runParallelExploit (internal use only) All Shannon CLI commands tested and verified working. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Strip _shannon-* suffix from workflow IDs so logs command finds audit-logs stored under the workspace name.
…-and-claude-cli feat: add MSYS path fix, Claude Code CLI, and Windows instructions
- Early exit when all agents already completed instead of running empty workflow - Descriptive error when deliverables missing from disk despite session.json success - Quote $WORKSPACE in shannon CLI to prevent word splitting
Query functionality is redundant with the Temporal Web UI at http://localhost:8233. Removes query.ts, CLI handler, npm script, and all documentation references.
feat: add named workspaces with resume support
- Delete 4 dead files: pre-recon.ts, tool-checker.ts, input-validator.ts, environment.ts - Remove runClaudePromptWithRetry() and its now-unused imports from claude-executor.ts - De-export unused symbols: AGENT_ORDER, getParallelGroups, logError, isRouterMode, showHelp, displayTimingSummary - De-export unused types: ProcessingState, ProcessingResult, SdkMessage, MessageDispatchResult, MessageDispatchContext - Remove dead import (path from zx) in session-manager.ts and deprecated comment in config.ts
- Delete unused src/cli/ui.ts, remove zod dependency, drop 4 dead functions (logError, handleToolError, getRetryDelay, displayTimingSummary) - Remove 8 unused types/interfaces and 3 duplicate formatting utils from audit/utils.ts - Narrow export surface: make 7 message-handler functions private, remove unused audit re-exports, unexport AgentDefinition and path constants - Remove unused runClaudePrompt params (sessionMetadata, attemptNumber) and update caller - Enable tsconfig noUnusedLocals, noUnusedParameters, noImplicitReturns, noImplicitOverride, noFallthroughCasesInSwitch
- Remove 4 duplicate file I/O functions from audit/utils.ts, re-export from utils/file-io.ts - Consolidate AgentEndResult interface into new types/audit.ts - Use exported AgentDefinition from types/agents.ts in session-manager.ts - Rename AgentMetrics to AgentAuditMetrics to disambiguate from temporal/shared.ts
…cation - Add DI container (src/services/) with AgentExecutionService, ConfigLoaderService, and ExploitationCheckerService — pure domain logic with no Temporal dependencies - Introduce Result<T, E> type and ErrorCode enum for code-based error classification in classifyErrorForTemporal, replacing scattered string matching - Consolidate billing/spending cap detection into utils/billing-detection.ts with shared pattern lists across message-handlers, claude-executor, and error-handling - Extract LogStream abstraction for append-only logging with backpressure, used by both AgentLogger and WorkflowLogger - Simplify activities.ts from inline lifecycle logic to thin wrappers delegating to services, with heartbeat and error classification - Expand config-parser with human-readable AJV errors, security validation, and rule type-specific checks
- Add ActivityLogger interface wrapping Temporal's Context.current().log - Thread logger parameter through claude-executor, message-handlers, git-manager, prompt-manager, reporting, and agent validators - Remove chalk dependency from all service/activity files; CLI files keep console.log for terminal output - Replace colorFn: ChalkInstance parameter with structured logger.info/warn/error calls - Use replay-safe `log` import from @temporalio/workflow in workflows.ts
- Move error-handling, git-manager, prompt-manager, queue-validation, and reporting into src/services/ - Delete src/constants.ts — relocate AGENT_VALIDATORS and MCP_AGENT_MAPPING into session-manager.ts alongside agent definitions - Delete src/utils/output-formatter.ts — absorb filterJsonToolCalls and getAgentPrefix into ai/output-formatters.ts - Extract ActivityLogger interface into src/types/activity-logger.ts to break temporal/ → services circular dependency - Consolidate VulnType, ExploitationDecision into types/agents.ts and SessionMetadata into types/audit.ts - Remove dead timingResults/costResults globals from utils/metrics.ts and all consumers
- Remove empty section markers (// === ... ===, // --- ... ---) that duplicate JSDoc or function names - Remove "what" comments that restate the next line of code (e.g. // Save to disk, // Check for retryable patterns) - Remove file-level descriptions that restate the filename (e.g. // Pure functions for formatting console output) - Fix "Added by client" comment referencing implementation history → "Used for audit correlation" - Preserve all WHY comments: error classification groups, billing/session limit explanations, ESM interop, exactOptionalPropertyTypes, mutex reasoning
…nd agent-execution - client.ts: extract parseCliArgs, resolveWorkspace, buildPipelineInput, display helpers, waitForWorkflowResult from startPipeline - workflows.ts: extract runSequentialPhase, buildPipelineConfigs, aggregatePipelineResults to reduce workflow body - agent-execution.ts: add failAgent private method to deduplicate rollback+audit+error pattern in steps 6-8
- Add // N. Description steps to temporal layer (client, activities, workflows) - Add steps to AI layer (claude-executor: runClaudePrompt, buildMcpServers) - Add steps to services layer (prompt-manager, config-parser, git-manager) - Add steps to audit layer (metrics-tracker, audit-session) - Update CLAUDE.md comment guidelines with clearer numbered-step vs section-divider guidance
docs: add WSL2 setup guide for Windows users
refactor: decompose activities into services layer with structured error handling
- Add preflight activity that validates repo path, config, and credentials before agent execution - Add formatWorkflowError() with pipe-delimited segments for multi-line log rendering - Add remediation hints for common failures (auth, billing, config errors) - Add REPO_NOT_FOUND, AUTH_FAILED, BILLING_ERROR codes with error classification - Add formatErrorBlock() in WorkflowLogger for indented error display
…orkflow-errors.ts
Replaces validateApiKey and validateOAuthToken (direct fetch calls) with a single SDK-based query using claude-haiku-4-5-20251001. Uses SDKAssistantMessageError types for structured error classification and returns human-readable error messages for each failure case.
…ation feat: add preflight validation phase with structured error reporting
ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN were not forwarded to the SDK subprocess environment, causing router mode to fail with "Authentication failed: Invalid API key" as the subprocess hit Anthropic directly with the placeholder key.
…through fix: pass router env vars to SDK subprocess
Implements Insecure Direct Object Reference (IDOR) vulnerability analysis and exploitation as a new parallel agent pair in the pentesting pipeline, addressing a gap in the Business Logic Testing coverage (WSTG-BUSLOGIC). The IDOR agent is distinct from the existing authz agent: authz checks whether access control guards exist on endpoints, while IDOR specifically audits whether object ownership is enforced at the data access layer (e.g. missing AND user_id = $currentUser in queries). Changes: - Add idor-vuln and idor-exploit agents to ALL_AGENTS and VulnType - Add playwright-agent6 for isolated parallel browser execution - Register agents in AGENTS, AGENT_PHASE_MAP, MCP_AGENT_MAPPING, AGENT_VALIDATORS - Add idor to VULN_TYPE_CONFIG in queue-validation service - Include idor_exploitation_evidence.md in final report assembly - Add IDOR_ANALYSIS, IDOR_QUEUE, IDOR_EVIDENCE deliverable types to MCP server - Add runIdorVulnAgent and runIdorExploitAgent activity functions - Wire IDOR into buildPipelineConfigs (pipeline grows from 5 to 6 pairs) - Add vuln-idor.txt: full analysis prompt covering direct/indirect references, mass assignment, cross-object references, and enumeration feasibility - Add exploit-idor.txt: exploitation prompt with sequential enumeration, UUID substitution, filename forging, and write/delete IDOR techniques - Add pipeline-testing variants for fast iteration Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Collaborator
|
Apologies for the close — this was automatically closed due to a maintenance operation on the main branch. Unfortunately GitHub doesn't allow reopening after a history rewrite. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements Insecure Direct Object Reference (IDOR) vulnerability analysis and exploitation as a new parallel agent pair in the pentesting pipeline, addressing a gap in the Business Logic Testing coverage (WSTG-BUSLOGIC).
The IDOR agent is distinct from the existing authz agent: authz checks whether access control guards exist on endpoints, while IDOR specifically audits whether object ownership is enforced at the data access layer (e.g. missing AND user_id = $currentUser in queries).
Changes: