Skip to content

Audit remediation + autonomous skill dispatcher + LLM decision capture#1

Open
perlantir wants to merge 56 commits intomainfrom
fix/audit-phase-0-cleanup
Open

Audit remediation + autonomous skill dispatcher + LLM decision capture#1
perlantir wants to merge 56 commits intomainfrom
fix/audit-phase-0-cleanup

Conversation

@perlantir
Copy link
Copy Markdown
Owner

@perlantir perlantir commented Apr 14, 2026

Summary

All hermulti-side work for the hipp0 GBrain-tier program. Scope expanded from initial audit remediation to include Phase 2 (signal detector) and Phase 5 (autonomous skill dispatcher, Tier B).

Audit remediation (earlier commits)

  • Hermes closed outcome loop: turn-boundary record_outcome, outcome_signals.py, reflection NULL-outcome backfill, closed-loop integration tests
  • Async/sync stability fixes, multi-agent context routing, WAL race-safe drain, prune_sessions N+1
  • Cost governor (daily budget), routing classifier with feedback edge, outcome inference drift detector
  • Bench hot-path gates + CI drift monitors
  • See earlier commits for full detail

Phase 2: signal detector (1 commit)

  • feat(signals): extract_decision_signals() + DecisionSignal dataclass added to agent/outcome_signals.py; record_decision() added to Hipp0MemoryProvider; wired into turn loop with try/except fallback. Regex-based (5 patterns for explicit decisions, rejections, confidence inference). 5 tests.

Phase 5: autonomous skill dispatcher (Tier B, 5 commits)

Upgrades from regex-only signal capture to a full skill execution framework driven by LLM with structured-output actions.

  • feat(skills): SkillLoader parses RESOLVER.md + SKILL.md files. Custom YAML frontmatter parser (no external dep). Reads from HIPP0_SKILLS_DIR or defaults to /root/audit/hipp0ai/skills.
  • feat(skills): TriggerMatcher — regex-first matching of events against skill triggers, with optional LLM-classifier fallback for ambiguous events. Pre-compiles trigger phrases to EventType tags and literal-substring regexes.
  • feat(skills): SkillRunner — builds LLM prompt from skill body + event, parses structured JSON actions, dispatches to hipp0_provider.record_decision / record_outcome / log / noop. Fully typed via LLMClient and Hipp0ProviderProto Protocols — no hard coupling to concrete classes.
  • feat(skills): SkillDispatcher orchestrator with priority ordering:
    • brain-ops READ phase fires FIRST on PRE_TASK events
    • signal-detector runs in PARALLEL (fire-and-forget, never blocks) on INBOUND/OUTBOUND messages
    • Other matched skills run SEQUENTIALLY (awaited)
    • brain-ops WRITE phase fires LAST on POST_DECISION / POST_OUTCOME
  • feat(skills): Turn loop integration_get_skill_dispatcher() lazy-init helper on AIAgent. When enabled, dispatches SkillEvent(OUTBOUND_MESSAGE) via fire-and-forget asyncio.create_task(). Regex fallback preserves Phase 2 semantics when dispatcher is off.

Uses hermulti's existing auxiliary_client.py via AuxiliaryLLMAdapter (prefers async clients, falls back to sync call_llm through thread executor).

Config flags

  • HIPP0_SKILL_DISPATCHER=auto|on|off (default: auto, enabled iff LLM configured)
  • HIPP0_SKILL_LLM_PROVIDER=anthropic|codex|openai-codex (default: probe all)
  • HIPP0_SKILLS_DIR (default: /root/audit/hipp0ai/skills)

Test plan

  • pytest tests/skills/ — 122 tests covering loader (5), matcher (9), runner (8), dispatcher (8) + integration
  • pytest tests/agent/test_decision_signals.py — 5 Phase 2 regex tests
  • Combined: 127 tests green
  • py_compile run_agent.py parses cleanly after integration edit
  • Dispatcher verified fire-and-forget — cannot block turn loop on LLM slowness/failure
  • Regex fallback preserves Phase 2 behavior when dispatcher disabled
  • All earlier audit-remediation tests pass
  • Pre-commit hook (em-dash/secret scrub) passes on all commits

Companion PR

Paired with hipp0ai/#3 which adds the hipp0-side GBrain-tier work (phases 1-5).

root added 28 commits April 14, 2026 06:04
Remove unused typing.Any import in gateway/persistent_agent_router.py.
Simple pure helper that classifies a user turn as "positive"/"negative"/None
based on explicit feedback markers. Used by the upcoming turn-boundary
record_outcome hook and the reflection NULL-outcome backfill to close the
signal loop.
…_turn

Before this, record_outcome had zero callers in run_agent.py — the outcome
column on sessions was only ever set by gateway/telegram reaction handlers,
leaving the CLI path (and most sessions) permanently NULL. Now every
completed turn tries to infer a coarse positive/negative signal from user
feedback markers and persists it via DecisionDB.record_outcome. Wrapped in
try/except so outcome recording can never break the turn.
Sessions that stay NULL for more than AGED_NULL_OUTCOME_DAYS (3) never
received a reaction signal and are unlikely to. Run the turn-boundary
heuristic over the last user message and, when it yields a confident
label, persist it so subsequent reflection cycles can actually learn
from the session instead of bucketing it under "no outcome recorded".
Neutral/unknown cases are left NULL by design.
…eline

Exercises all three Phase 1 pieces composed: infer_outcome_from_turn ->
SessionDB.record_outcome -> reflection._query_sessions backfill + bucket
assignment. Uses a temp state.db via the isolate_hermes_home conftest
fixture and monkeypatches reflection._state_db_path to point at it.

Also fixes a stale DecisionDB class reference in outcome_signals.py
docstring (the class is actually SessionDB).
…loop

Calling asyncio.run() from within a thread that already owns a running
event loop (the gateway dispatches handle_message via an executor) raises
RuntimeError. Mirror the thread-pool bridge pattern used in
agent/context_references.py so the Anthropic image fallback works in
both sync CLI paths and gateway concurrency.
asyncio.get_event_loop().run_until_complete() inside a coroutine (or any
thread with a live loop) is a hard crash under Python 3.12's stricter
event-loop policy. Make gather_reflection_input an async function that
awaits _try_compile_context directly, and have run_reflection call it
rather than duplicating the inline session/tool-usage/memory gather.

Top-level cron entry (run_reflection_job) still owns the only
asyncio.run() call — the ticker dispatches it from a worker thread with
no running loop, which is the supported pattern.
Exercises the two async-in-sync hazards Phase 2 fixed:

* _describe_image_for_anthropic_fallback called directly from a live
  event loop (plus 10-way asyncio.to_thread fan-out) — would raise
  "asyncio.run() cannot be called from a running event loop" without
  the thread-pool bridge.
* gather_reflection_input awaited from inside a running loop — would
  raise via get_event_loop().run_until_complete() without the async
  conversion.

Verified the suite fails against HEAD~2 (pre-fix) and passes against
HEAD.
Adds classify_task() heuristic that routes delegate invocations to
the cheapest compile mode: skip-compile for self-contained tasks,
technical/full compile for debugging, user/fast for preferences,
and a default fast compile otherwise. Wired into invoke() so the
right context reaches the subagent without always hitting the
broad full-compile path.
Adds PersistentDelegateTool.invoke_batch: a single broad compile
against the joined tasks, then per-subagent slicing of the returned
decisions via token-overlap scoring. Each subagent's invoke() accepts
a precompiled CompiledContext so the broad-compile output can be
passed through without re-hitting /api/compile.

Same-project guard: mixed-project batches fall back to the unchanged
per-task parallel path to avoid cross-project context leakage.

The natural run_agent.py fan-out site (_execute_tool_calls_concurrent)
uses a thread pool over sync tool handlers; wiring the async batch
path through it is out of scope for this commit. Callers and the
integration test exercise invoke_batch directly; the TTL cache in
the next commit absorbs the same fan-out when the batch path isn't
used.
…ntent

Adds _drop_redundant_compiled: for each compiled decision, compute
token-overlap against the tail of the parent's recent messages; drop
decisions already carried by the conversation (>=80% token overlap).
If the whole compile is redundant, zero it out so the prompt block
doesn't nag the model with duplicates. Wired into invoke() via
parent_agent._session_messages or an explicit recent_messages kwarg.
In-memory dict keyed on (project_id, sha256(task)[:16], fast_mode,
namespace) with a 300s TTL. Consulted before provider.compile() and
populated with non-degraded results only (avoids pinning the agent
in degraded mode past the 5m window). Absorbs N-subagent fan-out
when invoke_batch isn't used.
tests/test_task_classifier.py: pure unit coverage for classify_task
(self-contained skip, technical/user namespace routing, default).

tests/integration/test_multiagent_routing.py: spawns 3 heterogeneous
delegate tasks via invoke_batch, asserts exactly 1 compile call
reaches the mock HIPP0, each subagent gets a task-relevant slice,
user_facts propagate to every subagent, and a separate case that
verifies the 5m TTL cache absorbs a repeat compile.
Skills can now be auto-created from reflection proposals, capped at 1 per
cycle and gated by an evidence eval: the candidate must be anchored to at
least one NEGATIVE-outcome session in the lookback window that mentions a
topic token from the proposed skill name/hint. Proposals without prior-
failure evidence are logged as skill_eval_gate_failed and skipped.
…delta

After a skill is created by the reflection gate, wire it into the outcome
ledger so its value can be measured:

- Baseline: same-topic outcomes in the 7d BEFORE creation are written as
  kind='baseline' rows in a new skill_outcomes table (created on demand).
- Matches: up to 3 recent sessions whose first-user-message contains a
  topic token are written as kind='match' rows.
- New record_skill_outcome_for_session hook lets the outcome pipeline
  append kind='post' rows so an A/B delta can be computed later.

An A/B summary (baseline total / positive / negative / ratio) is appended
to the reflection log at registration time.
Add _propose_unused_skill_deprecation(), called once per reflection cycle.
Cross-references skill SKILL.md mtimes against the reflection input's
tool_usage map: skills with mtime >30d old whose name tokens do not appear
in any recently-used tool are logged as skill_deprecation_proposal entries
in the reflection log. Never auto-deletes — a human reviews the log.
Add _prune_reflection_log() invoked at the end of each reflection cycle.
Reads the per-agent reflection_log.jsonl, drops JSON entries whose
timestamp is older than REFLECTION_LOG_RETENTION_DAYS (180), and rewrites
atomically via a .tmp sibling. Fail-open for malformed lines or entries
missing a timestamp — we keep them rather than silently destroying rows
we can't parse.
… + 2m cooldown

Prevent pileup on dead HIPP0 by short-circuiting compile() to
degraded-mode while OPEN. CLOSED -> OPEN after 3 unavailable events
inside a 60s sliding window; HALF_OPEN probe after 2m cooldown; CLOSE
on probe success; re-OPEN on probe failure.
4xx replay failures are contract bugs — don't silently drop. Move them
to dead_letter.jsonl next to pending.jsonl with timestamp, status, and
error body for operator inspection. Dead-lettered entries are never
retried.

Adds `hermes wal status` CLI showing per-agent WAL depth, dead-letter
depth, and oldest entry age.
Prepend `[STALE MEMORY: last successful compile {N}m ago]` to the
rendered compile block when the circuit breaker is OPEN or the last
successful compile is older than 30 minutes, so the model knows recall
may be out of date.
The `f.get("key") or f.get("fact_key")` fallback masked a HIPP0
contract bug: legacy `fact_key`/`fact_value` entries kept rendering
under the current endpoint shape. Require `key` strictly; log-and-drop
malformed entries so contract violations surface.
Cap the compile-context fetch in gather_reflection_input at 5s so a
slow/dead HIPP0 can't stall the reflection pipeline. On timeout, log
and proceed without compiled context.
Bump schema to v9: rebuild messages_fts with session_id, role and
timestamp UNINDEXED so session/role filters can be served directly
from the FTS layer instead of round-tripping to the messages table.
Migration drops the old virtual table and triggers, recreates them
with the new column set, and repopulates from messages.

Add a 5-minute TTL in-process cache on list_sessions_rich() for the
top-10 recent sessions hot path (offset=0, limit<=10). Dashboard
polling absorbs the churn without hammering SQLite.
Add TrajectoryCompressor.compress_many_async(entries) for ad-hoc batch
callers: runs process_entry_async for every entry concurrently behind
an asyncio.Semaphore(10) via asyncio.gather, preserving input order.
Caps outbound LLM summarization fan-out independently of
max_concurrent_requests (which governs the full-directory pipeline).
…_rough

Route all ad-hoc `len(x) // 4` token-estimation sites through the
existing `agent.model_metadata.estimate_tokens_rough` /
`estimate_messages_tokens_rough` helpers so capacity, compression, and
cost-estimate math share one source of truth.

Sites swapped:
- trajectory_compressor.count_tokens() tokenizer fallback
- agent/prompt_builder.estimate_prompt_tokens()
- agent/hipp0_memory_provider degraded-compile fallback
- tools/skills_tool._estimate_tokens()
- gateway/platforms/web_platform (transcript/soul/input/output estimates)
- scripts/sample_and_compress tokenizer fallback

Adds tests/test_token_estimation.py with fixture messages and
sensible-range assertions.
…+ CI gate

Phase 10 final gate. Extends tests/integration/test_closed_loop.py to cover
the full task -> subagent -> compile -> outcome inference -> record_outcome
-> reflection backfill -> recompile re-ranks chain end-to-end, using a
FakeHipp0Provider that records every call and simulates the hipp0-side
trust-multiplier effect to prove HERMES-side wiring.

Adds two failure-mode tests:
  - record_outcome silently dropped -> second compile keeps baseline ranking
  - provider.record_outcome raises -> caller surfaces the error

Adds a dedicated closed-loop CI job to .github/workflows/tests.yml so a
regression fails with a clear signal independent of the main test suite.
_drain_wal() read the WAL into memory then rewrote it with
write_text() — a concurrent _wal_append between those two steps
would be silently clobbered. Serialize both operations under an
asyncio.Lock and use atomic replace for rewrites.

WAL and dead_letter.jsonl hold full conversation payloads but were
created with umask defaults (commonly 0o644), exposing memory to
other local users on shared hosts. Create both via os.open() with
mode 0o600.

Tests added:
  - test_wal_files_are_mode_0o600 — permission regression
  - test_drain_rewrite_preserves_concurrent_append — race regression
prune_sessions_older_than() issued one DELETE FROM messages + one
DELETE FROM sessions per session_id in a Python loop. Replace with
two IN-list statements so a 10k-session prune collapses from 20k
round-trips to 2.
@github-actions
Copy link
Copy Markdown

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: marshal/pickle/compile usage

These can deserialize or construct executable code objects.

Matches:

47:+# Circuit breaker tuning for compile(). Three unavailable events inside
61:+    """Minimal circuit breaker for Hipp0MemoryProvider.compile().
126:+    # Minutes since the provider's last successful compile(). Set when
180:+        # Wall-clock timestamp of the last successful compile(). Used by
200:+            return self._degraded_compile(
1692:+    async def noop_compile(_name):
1732:+    async def noop_compile(_name):
1973:+    Records every compile() and record_outcome() call, and uses its own
1974:+    outcome state to re-rank decisions on subsequent compile() calls.
1983:+    async def compile(self, task_description: str, **kwargs):

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

@perlantir perlantir changed the title Audit phases 0-10 + security/perf close-out Audit phases 0-15 + security/perf close-out Apr 14, 2026
root added 3 commits April 14, 2026 19:54
…ter)

Pure-CPU microbenches on the per-turn classifiers: perf_counter sampler
with baseline JSON in tests/bench/budgets.json, 1.4x tolerance. CI runs
serial (no xdist) so worker variance doesn't pollute p95. HERMES_BENCH_UPDATE=1
reseeds the baseline.
…comes

Handler aggregates the routing-outcomes JSONL and returns per-class
decision_count / outcome distribution / positive_rate. Auth-gated via
the existing _check_auth path so it respects HERMES_API_TOKEN.

Consumed by Phase 13's nightly threshold-tuning job and on-demand
dashboards; returns 503 when tools.routing_outcomes can't be imported
(e.g. trimmed-down distribution) rather than taking the server down.
Adds FaultyHipp0Provider with switchable compile/record faults:
- compile: hipp0_500, circuit_open, budget_exceeded (BudgetExceeded)
- record:  wal_full (OSError ENOSPC), circuit_open

Three new parametrized tests:
1. compile faults must raise a distinguishable exception the turn loop
   can match on (no silent swallowing).
2. record_outcome faults on WAL / circuit must propagate typed errors,
   while the local SessionDB record path stays functional so the turn
   loop keeps making progress with the remote down.
3. recovery test: after a transient compile fault, the next compile
   succeeds (guards against sticky-failure regressions).

13/13 closed-loop tests green (8 existing + 5 new).
@github-actions
Copy link
Copy Markdown

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: marshal/pickle/compile usage

These can deserialize or construct executable code objects.

Matches:

556:+# Circuit breaker tuning for compile(). Three unavailable events inside
570:+    """Minimal circuit breaker for Hipp0MemoryProvider.compile().
635:+    # Minutes since the provider's last successful compile(). Set when
689:+        # Wall-clock timestamp of the last successful compile(). Used by
709:+            return self._degraded_compile(
3014:+    async def noop_compile(_name):
3054:+    async def noop_compile(_name):
3295:+    Records every compile() and record_outcome() call, and uses its own
3296:+    outcome state to re-rank decisions on subsequent compile() calls.
3305:+    async def compile(self, task_description: str, **kwargs):

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

root and others added 6 commits April 15, 2026 07:33
…or passive decision capture

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…r as fallback

Adds agent/skills/llm_adapter.py (AuxiliaryLLMAdapter) bridging the
dispatcher's minimal LLMClient Protocol to hermulti's auxiliary_client
primitives. In run_agent.py, the existing regex-only decision-signal
capture now first tries the SkillDispatcher (fire-and-forget
OUTBOUND_MESSAGE event) and falls back to extract_decision_signals when
the dispatcher is disabled or no LLM is configured.
@github-actions
Copy link
Copy Markdown

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: marshal/pickle/compile usage

These can deserialize or construct executable code objects.

Matches:

556:+# Circuit breaker tuning for compile(). Three unavailable events inside
570:+    """Minimal circuit breaker for Hipp0MemoryProvider.compile().
635:+    # Minutes since the provider's last successful compile(). Set when
689:+        # Wall-clock timestamp of the last successful compile(). Used by
709:+            return self._degraded_compile(
4070:+    async def noop_compile(_name):
4110:+    async def noop_compile(_name):
4351:+    Records every compile() and record_outcome() call, and uses its own
4352:+    outcome state to re-rank decisions on subsequent compile() calls.
4361:+    async def compile(self, task_description: str, **kwargs):

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

@perlantir perlantir changed the title Audit phases 0-15 + security/perf close-out Audit remediation + autonomous skill dispatcher + LLM decision capture Apr 15, 2026
root and others added 6 commits April 15, 2026 17:26
All in-scope test directories (bench, cron, e2e, environments,
honcho_plugin, plugins) pass cleanly. Document the 4 cron tests that
skip when the optional croniter package is unavailable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ateway progress topics

- test_models_dev: add autouse fixture to save/restore module-global
  _models_dev_cache. Was overwriting it with SAMPLE_REGISTRY, breaking
  downstream opencode-go detection via list_authenticated_providers.
- test_run_progress_topics: pre-load tools.terminal_tool so the tool
  registry is populated before gateway.run emits progress. Previously
  the emoji depended on test ordering (registry loaded vs not).
- test_internal_event_bypass_pairing: redirect gateway.pairing.PAIRING_DIR
  to tmp_path so the _rate_limits.json state does not leak between tests
  via the real ~/.hermes/platforms/pairing directory.
@github-actions
Copy link
Copy Markdown

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: marshal/pickle/compile usage

These can deserialize or construct executable code objects.

Matches:

574:+# Circuit breaker tuning for compile(). Three unavailable events inside
588:+    """Minimal circuit breaker for Hipp0MemoryProvider.compile().
653:+    # Minutes since the provider's last successful compile(). Set when
707:+        # Wall-clock timestamp of the last successful compile(). Used by
727:+            return self._degraded_compile(
4336:+    async def noop_compile(_name):
4376:+    async def noop_compile(_name):
4617:+    Records every compile() and record_outcome() call, and uses its own
4618:+    outcome state to re-rank decisions on subsequent compile() calls.
4627:+    async def compile(self, task_description: str, **kwargs):

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

root and others added 3 commits April 15, 2026 18:25
Companion test to hipp0ai e2e scenario 04. Wires a RecordingLLM returning
a record_decision action and a RecordingProvider, then dispatches an
OUTBOUND_MESSAGE and asserts the action propagates to
provider.record_decision() with the expected payload.

Lives under tests/integration/ so the fast unit loop can skip it via
collection-time filtering; relies on /root/audit/hipp0ai/skills/ being
mounted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Hipp0MemoryProvider.record_decision was POSTing {title, content, ...}
  to /api/decisions. hipp0 rejects this: the route requires description
  (not content) and project_id. Without these every captured decision
  silently swallowed a 400 in the try/except and returned False.
- test_multi_turn_conversation used a literal string for session_id and
  `content` on the decision payload. hipp0 enforces session_id is a
  UUID, so the test always failed. Register a hermes agent, start a
  real session to obtain a UUID, and use `description` on the decision.
@github-actions
Copy link
Copy Markdown

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Outbound network calls (POST/PUT)

Outbound POST/PUT requests in new code could be data exfiltration. Verify the destination URLs are legitimate.

Matches (first 10):

4196:+    proj = httpx.post(
4206:+    reg = httpx.post(
4219:+    httpx.post(
4233:+    start = httpx.post(
4246:+    end = httpx.post(

⚠️ WARNING: marshal/pickle/compile usage

These can deserialize or construct executable code objects.

Matches:

574:+# Circuit breaker tuning for compile(). Three unavailable events inside
588:+    """Minimal circuit breaker for Hipp0MemoryProvider.compile().
653:+    # Minutes since the provider's last successful compile(). Set when
707:+        # Wall-clock timestamp of the last successful compile(). Used by
727:+            return self._degraded_compile(
4621:+    async def noop_compile(_name):
4661:+    async def noop_compile(_name):
4902:+    Records every compile() and record_outcome() call, and uses its own
4903:+    outcome state to re-rank decisions on subsequent compile() calls.
4912:+    async def compile(self, task_description: str, **kwargs):

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

root and others added 5 commits April 15, 2026 19:11
Wipes tools.approval module-level dicts (_gateway_queues,
_gateway_notify_cbs, _session_approved, _permanent_approved, _pending)
before and after every approval-related test in tests/gateway/ so
xdist workers cannot observe torn state from sibling runs.

tools/approval.py already serializes every mutation through _lock,
so thread-safety of the module itself is already covered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
save_permanent_allowlist() iterates via list(patterns) outside the lock.
If another thread calls approve_permanent() during that iteration,
CPython raises "Set changed size during iteration". Three call sites
(check_dangerous_command, check_all_command_guards x2) previously passed
the live set; now they copy under _lock first.

Completes the thread-safety audit started in 5cda48f. The autouse
isolation fixture already covers cross-test state pollution under xdist.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

⚠️ Supply Chain Risk Detected

This PR contains patterns commonly associated with supply chain attacks. This does not mean the PR is malicious — but these patterns require careful human review before merging.

⚠️ WARNING: Outbound network calls (POST/PUT)

Outbound POST/PUT requests in new code could be data exfiltration. Verify the destination URLs are legitimate.

Matches (first 10):

4268:+    proj = httpx.post(
4278:+    reg = httpx.post(
4291:+    httpx.post(
4305:+    start = httpx.post(
4318:+    end = httpx.post(

⚠️ WARNING: marshal/pickle/compile usage

These can deserialize or construct executable code objects.

Matches:

574:+# Circuit breaker tuning for compile(). Three unavailable events inside
588:+    """Minimal circuit breaker for Hipp0MemoryProvider.compile().
653:+    # Minutes since the provider's last successful compile(). Set when
707:+        # Wall-clock timestamp of the last successful compile(). Used by
727:+            return self._degraded_compile(
4778:+    async def noop_compile(_name):
4818:+    async def noop_compile(_name):
5059:+    Records every compile() and record_outcome() call, and uses its own
5060:+    outcome state to re-rank decisions on subsequent compile() calls.
5069:+    async def compile(self, task_description: str, **kwargs):

Automated scan triggered by supply-chain-audit. If this is a false positive, a maintainer can approve after manual review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant