Skip to content

Releases: architehc/selfware

v0.3.0 — Benchmark Harness, Visual Verification, Computer Control

27 Mar 22:52

Choose a tag to compare

What's New

Benchmark Infrastructure

  • selfware bench --suite throughput,multilang --report — one-command benchmark suite
  • selfware auto-config --endpoint <url> --save — auto-detect endpoint capabilities
  • SWE-bench evaluation pipeline with fuzzy patch application (62% apply rate)
  • Multi-language coding benchmark: Rust, Python, JavaScript/TypeScript, Go

Visual Verification System

  • Hard gating: blocks progress on visual assertion failures (configurable)
  • Stuck-loop detection: screenshot hashing detects repeated screens
  • Durable screenshot storage at ~/.selfware/visual_evidence/
  • Assertion lifecycle: set → checked each step → cleared on pass

Computer Control (Linux)

  • Keyboard: type_text, press_key, key_combo via xdotool
  • Mouse: move_to, click, double_click, scroll, drag via xdotool
  • Window management: list/focus/active via wmctrl + xdotool

Memory & Storage

  • Memory consolidation ("sleep") system — sessions auto-compact to long-term storage
  • Encrypted KV store (AES-256-GCM with PBKDF2 key derivation)
  • HNSW batch import for vector store

Code Quality

  • 54-issue review across 20 perspectives, 40+ resolved
  • File decomposition: execution.rs, context_management.rs, config/mod.rs, vector_store.rs split
  • 6,285 tests passing, 0 failures
  • Release profile: LTO thin, strip, codegen-units=1

Security

  • Git tools: path validation via SafetyConfig
  • extra_body validation: blocks model/messages/tools override
  • CI: secret scanning, benchmark regression detection

Benchmark Results (Qwen3.5-27B, 2x4090)

Metric Value
Throughput (32x) 290 tok/s
Multi-language 12/12 (100%)
SWE-bench Lite 5/300 resolved
Patch apply rate 68% (fuzzy)

Full Changelog: v0.2.2...v0.3.0

Selfware v0.2.2

11 Mar 12:27

Choose a tag to compare

Release 0.2.2.

Highlights:

  • fixes queued input handling and retry suppression
  • improves process tracking, reuse, and port reservation
  • adds persistent session/debug logging and debug commands
  • aligns local browser/http/vision workflows
  • includes broader end-to-end coverage and release packaging cleanup

Full Changelog: v0.2.0...v0.2.2

Selfware v0.2.0

09 Mar 13:03

Choose a tag to compare

Selfware v0.1.4

08 Mar 02:48

Choose a tag to compare

Selfware v0.1.3

07 Mar 19:50

Choose a tag to compare

Selfware v0.1.2

07 Mar 18:33

Choose a tag to compare

Selfware v0.1.1

07 Mar 07:27

Choose a tag to compare

What's Changed

📦 Other Changes

  • Fix CI blockers and complete TUI enhancements by @architehc in #17
  • ci: bump actions/upload-artifact from 4 to 6 by @dependabot[bot] in #1
  • ci: bump actions/cache from 4 to 5 by @dependabot[bot] in #2
  • ci: bump softprops/action-gh-release from 1 to 2 by @dependabot[bot] in #3
  • ci: bump codecov/codecov-action from 4 to 5 by @dependabot[bot] in #4
  • ci: bump actions/checkout from 4 to 6 by @dependabot[bot] in #5
  • deps: bump colored from 2.2.0 to 3.1.1 by @dependabot[bot] in #6
  • deps: bump whoami from 1.6.1 to 2.1.1 by @dependabot[bot] in #7
  • deps: bump git2 from 0.19.0 to 0.20.4 by @dependabot[bot] in #8
  • deps: bump nix from 0.28.0 to 0.30.1 by @dependabot[bot] in #10
  • deps: bump toml from 0.8.23 to 1.0.1+spec-1.1.0 by @dependabot[bot] in #11
  • deps: bump thiserror from 1.0.69 to 2.0.18 by @dependabot[bot] in #12
  • deps: bump pulldown-cmark from 0.9.6 to 0.13.0 by @dependabot[bot] in #14
  • deps: bump rusqlite from 0.31.0 to 0.38.0 by @dependabot[bot] in #15
  • deps: bump reqwest from 0.12.28 to 0.13.2 by @dependabot[bot] in #9
  • deps: bump reedline from 0.32.0 to 0.45.0 by @dependabot[bot] in #13
  • Fix pulldown-cmark 0.13 API changes and TUI improvements by @architehc in #18
  • Add developer experience improvements by @architehc in #19
  • Refactor feature flags into semantic groups by @architehc in #20
  • Fix unit tests for context and git modules by @architehc in #21
  • ci: bump actions/checkout from 4 to 6 by @dependabot[bot] in #37
  • ci: bump actions/cache from 4 to 5 by @dependabot[bot] in #35
  • ci: bump github/codeql-action from 3 to 4 by @dependabot[bot] in #36
  • ci: bump actions/download-artifact from 4 to 8 by @dependabot[bot] in #33
  • ci: bump codecov/codecov-action from 4 to 5 by @dependabot[bot] in #34
  • deps: bump the rust-minor group with 3 updates by @dependabot[bot] in #38
  • deps: bump nix from 0.30.1 to 0.31.2 by @dependabot[bot] in #40
  • deps: bump reedline from 0.45.0 to 0.46.0 by @dependabot[bot] in #39
  • deps: bump thiserror from 1.0.69 to 2.0.18 by @dependabot[bot] in #43
  • deps: bump tiktoken-rs from 0.7.0 to 0.9.1 by @dependabot[bot] in #26
  • deps: bump criterion from 0.5.1 to 0.8.2 by @dependabot[bot] in #28
  • deps: bump dirs from 5.0.1 to 6.0.0 by @dependabot[bot] in #24

New Contributors

Full Changelog: v0.1.0...v0.1.1

Selfware v0.1.0

14 Feb 02:52

Choose a tag to compare

Initial Release

Features

  • Core agent framework with PDVR cognitive cycle
  • 53 built-in tools for file operations, git, cargo, search
  • Safety system with hardened shell command validation
  • Path traversal protection with symlink chain detection
  • Checkpoint system for task persistence
  • Multi-agent collaboration support
  • TUI mode with ratatui
  • Docker support

Security

  • Regex-based shell command validation with obfuscation detection
  • Base64-encoded command detection
  • Command chaining detection
  • Symlink attack prevention

Platforms

  • Linux x86_64
  • macOS Apple Silicon (ARM64)
  • Windows x86_64

Note: Linux ARM64 and macOS Intel builds require additional cross-compilation setup.

🤖 Generated with Claude Code