Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
c892591
tests: add data-driven test fixtures for rule matcher
williballenthin Apr 2, 2026
f6d7655
tests: address PR review feedback on matcher fixtures
williballenthin May 11, 2026
81ce4ad
tests: migrate count/range tests from test_match.py to YAML fixtures
williballenthin May 11, 2026
420387a
tests: migrate string/regex tests from test_match.py to YAML fixtures
williballenthin May 11, 2026
8b6b7d8
tests: migrate feature tests from test_match.py to YAML fixtures
williballenthin May 11, 2026
12af3e2
tests: migrate rule composition tests from test_match.py to YAML fixt…
williballenthin May 11, 2026
ebe73c6
tests: migrate evaluation tests from test_rules.py to YAML fixtures
williballenthin May 11, 2026
c34d42d
tests: migrate engine operator tests from test_engine.py
williballenthin May 11, 2026
3111712
tests: migrate dynamic span/call tests to YAML fixtures
williballenthin May 11, 2026
be00371
tests: enable paranoid dual-algorithm verification in fixture tests
williballenthin May 11, 2026
848db44
tests: add negative and edge-case logic fixtures
williballenthin May 11, 2026
50c27bb
tests: add bytes, os-any wildcard, number edge case, and characterist…
williballenthin May 11, 2026
c6de5a9
tests: add scope isolation, file scope, and multi-match fixtures
williballenthin May 11, 2026
36d8cc0
tests: add dynamic process, thread, call, and span boundary fixtures
williballenthin May 11, 2026
4cdfd75
tests: add intermediate namespace prefix matching fixture
williballenthin May 11, 2026
4f2eba2
tests: fix lint issues in test_match_fixtures.py
williballenthin May 11, 2026
7f9322b
tests: fix path resolution after fixtures.py → fixtures/__init__.py move
williballenthin May 11, 2026
934c620
tests: make ppid optional in dynamic fixture process headers
williballenthin Jun 11, 2026
35978bf
tests: remove redundant 3-part instruction line format from DSL parser
williballenthin Jun 11, 2026
accc28e
tests: remove tagged YAML array address form and stale feature list f…
williballenthin Jun 11, 2026
527398e
tests: clarify scope of data-driven matcher tests vs test_rules.py in…
williballenthin Jun 11, 2026
6896529
tests: fix readme formatting
williballenthin Jun 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,7 @@ This release includes Ghidra PyGhidra support, performance improvements, depende

### Development

- tests: add data-driven rule matcher fixtures with a show-features-like DSL and authoring documentation #2985
- doc: document that default output shows top-level matches only; -v/-vv show nested matches @devs6186 #1410
- doc: fix typo in usage.md, add documentation links to README @devs6186 #2274
- doc: add table comparing ways to consume capa output (CLI, IDA, Ghidra, dynamic sandbox, web) @devs6186 #2273
Expand Down
5 changes: 3 additions & 2 deletions tests/fixtures.py → tests/fixtures/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,9 @@
from capa.features.extractors.dnfile.extractor import DnfileFeatureExtractor

logger = logging.getLogger(__name__)
CD = Path(__file__).resolve().parent
FIXTURE_MANIFEST_DIR = CD / "fixtures" / "features"
_FIXTURES_DIR = Path(__file__).resolve().parent
CD = _FIXTURES_DIR.parent
FIXTURE_MANIFEST_DIR = _FIXTURES_DIR / "features"
DNFILE_TESTFILES = CD / "data" / "dotnet" / "dnfile-testfiles"


Expand Down
74 changes: 74 additions & 0 deletions tests/fixtures/matcher/README.md
Comment thread
williballenthin marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
Data-driven matcher tests. Each test pairs a rule fragment, a synthetic feature listing, and the exact matches that capa should report. These test the matcher itself, not end-to-end binary analysis. Tests for rule parsing and representation (e.g. how a rule looks after deserialization) belong in `test_rules.py`, not here.

Fixture files live under `static/` and `dynamic/` directories, organized by theme (e.g. `logic.yml`, `scopes.yml`, `strings.yml`). Flavor is inferred from the directory. The pytest entrypoint and DSL parser both live in `tests/test_match_fixtures.py`.

```sh
pytest -q tests/test_match_fixtures.py
pytest -q tests/test_match_fixtures.py -k <term>
```

## Fixture file format

Each file is a YAML list. Each element is one test case.

```yaml
- name: and-both-present
description: and requires all children to match
rules:
- name: and-match
description: should match because the function contains both mov and number 0x10
scopes:
static: function
features:
- and:
- mnemonic: mov
- number: 0x10
features: |
func: 0x402000
bb: 0x402000: basic block
insn: 0x402000: mnemonic(mov)
insn: 0x402000: number(0x10)
expect:
matches:
and-match:
- 0x402000
```

The `name` field is a stable human-readable identifier that appears in pytest ids. The `description` explains the behavior under test. Rules are specified under `rules` with `name`, `scopes`, and `features` at the top level (no `meta:` wrapper needed); the loader fills in the missing scope side with `unsupported`. The `features` field is a block string or list of strings in the DSL described below. Expected results go in `expect.matches`, mapping rule names to exact match locations.

Optional fields: `base address` (static only, defaults to `0`) and `options.span size` (patches `SPAN_SIZE` for that test).

Keep tests small and focused: each test case should have its own minimal feature set. Prefer many small individual tests over grouped rules sharing features.

## Feature DSL

Line prefixes for static tests: `global:`, `file:`, `func:`, `bb:`, `insn:`.
Line prefixes for dynamic tests: `global:`, `file:`, `proc:`, `thread:`, `call:`.

Static examples:
```
global: global: os(windows)
file: 0x402345: characteristic(embedded pe)
func: 0x401000
func: 0x401000: string(hello world)
bb: 0x401000: basic block
insn: 0x401000: mnemonic(mov)
insn: 0x401000: offset(0x402000) -> 0x402000
insn: 0x401000: string(key: value)
```

Dynamic examples:
```
proc: sample.exe (pid=3052)
thread: 3064
call: 11: api(LdrGetProcedureAddress)
call: 11: string(AddVectoredExceptionHandler)
```

`-> <addr>` overrides the feature location. Feature text may contain `: `. Dynamic call IDs must be unique within a test and can be used as shorthand in `expect.matches`.

## Address syntax

String forms: `0x401000`, `base address+0x100`, `file+0x20`, `token(0x1234)`, `token(0x1234)+0x10`, `global`, `process{pid:3052}`, `process{pid:3052,tid:3064}`, `process{pid:3052,tid:3064,call:11}` (with optional `ppid:`).

Dynamic tests may use a bare integer call ID in `expect.matches` when that call ID is unique within the test.
53 changes: 53 additions & 0 deletions tests/fixtures/matcher/dynamic/call.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
- name: call-scope-single-api
description: call scope matches a single API at the correct call
rules:
- name: call-api
scopes:
dynamic: call
features:
- api: GetSystemTimeAsFileTime
features: |
proc: sample.exe (pid=3052)
thread: 3064
call: 8: api(GetSystemTimeAsFileTime)
call: 9: api(GetSystemInfo)
expect:
matches:
call-api:
- 8

- name: call-scope-multiple-matches
description: call scope reports multiple matching calls
rules:
- name: call-multi
scopes:
dynamic: call
features:
- api: LdrGetDllHandle
features: |
proc: sample.exe (pid=3052)
thread: 3064
call: 10: api(LdrGetDllHandle)
call: 11: api(LdrGetProcedureAddress)
call: 12: api(LdrGetDllHandle)
expect:
matches:
call-multi:
- 10
- 12

- name: call-scope-no-match
description: call scope does not match when no call has the required feature
rules:
- name: call-absent
scopes:
dynamic: call
features:
- api: CreateFileW
features: |
proc: sample.exe (pid=3052)
thread: 3064
call: 8: api(GetSystemTimeAsFileTime)
call: 9: api(GetSystemInfo)
expect:
matches: {}
40 changes: 40 additions & 0 deletions tests/fixtures/matcher/dynamic/process.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
- name: process-scope-basic
description: process scope matches features aggregated across threads
rules:
- name: process-apis
scopes:
dynamic: process
features:
- and:
- api: CreateFileW
- api: WriteFile
features: |
proc: sample.exe (pid=3052)
thread: 3064
call: 1: api(CreateFileW)
thread: 3065
call: 2: api(WriteFile)
expect:
matches:
process-apis:
- "process{pid:3052}"

- name: process-scope-no-match
description: process scope does not match when features are split across processes
rules:
- name: process-split
scopes:
dynamic: process
features:
- and:
- api: CreateFileW
- api: WriteFile
features: |
proc: sample.exe (pid=3052)
thread: 3064
call: 1: api(CreateFileW)
proc: other.exe (pid=3053)
thread: 4000
call: 2: api(WriteFile)
expect:
matches: {}
Loading
Loading