Skip to content

pk-doctor: _parse_frontmatter substring-splits on '---' and truncates spec on Markdown table separators #74

@projectious

Description

@projectious

Summary

pk-doctor's _parse_frontmatter helpers use text.split(\"---\", 2) to slice out YAML frontmatter. That is a substring split, not a line-anchored split, so any Note/Artifact whose body contains a Markdown table separator like |---|---| gets its frontmatter truncated at the first inline ---. yaml.safe_load then sees a fragment of spec: with only title/body, and downstream schema validation reports spurious <root>: 'type' is a required property / <root>: 'state' is a required property errors.

Affected files

  • context/skills/processkit/pk-doctor/scripts/checks/schema_filename.py (_parse_frontmatter at line ~122)
  • context/skills/processkit/pk-doctor/scripts/checks/commands_consistency.py (_parse_frontmatter at line ~27)

(Both helpers are independent copies of the same logic.)

Repro

A v2 Note like:

```markdown

apiVersion: processkit.projectious.work/v2
kind: Note
metadata:
id: NOTE-...
created: '2026-05-19T22:05:06+00:00'
spec:
title: 'Round 17 — scaling analysis'
body: |
Some intro text.

| Scale-up need | Existing-primitive coverage |
|---|---|
| Row | Row |

type: reference
state: captured

(body content)
```

Running pk-doctor reports:

```
ERROR schema.invalid | : 'type' is a required property
ERROR schema.invalid | : 'state' is a required property
```

even though the frontmatter is valid YAML and the spec block does contain `type` and `state`. In this derived project the bug surfaced 14 ERRORs across 7 BriskWillow round notes (each with one or more inline `|---|---|` table separators in body content).

Root cause

`text.split("---", 2)` matches the substring `---` anywhere — including inside the body's literal block scalar — and returns parts[1] truncated to the content between line 1 (`---`) and the first inline `---`. The second on-disk `---` line never participates in the split.

Suggested fix

Use a line-anchored split:

```python
import re
_parts = re.split(r"(?m)^---[ \t]*$\n?", text, maxsplit=2)
```

Both `_parse_frontmatter` copies should be updated; a shared helper in `common.py` would also prevent future drift.

Severity

ERROR-class false positive on otherwise valid entities. Causes `run_pk_release_audit` and routine `pk-doctor` runs to gate on nonexistent schema violations whenever a body contains a Markdown table — a very common pattern in Notes and Artifacts.

🤖 Reported via Claude Code

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions