[DISCUSSION] Standardizing Rule Matching Semantics

I've begun to think in hard detail about how to describe the rule matching semantics in detail, and in that process I have come to believe that we should no longer treat the virtual output as an optional fallback buffer and instead have ONE concrete matching path for each "kind" of rule.

On the explicit-rule-chaining branch, there are two distinct kinds of rules: anchor rules and chained rules. An anchor rule is any rule that matches on the triecodes in the buffer alone, and a chained rule is a rule that matches on one or more triecodes but then matches on the match index of a previous match (that is, it chains off that rule that was previously matched). This distinction has always existed conceptually, but now the distinction is explicit in the matching algorithm, and we can decide matching behavior based on that.

In the original system, since the addition of the vout fallback buffer, all rules could match on either the input buffer (literal keypresses) or the vout fallback buffer (if enabled). Priority was given to the literal input buffer so that the fallback wouldn't interfere with chaining rules. At this time, I simply disabled the vout fallback, because mixing it with chained rules was full of traps that would break rules in unexpected ways. Never-the-less, it was very useful for users who didn't rely so heavily on chained rules and wanted to write rule for common suffixes that would work on words generated by other rules and words that didn't have rules.

### The Problem
With the explicit-rule-chaining branch, it is no longer a concern that rules that work on the vout will interfere with chained rules, because any rule that would interfere will create a warning from the generator, and the interaction can be fixed in a straight-forward manner. However the reverse problem still exists. Anchor rules that are intended to match on the actual output can match in weird ways on the literal input triecodes that triggered chained rules. Here is somewhat realistic example rule-set:
```
_*ou -> _though
_*or -> _thorough
ou@ -> ough
gh@ -> ghly
```
And now, typing `_*or@` gives `_thoroughly` and typing `_*ou@` gives `_thoughgh`?! This is a very surprising interaction for users. The `ou@` rule was preferred over the `gh@` rule here, because `ou@` matches on three keys of literal input and the `gh@` matches on only two keys of literal input (because the `u` keypress produced both the `g` and `h`).

### Proposal
Match overlapping non-chained rules *only* on the virtual output. Put more explicitly:
- Anchor rules match on the output and *may* overlap with the output of previously matched rules.
- Chained rules match on one or more literal keypresses following a match on the previous rule in the chain.

For the previous example, that means completely disallowing the `ou@ -> ough` rule to match after typing `_*ou`, because `ou@` is an anchor rule, and anchor rules only match on the output. So, `gh@` is the only match, and we get the desired `_thoughly` output. If the `gh@` rule didn't exist, we still wouldn't match on `ou@`. We would instead get something like `_thoughn` (from my rule `h@ -> hn`). This isn't useful, but at least it isn't surprising!

If the user wants to recover the old behavior, they can simply make a proper chain rule for the instances where it is desired. I haven't actually found any instances required yet in my own ruleset, but it would be similar to how I needed to add `s*@ -> sks` to override the implicit `s*@ -> sknow` behavior when I enabled the vout buffer originally.

### Explanation for Users
Here is a rough draft of how I would explain the rule matching semantics to users:

There are two kinds of rules: anchor rules and chained rules.
- A rule is an **anchor rule** if no other rule has a sequence that is a prefix of this rule's sequence.
- A rule is a **chained rule** if one or more rules have sequences that are prefixes of this rule's sequence. For a chained rule, the longest prefix rule is called the **sub-rule**.

Example: If we have the rules:
```
_* -> _the
_*t -> _that
_*ts -> _that's
```
- `_* -> _the` is an anchor rule, because none of the other rules have sequences that are prefixes of `_*`
- `_*t -> that` is a chained rule, because `_*` is a prefix of `_*t`, and `_* -> _the` is its sub-rule
- `_*ts -> that's` is a chained rule, because both of the previous rules have sequences that are prefixes of `_*ts`, and `_*t -> _that` is its sub-rule, because `_*t` is a longer prefix than `_*`

#### Matching Anchor Rules
The sequence of an anchor rule is matched to the output as it is when the last character in the sequence is pressed. This output can come from any combination of regular keypresses and the output of previously matched rules.
**\<insert examples\>**

#### Matching Chained Rules
The sequence of a chained rule has two parts:
- the **prefix**, which is exactly the sequence of the sub-rule
- the **suffix**, which is everything that comes after the prefix (but usually only one symbol)

To match, it is first verified that the last keys pressed match to the **suffix**. Then, it is verified that last keypress immediately before the suffix triggered the sub-rule.
**\<insert examples\>**


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DISCUSSION] Standardizing Rule Matching Semantics #89

The Problem

Proposal

Explanation for Users

Matching Anchor Rules

Matching Chained Rules

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[DISCUSSION] Standardizing Rule Matching Semantics #89

Description

The Problem

Proposal

Explanation for Users

Matching Anchor Rules

Matching Chained Rules

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions