Add tag-based test filtering#8
Conversation
Restructure existing keys in config.toml into sections for better extensibility, and introduce a Config class that centralizes config key definitions, section routing, and default value lookup, replacing duplicated defaults across tests. Now a [devices] section is required and the old flat format is rejected.
Factor out the tuple parsing logic to support tags in later commits
Tests can now carry tags by appending a list to the registration tuple:
- 3-tuple (tags only): ("test-name", callback, ["tag1", "tag2"])
- 4-tuple (dep_fn + tags): ("test-name", callback, dep_fn, ["tag1"])
Tags are stored as a frozenset in the Test definition, defaulting to
empty for backward compatibility.
Implement a recursive descent parser that compiles tag expressions like "!(benchmark | experimental)" into a matcher object. The matcher takes a list of tags and returns whether they satisfy the expression. The parser supports boolean operators (&, |, !) and parentheses for grouping. Identifiers consist of alphanumeric characters and underscores, and must not start with a digit.
Pass the full Test struct to matches() to enable tag-based filtering in later commits.
The CLI --tags option accepts a boolean tag expression to filter tests by their tags. A tag filter built from the expression is AND'd with existing path/state filters, narrowing the selected tests. A default expression can be set in config.toml under [filter].tags. The --tags option overrides it completely.
Show each test's tags on the list
lorelei-sakai
left a comment
There was a problem hiding this comment.
Generally speaking this looks fine. Most of my comments are about parts of the implmentation that seem awkward or over-complicated.
We should communicate to existing users that the current config file format will break once you commit this and people will have to reshape their config files; it could lead to a bit of surprise otherwise.
| if cfg.get(key) is None: | ||
| raise ValueError(f"config.toml: missing required key '{key}' in [devices]") | ||
| if not cfg.get("disable_by_id_check"): | ||
| check_dev(cfg.get("metadata_dev"), "metadata_dev") |
There was a problem hiding this comment.
check_dev here really doesn't operation on anything except the config, so it probably makes more sense as a function in the Config class.
Then you could call it simply as cfg.check_dev("data_dev") with fewer arguments.
Possibly the entire validate() function should also be defined in the Config class as well, though that makes less difference in terms of code complexity.
| def __str__(self): | ||
| return str(self._cfg) | ||
|
|
||
| def __init__(self): |
There was a problem hiding this comment.
Now that you have an actual Config class, the Fixture class (which seems to just be a holder for the config) doesn't seem to serve much purpose. I'd like to get rid of one of the other and only have one class.
That can be a future cleanup, though.
There was a problem hiding this comment.
That makes sense. My thinking was that Fixture will likely grow to carry other environmental information beyond just configs. However, I agree we can revisit this for a cleanup later.
| raise ValueError(f"test entry must have 2-3 elements, got {len(entry)}") | ||
| path, callback, third = entry | ||
| if callable(third): | ||
| return path, callback, third, None |
There was a problem hiding this comment.
This parsing dance seems really awkward. Did you consider using named parameters instead of trying to guess what the third argument is by type?
In particular, I think we could make the third argument always "tags" and then require a named argument for tests that want to set dep_fn. There are currently only three tests that register a dep_fn, and it's for enforcing a dependency that I hope to remove later anyway, so in the long run we can probably make dep_fn go away entirely.
There was a problem hiding this comment.
Tuples don't have named parameters. Switching to dicts introduces more verbosity, and using NamedTuple requires a class name on every test definition. Both introduce a lot of churn.
Making the third argument always "tags" could be a feasible direction since only three tests need a dep_fn. Those three tests could be registered separately via register() directly.
| self._tests = {} | ||
|
|
||
| def register(self, path, callback, dep_fn=None): | ||
| def register(self, path, callback, dep_fn=None, tags=None): |
There was a problem hiding this comment.
If you make tags default to an empty list, I think it simplifies the assignment later. You should then be able to always say t = frozenset(tags) or even inline it.
There was a problem hiding this comment.
Use empty iterable default to avoid if-else:
def register(self, path, callback, dep_fn=None, tags=()):
path = _normalise_path(path)
self._tests[path] = Test(dep_fn, callback, frozenset(tags))
The type check on tags and dep_fn could be moved from _parse_test_entry() to register()
| self._tests[path] = Test(dep_fn, callback, t) | ||
|
|
||
| def register_batch(self, prefix, tests, batch_dep_fn=None): | ||
| def register_batch(self, prefix, tests, batch_dep_fn=None, batch_tags=None): |
There was a problem hiding this comment.
I think the case is even stronger here for making batch_tags the third argument and requiring a named-parameter to set the batch_dep_fn. There is literally no current test that sets a batch_dep_fn so you won't even have to update anything.
There was a problem hiding this comment.
Agree. Moving batch_tags to the third positional argument makes it more convenient for callers:
def register_batch(self, prefix, tests, batch_tags=(), batch_dep_fn=None):
| @@ -4,31 +4,31 @@ | |||
|
|
|||
There was a problem hiding this comment.
The plit between patches 7 and 8 feels awkward ("Extend TestFilter to obtain Test properties in matches()" and "Add tag-based test filtering via --tags option"). It's not at all clear in patch 7 why you want to pass all the test properties down through all the filters. Maybe pull the TagFilter class into the earlier patch?
This probably doesn't matter than much, feel free to ignore it.
There was a problem hiding this comment.
I had similar thoughts, but keeping the refactor separate makes the diff easier to review since it's a pure interface change across many files. The separation also makes it clear that the mechanical update is correct on its own, independent of the feature addition.
I'll update the commit message to explain why the interface is changing:
Extend TestFilter.matches() to accept the Test struct
Pass the full Test struct alongside the path so that filters can
access test properties like tags. This is needed for the TagFilter
introduced in the next commit.
There was a problem hiding this comment.
The extra info in the commit message seems like enough, thanks.
| if args.show_tags: | ||
| tags = tests.get_tags(p) | ||
| tag_str = f"[{', '.join(sorted(tags))}]" if tags else "" | ||
| print(f"{tag_str.ljust(30)}", end=" ") |
There was a problem hiding this comment.
It's not a big deal, probably but, I'm concerned about what this formatting will look like. tree_line() justifies everything so the test name will take up 50 columns. You're adding another 30 columns, so the results will necessarily show up past column 80 here, which is awkward if you're using an 80-character-wide terminal (like I do).
Another concern is that if there are three or four tags, they may take more than 30 columns to print.
Perhaps you could steal some columns from tree_line? Perhaps you could put the tags after the results columns? Perhaps you could deliberately add a second line for the tags of each test?
(Also tags here is guaranteed to be a frozenset I think, you shouldn't need the if/else.)
| arg_result_set(list_p) | ||
| arg_run_nr(list_p) | ||
| list_p.add_argument( | ||
| "-T", |
There was a problem hiding this comment.
Is it worth adding a long form of this argument too, like --show-tags?
There was a problem hiding this comment.
I initially chose -T since the verb-prefixed --show-tags felt redundant to the list subcommand, but I could add both for discoverability.
There was a problem hiding this comment.
The verb is kind of redundant, yeah. The longer tag could be just --tags? It just feels weird to only have the short option when most other command arguments don't do that.
There was a problem hiding this comment.
--tags is used for specifying tag expression and the list subcommand supports --tags, so the name is taken, unless we rename the --tags option. I haven't figured out a good long option name for -T.
pytest's -m option (marker expression) doesn't have the correspond long option name.
| if token is None: | ||
| raise ParseError("unexpected end of expression, expected an identifier or '('") | ||
| if token in ('&', '|', ')'): | ||
| raise ParseError(f"unexpected '{token}', expected an identifier or '('") |
There was a problem hiding this comment.
It looks to me like if "and" shows up in the wrong place, it'll be reported as "unexpected '&'" won't it? (Similarly for "or".)
There was a problem hiding this comment.
Right, keywords "and", "or", "not" are replaced. The error message could be ambiguous.
| self.name = name | ||
|
|
||
| def match(self, tags): | ||
| return self.name in tags |
There was a problem hiding this comment.
So, I take it the tag names accepted are all arbitrary, not limited to a set actually used in the tree? If I type "smoek" instead of "smoke" I just don't get any tests run, no warning/error? (Worse, "tag1|tag2" when I spell one wrong would still run the tests with the other tag, making it less obvious that something was wrong.)
Add a tagging system for test categorization and filtering. Tests can carry tags like
smoke,benchmark, orexperimentalto classify their purpose beyond the existing path-based organization. A boolean expression parser lets users select tests by tags at the command line or through a config default.Motivation
The test suite contains tests of varying nature: parameter validation, smoke tests, IO-intensive functional tests, benchmarks, and stress tests. However, they are organized along a single axis (by feature, subsystem, or module). Inspired by
pytestmarkers, tags add a second axis so developers can run only fast deterministic tests during development, or exclude benchmarks in CI, without reorganizing the test tree.Changes
config.tomlinto[devices]and[run]sections with a newConfigclass that centralizes key definitions and defaults.register_batch()supports per-test and batch-level tags, merged at registration time."!experimental"or"smoke & !benchmark"into matcher objects. Supports boolean operators (&,|,!), parentheses, and keyword aliases (and,or,not).--tagsfiltering: A new--tagsCLI option filters tests by tag expression, AND'd with existing path/state filters. A default expression can be set inconfig.tomlunder[filter].tags; the CLI option overrides it.list -Tshows each test's tags, andlist-tagssubcommand summarizes all tags with test counts.Registration examples
Usage