compare: scope PR-comment upsert per (deployment, triggering_env)#545
Conversation
`check_regression_comment` matched on the literal substrings "Comparison between" and "Time Period from" -- present in every regression-table comment regardless of which deployment or triggering env produced it. So when several matrix entries (e.g. oss-standalone + oss-cluster-02-primaries, or msmarco + ftsb) run `compare --pull-request N` against the same PR, the second one finds the first's comment via that generic predicate and `edit`s it in place, silently overwriting the previous setup's results. Net effect: only whichever setup finishes last is visible on the PR. - Add `build_upsert_marker(baseline_deployment, comparison_deployment, triggering_env)` returning a stable hidden HTML comment string. The marker excludes architecture so multi-arch sections of one comment still converge to a single upsertable comment as each arch finishes. - Extend `check_regression_comment(comments, marker=None)` to filter on the marker when provided. When no comment carries the marker we return (False, -1) so the caller posts a fresh comment; any pre-fix legacy comment on the PR is left in place as a one-time orphan rather than overwritten with a body for a different setup. With `marker=None` the legacy generic predicate is preserved for callers that don't (yet) compute a marker. - In `compare_command_logic`, compute the marker once right after the deployment names are resolved, thread it into the existing matcher call, and append it to `comment_body` before the upsert when `is_actionable_pr` is true. The same marker is added on both create and edit paths so the equality short-circuit at the diff comparison continues to work (the existing comment's stored body already contains the marker). - Tests cover: marker determinism, distinctness by deployment + triggering_env, prefix-collision safety (oss-standalone vs oss-standalone-threads-8), legacy-predicate fallback when no marker is passed, no-match returns (False, -1), legacy comments ignored when a marker is provided, last-match wins on duplicates. Refs: discussion on RediSearchEnterprise#462
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #545 +/- ##
==========================================
+ Coverage 47.50% 47.67% +0.17%
==========================================
Files 74 74
Lines 8907 8917 +10
==========================================
+ Hits 4231 4251 +20
+ Misses 4676 4666 -10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@kei-nan if u bump the version in pyproject, we can merge and inmediately do a release so it would be seen |
I don't have permissions in this project, can't even press the merge button. |
I mean, in the PR change the version in pyproject. Then I will be able to merge and release a version |
Summary
check_regression_commentmatches on the literal substrings"Comparison between"and"Time Period from"— strings that appear in every regression-table comment regardless of which deployment or triggering env produced it. So when several matrix entries (e.g.oss-standalone+oss-cluster-02-primaries, or msmarco + ftsb) each runcompare --pull-request Nagainst the same PR, the second invocation finds the first's comment via that generic predicate and callsregression_comment.edit(comment_body)(compare.py:876), silently overwriting the previous setup's results. Net effect on the PR: only whichever setup finishes last is visible.This PR scopes the upsert per
(deployment, triggering_env)tuple so each setup gets its own comment.Changes
New helper
build_upsert_marker(baseline_deployment_name, comparison_deployment_name, tf_triggering_env)returns a stable hidden HTML comment string, e.g.Architecture is intentionally not part of the key — in multi-arch mode a single comment renders one H2 section per arch and we want it to converge to a single upsertable comment as each arch finishes.
check_regression_comment(comments, marker=None)now filters on the marker when provided. When no comment carries the marker the function returns(False, -1)so the caller posts a fresh comment; any pre-fix legacy comment is left in place as a one-time orphan rather than overwritten with a body for a different setup. Whenmarker=Nonethe legacy generic predicate is preserved, so out-of-tree callers that don't yet pass a marker keep their existing behavior.compare_command_logiccomputes the marker once right afterbaseline_deployment_name/comparison_deployment_nameare resolved (already done at compare.py:364-371), threads it into the existing matcher call, and appends it tocomment_bodybefore posting whenis_actionable_pris true. The same marker is added on both create and edit paths so the equality short-circuit at the diff comparison (thecomment_body == old_regression_comment_bodycheck) continues to work — the previously-posted body already contains the marker.Backward compatibility
check_regression_comment(comments)(nomarker=kwarg) hit the legacy generic predicate — unchanged.redisbench-adminwill gain one fresh marker-tagged comment alongside the orphan the first time the new code runs. From that point on, repeat runs of the same(deployment, triggering_env)find their marker and update in place.Test plan
python -m pytest tests/test_compare.py tests/test_compare_validation.py→ 20 passed, 9 skipped (the 9 skipped are RTS-dependent integration tests; the 8 new unit tests are pure and all pass).black --check redisbench_adminclean.python -m flake8 redisbench_adminclean.from redisbench_admin.compare.compare import compare_command_logic, build_upsert_marker, check_regression_commentsucceeds.New unit tests cover:
oss-standaloneis not a substring ofoss-standalone-threads-8's marker, even though their deployment names share a prefix).marker=None.(False, -1)returned when no comment carries the marker.Context
Discovered while debugging missing benchmark comments on redislabsdev/RediSearchEnterprise#462 — fixing the deployment-name filter there exposed this upsert-collision as the next layer of the same problem.