Skip to content

feat: Improve progress reporting during identifier conversion#23186

Open
thykel wants to merge 8 commits into
devfrom
feat/add-remaining-project-counter-into-admin
Open

feat: Improve progress reporting during identifier conversion#23186
thykel wants to merge 8 commits into
devfrom
feat/add-remaining-project-counter-into-admin

Conversation

@thykel
Copy link
Copy Markdown
Contributor

@thykel thykel commented May 12, 2026

Ticket

https://community.openproject.org/projects/stream-jira-exit/work_packages/74903/activity

What are you trying to accomplish?

Show a very basic progress indicator: Number of remaining projects to convert.

Screenshots

Screenshot 2026-05-14 at 10 46 56 Screenshot 2026-05-14 at 10 50 03

What approach did you choose and why?

Merge checklist

  • Added/updated tests
  • Added/updated documentation in Lookbook (patterns, previews, etc)
  • Tested major browsers (Chrome, Firefox, Edge, ...)

akabiru added a commit that referenced this pull request May 13, 2026
The [A-Z][A-Z0-9_]* shape that defines a semantic project identifier
appears in three places that each redefine it inline: the validator's
body check, the work-package semantic-id pattern composed into the
route constraint, and the wiki reverse-link parser.

Lift the unanchored shape to Projects::Identifier::SEMANTIC_FORMAT and
have WorkPackage::SemanticIdentifier::SEMANTIC_ID_PATTERN compose from
it via .source, with ID_ROUTE_CONSTRAINT in turn composing from that.
The validator keeps its own anchored start/body patterns because each
pattern produces a distinct error message — composing the two from
SEMANTIC_FORMAT would obscure that contract.

Split out of #22976 so #23186 can introduce its CLASSIC_IDENTIFIER_CHARS
counterpart without conflicting with the larger feature PR.
@thykel thykel marked this pull request as ready for review May 14, 2026 14:22
@github-actions
Copy link
Copy Markdown

Caution

The provided work package version does not match the core version

Details:

Please make sure that:

  • The work package version OR your pull request target branch is correct

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 14, 2026

Deploying openproject with PullPreview

Field Value
Latest commit 8bec621
Job deploy
Status ✅ Deploy successful
Preview URL https://pr-23186-add-remaining-proj-ip-178-105-9-66.my.opf.run:443

View logs

Copy link
Copy Markdown
Member

@akabiru akabiru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do my own review in the morning, but here are some notes review agents flagged verbatim.

Comment on lines +125 to +126
def with_non_classic_identifier
where("identifier !~ ?", "^#{CLASSIC_FORMAT_CHARS.source}$")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍎 with_non_classic_identifier and classic_identifier_format? disagree on all-numeric input

The new scope is built from CLASSIC_FORMAT_CHARS (no anti-all-numeric guard) while classic_identifier_format? uses CLASSIC_FORMAT (which has the guard). So "123" is classic in SQL and non-classic in Ruby — and the scope name implies it's the complement of the method.

It happens to be safe today only because RevertInstanceToClassicIdsJob#perform (app/workers/project_identifiers/revert_instance_to_classic_ids_job.rb:47) also skips numerics via the Ruby method, so the rows SQL misses are the same rows the job would skip anyway. But the scope is now a public class method on Project and the next caller won't carry that guarantee.

Verification
CLASSIC_FORMAT_CHARS = /[a-z0-9\-_]+/
CLASSIC_FORMAT       = /\A(?!\d+\z)#{CLASSIC_FORMAT_CHARS}\z/
sql_anchored = /^#{CLASSIC_FORMAT_CHARS.source}$/

# input            ruby_classic  sql_classic  divergence
# "my-project"     true          true
# "PROJ"           false         false
# "MY_PROJ_1"      false         false
# "123"            false         true         DIVERGES
# "abc_42"         true          true

Simplest fix: drop CLASSIC_FORMAT_CHARS entirely, encode the lookahead in one SQL regex. Postgres ~/!~ supports (?!…).

Suggested change
def with_non_classic_identifier
where("identifier !~ ?", "^#{CLASSIC_FORMAT_CHARS.source}$")
def with_non_classic_identifier
where("identifier !~ ?", '^(?!\\d+$)[a-z0-9_-]+$')
end

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the lookahead doesn't work for SQL in this query 👀 and if nothing else, it's a bit faster without it.

Copy link
Copy Markdown
Contributor Author

@thykel thykel May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, interestingly enough, we may want to include all-numeric IDs here for projects created before 2021.

=> Relevant Matrix thread

(Though, that will require adding a little bit of extra error handling for the back-conversion, because conversion back to 123 would yield a validation error.)

Comment thread app/models/projects/identifier.rb
Comment on lines +74 to +78
if pending_count.zero?
I18n.t("#{scope}.zero")
else
I18n.t(scope, count: pending_count)
end
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍎 I18n already routes count: 0 to the zero: key — drop the manual branch

Rails I18n auto-resolves zero: under a pluralized entry when count: 0 is passed. The explicit if pending_count.zero? + "#{scope}.zero" string interpolation is reimplementing existing behaviour, and it reads as if zero were a peer key rather than a CLDR plural form.

Verification
# /tmp/i18n_check.rb — store_translations with {zero, one, other}
count=0: Finalizing conversion...
count=1: 1 project remaining.
count=7: 7 projects remaining.

If you add a parallel zero: to classic_conversion_status_message (see comment on config/locales/en.yml), the whole method collapses to a symmetric single call:

Suggested change
if pending_count.zero?
I18n.t("#{scope}.zero")
else
I18n.t(scope, count: pending_count)
end
if pending_count.zero?
I18n.t("#{scope}.zero")
else
I18n.t(scope, count: pending_count)
end

→ becomes just I18n.t(scope, count: pending_count).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I thought that automatic routing doesn't actually work and 0 gets direct to other. Will re-check.

Comment thread config/locales/en.yml
Comment on lines +456 to 460
classic_conversion_status_message:
one: Project identifiers are being converted to classic format. 1 project remaining.
other: Project identifiers are being converted to classic format. %{count} projects remaining.
workflows:
tabs:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍎 Add zero: to classic_conversion_status_message for symmetry

Forward direction has a zero: finalizing message; reverse direction doesn't. The poll can render count: 0 against the other plural ("0 projects remaining") for one tick — Project.with_non_classic_identifier.count can hit zero before IdentifierAutofix.reversion_in_progress? flips. Adding the parallel key also unlocks collapsing the Ruby branch in identifier_settings_form_component.rb#in_progress_banner_message.

Suggested change
classic_conversion_status_message:
one: Project identifiers are being converted to classic format. 1 project remaining.
other: Project identifiers are being converted to classic format. %{count} projects remaining.
workflows:
tabs:
classic_conversion_status_message:
zero: Finalizing conversion to classic format...
one: Project identifiers are being converted to classic format. 1 project remaining.
other: Project identifiers are being converted to classic format. %{count} projects remaining.

I18n.t("admin.settings.work_packages_identifier.in_progress.classic_conversion_status_message",
count: pending_count)
else
pending_count = ProjectIdentifiers::PendingProjectsFinder.project_ids.size
Copy link
Copy Markdown
Member

@akabiru akabiru May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the proposal here on adopting a cache is valid- as some projects will take less than 10s and afaik, cache is an opt-in; not guaranteed to be present. But the indexing point is sth worth exploring?


🍊 PendingProjectsFinder.project_ids is three unindexed queries every 5s

PendingProjectsFinder.project_ids (app/services/project_identifiers/pending_projects_finder.rb:40-58 — not in this PR) materializes three Sets per call:

  • ProblematicIdentifiers#scope.ids — 4 OR'd regex predicates on projects.identifier (no functional index; the file's own perf note acknowledges this).
  • WorkPackage.unsequenced.distinct.pluck(:project_id) — scan filtered on sequence_number IS NULL.
  • WorkPackage.non_semantic.distinct.pluck(:project_id)JOIN projects + per-row work_packages.identifier IS DISTINCT FROM projects.identifier || '-' || sequence_number::text. No index supports the concat.

This runs every 5s (poll_for_changes_interval_value: 5000 below) for every admin polling the page, against the same work_packages rows the conversion job is mutating. On a large instance that's seconds of DB time per poll while the job is contending for the same rows.

A monotonically-decreasing counter doesn't need second-level freshness — short cache window is enough:

Suggested change
pending_count = ProjectIdentifiers::PendingProjectsFinder.project_ids.size
pending_count = Rails.cache.fetch("wp_identifier_pending_count", expires_in: 10.seconds) do
ProjectIdentifiers::PendingProjectsFinder.project_ids.size
end

Alternative: add PendingProjectsFinder.count that aggregates the three buckets in one SQL union — the caller doesn't actually need the IDs, only the size.

context "when reversion is in progress" do
before do
allow(ProjectIdentifiers::IdentifierAutofix).to receive(:reversion_in_progress?).and_return(true)
allow(Project).to receive(:with_non_classic_identifier).and_return(double(count: 3))
Copy link
Copy Markdown
Member

@akabiru akabiru May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍊 Stubbing with_non_classic_identifier defeats the test

allow(Project).to receive(:with_non_classic_identifier).and_return(double(count: 3)) stubs out the very SQL scope this PR introduces. If the regex breaks (see the asymmetry comment on identifier.rb), this spec still passes.

The same file uses real create_list/shared_let data elsewhere — let the scope run for real here too. Even stronger: add a dedicated spec/models/projects/identifier_spec.rb#with_non_classic_identifier covering edge cases (all-digit, uppercase, mixed-case, underscores) — the regex split is exactly the surface that wants direct coverage.

Suggested change
allow(Project).to receive(:with_non_classic_identifier).and_return(double(count: 3))
create_list(:project, 3, identifier: "PROJ#{_1}")

(And drop the allow(Project).to receive(:with_non_classic_identifier)… stub.)

The forward-direction stub on PendingProjectsFinder.project_ids (lines 58, 67, 76) is fine — it's a module-level pure function, not an AR scope.

it "returns 200 with a turbo stream replacing the form component in change_in_progress state" do
get :status, format: :turbo_stream

expect(response).to have_http_status(:ok)
Copy link
Copy Markdown
Member

@akabiru akabiru May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍊 Both #status examples assert the same turbo-stream wrapper — the state difference isn't exercised

The in-progress branch (state: :change_in_progress) and the completed branch (state: :completed) render different components, but both assertions check only that some turbo-stream replaces the wrapper. A future change that returns :edit while in progress, or swaps the component class entirely, ships green.

Assert against the rendered body for the meaningful difference:

Suggested change
expect(response).to have_http_status(:ok)
expect(response).to have_http_status(:ok)
expect(response).to have_turbo_stream(action: "replace", target: component_target)
expect(response.body).to include("projects remaining").or include("Finalizing conversion")

Mirror with Successfully updated work package identifier format. in the completed example (line 60-62).

Comment on lines +38 to +39
# Simple character set meant only for SQL matching. No anchoring or anti-all-numeric guard.
CLASSIC_FORMAT_CHARS = /[a-z0-9\-_]+/
Copy link
Copy Markdown
Member

@akabiru akabiru May 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍏 Comment reads as if the SQL is unanchored

Simple character set meant only for SQL matching. No anchoring or anti-all-numeric guard.

The constant itself isn't anchored, but the only caller (line 126) adds ^…$ inline — so the comment narrates a property of this Regexp object that's invisible at the use site. If the regex-asymmetry comment is taken and CLASSIC_FORMAT_CHARS is removed, this disappears with it; otherwise tighten to "Character class only — callers anchor."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants