Skip to content

fix(core): drop locale code lowercasing that broke mixed-case locales#1552

Open
ching-kuo wants to merge 1 commit into
emdash-cms:mainfrom
ching-kuo:fix/locale-code-casing
Open

fix(core): drop locale code lowercasing that broke mixed-case locales#1552
ching-kuo wants to merge 1 commit into
emdash-cms:mainfrom
ching-kuo:fix/locale-code-casing

Conversation

@ching-kuo

Copy link
Copy Markdown

What does this PR do?

localeCode schema was calling .toLowerCase() on the validated value, rewriting e.g. zh-TW to zh-tw before the query ran. Rows are stored with the casing from defaultLocale and SQLite/D1 compare strings case-sensitively, so the filter matched zero rows — admin content list returned an empty envelope with no error.

Removing the transform keeps read and write casing symmetric.

Simple test is also added for 3 different cases.

Closes #1551

Type of change

  • Bug fix
  • Feature (requires maintainer-approved Discussion)
  • Refactor (no behavior change)
  • Translation
  • Documentation
  • Performance improvement
  • Tests
  • Chore (dependencies, CI, tooling)

Checklist

  • I have read CONTRIBUTING.md
  • pnpm typecheck passes
  • pnpm lint passes
  • pnpm test passes (or targeted tests for my change)
  • pnpm format has been run
  • I have added/updated tests for my changes (if applicable)
  • User-visible strings in the admin UI are wrapped for translation (if applicable). Do not include messages.po changes except in translation PRs — a workflow extracts catalogs on merge to main.
  • I have added a changeset (if this PR changes a published package)
  • New features link to an approved Discussion: https://github.com/emdash-cms/emdash/discussions/...

AI-generated code disclosure

  • This PR includes AI-generated code — model/tool:

Screenshots / test output

localeCode schema was calling .toLowerCase() on the validated value,
rewriting e.g. zh-TW to zh-tw before the query ran. Rows are stored
with the casing from defaultLocale and SQLite/D1 compare strings
case-sensitively, so the filter matched zero rows — admin content list
returned an empty envelope with no error.

Removing the transform keeps read and write casing symmetric.
@changeset-bot

changeset-bot Bot commented Jun 19, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: a88fd5e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 16 packages
Name Type
emdash Patch
@emdash-cms/cloudflare Patch
@emdash-cms/sandbox-workerd Patch
@emdash-cms/fixture-perf-site Patch
@emdash-cms/perf-demo-site Patch
@emdash-cms/cache-demo-site Patch
@emdash-cms/do-demo-site Patch
@emdash-cms/do-solo-demo-site Patch
@emdash-cms/admin Patch
@emdash-cms/auth Patch
@emdash-cms/blocks Patch
@emdash-cms/gutenberg-to-portable-text Patch
@emdash-cms/x402 Patch
create-emdash Patch
@emdash-cms/auth-atproto Patch
@emdash-cms/plugin-embeds Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions Bot added area/core size/S review/needs-review No maintainer or bot review yet labels Jun 19, 2026
@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@ching-kuo

Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

github-actions Bot added a commit that referenced this pull request Jun 19, 2026

@emdashbot emdashbot Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach

The fix targets a real, correctly-diagnosed bug. localeCode ran .toLowerCase() on the validated value, so a ?locale=zh-TW filter became zh-tw before the query ran. Content rows are written with the casing of the configured defaultLocale (the seed/import/runtime paths set locale directly from getI18nConfig().defaultLocale and never go through this schema), and SQLite/D1 compare TEXT case-sensitively (no NOCASE collation on the locale column — migration 019_i18n.ts). So the filter matched zero rows and the admin content list came back empty with no error. Removing the transform to make read/write casing symmetric is the right minimal fix and fits EmDash's conventions. The added unit test reproduces the regression at the schema level (it fails under the old .transform(...)), and the changeset is present, scoped to emdash as patch, and written as user-facing release notes. So the change is the right change, solving the right problem.

What I checked

  • Traced every localeCode consumer (contentListQuery, contentCreateBody, searchQuery, searchSuggestQuery) — all four are locale filter/input fields; none rely on the lowercasing.
  • Confirmed the stored locale comes from defaultLocale via seed/apply.ts, import/, and handleContentCreate (body.locale ?? getI18nConfig()?.defaultLocale), none of which normalize casing — so the asymmetry the PR describes is real.
  • Confirmed the repository layer filters with a bare WHERE locale = ? (database/repositories/content.ts:510,852) and the FTS search path with AND c.locale = ? (search/query.ts:240,373) — both case-sensitive, so the same bug applied to search and is fixed by the same change.
  • Confirmed the locale column is plain TEXT with no COLLATE NOCASE, so the case-sensitive comparison premise holds on SQLite and Postgres.
  • Confirmed the test would fail under the old transform and passes now, and that no other test asserted the old lowercasing behavior.
  • Confirmed the changeset package name (emdash) and bump level (patch) are correct.

Headline

The fix resolves the reported bug for seed/import/runtime-created content. However, it can regress sites that created content through the admin UI under the previously-released lowercasing code on a mixed-case defaultLocale. The admin create flow always sends an explicit locale: defaultLocale through contentCreateBodylocaleCode, so under the old released code those rows were stored lowercased (zh-TWzh-tw). After this PR the now-case-preserving ?locale=zh-TW filter no longer matches them, so they vanish from the admin list — and UNIQUE(slug, locale) won't catch a re-created entry because zh-twzh-TW. See the line-anchored finding for the recommended remediation (a canonicalizing migration or a case-insensitive query), which would fix the reported bug without orphaning the lowercased rows the predecessor code wrote.

Net: approach is correct and the code is clean, but I'd hold for a migration/query-level fix (or an explicit decision that the affected population is acceptable) before merging.

.string()
.regex(/^[a-z]{2,3}(-[a-z0-9]{2,8})*$/i, "Invalid locale code")
.transform((v) => v.toLowerCase());
export const localeCode = z.string().regex(/^[a-z]{2,3}(-[a-z0-9]{2,8})*$/i, "Invalid locale code");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[needs fixing] Removing the .toLowerCase() transform correctly fixes the reported bug for content stored with defaultLocale casing (seed/import/runtime paths write locale directly from getI18nConfig().defaultLocale, bypassing this schema, so e.g. zh-TW rows existed while the old filter lowercased ?locale=zh-TW to zh-tw and matched nothing).

But the transform being removed has shipped in released versions (it's in the repo's first commit), and the admin create flow does not bypass the schema: router.tsx always sends locale: pickerLocale (pickerLocale = locale ?? manifest?.i18n?.defaultLocale), which is parsed by contentCreateBodylocaleCode. Under the old code, localeCode.parse("zh-TW") returned "zh-tw", so handleContentCreate stored those rows with locale = 'zh-tw' (see api/handlers/content.ts:640, effectiveLocale = body.locale ?? defaultLocale).

After this PR, the list filter preserves casing (?locale=zh-TW), and the locale column is plain TEXT with no COLLATE NOCASE (migration 019_i18n.ts), so SQLite/D1 compare case-sensitively. Net effect on a mixed-case-defaultLocale site (e.g. zh-TW) that created content via the admin UI under the released code:

  • Those existing rows are stored as zh-tw.
  • WHERE locale = 'zh-TW' no longer matches them → they vanish from the admin content list.

This trades one empty-list bug for another on a different (and likely larger) population of sites — any non-English site using canonical BCP 47 casing that created entries through the admin UI. A secondary effect: because uniqueness is UNIQUE(slug, locale) and zh-twzh-TW, an editor re-creating an entry with the same slug at the now-correct zh-TW casing won't get a conflict, producing duplicate slugs across the two casings.

The robust fix is to canonicalize what's stored rather than just stop normalizing the input — e.g. a forward-only migration that rewrites every ec_*.locale to the configured casing (looping _emdash_collections like migration 013), or making the filter case-insensitive (WHERE locale = ? COLLATE NOCASE on SQLite / LOWER(locale) = LOWER(?) on Postgres) so the query matches regardless of how a row was historically stored. Either approach fixes the reported bug without orphaning the lowercased rows this PR's predecessor wrote. A changeset note calling out the migration step would also be appropriate if a migration is added.

@ching-kuo

Copy link
Copy Markdown
Author

The bot review looks valid, I didn't consider that posts created in the admin page go through the schema and get stored with a lowercase locale (e.g. en-US becomes en-us).
Imported content skips the schema and keeps the source casing (en-US), so the two end up stored differently.

A few possible solutions here:

  1. Canonicalizing migration: forward-only, loop _emdash_collections l and rewrite each ec_*.locale to the configured casing (same pattern as the 013_scheduled_publishing migration), skipping any UNIQUE(slug, locale) collisions.
  2. Case-insensitive comparison: COLLATE NOCASE on SQLite, LOWER(locale) = LOWER(?) on Postgres.

I personally prefer the canonicalizing migration, since a migration likely runs only once, while the case-insensitive comparison causes a performance hit on every locale-filtered query (it can't use the idx_*_locale index).

Happy to implement the migration if that direction sounds good.

@github-actions github-actions Bot added review/awaiting-author Reviewed; waiting on the author to respond and removed review/needs-review No maintainer or bot review yet labels Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core cla: signed review/awaiting-author Reviewed; waiting on the author to respond size/S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

localeCode schema lowercases the ?locale= filter, returning zero rows for locales with uppercase subtags

1 participant