CLIM-1339: Download processing finch error#692
Open
renoirb wants to merge 5 commits into
Open
Conversation
58a3dd9 to
77bb9dd
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
On the Download page, requesting data for certain Health Region selections (
InteractiveRegionOption.HEALTH) failed in Finch: the job errored, no file was produced, and the user received an error email instead of their data.Root cause. The selected region's display name is sent to Finch as the
output_nameinput — the label Finch uses to name the output file. Two of the 119 health-region names contain a forward slash:North Shore/Coast Garibaldi Health Service Delivery AreaThompson/Cariboo Health Service Delivery AreaFinch builds its output path from
output_nameand reads the/as a directory separator, so it attempts to write into a sub-directory that does not exist:Two message variants across the failing jobs, one cause — the
/split.Fix. Before
output_nameis pushed to the Finch inputs, replace the path separators/and\with_:North Shore/Coast Garibaldi…is then sent as…North Shore_Coast Garibaldi…, which Finch accepts and stores as…north_shore_coast_garibaldi…— matching a known-good run.Why only the slash. Finch already normalizes the rest of the name itself — it lower-cases, replaces spaces and punctuation with
_, ASCII-folds accents, and transliterates non-ASCII letters. Verified against real jobs:Région de l'Estriewas submitted unescaped (accent + apostrophe + spaces) and succeeded, returning…region_de_l_estrie….zip. So accents, apostrophes and dashes are not the cause.Tłı̨chǫ Community Services Agen(letters that do not NFKD-fold:ł ı ǫ) was also submitted unescaped and succeeded, returning…tlicho_community_services_agen…— Finch transliteratesł→l,ı→i,ǫ→o(see on the map click near 'Northwest Territories').The slash is special only because Finch consumes it as a path separator before that normalization runs.
A scan of all 119 names in the GeoServer
CDC:healthlayer found exactly 2 with/(the two above); the rest carry accents, dashes, apostrophes and even non-Latin letters — all of which Finch handles on its own. The slash fix therefore covers the failure for the entire layer.Scope & safety.
output_namestring.output_nameis write-only on the frontend (nothing reads it back), so this cannot affect request matching or any other input.output_nameas before.Note
Unrelated data note:
Tłı̨chǫ Community Services Agen…(regionid116) is truncated in theCDC:healthsource layer (it should read "Agency"). It downloads correctly — Finch transliterates the name totlicho_community_services_agen— so this is a separate layer data-quality issue.Related ticket