Skip to content

fix: handle FileNotFoundError when temporary file is missing in analysis workers#395

Merged
Pierlou merged 2 commits intomainfrom
fix/raise-tmpfile-not-found-exception
Apr 3, 2026
Merged

fix: handle FileNotFoundError when temporary file is missing in analysis workers#395
Pierlou merged 2 commits intomainfrom
fix/raise-tmpfile-not-found-exception

Conversation

@bolinocroustibat
Copy link
Copy Markdown
Contributor

@bolinocroustibat bolinocroustibat commented Mar 4, 2026

Following the incident between Feb 26 and Mar 4, where a duplicate worker was running and could not find temporary files, analysis jobs were failing with a FileNotFoundError.
This exception was not caught by the outer except (ParseException, IOException) block, so it propagated without the useful attributes we have for ExceptionWithSentryDetails.

Why not just add FileNotFoundError to the outer except tuple?

The outer handler calls handle_parse_exception(e, ...), which accesses e.step and e.sentry_event_id — attributes defined by our custom ExceptionWithSentryDetails base class (from which ParseException and IOException both inherit). A native FileNotFoundError has none of these attributes, so passing it directly to handle_parse_exception would raise an AttributeError.

Instead, we wrap the read_or_download_file call in each analysis module with an inner try/except FileNotFoundError that re-raises as an IOException (with explicit message and context). This keeps the outer error handling pipeline intact and consistent with how all other errors are already handled in these functions.

@bolinocroustibat bolinocroustibat self-assigned this Mar 4, 2026
Copy link
Copy Markdown
Contributor

@Pierlou Pierlou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! I am wondering whether we'd prefer to have the catch in read_or_download_file like:

if tmp_file:
   try:
      return open(file_path, "rb")
   except FileNotFoundError:
       ...

so that the logic is within the function. We should have the infos to pass on to Sentry in the check

…sis workers

Wrap read_or_download_file calls in csv, geojson and parquet analysis
with a FileNotFoundError catch that re-raises as IOException.

Made-with: Cursor
@bolinocroustibat bolinocroustibat force-pushed the fix/raise-tmpfile-not-found-exception branch from 3fc245e to ad5ce14 Compare March 4, 2026 11:11
@bolinocroustibat
Copy link
Copy Markdown
Contributor Author

bolinocroustibat commented Mar 4, 2026

Thanks for the fix! I am wondering whether we'd prefer to have the catch in read_or_download_file like:

if tmp_file:
try:
return open(file_path, "rb")
except FileNotFoundError:
...
so that the logic is within the function. We should have the infos to pass on to Sentry in the check

@Pierlou That would indeed avoid the repetition, but that would make read_or_download_file helper depending on a custom IOException, meaning we would have to import and know the custom exceptions of the project here. While maybe we would prefer to keep read_or_download_file easy to be used everywhere as it's a simple helper?

I don't know what to think, both approaches make sense

Copy link
Copy Markdown

@buzcu buzcu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe instead of repeating the same safeguards in a 3 similarly close place it is better to only have it in read_or_download_file. It increases readability. Best regards

@Pierlou Pierlou merged commit ce1e060 into main Apr 3, 2026
5 checks passed
@Pierlou Pierlou deleted the fix/raise-tmpfile-not-found-exception branch April 3, 2026 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants