Conversation
There was a problem hiding this comment.
Thanks for the fix! I am wondering whether we'd prefer to have the catch in read_or_download_file like:
if tmp_file:
try:
return open(file_path, "rb")
except FileNotFoundError:
...so that the logic is within the function. We should have the infos to pass on to Sentry in the check
…sis workers Wrap read_or_download_file calls in csv, geojson and parquet analysis with a FileNotFoundError catch that re-raises as IOException. Made-with: Cursor
3fc245e to
ad5ce14
Compare
@Pierlou That would indeed avoid the repetition, but that would make I don't know what to think, both approaches make sense |
buzcu
left a comment
There was a problem hiding this comment.
I believe instead of repeating the same safeguards in a 3 similarly close place it is better to only have it in read_or_download_file. It increases readability. Best regards
Following the incident between Feb 26 and Mar 4, where a duplicate worker was running and could not find temporary files, analysis jobs were failing with a
FileNotFoundError.This exception was not caught by the outer
except (ParseException, IOException)block, so it propagated without the useful attributes we have forExceptionWithSentryDetails.Why not just add
FileNotFoundErrorto the outerexcepttuple?The outer handler calls
handle_parse_exception(e, ...), which accessese.stepande.sentry_event_id— attributes defined by our customExceptionWithSentryDetailsbase class (from whichParseExceptionandIOExceptionboth inherit). A nativeFileNotFoundErrorhas none of these attributes, so passing it directly tohandle_parse_exceptionwould raise anAttributeError.Instead, we wrap the
read_or_download_filecall in each analysis module with an innertry/except FileNotFoundErrorthat re-raises as anIOException(with explicit message and context). This keeps the outer error handling pipeline intact and consistent with how all other errors are already handled in these functions.