While looking through fossology/report.py, I noticed that the regex used to extract the report ID can silently return an empty string when the API message does not end with digits.
Current code:
report_id = re.search("[0-9]*$", response.json()["message"])
return report_id[0]
Since * allows zero matches, [0-9]*$ always succeeds — even when there are no trailing digits. In those cases, the function returns "" instead of raising an error.
Impact
If the API response format changes or an unexpected message is returned, generate_report silently returns an empty string:
That value may later get passed into download_report, resulting in confusing downstream errors (404s, invalid requests, etc.) instead of failing early with a clearer parsing error.
Minimal reproduction
import re
message = "Report has been queued."
result = re.search("[0-9]*$", message)
print(repr(result[0]))
Current output:
With the normal API response format this works because the message happens to end with digits:
"Report will be generated in the back ground, report id is 42"
but the parsing is fragile if the format changes.
Proposed fix
Require at least one trailing digit and raise an error if parsing fails:
match = re.search(r"[0-9]+$", response.json()["message"])
if not match:
raise FossologyApiError(
f"Could not parse report ID from response: {response.json()['message']}",
response,
)
return match[0]
Questions ⭐️
- Is the API response message guaranteed to always end with a numeric report ID across Fossology server versions?
- Would returning the report ID as
int instead of str make sense here for stronger type safety?
Environment:
- fossology-python: 3.5.0
- Python: 3.10+
Would like to work on this fix and open a PR for it, if that sounds good.
While looking through
fossology/report.py, I noticed that the regex used to extract the report ID can silently return an empty string when the API message does not end with digits.Current code:
Since
*allows zero matches,[0-9]*$always succeeds — even when there are no trailing digits. In those cases, the function returns""instead of raising an error.Impact
If the API response format changes or an unexpected message is returned,
generate_reportsilently returns an empty string:''That value may later get passed into
download_report, resulting in confusing downstream errors (404s, invalid requests, etc.) instead of failing early with a clearer parsing error.Minimal reproduction
Current output:
''With the normal API response format this works because the message happens to end with digits:
"Report will be generated in the back ground, report id is 42"but the parsing is fragile if the format changes.
Proposed fix
Require at least one trailing digit and raise an error if parsing fails:
Questions ⭐️
intinstead ofstrmake sense here for stronger type safety?Environment:
Would like to work on this fix and open a PR for it, if that sounds good.