Normalize UTF-8 encoding across environments#305
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughSet UTF-8 process env vars in Changes
Sequence Diagram(s)(Skipped — changes are encoding/error-handling focused and do not introduce a new multi-component control flow.) Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cli.py`:
- Line 5: The file imports the unused symbol "os" in cli.py; remove the unused
import statement (delete the line "import os") so the entrypoint only imports
and calls main() from fosslight_dependency.cli; ensure no other code in cli.py
references os and that any module needing os imports it locally.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
70ebb94 to
bae4bf2
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/fosslight_dependency/_package_manager.py (1)
120-125:⚠️ Potential issue | 🟠 Major
check_output()return value is being compared to an integer, creating unreachable code paths.Lines 121 and 134 compare the
subprocess.check_output()result against0, but this function returns decoded stdout (a string) on success and raises an exception on failure—it never returns an exit code. These comparisons therefore produce incorrect logic:
- Line 121:
if ret != 0:always evaluates to True (string ≠ 0), making the else branch unreachable- Line 134:
if ret == 0:always evaluates to False (string ≠ 0), making that error handler dead codeThis can leave empty or invalid Gradle output processed as success.
Proposed fix
ret = subprocess.check_output(cmd, shell=True, encoding='utf-8') - if ret != 0: + if ret.strip(): self.parse_dependency_tree(ret) else: self.set_direct_dependencies(False) logger.warning(f"Fail to run {cmd}")ret = subprocess.check_output(cmd, shell=True, encoding='utf-8') - if ret == 0: - ret_task = False - logger.error(f'Fail to run {cmd}') if os.path.isfile(self.input_file_name):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fosslight_dependency/_package_manager.py` around lines 120 - 125, The code incorrectly treats subprocess.check_output() like it returns an exit code; update the call in the method that invokes subprocess.check_output(cmd, ...) to either (A) use subprocess.run(cmd, shell=True, capture_output=True, text=True) and then check result.returncode == 0, call self.parse_dependency_tree(result.stdout) on success and self.set_direct_dependencies(False) with a logger.warning including result.stderr on failure, or (B) wrap subprocess.check_output in a try/except subprocess.CalledProcessError: on success pass the returned stdout string to self.parse_dependency_tree(ret), and in the except branch call self.set_direct_dependencies(False) and log the exception; ensure you remove the comparisons against 0 (the symbols to change are the check_output call and the branches that call self.parse_dependency_tree and self.set_direct_dependencies).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fosslight_dependency/package_manager/Pypi.py`:
- Around line 209-215: Replace the raw bytes handling of subprocess output:
update the subprocess.run calls that assign to cmd_ret to pass text=True,
encoding='utf-8', errors='replace' so stderr is returned as a decoded string
(tolerating non-UTF-8 sequences), and then remove the manual .decode('utf-8')
calls; also update subsequent checks (e.g., the
stderr.lower().startswith('error:') and any err_msg construction using
cmd_ret.stderr) to operate on the string returned in cmd_ret.stderr. Ensure you
apply this change for both subprocess.run invocations that produce cmd_ret and
where cmd_ret.stderr is decoded/inspected.
---
Outside diff comments:
In `@src/fosslight_dependency/_package_manager.py`:
- Around line 120-125: The code incorrectly treats subprocess.check_output()
like it returns an exit code; update the call in the method that invokes
subprocess.check_output(cmd, ...) to either (A) use subprocess.run(cmd,
shell=True, capture_output=True, text=True) and then check result.returncode ==
0, call self.parse_dependency_tree(result.stdout) on success and
self.set_direct_dependencies(False) with a logger.warning including
result.stderr on failure, or (B) wrap subprocess.check_output in a try/except
subprocess.CalledProcessError: on success pass the returned stdout string to
self.parse_dependency_tree(ret), and in the except branch call
self.set_direct_dependencies(False) and log the exception; ensure you remove the
comparisons against 0 (the symbols to change are the check_output call and the
branches that call self.parse_dependency_tree and self.set_direct_dependencies).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 776811aa-6059-4b50-a058-009e3c68cdb8
📒 Files selected for processing (5)
cli.pysrc/fosslight_dependency/_package_manager.pysrc/fosslight_dependency/package_manager/Npm.pysrc/fosslight_dependency/package_manager/Pypi.pysrc/fosslight_dependency/package_manager/Yarn.py
✅ Files skipped from review due to trivial changes (1)
- cli.py
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/fosslight_dependency/_package_manager.py (1)
120-125:⚠️ Potential issue | 🔴 CriticalFix incorrect string/int comparisons in subprocess.check_output() calls.
subprocess.check_output()returns stdout as a string and raises on non-zero exit. Comparing the return value to integers (lines 121 and 134) is semantically incorrect:
- Line 121:
if ret != 0:is always True for non-empty output, making the else clause (lines 124–125) unreachable dead code.- Line 134:
if ret == 0:is always False, making that entire error branch (lines 135–136) unreachable.For
allDeps, check if output exists:if ret.strip():instead ofif ret != 0:.
ForgenerateLicenseTxt, usesubprocess.check_call()instead ofcheck_output()since the output is never used.Proposed fixes
ret = subprocess.check_output(cmd, shell=True, encoding='utf-8') - if ret != 0: + if ret.strip(): self.parse_dependency_tree(ret) else: self.set_direct_dependencies(False) - logger.warning(f"Fail to run {cmd}") + logger.warning(f"No output from {cmd}")- ret = subprocess.check_output(cmd, shell=True, encoding='utf-8') - if ret == 0: - ret_task = False - logger.error(f'Fail to run {cmd}') + subprocess.check_call(cmd, shell=True)Also applies to: 133–136
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fosslight_dependency/_package_manager.py` around lines 120 - 125, In allDeps where subprocess.check_output(cmd, ...) is used, replace the integer comparison with a truthy check on the returned string (e.g., if ret.strip(): call self.parse_dependency_tree(ret) else call self.set_direct_dependencies(False) and log the failure); in generateLicenseTxt replace subprocess.check_output with subprocess.check_call (since output isn't used) and add proper exception handling for subprocess.CalledProcessError to run the error branch (logger.warning and set_direct_dependencies(False)); update any checks that currently compare ret == 0 or ret != 0 to use ret.strip() or try/except as appropriate (references: methods allDeps, generateLicenseTxt, parse_dependency_tree, set_direct_dependencies).
♻️ Duplicate comments (1)
src/fosslight_dependency/package_manager/Pypi.py (1)
209-215:⚠️ Potential issue | 🟠 MajorLet
subprocess.run()decode stderr safely.Lines 213 and 231 still do strict
.decode('utf-8')on raw stderr bytes. A single invalid byte fromvirtualenvorpipwill raiseUnicodeDecodeErrorand hide the real failure. Decode insubprocess.run()instead withtext=True, encoding='utf-8', errors='replace'.🐛 Proposed fix
- cmd_ret = subprocess.run(cmd, shell=True, stderr=subprocess.PIPE) + cmd_ret = subprocess.run( + cmd, + shell=True, + stderr=subprocess.PIPE, + text=True, + encoding='utf-8', + errors='replace', + ) if cmd_ret.returncode != 0: ret = False err_msg = f"return code({cmd_ret.returncode})" - elif cmd_ret.stderr.decode('utf-8').strip().lower().startswith('error:'): + elif cmd_ret.stderr.strip().lower().startswith('error:'): ret = False err_msg = f"stderr msg({cmd_ret.stderr})" @@ - cmd_ret = subprocess.run(cmd, shell=True, stderr=subprocess.PIPE) + cmd_ret = subprocess.run( + cmd, + shell=True, + stderr=subprocess.PIPE, + text=True, + encoding='utf-8', + errors='replace', + ) if cmd_ret.returncode != 0: ret = False err_msg = f"return code({cmd_ret.returncode})" - elif cmd_ret.stderr.decode('utf-8').strip().lower().startswith('error:'): + elif cmd_ret.stderr.strip().lower().startswith('error:'): ret = False err_msg = f"stderr msg({cmd_ret.stderr})"#!/bin/bash python - <<'PY' bad = b'\xfferror: failed' try: bad.decode('utf-8') except UnicodeDecodeError as e: print(type(e).__name__) print(bad.decode('utf-8', errors='replace')) PYAlso applies to: 227-233
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fosslight_dependency/package_manager/Pypi.py` around lines 209 - 215, The code currently calls subprocess.run(...) and later does cmd_ret.stderr.decode('utf-8') which can raise UnicodeDecodeError; update the subprocess.run calls that assign to cmd_ret to pass text=True, encoding='utf-8', errors='replace' so stderr is already a safe string, then remove explicit .decode(...) usages (use cmd_ret.stderr directly); apply this change for the subprocess.run usages around the cmd_ret variable in Pypi.py (the blocks checking cmd_ret.returncode and the stderr.startswith('error:') checks).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fosslight_dependency/package_manager/Npm.py`:
- Around line 149-159: The failure branches for subprocess invocation in
Npm.parse_direct_dependencies leave err_msg empty, so callers lose the stderr
reason; update the block handling subprocess.run (the result variable) to set
err_msg to a descriptive string (e.g., include result.stderr and optionally
result.returncode) when no output or when result.returncode > 1 (in the same
places where ret is set to False) so that parse_direct_dependencies() / Yarn
caller receive the stderr-based error message; ensure existing debug branch for
returncode == 1 remains unchanged.
In `@src/fosslight_dependency/package_manager/Pypi.py`:
- Around line 274-277: The subprocess.call invocations in Pypi.py that write to
tmp_list.txt and tmp_pip_license_output.json rely on external env vars for
UTF-8; update both subprocess.call calls to pass an explicit env built from
os.environ with PYTHONUTF8='1' and PYTHONIOENCODING='utf-8' (e.g., env =
dict(os.environ, PYTHONUTF8='1', PYTHONIOENCODING='utf-8')) and pass that env to
subprocess.call so the child process always emits UTF-8 output before the files
are reopened with encoding='utf-8'.
---
Outside diff comments:
In `@src/fosslight_dependency/_package_manager.py`:
- Around line 120-125: In allDeps where subprocess.check_output(cmd, ...) is
used, replace the integer comparison with a truthy check on the returned string
(e.g., if ret.strip(): call self.parse_dependency_tree(ret) else call
self.set_direct_dependencies(False) and log the failure); in generateLicenseTxt
replace subprocess.check_output with subprocess.check_call (since output isn't
used) and add proper exception handling for subprocess.CalledProcessError to run
the error branch (logger.warning and set_direct_dependencies(False)); update any
checks that currently compare ret == 0 or ret != 0 to use ret.strip() or
try/except as appropriate (references: methods allDeps, generateLicenseTxt,
parse_dependency_tree, set_direct_dependencies).
---
Duplicate comments:
In `@src/fosslight_dependency/package_manager/Pypi.py`:
- Around line 209-215: The code currently calls subprocess.run(...) and later
does cmd_ret.stderr.decode('utf-8') which can raise UnicodeDecodeError; update
the subprocess.run calls that assign to cmd_ret to pass text=True,
encoding='utf-8', errors='replace' so stderr is already a safe string, then
remove explicit .decode(...) usages (use cmd_ret.stderr directly); apply this
change for the subprocess.run usages around the cmd_ret variable in Pypi.py (the
blocks checking cmd_ret.returncode and the stderr.startswith('error:') checks).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 0f53914a-94a9-4186-8386-c0be38a11ac5
📒 Files selected for processing (5)
cli.pysrc/fosslight_dependency/_package_manager.pysrc/fosslight_dependency/package_manager/Npm.pysrc/fosslight_dependency/package_manager/Pypi.pysrc/fosslight_dependency/package_manager/Yarn.py
✅ Files skipped from review due to trivial changes (1)
- cli.py
Signed-off-by: woocheol <jayden6659@gmail.com>
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/fosslight_dependency/_package_manager.py (2)
133-144:⚠️ Potential issue | 🔴 CriticalError handling is broken —
ret == 0is alwaysFalsefor string output.Same issue as above:
subprocess.check_output(..., encoding='utf-8')returns a string. The conditionif ret == 0:(line 134) compares a string to an integer, which is alwaysFalse. This means:
- Lines 135-136 (
ret_task = Falseand error log) will never execute- Command failures that raise
CalledProcessErrorare caught at line 142, but that's the only failure path now- If the command succeeds but produces unexpected output, no error is logged
The logic appears inverted from the original intent — it seems like the condition should trigger on failure, not success.
Proposed fix
if ret_plugin: cmd = f"{cmd_gradle} generateLicenseTxt" try: ret = subprocess.check_output(cmd, shell=True, encoding='utf-8') - if ret == 0: - ret_task = False - logger.error(f'Fail to run {cmd}') + # check_output raises CalledProcessError on non-zero exit + # If we reach here, command succeeded if os.path.isfile(self.input_file_name): logger.info('Automatically run android-dependency-scanning plugin and generate output.') self.plugin_auto_run = True else: + ret_task = False logger.warning('Automatically run android-dependency-scanning plugin, but fail to generate output.') except Exception as e: logger.error(f'Fail to run {cmd}: {e}') ret_task = False🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fosslight_dependency/_package_manager.py` around lines 133 - 144, The current check compares the string returned by subprocess.check_output to 0, which never matches; replace the call to subprocess.check_output(...) with subprocess.run(..., shell=True, capture_output=True, encoding='utf-8') (or otherwise obtain a numeric returncode), then test if result.returncode != 0 to set ret_task = False and log the error (include result.stdout/result.stderr for context); still keep the existing exception handler, and afterwards use os.path.isfile(self.input_file_name) to set self.plugin_auto_run and log success/failure as before.
120-128:⚠️ Potential issue | 🟠 Major
subprocess.check_outputreturns decoded output, not a return code — comparison is incorrect.
subprocess.check_output(..., encoding='utf-8')returns the command's stdout as a string, not an exit code. The conditionif ret != 0:at line 121 compares a string to an integer, which is alwaysTruefor any non-empty output. This accidentally works becauseparse_dependency_tree(ret)receives the output string as intended.However, the
elsebranch (lines 123-125) that logs failure will never execute becauseret != 0is always truthy for non-empty strings. If the command succeeds but produces empty output, the logic will incorrectly log a failure.Also note:
check_outputraisesCalledProcessErroron non-zero exit, so theret == 0check is unreachable for failed commands anyway.Proposed fix
try: ret = subprocess.check_output(cmd, shell=True, encoding='utf-8') - if ret != 0: - self.parse_dependency_tree(ret) - else: - self.set_direct_dependencies(False) - logger.warning(f"Fail to run {cmd}") + if ret and ret.strip(): + self.parse_dependency_tree(ret) + else: + self.set_direct_dependencies(False) + logger.warning(f"No output from {cmd}") except Exception as e: self.set_direct_dependencies(False) logger.warning(f"Cannot print 'depends on' information. (fail {cmd}: {e})")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fosslight_dependency/_package_manager.py` around lines 120 - 128, The code incorrectly treats subprocess.check_output's return value as an exit code; replace that logic in the block using subprocess.check_output(...) with one of two fixes: either call subprocess.run(cmd, shell=True, encoding='utf-8', stdout=subprocess.PIPE, stderr=subprocess.PIPE, check=False) and then check run_result.returncode to decide whether to call parse_dependency_tree(run_result.stdout) or set_direct_dependencies(False) and log the failure via logger.warning(f"Fail to run {cmd}: {run_result.stderr or run_result.returncode}"), or keep check_output and handle CalledProcessError in an except CalledProcessError as e block where on success you call parse_dependency_tree(ret) and on failure you call set_direct_dependencies(False) and log e.output/e.returncode; ensure references to parse_dependency_tree(...) and set_direct_dependencies(False) remain and include the command and error details in logger.warning.
♻️ Duplicate comments (1)
src/fosslight_dependency/package_manager/Pypi.py (1)
209-215:⚠️ Potential issue | 🟠 MajorStrict UTF-8 decode on stderr can raise
UnicodeDecodeError.Lines 213 and 231 decode stderr with strict UTF-8:
cmd_ret.stderr.decode('utf-8'). Ifvirtualenvorpipemit non-UTF-8 bytes (e.g., locale-dependent error messages), this will raiseUnicodeDecodeErrorand mask the original subprocess failure.Proposed fix
- cmd_ret = subprocess.run(cmd, shell=True, stderr=subprocess.PIPE) + cmd_ret = subprocess.run(cmd, shell=True, stderr=subprocess.PIPE, text=True, encoding='utf-8', errors='replace') if cmd_ret.returncode != 0: ret = False err_msg = f"return code({cmd_ret.returncode})" - elif cmd_ret.stderr.decode('utf-8').strip().lower().startswith('error:'): + elif cmd_ret.stderr.strip().lower().startswith('error:'): ret = False err_msg = f"stderr msg({cmd_ret.stderr})"Apply the same change to the second
subprocess.runat line 227.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/fosslight_dependency/package_manager/Pypi.py` around lines 209 - 215, The code decodes subprocess stderr strictly with cmd_ret.stderr.decode('utf-8') which can raise UnicodeDecodeError; update the subprocess handling in Pypi.py to decode stderr using a tolerant codec (e.g., cmd_ret.stderr.decode('utf-8', errors='replace') or errors='ignore') before calling .strip().lower(), and apply the same fix to the other subprocess.run call that also inspects cmd_ret.stderr; ensure you build err_msg from the safely decoded stderr when setting err_msg and when checking for stderr starting with 'error:'.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/fosslight_dependency/_package_manager.py`:
- Around line 133-144: The current check compares the string returned by
subprocess.check_output to 0, which never matches; replace the call to
subprocess.check_output(...) with subprocess.run(..., shell=True,
capture_output=True, encoding='utf-8') (or otherwise obtain a numeric
returncode), then test if result.returncode != 0 to set ret_task = False and log
the error (include result.stdout/result.stderr for context); still keep the
existing exception handler, and afterwards use
os.path.isfile(self.input_file_name) to set self.plugin_auto_run and log
success/failure as before.
- Around line 120-128: The code incorrectly treats subprocess.check_output's
return value as an exit code; replace that logic in the block using
subprocess.check_output(...) with one of two fixes: either call
subprocess.run(cmd, shell=True, encoding='utf-8', stdout=subprocess.PIPE,
stderr=subprocess.PIPE, check=False) and then check run_result.returncode to
decide whether to call parse_dependency_tree(run_result.stdout) or
set_direct_dependencies(False) and log the failure via logger.warning(f"Fail to
run {cmd}: {run_result.stderr or run_result.returncode}"), or keep check_output
and handle CalledProcessError in an except CalledProcessError as e block where
on success you call parse_dependency_tree(ret) and on failure you call
set_direct_dependencies(False) and log e.output/e.returncode; ensure references
to parse_dependency_tree(...) and set_direct_dependencies(False) remain and
include the command and error details in logger.warning.
---
Duplicate comments:
In `@src/fosslight_dependency/package_manager/Pypi.py`:
- Around line 209-215: The code decodes subprocess stderr strictly with
cmd_ret.stderr.decode('utf-8') which can raise UnicodeDecodeError; update the
subprocess handling in Pypi.py to decode stderr using a tolerant codec (e.g.,
cmd_ret.stderr.decode('utf-8', errors='replace') or errors='ignore') before
calling .strip().lower(), and apply the same fix to the other subprocess.run
call that also inspects cmd_ret.stderr; ensure you build err_msg from the safely
decoded stderr when setting err_msg and when checking for stderr starting with
'error:'.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7c9b2ead-b1ef-488f-963e-3080408ea05a
📒 Files selected for processing (5)
cli.pysrc/fosslight_dependency/_package_manager.pysrc/fosslight_dependency/package_manager/Npm.pysrc/fosslight_dependency/package_manager/Pypi.pysrc/fosslight_dependency/package_manager/Yarn.py
✅ Files skipped from review due to trivial changes (1)
- cli.py
|
@woocheol-lge , description 어떤 이슈로 수정한 패치인지 보완 부탁드립니다. |
Description
Normalize UTF-8 encoding across environments