Skip to content

Fix Ghidra comment read fidelity#199

Open
amattas wants to merge 2 commits into
binsync:mainfrom
amattas:pr/ghidra-comment-read-fidelity
Open

Fix Ghidra comment read fidelity#199
amattas wants to merge 2 commits into
binsync:mainfrom
amattas:pr/ghidra-comment-read-fidelity

Conversation

@amattas

@amattas amattas commented Jun 15, 2026

Copy link
Copy Markdown

Problem

Ghidra stores several comment slots for each code unit: plate, pre, EOL, post, and repeatable comments. The current declib Ghidra reader only looks at a subset of those slots and collapses multiple comments at the same address by overwriting earlier text. In practice this loses comments when round-tripping state from Ghidra into clients such as BinSync.

Two concrete failure modes this fixes:

  • EOL and PRE comments at the same address are not both preserved.
  • PLATE, POST, and REPEATABLE comments are not represented when reading comments back from Ghidra.

Solution

  • Read all populated Ghidra comment slots in Ghidra display order.
  • Preserve multiple comments at one address by joining them with explicit slot labels.
  • Set decompiled=True only when the merged comment is purely pseudocode/PRE text; mixed disassembly+pseudocode comments remain disassembly-side comments.
  • Add a Ghidra import-stubbed unit test for slot mapping and mixed comment behavior.

Verification

  • PYTEST_ADDOPTS='-p no:cacheprovider' conda run -n ghidra python -m pytest tests/test_ghidra_comments.py -q -> 1 passed
  • PYTHONPYCACHEPREFIX=/tmp/declib-pyc conda run -n ghidra python -m compileall -q declib tests
  • git diff --check

Notes

This PR is independent of the BinSync PRs. It only changes how declib reads comments from Ghidra.

amattas added 2 commits June 14, 2026 23:11
_comments() read only EOL and PRE comments and let PRE silently
overwrite EOL at the same address, and never read PLATE comments -- so
function-level comments written by _set_comment could not be read back
out. Read all five Ghidra comment types per code unit and join the
populated ones, tagging entry-point PLATE comments with func_addr so
they round-trip through _set_comment. Iterate functions directly for
func_addr context and drop the now-unused __function_code_units helper.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant