Fix Ghidra comment read fidelity#199
Open
amattas wants to merge 2 commits into
Open
Conversation
_comments() read only EOL and PRE comments and let PRE silently overwrite EOL at the same address, and never read PLATE comments -- so function-level comments written by _set_comment could not be read back out. Read all five Ghidra comment types per code unit and join the populated ones, tagging entry-point PLATE comments with func_addr so they round-trip through _set_comment. Iterate functions directly for func_addr context and drop the now-unused __function_code_units helper.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Ghidra stores several comment slots for each code unit: plate, pre, EOL, post, and repeatable comments. The current declib Ghidra reader only looks at a subset of those slots and collapses multiple comments at the same address by overwriting earlier text. In practice this loses comments when round-tripping state from Ghidra into clients such as BinSync.
Two concrete failure modes this fixes:
Solution
decompiled=Trueonly when the merged comment is purely pseudocode/PRE text; mixed disassembly+pseudocode comments remain disassembly-side comments.Verification
PYTEST_ADDOPTS='-p no:cacheprovider' conda run -n ghidra python -m pytest tests/test_ghidra_comments.py -q->1 passedPYTHONPYCACHEPREFIX=/tmp/declib-pyc conda run -n ghidra python -m compileall -q declib testsgit diff --checkNotes
This PR is independent of the BinSync PRs. It only changes how declib reads comments from Ghidra.