Add pure Python JSFuck, JJEncode, and AAEncode decoders#8
Merged
Conversation
Re-implement JSFuck and JJEncode decoders without Node.js dependency. JSFuck decoder uses a recursive-descent parser with JS type coercion semantics to evaluate expressions and capture Function() arguments. JJEncode decoder extracts payloads via octal/hex escape pattern matching. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…radix) The JJEncode decoder previously used naive regex matching for literal octal escapes, which failed on all real samples. Replaced with a symbol-table simulation that parses the 6-statement JJEncode structure, resolves property references token-by-token, and processes JS escape sequences. Achieves 100% decode rate on all 21 pure JJEncode samples. JSFuck decoder gains toString(radix) support via receiver tracking in the postfix parser, enabling expressions like (10)["toString"](36) → "a". Added recursion guard with sys.setrecursionlimit for deeply nested input. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The deobfuscator now runs an outer generate→re-parse loop (up to 5 cycles) so that running it twice on the same input produces identical output. This fixes ConstantProp missing opportunities that only appear after re-parsing. Also removes dead code: jj_decode_via_eval (identical to jj_decode), unused _JSFUCK_RE regex, and unused import re in jsfuck_decode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace GPL-derived jj_decode and aa_decode with clean-room implementations based on Yosuke Hasegawa's public encoding specs. Both decoders are purely iterative (no recursion) and integrate as pre-passes in the deobfuscator pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add webcrack MIT license file, update NOTICE and THIRD_PARTY_LICENSES with proper attribution for all upstream projects. Update README license section. Remove redundant deobfuscator_prepasses_test.py (tests already covered in deobfuscator_test.py). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3c2cad5 to
4b94d8e
Compare
…rative conversions Convert JSFuck parser from mutual recursion to iterative state machine with explicit value/continuation stacks, removing the sys.setrecursionlimit hack. Add depth-limited recursion and iterative paren stripping to JJEncode evaluator. Add RecursionError safety net in Deobfuscator.execute(). Revert traverser.py and parser.py to original recursive versions since esprima-bounded AST depth (~60 levels) makes them safe. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…r handling - AAEncode: replace U+30FC (fullwidth ー) with U+FF70 (halfwidth ー) in all replacement patterns; real AAEncode output uses the halfwidth variant - JSFuck: exclude whitespace from detection ratio, raise threshold to 95% to avoid false positives on minified JS - JSFuck/JJEncode: replace bare `except Exception` with specific exception types so unexpected errors propagate - Document single-arg limitation in _Parser._call - Fix RecursionError comment to accurately describe esprima as the source - Restore test_recursive_deobfuscation verifying pre-pass→pipeline path - Fix AAEncode test to use correct U+FF70 escape sequences Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n safety Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
JS octal escapes allow at most 2 digits when the first digit is 4-7 (max \77 = 63), and 3 digits when the first digit is 0-3 (max \377 = 255). The decoder was greedily consuming up to 3 digits regardless, causing \401 to produce U+0101 instead of space + '1'. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n safety Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace Node.js-dependent decoders with pure Python implementations and improve the deobfuscator pipeline:
_JSValue), resolvesFunction()constructor chains to extract payloadssys.setrecursionlimitguards to handle deeply nested inputs without crashinglicenses/LICENSE-webcrack, updated NOTICE and THIRD_PARTY_LICENSES for all upstream projectsKey changes
pyjsclear/transforms/jsfuck_decode.pypyjsclear/transforms/jj_decode.pypyjsclear/transforms/aa_decode.pypyjsclear/deobfuscator.pylicenses/,NOTICE,THIRD_PARTY_LICENSES.mdTest plan
subprocessorshutilreferences inpyjsclear/— fully standalone🤖 Generated with Claude Code