Skip to content

Add pure Python JSFuck, JJEncode, and AAEncode decoders#8

Merged
itamarga merged 10 commits intomainfrom
feat/pure-python-jsfuck-jjencode
Mar 13, 2026
Merged

Add pure Python JSFuck, JJEncode, and AAEncode decoders#8
itamarga merged 10 commits intomainfrom
feat/pure-python-jsfuck-jjencode

Conversation

@itamarga
Copy link
Copy Markdown
Collaborator

@itamarga itamarga commented Mar 12, 2026

Summary

Replace Node.js-dependent decoders with pure Python implementations and improve the deobfuscator pipeline:

  • JSFuck decoder — recursive-descent parser/evaluator with JS type coercion semantics (_JSValue), resolves Function() constructor chains to extract payloads
  • JJEncode decoder — clean-room rewrite (Apache-2.0) that handles octal/hex escape extraction, real-world variable naming, and multi-layer encoding
  • AAEncode decoder — clean-room rewrite (Apache-2.0) fixing U+30FC/U+FF70 Katakana long vowel handling
  • Outer re-parse loop — after AST transforms, generates code and re-parses to catch changes that only become visible after regeneration (max 5 cycles)
  • Recursion safety — custom parsers (JSFuck, JJEncode) use sys.setrecursionlimit guards to handle deeply nested inputs without crashing
  • License attribution — added licenses/LICENSE-webcrack, updated NOTICE and THIRD_PARTY_LICENSES for all upstream projects

Key changes

File What changed
pyjsclear/transforms/jsfuck_decode.py New pure Python JSFuck decoder (593 lines)
pyjsclear/transforms/jj_decode.py New clean-room JJEncode decoder (718 lines)
pyjsclear/transforms/aa_decode.py Clean-room rewrite with Katakana bugfix
pyjsclear/deobfuscator.py Pre-pass ordering (JSFuck → AAEncode → JJEncode → eval-packer), outer re-parse loop
licenses/, NOTICE, THIRD_PARTY_LICENSES.md License attribution updates

Test plan

  • 45 JSFuck tests: detection, JS coercion semantics, tokenizer, parser, constructor chain resolution
  • 11 JJEncode tests: detection, octal/hex extraction, escape over-consumption fix
  • 9 AAEncode tests: detection, Katakana edge cases, U+30FC/U+FF70 handling
  • Deobfuscator integration tests for all pre-passes
  • Full test suite passes (1741 tests)
  • No subprocess or shutil references in pyjsclear/ — fully standalone
  • No GPL code in source; license audit clean

🤖 Generated with Claude Code

itamarga and others added 5 commits March 12, 2026 13:13
Re-implement JSFuck and JJEncode decoders without Node.js dependency.
JSFuck decoder uses a recursive-descent parser with JS type coercion
semantics to evaluate expressions and capture Function() arguments.
JJEncode decoder extracts payloads via octal/hex escape pattern matching.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…radix)

The JJEncode decoder previously used naive regex matching for literal
octal escapes, which failed on all real samples. Replaced with a
symbol-table simulation that parses the 6-statement JJEncode structure,
resolves property references token-by-token, and processes JS escape
sequences. Achieves 100% decode rate on all 21 pure JJEncode samples.

JSFuck decoder gains toString(radix) support via receiver tracking in
the postfix parser, enabling expressions like (10)["toString"](36) → "a".
Added recursion guard with sys.setrecursionlimit for deeply nested input.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The deobfuscator now runs an outer generate→re-parse loop (up to 5 cycles)
so that running it twice on the same input produces identical output. This
fixes ConstantProp missing opportunities that only appear after re-parsing.

Also removes dead code: jj_decode_via_eval (identical to jj_decode), unused
_JSFUCK_RE regex, and unused import re in jsfuck_decode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace GPL-derived jj_decode and aa_decode with clean-room
implementations based on Yosuke Hasegawa's public encoding specs.
Both decoders are purely iterative (no recursion) and integrate
as pre-passes in the deobfuscator pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add webcrack MIT license file, update NOTICE and THIRD_PARTY_LICENSES
with proper attribution for all upstream projects. Update README license
section. Remove redundant deobfuscator_prepasses_test.py (tests already
covered in deobfuscator_test.py).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@itamarga itamarga force-pushed the feat/pure-python-jsfuck-jjencode branch from 3c2cad5 to 4b94d8e Compare March 12, 2026 17:46
itamarga and others added 5 commits March 12, 2026 18:07
…rative conversions

Convert JSFuck parser from mutual recursion to iterative state machine with
explicit value/continuation stacks, removing the sys.setrecursionlimit hack.
Add depth-limited recursion and iterative paren stripping to JJEncode evaluator.
Add RecursionError safety net in Deobfuscator.execute(). Revert traverser.py
and parser.py to original recursive versions since esprima-bounded AST depth
(~60 levels) makes them safe.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…r handling

- AAEncode: replace U+30FC (fullwidth ー) with U+FF70 (halfwidth ー) in all
  replacement patterns; real AAEncode output uses the halfwidth variant
- JSFuck: exclude whitespace from detection ratio, raise threshold to 95%
  to avoid false positives on minified JS
- JSFuck/JJEncode: replace bare `except Exception` with specific exception
  types so unexpected errors propagate
- Document single-arg limitation in _Parser._call
- Fix RecursionError comment to accurately describe esprima as the source
- Restore test_recursive_deobfuscation verifying pre-pass→pipeline path
- Fix AAEncode test to use correct U+FF70 escape sequences

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n safety

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
JS octal escapes allow at most 2 digits when the first digit is 4-7
(max \77 = 63), and 3 digits when the first digit is 0-3 (max \377 =
255).  The decoder was greedily consuming up to 3 digits regardless,
causing \401 to produce U+0101 instead of space + '1'.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n safety

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@itamarga itamarga changed the title Add pure Python JSFuck and JJEncode decoders Add pure Python JSFuck, JJEncode, and AAEncode decoders Mar 13, 2026
@itamarga itamarga merged commit d43b95b into main Mar 13, 2026
3 checks passed
@itamarga itamarga deleted the feat/pure-python-jsfuck-jjencode branch March 14, 2026 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant