Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions .github/workflows/fuzz.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
name: Fuzz Testing

on:
push:
branches: [develop]

jobs:
fuzz:
runs-on: ubuntu-latest
timeout-minutes: 5
strategy:
fail-fast: false
matrix:
target:
- fuzz_deobfuscate
- fuzz_parser
- fuzz_generator
- fuzz_transforms
- fuzz_expression_simplifier
- fuzz_string_decoders
- fuzz_scope
- fuzz_traverser
steps:
- uses: actions/checkout@v4
- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install project
run: |
python -m pip install --upgrade pip
pip install -e .
- name: Run fuzz target (${{ matrix.target }})
run: |
chmod +x tests/fuzz/run_local.sh
tests/fuzz/run_local.sh ${{ matrix.target }} 60
28 changes: 28 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: Publish to PyPI

on:
release:
types: [published]

permissions:
contents: read

jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install build dependencies
run: |
python -m pip install --upgrade pip
pip install build
- name: Build package
run: python -m build
- name: Publish to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
password: ${{ secrets.PYPI_TOKEN }}
4 changes: 2 additions & 2 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@ name: Tests

on:
push:
branches: [main]
branches: [main, develop]
pull_request:
branches: [main]
branches: [main, develop]

jobs:
test:
Expand Down
24 changes: 24 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,30 @@ pytest tests/test_regression.py # end-to-end regression (47k files)

Test helpers in `conftest.py`: `roundtrip(code, TransformClass)`, `parse_expr(expr)`, `normalize(code)`.

### Fuzz Testing

8 fuzz targets in `tests/fuzz/` covering the full pipeline, parser, generator, transforms, expression simplifier, string decoders, scope analysis, and AST traversal. OSS-Fuzz compatible (atheris/libFuzzer).

```bash
# Run all targets for 10s each (standalone, no extra deps)
tests/fuzz/run_local.sh all 10

# Run a single target for 60s
tests/fuzz/run_local.sh fuzz_deobfuscate 60

# With atheris (requires clang + libFuzzer):
CLANG_BIN=$(which clang) pip install atheris
tests/fuzz/run_local.sh fuzz_deobfuscate 300
```

Fuzz helpers in `tests/fuzz/conftest_fuzz.py`: `bytes_to_js(data)`, `bytes_to_ast_dict(data)`, `run_fuzzer(target_fn)`. Targets use atheris when available, otherwise a standalone random-based fuzzer.

## CI/CD

- **Tests**: `.github/workflows/tests.yml` — pytest on push/PR to `main` (Python 3.11–3.13)
- **Fuzz**: `.github/workflows/fuzz.yml` — 8 fuzz targets on push/PR to `develop` (60s each, standalone fuzzer)
- **Publish**: `.github/workflows/publish.yml` — build + publish to PyPI on GitHub Release (requires `PYPI_TOKEN` secret)

## Safety Guarantees

- Never crashes on valid JS (parse failure → fallback hex decode → return original)
Expand Down
5 changes: 5 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
recursive-include pyjsclear *.py
include LICENSE
include README.md
include NOTICE
include THIRD_PARTY_LICENSES.md
12 changes: 3 additions & 9 deletions NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,7 @@ This product is a derivative work based on the following projects:
Licensed under the Apache License, Version 2.0
https://github.com/ben-sb/javascript-deobfuscator

3. esprima2 (v5.0.1)
Copyright JS Foundation and other contributors
Licensed under the BSD 2-Clause License
https://github.com/s0md3v/esprima2

This Python library re-implements the deobfuscation algorithms and transform
logic from the above Node.js/Babel-based tools (1, 2) in pure Python. No
source code was directly copied; the implementations were written from scratch
following the same algorithmic approaches. esprima2 (3) is used as a runtime
dependency for JavaScript parsing.
logic from the above Node.js/Babel-based tools in pure Python. No source code
was directly copied; the implementations were written from scratch following
the same algorithmic approaches.
27 changes: 15 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
<p align="center">
<img src="PyJSClear.png" alt="PyJSClear" width="200">
<img src="https://raw.githubusercontent.com/intezer/PyJSClear/main/PyJSClear.png" alt="PyJSClear" width="200">
</p>

# PyJSClear
Expand All @@ -14,11 +14,16 @@ into a single Python library with no Node.js dependency.
## Installation

```bash
pip install -r requirements.txt # install runtime dependencies
pip install -e . # install PyJSClear
pip install pyjsclear
```

For development:

# For development/testing
pip install -r test-requirements.txt
```bash
git clone https://github.com/intezer/PyJSClear.git
cd PyJSClear
pip install -e .
pip install pytest
```

## Usage
Expand All @@ -42,16 +47,16 @@ cleaned = deobfuscate_file("input.js")

```bash
# File to stdout
python -m pyjsclear input.js
pyjsclear input.js

# File to file
python -m pyjsclear input.js -o output.js
pyjsclear input.js -o output.js

# Stdin to stdout
cat input.js | python -m pyjsclear -
cat input.js | pyjsclear -

# With custom iteration limit
python -m pyjsclear input.js --max-iterations 20
pyjsclear input.js --max-iterations 20
```

## What it does
Expand Down Expand Up @@ -133,7 +138,5 @@ This project is a derivative work based on
[obfuscator-io-deobfuscator](https://github.com/ben-sb/obfuscator-io-deobfuscator)
(Apache 2.0) and
[javascript-deobfuscator](https://github.com/ben-sb/javascript-deobfuscator)
(Apache 2.0), and uses [esprima2](https://github.com/s0md3v/esprima2)
(BSD 2-Clause) for JavaScript parsing.
See [THIRD_PARTY_LICENSES.md](THIRD_PARTY_LICENSES.md) and
(Apache 2.0). See [THIRD_PARTY_LICENSES.md](THIRD_PARTY_LICENSES.md) and
[NOTICE](NOTICE) for full attribution.
34 changes: 0 additions & 34 deletions THIRD_PARTY_LICENSES.md
Original file line number Diff line number Diff line change
Expand Up @@ -237,37 +237,3 @@ https://github.com/ben-sb/javascript-deobfuscator/blob/master/LICENSE).
**Features derived from this project:** hex escape decoding (`--he`),
static array unpacking (`--su`), property access transformation (`--tp`).

---

## esprima2

- **Version:** 5.0.1
- **Author:** Somdev Sangwan (s0md3v)
- **Repository:** https://github.com/s0md3v/esprima2
- **License:** BSD 2-Clause License

```
Copyright JS Foundation and other contributors, https://js.foundation/

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
```

**Usage:** Runtime dependency — JavaScript parser providing ESTree-compatible AST with ES2024 support.
2 changes: 1 addition & 1 deletion pyjsclear/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from .deobfuscator import Deobfuscator


__version__ = '0.1.0'
__version__ = '0.1.1'


def deobfuscate(code, max_iterations=50):
Expand Down
53 changes: 39 additions & 14 deletions pyjsclear/transforms/string_revealer.py
Original file line number Diff line number Diff line change
Expand Up @@ -261,8 +261,14 @@ def _process_obfuscatorio_pattern(self):

# Step 5: Find and execute rotation
rotation_result = self._find_and_execute_rotation(
body, array_func_name, string_array, primary_decoder, all_wrappers,
all_decoder_aliases, alias_decoder_map=alias_decoder_map, all_decoders=decoders,
body,
array_func_name,
string_array,
primary_decoder,
all_wrappers,
all_decoder_aliases,
alias_decoder_map=alias_decoder_map,
all_decoders=decoders,
)

# Update the AST array to reflect rotation so future passes
Expand Down Expand Up @@ -304,7 +310,6 @@ def _process_obfuscatorio_pattern(self):
indices_to_remove.add(array_func_idx)
self._remove_body_indices(body, *indices_to_remove)


def _find_string_array_function(self, body):
"""Find the string array function declaration.

Expand Down Expand Up @@ -631,8 +636,14 @@ def _find_and_execute_rotation(

if expr.get('type') == 'CallExpression':
if self._try_execute_rotation_call(
expr, array_func_name, string_array, decoder, wrappers, decoder_aliases,
alias_decoder_map=alias_decoder_map, all_decoders=all_decoders,
expr,
array_func_name,
string_array,
decoder,
wrappers,
decoder_aliases,
alias_decoder_map=alias_decoder_map,
all_decoders=all_decoders,
):
return (i, None)

Expand All @@ -641,16 +652,29 @@ def _find_and_execute_rotation(
if sub.get('type') != 'CallExpression':
continue
if self._try_execute_rotation_call(
sub, array_func_name, string_array, decoder, wrappers, decoder_aliases,
alias_decoder_map=alias_decoder_map, all_decoders=all_decoders,
sub,
array_func_name,
string_array,
decoder,
wrappers,
decoder_aliases,
alias_decoder_map=alias_decoder_map,
all_decoders=all_decoders,
):
return (i, sub)

return None

def _try_execute_rotation_call(
self, call_expr, array_func_name, string_array, decoder, wrappers, decoder_aliases,
alias_decoder_map=None, all_decoders=None,
self,
call_expr,
array_func_name,
string_array,
decoder,
wrappers,
decoder_aliases,
alias_decoder_map=None,
all_decoders=None,
):
"""Try to parse and execute a single rotation call expression. Returns True on success."""
callee = call_expr.get('callee')
Expand Down Expand Up @@ -680,7 +704,11 @@ def _try_execute_rotation_call(
return False

self._execute_rotation(
string_array, operation, wrappers, decoder, stop_value,
string_array,
operation,
wrappers,
decoder,
stop_value,
alias_decoder_map=alias_decoder_map,
)
return True
Expand Down Expand Up @@ -1125,10 +1153,7 @@ def _find_simple_rotation(self, body, array_name):
if expr.get('type') == 'CallExpression':
candidates.append(expr)
elif expr.get('type') == 'SequenceExpression':
candidates.extend(
sub for sub in expr.get('expressions', [])
if sub.get('type') == 'CallExpression'
)
candidates.extend(sub for sub in expr.get('expressions', []) if sub.get('type') == 'CallExpression')

for call_expr in candidates:
callee = call_expr.get('callee')
Expand Down
2 changes: 1 addition & 1 deletion pyjsclear/utils/string_decoders.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def type(self):
return DecoderType.RC4

def get_string(self, index, key=None):
if key is None:
if not key:
return None
# Include key in cache to avoid collisions with different RC4 keys
cache_key = (index, key)
Expand Down
30 changes: 28 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,42 @@ build-backend = "setuptools.build_meta"

[project]
name = "pyjsclear"
version = "0.1.0"
dynamic = ["version"]
description = "Pure Python JavaScript deobfuscator"
readme = "README.md"
license = "Apache-2.0"
requires-python = ">=3.11"
dependencies = ["esprima2>=5.0.1"]
keywords = ["javascript", "deobfuscator", "deobfuscation", "security", "malware-analysis", "ast"]
authors = [
{name = "Intezer Labs", email = "info@intezer.com"},
]
classifiers = [
"Development Status :: 4 - Beta",
"Intended Audience :: Developers",
"Intended Audience :: Information Technology",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
"Programming Language :: Python :: 3.13",
"Topic :: Security",
"Topic :: Software Development :: Libraries :: Python Modules",
"Typing :: Typed",
]

[project.urls]
Homepage = "https://github.com/intezer/PyJSClear"
Repository = "https://github.com/intezer/PyJSClear"
Issues = "https://github.com/intezer/PyJSClear/issues"

[project.scripts]
pyjsclear = "pyjsclear.__main__:main"

[tool.setuptools.packages.find]
include = ["pyjsclear*"]

[tool.setuptools.dynamic]
version = {attr = "pyjsclear.__version__"}

[tool.black]
line-length = 120
target-version = ['py311']
Expand All @@ -34,4 +60,4 @@ schema_pattern = """(?s)\
(build|ci|docs|feat|fix|perf|refactor|style|test|chore|revert|bump)\
(\\(\\S+\\))?!?:\
( [^\\n\\r]+)\
((\\n\\n.*)|(\\s*))?$"""
((\\n\\n.*)|(\\s*))?$"""
Loading
Loading