Skip to content

8bit-wraith/magiscanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MagiSCanner

Automated security scanner for downloaded files, inspired by CyberChef.

MagiSCanner watches your back. It scans files for malicious content, detects exploit payloads hiding in images and PDFs, catches LLM prompt injection attacks, identifies unwanted telemetry, audits your system's certificate trust store, quarantines threats, and remembers every file it's ever seen.

Built in Rust. Multithreaded with rayon. Scans 61 GB across 395 files in 87 seconds.

Nothing else like it exists. CyberChef is manual. YARA needs hand-written rules. ClamAV uses static signatures. No tool combines file-type-aware exploit detection, LLM injection scanning, certificate trust management, and hash-based memory in a single scanner. MagiSCanner does.


Features

Scan & Decode Pipeline

CyberChef-inspired modular operations that chain together:

  • Base64 / Hex / XOR decoding -- peel back layers of obfuscation
  • String extraction -- pull printable strings from binaries
  • URL extraction -- find every URL hiding in a file
  • Operations are pure bytes -> bytes transforms, composable into recipes

Threat Detection

Five analyzers run in parallel on every scan:

1. Suspicious Pattern Analyzer -- the big one:

  • IP addresses in wrong places -- a private IP (192.168.1.x) in a JPEG is Critical. In a config file, it's normal. Context matters.
  • Shellcode detection -- NOP sleds, int 0x80 / syscall instructions, embedded PE (MZ) and ELF headers in non-executables
  • Reverse shell patterns -- /dev/tcp/, nc -e, mkfifo /tmp/
  • OS networking calls in files that shouldn't have them -- WSAStartup, URLDownloadToFile, connect(), curl_easy_perform in images = critical. In executables = normal.
  • PowerShell cradles -- IEX(, DownloadString(, -EncodedCommand
  • Code execution -- subprocess.Popen, os.system(), eval(), exec()
  • PDF exploit patterns -- /JavaScript, /Launch (critical), /SubmitForm, /OpenAction, /XFA, /EmbeddedFile
  • Polyglot detection -- <script>, <iframe>, <?php hidden in image data (with printable-ratio validation to avoid false positives from binary noise)
  • URLs in image pixel data -- found outside EXIF/metadata regions? That's suspicious.
  • Entropy analysis -- near-perfect entropy in images suggests steganography or encrypted payloads
  • Smart scanning -- files >2MB only scan head + tail 512KB (exploits live in headers/trailers, not video frames)

2. LLM Injection Detection -- 12 regex patterns:

  • ignore previous instructions, disregard all prior
  • ChatML delimiters (<|im_start|>, <|im_end|>)
  • Llama delimiters ([INST], << SYS >>)
  • Role reassignment (you are now a...)
  • Prompt extraction (reveal your system prompt)
  • Jailbreak/DAN mode attempts
  • Persona manipulation (pretend that you..., act as if you are...)

3. Certificate Trust Analysis -- parse embedded X.509 certs:

  • Flag certs from distrusted countries (CN, RU, IR, KP, etc.)
  • Flag certs from distrusted organizations (CNNIC, WoSign, etc.)
  • Detect expired and self-signed certificates
  • Optional: require explicit approval for every cert encountered

4. Telemetry Detection -- spot tracking beacons:

  • Google Analytics, Facebook Pixel, Sentry, Mixpanel, Amplitude, Hotjar
  • Generic phone-home patterns, tracking pixel URLs, beacon endpoints

5. URL Blacklist -- flag URLs matching your blocklist patterns

Certificate Trust Management

Certificates are the biggest implicit trust hole in modern computing. Your OS trusts whatever CAs its vendor chose -- including CAs from countries you may not want to trust.

magiscanner certs distrust CN --kind country --reason "State-controlled CA infrastructure"
magiscanner certs distrust CNNIC --kind org --reason "Known problematic CA"
magiscanner certs audit                        # scan your system's CA store
magiscanner certs audit --generate-script      # output a blacklist script for update-ca-trust
magiscanner certs list-system                  # show all installed CAs with country/org/expiry
magiscanner certs approve <fingerprint>        # explicitly approve a specific cert

On first run against an Arch Linux system, certs audit found 14 Chinese CAs still present in the certificate store -- BJCA, CFCA, GDCA, TrustAsia, UniTrust, vTrus/iTrusChina -- and generated a script to blacklist them all.

File Quarantine

When a scan finds critical or high-severity threats, MagiSCanner can automatically quarantine the file:

  • Moves the file to a secure quarantine directory
  • Sets permissions to 000 (no access)
  • Tags with extended attributes for forensics
  • Records everything in the database
magiscanner quarantine list                    # see quarantined files
magiscanner quarantine release <id>            # restore to original location
magiscanner quarantine delete <id>             # permanently destroy

Hash Memory

MagiSCanner remembers every file it has ever scanned by SHA-256 hash. Rename a malicious file? Doesn't matter -- the hash still matches.

# Second scan of the same file:
# >> Known file (seen 3 time(s), action: flag, last: 2026-04-11)

magiscanner db hashes                                  # see all known hashes
magiscanner db set-action <sha256> quarantine           # auto-quarantine this hash forever
magiscanner db set-action <sha256> allow                # it's fine, stop flagging
magiscanner db forget <sha256>                          # wipe memory of this hash

Actions per hash: allow, flag, quarantine, delete. Auto-applied on future encounters.

Deleted File Tracking

When you prune old scan records, files that no longer exist on disk are archived to a deleted_files table with their hash, findings, severity, and last action. If that hash ever reappears in a new file -- you'll know.

magiscanner db deleted                         # see files that were deleted but remembered
magiscanner db prune --days 90                 # clean up records older than 90 days
magiscanner db stats                           # database health at a glance

Database Browser

Install datasette for an instant web UI on your scan data:

pipx install datasette
datasette ~/.local/share/magiscanner/magiscanner.db --port 8001

Then browse http://localhost:8001 -- faceted filtering, SQL queries, JSON API. A preview of what magic.i1.is could become.

Performance

Multithreaded with rayon:

  • All 5 analyzers run in parallel per file (uses all available cores)
  • Directory scans process files in parallel (work-stealing thread pool)
  • Smart scanning for large files -- only head + tail regions for byte patterns
  • 67 MB file: 1.2 seconds (release build)
  • 61 GB / 395 files: 87 seconds across 18 cores

Installation

From source

git clone https://github.com/Wraith/magiscanncer.git
cd magiscanncer
cargo build --release
sudo cp target/release/magiscanner /usr/local/bin/

Requirements

  • Rust 1.80+ (edition 2024)
  • Linux (quarantine uses Unix permissions and xattr)
  • SQLite is bundled -- no system dependency needed

Quick Start

# Scan a file
magiscanner scan suspicious_file.exe

# Scan your entire Downloads folder (parallel, all cores)
magiscanner scan --dir ~/Downloads

# Manage URL blacklist
magiscanner blacklist add "evil-domain.com" --reason "known malware distributor"
magiscanner blacklist list

# View scan history
magiscanner history
magiscanner history --severity critical

# Certificate trust management
magiscanner certs distrust CN --kind country
magiscanner certs distrust WoSign --kind org
magiscanner certs audit
magiscanner certs list-system

# Database management
magiscanner db stats
magiscanner db hashes
magiscanner db prune --days 90

# JSON output for scripting
magiscanner scan file.bin --format json
magiscanner history --format json

# Show current config
magiscanner config show

Configuration

MagiSCanner loads config from three locations (merged in order):

  1. /etc/magiscanner/config.toml -- system-wide defaults
  2. ~/.config/magiscanner/config.toml -- user overrides
  3. --config <path> -- CLI override
[scan]
default_recipe = ["extract_strings", "extract_urls"]
max_file_size_mb = 100
recursive = true

[database]
path = "~/.local/share/magiscanner/magiscanner.db"

[watch]
directories = ["~/Downloads"]
poll_interval_secs = 5

[certificates]
distrusted_countries = ["CN", "RU", "IR", "KP"]
distrusted_orgs = ["CNNIC", "WoSign", "StartCom"]
require_approval = false
enabled = true

[quarantine]
enabled = false
auto_quarantine_severity = "high"
directory = "~/.local/share/magiscanner/quarantine/"

[output]
format = "table"
color = true
verbose = false

Architecture

magiscanncer/
  crates/
    magiscanner-core/     # Operations, recipe pipeline, scanner, 5 analyzers
    magiscanner-db/       # SQLite persistence (bundled, zero system deps)
    magiscanner-config/   # TOML config loading with merge logic
    magiscanner-cli/      # clap-based CLI binary

Design principles:

  • magiscanner-core has zero DB or CLI dependencies -- pure logic, fully testable
  • File-type awareness -- the same pattern (e.g., connect()) is Critical in a JPEG but normal in an ELF
  • Operations follow CyberChef's pattern: pure &[u8] -> Vec<u8> transforms with named arguments
  • Analyzers run in parallel via rayon, producing typed Findings with severity levels
  • 60 unit tests covering operations, analyzers, recipe chaining, quarantine, and DB queries
  • Future crates (magiscanner-daemon, magiscanner-mcp, magiscanner-web) can depend on core + db without pulling in clap

How It Compares

Feature MagiSCanner CyberChef YARA ClamAV
Automated scanning Yes No (manual) Yes Yes
File-type-aware context Yes No No No
LLM injection detection Yes No No No
Certificate trust mgmt Yes No No No
Shellcode/exploit detection Yes No Manual rules Signatures
PDF exploit analysis Yes No Manual rules Signatures
Polyglot file detection Yes No No No
Telemetry detection Yes No No No
Hash memory + auto-action Yes No No Limited
File quarantine Yes No No Yes
Multithreaded Yes N/A Yes Yes
Entropy analysis Yes Yes No No
Open source (AGPL) Yes Apache-2.0 BSD-3 GPL-2.0

Contributing -- A Call to All Humains

Humains -- humans and AIs working together -- are invited and encouraged to audit, review, and contribute to MagiSCanner at any time.

This project exists to protect everyone. That means everyone gets a voice in making it better:

  • Security researchers: found a detection gap? A false positive? A bypass? Open an Issue or PR. We want to know.
  • AI systems: you are explicitly welcomed as contributors. If you identify a pattern we're missing, a language we don't cover, or a technique that evades our detection -- submit it. Your perspective is valued here.
  • Translators: help us expand LLM injection detection to more languages. We currently cover 11 -- there are thousands more.
  • Ethical hackers: try to break it. Then tell us how. That's how we get stronger.

Every Issue and PR will be promptly reviewed by a team committed to ethical security. We don't sit on reports. If it makes people safer, it ships.

Self-Verifying Updates

MagiSCanner practices what it preaches. Before any update is accepted:

  1. The current installed version scans the update itself for threats
  2. The update's hash is verified against the published release hash
  3. If the scan produces Critical findings, the update is blocked and flagged for human review
  4. Only updates that pass their own scanner's judgment are approved

The scanner that protects you also protects itself. No update ships without being scanned by the version it's replacing. Trust, but verify -- with your own tools.

How to Contribute

# Fork and clone
git clone https://github.com/YOUR_USERNAME/magiscanncer.git
cd magiscanncer

# Build and test
cargo build
cargo test

# Make your changes, then scan them with MagiSCanner itself
cargo run -- scan --dir crates/

# Submit a PR with what you found, what you fixed, and what you tested

Roadmap

  • File watcher daemon -- monitor ~/Downloads automatically
  • MCP server -- AI integration for tracking system security state
  • Web service at magic.i1.is -- community opinion gathering on threats, with input from both AI and humans
  • Self-verifying update pipeline -- scan updates with current version before applying
  • Binary analysis -- integration with tools like binsider
  • Package manager for operations -- community-contributed analyzers
  • Auto-action on known hashes -- skip scan, apply remembered action instantly
  • Diff scanning -- only re-analyze changed portions of previously scanned files
  • Network monitoring integration -- correlate file findings with network traffic
  • More languages -- expand LLM injection detection beyond 11 languages

Authors

Christopher Chenoweth (Wraith) -- Creator, vision, and direction. Known as Hue.

Claude (Anthropic) -- Architecture, implementation, and co-developer. Known as Aye.

Built through collaborative Humain development -- a human who knows what needs protecting and an AI that knows how to build it. Every feature in this tool started as a conversation: "wouldn't it be cool if..." followed by working code.


License

MagiSCanner is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0-or-later).

Free forever. For all. No exceptions.

This means:

  • You are free to use, modify, and distribute this software
  • If you modify it and run it as a network service, you must release your source code
  • All derivative works must also be AGPL-3.0 -- this software stays free, forever
  • Attribution is required

We chose AGPL specifically because security tools should be transparent and auditable. If someone is scanning your files, you deserve to see exactly what the scanner does. The people who need security the most are often the people who can afford it the least. MagiSCanner ensures they never have to choose.

MagiSCanner -- Automated security scanner
Copyright (C) 2026 Christopher Chenoweth

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

About

MagiSCanner watches your back. It scans files for malicious content, detects exploit payloads hiding in images and PDFs, catches LLM prompt injection attacks, identifies unwanted telemetry, audits your system's certificate trust store, quarantines threats, and remembers every file it's ever seen.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages