Skip to content

Security: foprc/sauron

Security

SECURITY.md

Security Policy

Security Model

Sauron is a local-only OCR tool. All image processing happens on the user's machine using PaddleOCR. There is no network server, no cloud API, and no telemetry.

Trust boundary: The process boundary. Callers (the CLI user, the MCP client) are trusted principals. Sauron does not defend against a malicious user on the same machine — that is the OS's job.

MCP transport: The MCP server communicates via stdin/stdout with a single connected process (e.g., Claude Desktop). No network ports are opened. The server accepts arbitrary file paths by design; the stdio transport ensures only the directly connected client can send requests.

Mitigations

Threat Mitigation
Path traversal in tar extraction (CVE-2007-4559) download_models.py uses tarfile.extractall(filter='data') with a manual validation fallback for older Python builds.
Symlink escape in directory scanning scanner.py verifies that each discovered file resolves within the scan root via Path.resolve().is_relative_to(). Symlinks pointing outside the root are silently skipped.
Output path escape in batch processing batch.py verifies the resolved destination stays within output_dir before writing any result files.
Model integrity download_models.py verifies SHA-256 checksums of downloaded model tarballs before extraction.
Code injection No use of eval(), exec(), pickle, yaml.load(), or subprocess with shell=True anywhere in the codebase.
Serialization JSON-only serialization (json.dumps / json.loads). No binary deserialization of untrusted data.
File write atomicity All file writes use a temp-file-then-rename pattern to prevent partial writes.
GUI surface Depends on opencv-python-headless (no GUI event loop or window creation).
MCP isolation stdio-only transport; no HTTP server, no open ports.

Known Limitations

  • TOCTOU in path validation: ocr_engine.py checks exists() / is_file() before PaddleOCR opens the file. A file could be replaced between check and use. This is standard for CLI tools and low-risk in the local-only threat model.
  • Error messages contain file paths: Acceptable for a local tool. Do not expose Sauron error output to untrusted parties.
  • MCP server accepts arbitrary file paths: By design. The MCP server can read any file the process has access to. If you need path restrictions, add allowlist logic in mcp_server.py:handle_call_tool().
  • PaddlePaddle model deserialization: PaddleOCR loads model files via PaddlePaddle's inference engine. Model files are trusted input (downloaded from official sources with checksum verification). Sauron does not defend against maliciously crafted model files — that is PaddlePaddle's responsibility.

Supported Versions

Version Supported
0.1.x Yes

Reporting a Vulnerability

If you discover a security vulnerability in Sauron, please report it responsibly:

  1. Do NOT open a public issue for security vulnerabilities
  2. Email dorian@truewatch.com with:
    • Description of the vulnerability
    • Steps to reproduce
    • Potential impact
    • Suggested fix (if any)

We will acknowledge receipt within 48 hours and aim to provide a fix or mitigation plan within 7 days.

Out of Scope

  • Vulnerabilities in upstream dependencies (PaddleOCR, PaddlePaddle) -- please report these to their respective projects
  • Issues that require physical access to the machine running Sauron
  • Attacks that require the user to run Sauron with elevated privileges (root/admin)

There aren't any published security advisories