Skip to content

New rule: detect suspicious encoded blobs in comments (beyond base64) #19

Description

@sarahxsanders

Context

From Vincent's review on PR #12 — the current prompt_injection_base64_in_comment rule only catches base64. Other encoding schemes could also be used to smuggle payloads in comments.

What to catch

  • Long hex strings in comments (e.g., \x48\x65\x6c\x6c\x6f)
  • URL-encoded blobs (%48%65%6C%6C%6F)
  • Other obfuscation patterns (rot13, unicode escapes, etc.)

Considerations

  • High false-positive risk. Legitimate code comments often contain:
    • SHA hashes (40+ hex chars)
    • JWTs in test fixtures
    • UUIDs
    • Color codes, memory addresses
  • Needs careful threshold tuning per encoding type.
  • Better as companion rules (one per encoding type) rather than broadening the existing base64 rule.

Origin

PR #12 review comment by @gewenyu99

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions