Skip to content

[Code scan] Notebook HTML renderer does not escape JSON content #120

Description

@njzjz

This issue was found by a Codex global code scan of the repository.

Affected code:

dargs/dargs/notebook.py

Lines 260 to 266 in b4db564

buff.append('"')
if isinstance(self.arg, Argument):
buff.append(self.arg.name)
elif isinstance(self.arg, Variant):
buff.append(self.arg.flag_name)
elif isinstance(self.arg, str):
buff.append(self.arg)

dargs/dargs/notebook.py

Lines 300 to 308 in b4db564

doc_body = html.escape(self.arg.doc.strip())
if doc_body:
buff.append("<hr/>")
doc_body = re.sub(r"""\n+""", "\n", doc_body)
doc_body = doc_body.replace("\n", linebreak)
doc_body = re.sub(
r"`+(.*?)`+", r'<span class="dargs-doc-code">\1</span>', doc_body
)
doc_body = re.sub(r"\*(.+)\*", r"<i>\1</i>", doc_body)

dargs/dargs/notebook.py

Lines 350 to 356 in b4db564

buff.append(
json.dumps(self.data, indent=2)
.replace(" ", "&nbsp;")
.replace(
"\n", f"""</code>{linebreak}{indent}<code class="dargs-code">"""
)
)

Problem:
print_html() builds HTML by appending JSON keys, JSON values, and generated doc fragments directly into a string. Values produced by json.dumps() are not HTML-escaped before being inserted into <code> elements.

Reproducer:

from dargs import Argument
from dargs.notebook import print_html

html = print_html({"x": "<script>alert(1)</script>"}, [Argument("x", str)])
print("<script>alert(1)</script>" in html)

Observed behavior:
The returned HTML contains the raw <script> tag.

Expected behavior:
User-provided JSON keys and values, plus any generated text inserted into HTML, should be escaped before rendering in notebooks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions