Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 14 additions & 12 deletions DOCS/features/reporting/report-templates/word-template-styles.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,18 @@ You can configure many variables for a style in a Word document. In some cases,

These styles are called by name:

| **Style Name** | **Description** |
|--------------------|-------------------------------------------------------------------------------------------------------------------------------|
| `CodeBlock` | Style text evidence and anything in the WYSIWYG editor's code editor (must be a *Paragraph* style type). |
| `CodeInline` | Style runs of text formatted as code in the WYSIWYG editor (must be a *Character* style type). |
| `Number List` | Style numbered (ordered) lists. |
| `Bullet List` | Style bulleted (unordered) lists. |
| `Caption` | Built-in style used for captions below evidence and lines preceded by the *`{{.caption}}`* expression. |
| `List Paragraph` | Built-in base style used for bulleted and numbered lists; the fallback built-in style if your template lacks customized styles. |
| `BlockQuote` | Style used for block quotes. |
| `Table Grid` | Built-in style used for tables. |
| **Style Name** | **Description** |
|---------------------|---------------------------------------------------------------------------------------------------------------------------------|
| `CodeBlock` | Style text evidence and anything in the WYSIWYG editor's code editor (must be a *Paragraph* style type). |
| `CodeInline` | Style runs of text formatted as code in the WYSIWYG editor (must be a *Character* style type). |
| `Number List` | Style numbered (ordered) lists. |
| `Bullet List` | Style bulleted (unordered) lists. |
| `Caption` | Built-in style used for captions below evidence and lines preceded by the *`{{.caption}}`* expression. |
| `List Paragraph` | Built-in base style used for bulleted and numbered lists; the fallback built-in style if your template lacks customized styles. |
| `BlockQuote` | Style used for block quotes. |
| `Table Grid` | Built-in style used for tables. |
| `Footnote Reference`| Built-in style used for footnote numbers. |
| `Footnote Text` | Built-in style used for footnote text. |


<Check>
Expand All @@ -30,7 +32,7 @@ You can choose not to create these list styles, but lists will probably not look

When you create a list in Word, the application applies *List Paragraph* and additional styling depending on your selection (numbered or bulleted). The style will appear as *List Paragraph,DAI2* or similar.

This style **does not exist** in your template until you use it once, so Ghostwriter can't default to using it. (See below.)
This style **does not exist** in your template until you use it once, so Ghostwriter can't default to using it (see below).

Create a numbered list, open the styles tab, and save the style as a new style named *Numbered List*. Repeat this for bulleted lists.

Expand All @@ -40,7 +42,7 @@ Feel free to modify the indentation for nested list items and any other style va
<Check>
**Note on Built-in Styles**

Word offers many, many built-in styles you might expect to be available to Ghostwriter; however, these styles only exist in the Word *application*. Word will only add a style to your template's styles.xml when you use it to keep file size down.
Word offers many, many built-in styles you might expect to be available to Ghostwriter; however, these styles only exist in the Word *application*. Word will only add a style to your template's internal _styles.xml_ when you use it to keep file size down. These styles have these attributes applied: `<w:semiHidden/><w:unhideWhenUsed/>`.

This means a style like *Caption* will not exist in your template until you've applied it or created it yourself.

Expand Down
Binary file modified DOCS/sample_reports/template.docx
Binary file not shown.
2 changes: 2 additions & 0 deletions compose/local/django/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ RUN apk --no-cache add build-base curl \
# Rust and Cargo required by the ``cryptography`` Python package
&& apk --no-cache add rust \
&& apk --no-cache add cargo \
# Git for installing packages from GitHub
&& apk --no-cache add git \
&& pip install --no-cache-dir -U setuptools pip

COPY ./requirements /requirements
Expand Down
2 changes: 2 additions & 0 deletions compose/production/django/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ RUN apk --no-cache add build-base curl \
# Rust and Cargo required by the ``cryptography`` Python package
&& apk --no-cache add rust \
&& apk --no-cache add cargo \
# Git for installing packages from GitHub
&& apk --no-cache add git \
&& addgroup -S django \
&& adduser -S -G django django \
&& pip install --no-cache-dir -U setuptools pip
Expand Down
73 changes: 70 additions & 3 deletions ghostwriter/modules/reportwriter/base/docx.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,14 @@
import os
import re

from docxtpl import DocxTemplate, RichText as DocxRichText
from docx.opc.exceptions import PackageNotFoundError
from docx import Document
from docx.enum.style import WD_STYLE_TYPE
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.shared import Inches, Pt
from docx.image.exceptions import UnrecognizedImageError
from docx.opc.exceptions import PackageNotFoundError
from docx.oxml.ns import qn
from docx.shared import Inches, Pt
from docxtpl import DocxTemplate, RichText as DocxRichText

from ghostwriter.commandcenter.models import CompanyInformation, ReportConfiguration
from ghostwriter.modules.reportwriter.base import ReportExportTemplateError
Expand All @@ -32,6 +34,8 @@
"Caption",
"List Paragraph",
"Blockquote",
"footnote text", # Lowercase to match style name
"footnote reference" # Lowercase to match style name
] + [f"Heading {i}" for i in range(1, 7)]

_img_desc_replace_re = re.compile(r"^\s*\[\s*([a-zA-Z0-9_]+)\s*\]\s*(.*)$")
Expand Down Expand Up @@ -128,8 +132,65 @@ def run(self) -> io.BytesIO:

out = io.BytesIO()
self.word_doc.save(out)

# Post-process to clean up separator footnotes (remove extra empty paragraphs)
out = self._cleanup_footnote_separators(out)

return out

def _cleanup_footnote_separators(self, docx_bytes: io.BytesIO) -> io.BytesIO:
"""
Remove extra empty paragraphs from separator footnotes.

Some Word templates have extra empty paragraphs in the separator and
continuationSeparator footnotes, which causes unwanted spacing between
the footnote separator line and the actual footnotes.

This post-processes the saved DOCX to avoid interfering with docxtpl's
template rendering.
"""
try:
docx_bytes.seek(0)
doc = Document(docx_bytes)

# Access the footnotes part (requires accessing internal python-docx members)
# pylint: disable=protected-access
if not hasattr(doc._part, '_footnotes_part') or doc._part._footnotes_part is None:
docx_bytes.seek(0)
return docx_bytes

footnotes_part = doc._part._footnotes_part
footnotes_element = footnotes_part._element
# pylint: enable=protected-access

modified = False
for footnote in footnotes_element:
# Only clean separator footnotes (id=-1 or id=0)
footnote_id = footnote.get(qn("w:id"))
if footnote_id in ("-1", "0"):
# Find all paragraph elements
paragraphs = list(footnote.iterchildren(qn("w:p")))
# Keep only the first paragraph (which contains the separator)
for para in paragraphs[1:]:
footnote.remove(para)
modified = True

if modified:
# Save the modified document
out = io.BytesIO()
doc.save(out)
out.seek(0)
return out

docx_bytes.seek(0)
return docx_bytes

except Exception as e: # pylint: disable=broad-exception-caught
# Log but don't fail the report generation
logger.warning("Failed to cleanup footnote separators: %s", e)
docx_bytes.seek(0)
return docx_bytes

def create_styles(self):
"""
Creates default styles
Expand Down Expand Up @@ -360,6 +421,12 @@ def lint(cls, report_template: ReportTemplate) -> Tuple[List[str], List[str]]:
if style == "List Paragraph":
if document_styles[style].type != WD_STYLE_TYPE.PARAGRAPH:
warnings.append("List Paragraph style is not a paragraph style (see documentation)")
if style == "footnote text":
if document_styles[style].type != WD_STYLE_TYPE.PARAGRAPH:
warnings.append("Footnote Text style is not a character style (see documentation)")
if style == "footnote reference":
if document_styles[style].type != WD_STYLE_TYPE.CHARACTER:
warnings.append("Footnote Reference style is not a character style (see documentation)")
if "Table Grid" not in document_styles:
errors.append("Template is missing a required style (see documentation): Table Grid")
if report_template.p_style and report_template.p_style not in document_styles:
Expand Down
119 changes: 108 additions & 11 deletions ghostwriter/modules/reportwriter/richtext/docx.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,101 @@ def tag_div(self, el, **kwargs):
else:
super().tag_div(el, **kwargs)

def tag_span(self, el, *, par, **kwargs):
"""Override tag_span to handle footnotes."""
if "footnote" in el.attrs.get("class", []):
self.make_footnote(el, par=par, **kwargs)
else:
super().tag_span(el, par=par, **kwargs)

def make_footnote(self, el, *, par=None, **kwargs):
"""
Handle <span class="footnote"> elements by creating a Word footnote.

The footnote content is the text content of the element.
A footnote reference is inserted at the current position in the paragraph.
"""
if par is None:
logger.warning("Footnote found outside of a paragraph, skipping")
return

# Get the footnote content from the element's text
footnote_content = el.get_text().strip()
if not footnote_content:
return

# Emit any pending segment break before adding footnote
self.text_tracking.force_emit_pending_segment_break()

# Calculate the next footnote ID by finding the max existing ID
# This is simpler and more reliable than the paragraph-based algorithm
# which doesn't work well for table cells or dynamically-built documents
max_existing_id = 0
for footnote in self.doc.footnotes:
max_existing_id = max(max_existing_id, footnote.id)
next_footnote_id = max_existing_id + 1

# Add footnote reference to the run and create the footnote
paragraph_element = par._p.add_r()
paragraph_element.add_footnoteReference(next_footnote_id)
new_footnote = self.doc._add_footnote(next_footnote_id)

# Add the footnote paragraph with the footnote reference mark at the start
# This is required for Word to properly display the footnote number
footnote_paragraph = new_footnote.add_paragraph()

# Track if either style is missing
missing_footnote_text = False
missing_footnote_ref = False

# Try to apply "Footnote Text" style to the paragraph
try:
footnote_paragraph.style = "Footnote Text"
except KeyError:
missing_footnote_text = True

# Create a run for the footnote reference number
try:
footnote_ref_run = footnote_paragraph.add_run()
footnote_ref_run.style = "Footnote Reference"
footnote_ref_run.font.superscript = True
except KeyError:
missing_footnote_ref = True
footnote_ref_run = footnote_paragraph._p.add_r()
run_properties = OxmlElement("w:rPr")
style_element = OxmlElement("w:rStyle")
style_element.set(qn("w:val"), "FootnoteReference")
run_properties.append(style_element)
vert_align = OxmlElement("w:vertAlign")
vert_align.set(qn("w:val"), "superscript")
run_properties.append(vert_align)
footnote_ref_run.insert(0, run_properties)

# Add the footnote reference mark
footnote_ref_element = OxmlElement("w:footnoteRef")
if hasattr(footnote_ref_run, "_r"):
footnote_ref_run._r.append(footnote_ref_element)
else:
footnote_ref_run.append(footnote_ref_element)

# Add a space and the footnote text with "Footnote Text" character style
text_run = footnote_paragraph.add_run(" " + footnote_content)
try:
text_run.style = "Footnote Text"
except KeyError:
missing_footnote_text = True
text_run.font.size = Pt(10)

# If either style is missing, apply default formatting to the paragraph
if missing_footnote_text or missing_footnote_ref:
pf = footnote_paragraph.paragraph_format
pf.line_spacing = 1.0
pf.space_before = 0
for run in footnote_paragraph.runs:
run.font.size = Pt(10)

# ...existing code...

def create_table(self, rows, cols, **kwargs):
table = self.doc.add_table(rows=rows, cols=cols, style="Table Grid")
table.autofit = True
Expand Down Expand Up @@ -325,6 +420,8 @@ def tag_span(self, el, *, par, **kwargs):
ref_name = el.attrs["data-gw-ref"]
self.text_tracking.force_emit_pending_segment_break()
self.make_cross_ref(par, ref_name)
elif "footnote" in el.attrs.get("class", []):
self.make_footnote(el, par=par, **kwargs)
else:
super().tag_span(el, par=par, **kwargs)

Expand Down Expand Up @@ -503,17 +600,17 @@ def make_evidence(self, par, evidence):
try:
self._make_image(par, file_path)
except UnrecognizedImageError as e:
logger.exception(
"Evidence file known as %s (%s) was not recognized as a %s file.",
evidence["friendly_name"],
file_path,
extension,
)
error_msg = (
f'The evidence file, `{evidence["friendly_name"]},` was not recognized as a {extension} file. '
"Try opening it, exporting as desired type, and re-uploading it."
)
raise ReportExportTemplateError(error_msg) from e
logger.exception(
"Evidence file known as %s (%s) was not recognized as a %s file.",
evidence["friendly_name"],
file_path,
extension,
)
error_msg = (
f'The evidence file, `{evidence["friendly_name"]},` was not recognized as a {extension} file. '
"Try opening it, exporting as desired type, and re-uploading it."
)
raise ReportExportTemplateError(error_msg) from e

if self.global_report_config.figure_caption_location == "bottom":
par_caption = self.doc.add_paragraph()
Expand Down
8 changes: 8 additions & 0 deletions ghostwriter/modules/reportwriter/richtext/pptx.py
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,12 @@ def style_run(self, run, style):
if "font_color" in style:
run.font.color.rgb = PptxRGBColor(*style["font_color"])

def tag_footnote(self, el, **kwargs): # pylint: disable=unused-argument
"""
Handle <span class="footnote"> elements - PowerPoint doesn't support footnotes,
so we silently ignore them.
"""

def tag_br(self, el, *, par=None, **kwargs):
self.text_tracking.new_block()
if par is not None:
Expand Down Expand Up @@ -187,6 +193,8 @@ def tag_span(self, el, *, par, **kwargs):
run = par.add_run()
run.text = f"See {ref_name}"
run.font.italic = True
elif "footnote" in el.attrs.get("class", []):
self.tag_footnote(el, par=par, **kwargs)
else:
super().tag_span(el, par=par, **kwargs)

Expand Down
Loading
Loading