Skip to content

fix: protect special characters in math from markdown table parsers (#1462)#8000

Open
hugogu wants to merge 3 commits into
requarks:mainfrom
hugogu:fix/katex
Open

fix: protect special characters in math from markdown table parsers (#1462)#8000
hugogu wants to merge 3 commits into
requarks:mainfrom
hugogu:fix/katex

Conversation

@hugogu
Copy link
Copy Markdown

@hugogu hugogu commented May 10, 2026

Change Summary

Wiki.js uses markdown-it-attrs which interprets curly braces inside inline math ($...$) as attribute directives, stripping them from the formula. Additionally, markdown table parsers split cells at both | and & characters, breaking formulas containing those symbols.

This fix replaces {, }, |, and & inside math expressions with Unicode Private Use Area placeholders during markdown parsing, then restores them before passing to KaTeX/MathJax for rendering.

  • <E000> / <E001>: temporary replacements for { / }
  • <E002>: temporary replacement for | (table cell delimiter)
  • <E003>: temporary replacement for & (table cell delimiter in multiline tables, used by LaTeX cases/arrays)

The placeholder approach was chosen over HTML escaping because it preserves LaTeX environments like \begin{array} that were broken by the previous {{}} escaping method.

Fixes #1581
Fixes #1462

Test

# Issue #1462 Test Cases

# Latex测试

## 特殊函数

| 符号 | 含义 | 定义/示例 |
|------|------|----------|
| $\zeta(s)$ | 黎曼ζ函数 | $\zeta(2) = \frac{\pi^2}{6}$ |
| $\Gamma(x)$ | 伽马函数 | $\Gamma(n) = (n-1)!$ |
| $\Gamma(z)$ | 伽马函数(复数) | $\Gamma(z) = \int_0^{\infty} t^{z-1}e^{-t}dt$ |
| $B(x, y)$ | 贝塔函数 | $B(x,y) = \frac{\Gamma(x)\Gamma(y)}{\Gamma(x+y)}$ |
| $\psi(x)$ | 双伽马函数 | $\psi(x) = \frac{\Gamma'(x)}{\Gamma(x)}$ |
| $\mathrm{Heaviside}(x)$ | 海维赛德阶跃 | $H(x) = \begin{cases} 1 & x \geq 0 \\ 0 & x < 0 \end{cases}$ |
| $\delta(x)$ | 狄拉克δ函数 | $\int_{-\infty}^{\infty} \delta(x)dx = 1$ |
| $\mathrm{sinc}(x)$ | sinc函数 | $\mathrm{sinc}(x) = \frac{\sin x}{x}$ |
| $\mathrm{rect}(x)$ | 矩形函数 | $\mathrm{rect}(x) = \begin{cases} 1 & |x| \leq 1/2 \\ 0 & \text{其他} \end{cases}$ |
| $\sigma_x, \sigma_y, \sigma_z$ | 泡利矩阵 | $\sigma_x = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}$ |
| $\mathfrak{c}$ | 连续统基数 | $|\mathbb{R}| = \mathfrak{c}$ |
| $\sum_{n \in S}$ | 对集合求和 | $\sum_{n \in \mathbb{N}} \frac{1}{n^2}$ |

## Array Environment

$\begin{array}{lrc} 1 & 2 & 3 \\ 4 & 5 & 6 \end{array}$

text $\begin{array}{lrc} 1 & 2 & 3 \\ 4 & 5 & 6 \end{array}$ text

## Simple Fractions

$\frac{1}{3}$

text $\frac{1}{3}$ text

## Fractions with Superscripts

$\frac{1^{2}}{3^{4}}$

$\frac{1}{3^{4}}$

$\frac{1^{2}}{3}$

## Einstein Field Equation

$G_{\mu\nu} + \Lambda g_{\mu\nu} = \frac{8\pi G}{c^4} T_{\mu\nu}$

## Block Array

$$
\begin{array}{lrc}
1 & 2 & 3 \\
4 & 5 & 6
\end{array}
$$

The latest build by now (2.5.314) render it as
image

and

image

After the fix all latex text will be rendered correctly:

image

and

image

Wiki.js uses markdown-it-attrs which interprets curly braces inside
inline math ($...$) as attribute directives, stripping them from the
formula. Additionally, markdown table parsers split cells at both `|`
and `&` characters, breaking formulas containing those symbols.

This fix replaces `{`, `}`, `|`, and `&` inside math expressions with
Unicode Private Use Area placeholders during markdown parsing, then
restores them before passing to KaTeX/MathJax for rendering.

- `<E000>` / `<E001>`: temporary replacements for `{` / `}`
- `<E002>`: temporary replacement for `|` (table cell delimiter)
- `<E003>`: temporary replacement for `&` (table cell delimiter in
  multiline tables, used by LaTeX cases/arrays)

The placeholder approach was chosen over HTML escaping because it
preserves LaTeX environments like `\begin{array}` that were broken
by the previous `{{}}` escaping method.

Fixes requarks#1581
Fixes requarks#1462

Co-authored-by: Claude <noreply@anthropic.com>
AI-model: kimi-for-coding/k2p6
@auto-assign auto-assign Bot requested a review from NGPixel May 10, 2026 08:14
hugogu and others added 2 commits May 12, 2026 22:49
Unclosed $ delimiters would span across multiple lines, causing
protectMathPipes() to corrupt table cell delimiters in unrelated
content. Limit inline math matching to the same line to prevent
false matches.

Co-authored-by: Claude <noreply@anthropic.com>
AI-model: kimi-for-coding/k2p6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inline LaTeX rendering fails for certain commands if it's not surrounded by text Katex subscript rendering issue

2 participants