Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified xkcd-script/font/xkcd-script.otf
Binary file not shown.
13,901 changes: 7,837 additions & 6,064 deletions xkcd-script/font/xkcd-script.sfd

Large diffs are not rendered by default.

Binary file modified xkcd-script/font/xkcd-script.ttf
Binary file not shown.
Binary file modified xkcd-script/font/xkcd-script.woff
Binary file not shown.
25 changes: 25 additions & 0 deletions xkcd-script/generator/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# xkcd-script font generation pipeline

Each script runs in order inside the `fontbuilder` Docker image; `run.sh` orchestrates them and accepts an optional starting step (`./run.sh 5` skips pt1–pt4).

## Stages

| # | Script | What it does |
|---|---|---|
| 1 | `pt1_character_extraction.py` | Extract character strokes from `handwriting_minimal.png` (scikit-image). |
| 2 | `pt2_character_classification.py` | Cluster strokes into lines (k-means, fixed seed). |
| 3 | `pt3_ppm_to_svg.py` | Convert per-character PPM → SVG via `potrace`. |
| 4 | `pt4_additional_sources.py` | Trace extra glyphs from comic panels and `extras/`. |
| 5 | `pt5_svg_to_font.py` | Import SVG glyphs into a FontForge SFD; apply stroke normalisation, weight nudges, math-symbol imports. |
| 6 | `pt6_derived_chars.py` | Build derived/composed glyphs: diacritics, ligatures, Greek, IPA, combining marks, math cmap aliases (U+1D400 block via altuni). |
| 7 | `pt7_font_properties.py` | Apply kerning, GPOS anchors, pin CFF hints and OS/2 metrics. Output is `xkcd-script-pt7.sfd` — the **base** font used for everything downstream. |
| 8 | `pt8_derivatives.py` | Orchestrator. Runs each `pt8X_*.py` derivative step in turn. |
| 9 | `pt9_gen_reprod_font.py` | Scrub the SFD for reproducibility, freeze CFF charstrings, generate committed binaries (otf/ttf/woff). |

## Derivatives (pt8)

`pt7` produces a single kitchen-sink base SFD with everything — Latin, Greek, math symbols and aliases, ligatures, combining marks. Each `pt8X_<name>.py` reads that base and either writes its own derivative SFD or extracts data from it to splice elsewhere; `pt8_derivatives.py` runs them with `runpy`.

Today there is no live derivative font: the sole entry, `pt8a_mathjax3.py`, only extracts extensible-glyph outline data into `../xkcd-mathjax3.js`. The display-sized large operators that used to live in a separate mathjax3 WOFF are now stylistic alternates in the base font (`ss01`).

Because derivatives can only subtract or overlay what pt7 already has, **everything plausibly useful belongs in pt7**. Don't pre-strip pt7 for size — that loses the subtractive option.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added xkcd-script/generator/extras/sqrt_vertical.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 12 additions & 2 deletions xkcd-script/generator/pt4_additional_sources.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,17 +72,23 @@ def _clean_potrace_svg(raw_svg_path, clean_svg_path):
os.remove(clean_svg_path + '.sfd')


def extract_symbol(arr, y0, y1, x0, x1, name, exclude=None):
def extract_symbol(arr, y0, y1, x0, x1, name, exclude=None, pad=0):
"""Crop glyph region, upsample, binarise, run potrace, clean, save SVG.

exclude: optional list of (y0, y1, x0, x1) regions in full-image coordinates
to blank out (set to background) before potrace, for removing
artefacts that cannot be separated by tightening the main crop.
pad: white-pixel border added around the crop before upsampling. Use
when the source PNG is tightly cropped and ink touches the edge —
potrace otherwise produces edge artefacts where contours run into
the canvas boundary.
"""
crop = arr[y0:y1, x0:x1].copy()
if exclude:
for ey0, ey1, ex0, ex1 in exclude:
crop[ey0 - y0:ey1 - y0, ex0 - x0:ex1 - x0] = 255
if pad:
crop = np.pad(crop, pad, mode='constant', constant_values=255)
upsample = SPECIALUPSAMPLE.get(name, UPSAMPLE)
big = Image.fromarray(crop).resize(
(crop.shape[1] * upsample, crop.shape[0] * upsample),
Expand Down Expand Up @@ -175,14 +181,18 @@ def extract_symbol(arr, y0, y1, x0, x1, name, exclude=None):
('right_half_arrow', '2343_mathematical_symbol_fight_2x__right_half_arrow'), # ⇀ U+21C0 source
('right_lim_arrow', '2343_mathematical_symbol_fight_2x__right_lim_arrow'), # → U+2192 source
('triangle', '2343_mathematical_symbol_fight_2x__triangle'), # △ U+25B3 source
('circled_times', '2034_equations_2x__circled_times'), # ⊗ U+2297 source
('sqrt_vertical', 'sqrt_vertical'), # √ tall surd, unencoded glyph `radical.tall`
('braceleft_tall', '2435_geothmetic_meandian_2x__brace__shortened'), # { tall brace for MathJax stretchy assembly
('parenleft_tall', '2059_modified_bayes_theorem_2x__lparen'), # ( tall paren for MathJax stretchy assembly (\binom, \left(, pmatrix); right paren is mirrored in pt5
]

print('Extracting hand-drawn extras...')
for name, filename in EXTRAS:
src_path = os.path.join(EXTRAS_DIR, f'{filename}.png')
arr_extra = np.array(Image.open(src_path).convert('L'))
h, w = arr_extra.shape
extract_symbol(arr_extra, 0, h, 0, w, name)
extract_symbol(arr_extra, 0, h, 0, w, name, pad=10)


# ---------------------------------------------------------------------------
Expand Down
66 changes: 63 additions & 3 deletions xkcd-script/generator/pt5_svg_to_font.py
Original file line number Diff line number Diff line change
Expand Up @@ -784,13 +784,38 @@ def _import_comic_glyph(font, name, svg_path, target_top, weight_delta=0):
_math_axis = _xh / 2


def _import_math_centered(name, cp, target_top, weight_delta=0):
"""Import a math symbol SVG, then centre it at the math axis."""
def _mirror_glyph_x(src, dst_name):
"""Create an unencoded glyph `dst_name` that is `src` flipped on X.

psMat.scale(-1, 1) puts the glyph in negative-x territory; we shift it
back so the leftmost contour sits at x=20 (matching _import_comic_glyph's
side-bearing) and set advance width to enclose it.
"""
dst = font.createChar(-1, dst_name)
dst.clear()
for cont in src.foreground:
dst.foreground += cont
dst.transform(psMat.scale(-1, 1))
bb = dst.boundingBox()
dst.transform(psMat.translate(-bb[0] + 20, 0))
dst.width = int(round(dst.boundingBox()[2] + 20))
return dst


def _import_math_centered(name, cp, target_top, weight_delta=0, dst_name=None):
"""Import a math symbol SVG, then centre it at the math axis.

cp=None creates an unencoded glyph at dst_name (defaults to name);
otherwise the centred copy is mapped to cp.
"""
svg = os.path.join(_COMIC_CHARS_DIR, f'{name}.svg')
g = _import_comic_glyph(font, name, svg, target_top=target_top, weight_delta=weight_delta)
bb = g.boundingBox()
g.transform(psMat.translate(0, _math_axis - (bb[1] + bb[3]) / 2))
ch = font.createMappedChar(cp)
if cp is None:
ch = font.createChar(-1, dst_name or name)
else:
ch = font.createMappedChar(cp)
ch.clear()
for cont in g.foreground:
ch.foreground += cont
Expand All @@ -802,6 +827,7 @@ def _import_math_centered(name, cp, target_top, weight_delta=0):
_import_math_centered('right_lim_arrow', 0x2192, _xh, weight_delta=20) # →
_import_math_centered('right_double_arrow', 0x21D2, _xh, weight_delta=20) # ⇒
_import_math_centered('right_half_arrow', 0x21C0, _xh, weight_delta=20) # ⇀
_import_math_centered('circled_times', 0x2297, _xh, weight_delta=10) # ⊗

# △ U+25B3 WHITE UP-POINTING TRIANGLE — baseline to cap height.
_tri_svg = os.path.join(_COMIC_CHARS_DIR, 'triangle.svg')
Expand All @@ -820,6 +846,40 @@ def _import_math_centered(name, cp, target_top, weight_delta=0):
_ch.width = _tri_src.width


# Hand-drawn near-vertical surd as the unencoded glyph `radical.tall`,
# for math renderers to use on tall radicands where extending the natural
# √'s (U+221A) diagonal would lean too far. Reached via the runtime
# cut-and-extend renderer in xkcd-mathjax3.js, not via a cmap entry.
_import_math_centered('sqrt_vertical', None, 1.12 * font.em, weight_delta=0,
dst_name='radical.tall')


# Hand-drawn tall left brace as the unencoded glyph `braceleft.tall`, used
# by xkcd-mathjax3.js's overlay pass for \begin{cases} and other stretchy
# braces that MathJax CHTML would otherwise assemble from missing U+23A7-AA
# pieces. Reached via the runtime cut-and-extend renderer, not via cmap.
# Target_top is sized to keep the imported pen-weight close to the source's
# natural stroke: shortened source is 276 px tall, so ≈0.7×em scales the
# strokes at ~2.5×, similar to other math glyphs from this comic.
_import_math_centered('braceleft_tall', None, 0.7 * font.em, weight_delta=20,
dst_name='braceleft.tall')


# Hand-drawn tall left parenthesis as the unencoded glyph `parenleft.tall`,
# used by xkcd-mathjax3.js's overlay pass for \binom, \left(...\right) around
# fractions, and \begin{pmatrix} where MathJax would otherwise draw a
# letter-height ( inside a tall reserved box. Reached via the runtime
# cut-and-extend renderer, not via cmap.
_paren_l = _import_math_centered('parenleft_tall', None, 0.8 * font.em,
weight_delta=0, dst_name='parenleft.tall')

# `parenright.tall` — mirror of parenleft.tall on the X axis. Kept as a
# separate glyph so font-level uses see a proper closing paren; the JS
# renderer mirrors from parenleft.tall on its own and never reads this one,
# so we don't export it to EXTENSIBLE_GLYPHS.
_mirror_glyph_x(_paren_l, 'parenright.tall')


# ---------------------------------------------------------------------------
# Save
# ---------------------------------------------------------------------------
Expand Down
225 changes: 225 additions & 0 deletions xkcd-script/generator/pt6_derived_chars.py
Original file line number Diff line number Diff line change
Expand Up @@ -1298,6 +1298,231 @@ def _greek_lc_to_uc(font, lc_cp, uc_cp, snap=True, weight_delta=0):
_g.width = _dx + font[_right].width


# ---------------------------------------------------------------------------
# Math cmap aliases (for MathJax / pasted Unicode math text)
# ---------------------------------------------------------------------------
# MathJax CHTML references math italic / bold / Greek codepoints directly
# (U+1D400-block). Rather than ship dedicated glyphs, we add cmap aliases so
# each math codepoint resolves to the existing Latin/Greek letterform. The
# aliases attach via glyph.altuni so no extra glyph entries are created — the
# size cost is only the additional cmap entries.

def _add_altuni(glyph, codepoint):
cur = list(glyph.altuni) if glyph.altuni else []
cur.append((codepoint, -1, 0))
glyph.altuni = tuple(cur)


# Greek codepoint sequences for the U+1D6A8.. and U+1D6E2.. math blocks.
# Not a simple 0x0391+i offset: position 17 is the "capital theta symbol" (maps to
# plain Θ), and U+03A2 is an unassigned hole so position 18 jumps straight to Σ (U+03A3).
_GREEK_UPPER = [
0x0391, 0x0392, 0x0393, 0x0394, 0x0395, 0x0396, 0x0397, 0x0398,
0x0399, 0x039A, 0x039B, 0x039C, 0x039D, 0x039E, 0x039F, 0x03A0,
0x03A1, 0x0398, 0x03A3, 0x03A4, 0x03A5, 0x03A6, 0x03A7, 0x03A8, 0x03A9,
]
_GREEK_LOWER = [
0x03B1, 0x03B2, 0x03B3, 0x03B4, 0x03B5, 0x03B6, 0x03B7, 0x03B8,
0x03B9, 0x03BA, 0x03BB, 0x03BC, 0x03BD, 0x03BE, 0x03BF, 0x03C0,
0x03C1, 0x03C2, 0x03C3, 0x03C4, 0x03C5, 0x03C6, 0x03C7, 0x03C8, 0x03C9,
]

_math_aliases = {} # src_cp -> dst_cp

# Math Italic Latin: U+1D434–U+1D467 (U+1D455 is a Unicode hole)
for _i in range(26):
_math_aliases[0x1D434 + _i] = 0x0041 + _i
for _i in range(26):
_src = 0x1D44E + _i
if _src != 0x1D455:
_math_aliases[_src] = 0x0061 + _i

# Canonical math symbols defined by Unicode to look like a specific letter glyph.
# U+1D455 is unassigned because U+210E already exists for math-italic h.
_math_aliases[0x210E] = ord('h') # ℎ PLANCK CONSTANT (math italic h by definition)
_math_aliases[0x210F] = ord('h') # ℏ PLANCK CONSTANT OVER TWO PI
_math_aliases[0x2113] = ord('l') # ℓ SCRIPT SMALL L (math italic l by definition)
_math_aliases[0x03D5] = 0x03C6 # ϕ GREEK PHI SYMBOL → φ

# Math Bold Latin: U+1D400–U+1D433
for _i in range(26):
_math_aliases[0x1D400 + _i] = 0x0041 + _i
for _i in range(26):
_math_aliases[0x1D41A + _i] = 0x0061 + _i

# Math Bold Italic Latin: U+1D468–U+1D49B
for _i in range(26):
_math_aliases[0x1D468 + _i] = 0x0041 + _i
for _i in range(26):
_math_aliases[0x1D482 + _i] = 0x0061 + _i

# Math Bold digits: U+1D7CE–U+1D7D7
for _i in range(10):
_math_aliases[0x1D7CE + _i] = 0x0030 + _i

# Math Italic Greek capitals: U+1D6E2–U+1D6FA, small: U+1D6FC–U+1D714
for _i, _cp in enumerate(_GREEK_UPPER):
_math_aliases[0x1D6E2 + _i] = _cp
for _i, _cp in enumerate(_GREEK_LOWER):
_math_aliases[0x1D6FC + _i] = _cp

# Math Italic Greek variants
_math_aliases[0x1D715] = 0x2202 # ∂ partial differential
_math_aliases[0x1D716] = 0x03F5 # ϵ epsilon symbol
_math_aliases[0x1D717] = 0x03B8 # θ theta symbol
_math_aliases[0x1D718] = 0x03BA # κ kappa symbol
_math_aliases[0x1D719] = 0x03C6 # φ phi symbol
_math_aliases[0x1D71A] = 0x03C1 # ρ rho symbol
_math_aliases[0x1D71B] = 0x03C0 # π pi symbol

# Math Bold Greek capitals: U+1D6A8–U+1D6C0, small: U+1D6C2–U+1D6DA
for _i, _cp in enumerate(_GREEK_UPPER):
_math_aliases[0x1D6A8 + _i] = _cp
for _i, _cp in enumerate(_GREEK_LOWER):
_math_aliases[0x1D6C2 + _i] = _cp

for _src_cp, _dst_cp in sorted(_math_aliases.items()):
_add_altuni(font[_dst_cp], _src_cp)


# ---------------------------------------------------------------------------
# ⋅ U+22C5 DOT OPERATOR — period scaled to 75%, centred on the math axis.
# Used inline in math contexts (e.g. "5⋅3"); math-axis centring lines it up
# with operators like +/− rather than the baseline-hugging period dot.
# ---------------------------------------------------------------------------
_MATH_AXIS = 260
_CDOT_WIDTH = 220
_period_g = font[ord('.')]
_period_bb = _period_g.boundingBox()
_period_cx = (_period_bb[0] + _period_bb[2]) / 2
_period_cy = (_period_bb[1] + _period_bb[3]) / 2
_cdot = font.createChar(0x22C5, 'uni22C5')
_cdot.clear()
_cdot.addReference(_period_g.glyphname, psMat.compose(
psMat.scale(0.75),
psMat.translate(_CDOT_WIDTH / 2 - _period_cx * 0.75,
_MATH_AXIS - _period_cy * 0.75),
))
_cdot.width = _CDOT_WIDTH


# ---------------------------------------------------------------------------
# ∘ U+2218 RING OPERATOR — reuses the ring contours extracted from Å (the
# same shape used as a diacritic above å/ů). The hand-drawn ring already
# has the right pen weight and proportions for an operator-sized glyph,
# so we just copy it and recentre on the math axis.
# ---------------------------------------------------------------------------
_CIRC_WIDTH = 440
_circ_layer = fontforge.layer()
for c in _ring_mark.foreground:
_circ_layer += c
_circ = font.createChar(0x2218, 'uni2218')
_circ.clear()
_circ.foreground = _circ_layer
_circ_bb = _circ.boundingBox()
_circ.transform(psMat.translate(
_CIRC_WIDTH / 2 - (_circ_bb[0] + _circ_bb[2]) / 2,
_MATH_AXIS - (_circ_bb[1] + _circ_bb[3]) / 2,
))
_circ.width = _CIRC_WIDTH


# ---------------------------------------------------------------------------
# ∗ U+2217 ASTERISK OPERATOR — asterisk recentred on the math axis.
# MathJax emits U+2217 for `*` in math mode (and for \ast); without this
# glyph `*` falls back to .notdef. We copy the asterisk outlines and
# translate them so their vertical centre lands on _MATH_AXIS, matching
# the height of +/−/⋅ rather than the higher-sitting text asterisk.
# ---------------------------------------------------------------------------
_AST_WIDTH = 475
_ast_src = font[ord('*')]
_ast_bb = _ast_src.boundingBox()
_ast_cy = (_ast_bb[1] + _ast_bb[3]) / 2
_ast = font.createChar(0x2217, 'uni2217')
_ast.clear()
_ast.addReference(_ast_src.glyphname, psMat.translate(0, _MATH_AXIS - _ast_cy))
_ast.width = _AST_WIDTH


# ---------------------------------------------------------------------------
# ∑ U+2211 / ∏ U+220F — inline-sized large operators.
# The base font carries the Greek capitals Σ/Π for letter use; we mint
# separate `summation`/`product` glyphs at the math codepoints so the
# ss01 substitution (wired up in pt7) doesn't also affect Greek text.
# Outlines are copied (not referenced) so the .disp variants below can
# scale and thin them independently of the source letters.
# ---------------------------------------------------------------------------

def _copy_glyph_at(font, src_name, cp, dst_name):
src = font[src_name]
layer = fontforge.layer()
for c in src.foreground:
layer += c
g = font.createChar(cp, dst_name)
g.clear()
g.foreground = layer
g.width = src.width
return g

_copy_glyph_at(font, 'Sigma', 0x2211, 'summation')
_copy_glyph_at(font, 'Pi', 0x220F, 'product')


# ---------------------------------------------------------------------------
# Display-sized large operators (∑ ∏ ∫) as stylistic alternates.
#
# Unencoded glyphs reached via the OpenType ss01 feature (wired up in pt7).
# The base U+2211 / U+220F / U+222B forms stay at inline size; ss01 swaps
# them for these enlarged variants in display contexts (MathJax uses a
# font-feature-settings CSS rule scoped to display-mode <mjx-mo>).
# ---------------------------------------------------------------------------

def make_display_operator(font, src_name, dst_name, target_h, weight=0, rbear=0):
"""Scale src to target_h, optionally thin strokes, centre on MATH_AXIS.

Creates an unencoded glyph named dst_name. rbear sets the right
bearing in font units (advance = right glyph edge + rbear); MathJax
CHTML ignores font advances anyway, so the oversized rbears used for
∏/∫ only matter for non-MathJax consumers.
"""
src = font[src_name]
src_layer = fontforge.layer()
for c in src.foreground:
src_layer += c
src_width = src.width

g = font.createChar(-1, dst_name)
g.clear()
g.foreground = src_layer
g.width = src_width

bb = g.boundingBox()
scale = target_h / (bb[3] - bb[1])
g.transform(psMat.scale(scale))

if weight != 0:
g.correctDirection()
g.removeOverlap()
g.changeWeight(weight)
g.correctDirection()
g.addExtrema()

bb2 = g.boundingBox()
g.transform(psMat.translate(0, _MATH_AXIS - (bb2[3] + bb2[1]) / 2))

bb3 = g.boundingBox()
g.width = round(bb3[2] + rbear)

print(f" {dst_name}: scale={scale:.3f} weight={weight} "
f"bounds={g.boundingBox()} advance={g.width}")


_upem = font.em
make_display_operator(font, 'Sigma', 'summation.disp', _upem, weight=-20, rbear=20)
make_display_operator(font, 'Pi', 'product.disp', _upem, weight=-20, rbear=5000)
make_display_operator(font, 'integral', 'integral.disp', round(1.4 * _upem), weight=-15, rbear=5000)


# ---------------------------------------------------------------------------
# Save
# ---------------------------------------------------------------------------
Expand Down
Loading
Loading