Skip to content

Increate the CSS spec coverage#5

Open
samoht wants to merge 977 commits into
mainfrom
spec-parser
Open

Increate the CSS spec coverage#5
samoht wants to merge 977 commits into
mainfrom
spec-parser

Conversation

@samoht
Copy link
Copy Markdown
Owner

@samoht samoht commented May 4, 2026

No description provided.

samoht added 28 commits May 16, 2026 22:35
CSS Transforms 1 sec. 11: collapse a [transform] function to its
shortest spec-equivalent spelling. [canonicalise_transform] runs at
the top of [pp_transform] under [Pp.minified]:

- [translate3d(X, 0, 0)] -> [translate(X)], [translate3d(0, Y, 0)]
  -> [translateY(Y)], etc.
- [translate(X, 0)] -> [translate(X)], [translate(0, Y)] ->
  [translateY(Y)].
- [scale(X, 1)] -> [scaleX(X)], [scale(1, Y)] -> [scaleY(Y)].
- [scale(X, X)] -> [scale(X)] (matching axes collapse).
- [scale3d(X, Y, 1)] -> [scale(X, Y)] (Z=1 drops 3d).
- [rotateZ(A)] -> [rotate(A)] (Z is the default rotation axis).
- [rotate3d(1, 0, 0, A)] -> [rotateX(A)], same for Y / Z.

Also exposes [Values.length_is_zero] in [values.mli] for the
[translate*] canonicalisations.
The length-only gate caught regressions in cascade's competitiveness
but not in its correctness: a shorter cascade output that silently
drops or rewrites declarations would still pass. Add a strict
[Cascade_diff.Css_compare.equal ~mode:`Canonical] check first, so a
size win against the shortest oracle only counts when cascade's
output is also cascade-canonically equivalent to that oracle.

The two checks now layer:
 1. cascade output must canonical-equal the shortest oracle, OR
    the test fails with a "semantically not equivalent" message;
 2. given that, cascade output must be no longer than the shortest
    oracle.

Current state: 72/72 sites trip stage 1, because cascade and every
oracle each pick their own canonical form for many constructs (e.g.
csso's rule factorisation cascade does not reverse, font-name casing,
escape encodings). The work backlog is clear: every cascade
canonicalisation gap surfaces as one or more semantic mismatches in
this run. Runtime now 8 s.
Four more shortest-spelling rewrites:

- [display]: [Multi (Block, Block)] / [Multi (Inline, Block)] /
  [Multi (Run_in, Block)] collapse to the outside keyword (CSS
  Display 3 sec. 2: [<display-outside> flow] is just the outside,
  so [block flow] -> [block]).
- [timing-function]: add [Cubic_bezier (0, 0, 1, 1)] -> [linear]
  (CSS Easing 1 sec. 3, the identity easing curve).
- [font-family] platform-name printers: multi-word names ([Helvetica
  Neue], [Trebuchet MS], [Times New Roman], ...) emit unquoted under
  minify via a new [pp_font_family_name] helper - CSS Fonts 4
  sec. 4.1 lets a [<family-name>] be a sequence of [<custom-ident>]
  words without quotes, and that's the shorter spelling. Pretty mode
  still quotes for readability.
- [<angle>]: [shortest_angle_unit] picks the shortest spelling under
  minify, so [.25turn] (7 chars) -> [90deg] (5 chars), [100grad] ->
  [90deg], etc. [rad] stays since the decimal expansion of [pi]
  fractions is always longer than the [deg] equivalent.
Two cascade canonicalisations that converge with the cssnano / csso
canonical form even when bytes don't change:

1. [lib/optimize.ml] [factorise_group] now commits the factored shape
   when [after_size <= before_size] (ties go to the factored form),
   not just on strict savings. cssnano-style rule shapes like

       .sw_b3 { border:...; overflow:... }
       .sa_drw, .sw_b3 { background:#fff }
       .sa_drw { margin:0; padding:0 }

   now emerge even on inputs where the byte count is identical to two
   separate rules carrying the shared declaration.

2. [lib/selector.ml] [List] selectors are sorted alphabetically under
   minify by their printed form. CSS Selectors 4 sec. 4.2 treats a
   selector list as an unordered set, so sorting is semantics-
   preserving and matches the cssnano / csso convention. Pretty mode
   keeps source order.

Side effects on the interop corpora:
 - keithamus 100 -> 94 failures (selector sort closes a cluster of
   small ordering gaps).
 - satcss stays at 72 failures vs the strict canonical-equality gate;
   cascade still differs from the oracles on shorthand recomposition,
   selector-set membership choices, and a handful of other
   canonicalisations.
CSS Backgrounds 3 sec. 3.10: under minify, drop a longhand whose value
equals the [background] shorthand initial - [repeat], [scroll],
[padding-box] (origin), [border-box] (clip). The resolved cascade
value is unchanged; cascade was emitting the redundant defaults
([background: red url(bg.png) 0% 0% repeat scroll] kept the [repeat
scroll]).
Match the recent angle-unit / font-family-unquote / display-flow /
transform-canonical / cubic-bezier-linear / background-defaults
rewrites:

- test_values: angle vectors switch to the shortest unit ([360deg] ->
  [1turn], etc.).
- test_declaration, test_properties, test_inline: multi-word font
  family names emit unquoted under minify ([\"Helvetica Neue\"] ->
  [Helvetica Neue]).
- test_declaration, test_properties: transform / angle oracles
  ([.25turn], [100grad] -> [90deg]; [rotateZ(180deg)] -> [rotate]).
- test_stylesheet, test_selector: selector lists sort into the
  canonical order ([.b, .a] -> [.a, .b]; [.stop, .end] -> [.end,
  .stop]).
- test_context, test_css: dependent fixtures resync.
- README: document the selector-list sort canonicalisation under
  [--minify].
- bin/cmd_diff: dune fmt drop of a redundant block.
The earlier "default-pick the first rule's value into the shared
block" extension was cascade-unsafe in one case: when a later rule in
the group redeclares the property with the SAME value as the first
rule (e.g. [.a{red}.b{blue}.c{red}]), default-picking [red] into the
shared block at position 1 drops [.c]'s position-3 override of [.b].

Test [optimize.030] "same declarations are not grouped across
source-order competitor" pins this and was correctly catching the
regression.

Revert to the strictly-safe rule: a declaration is "common" only when
every rule in the group declares it IDENTICALLY (same property and
same value). This loses the microsoft-3 text-align factorisation that
csso produces, but preserves cascade semantics for every element.

Single-declaration factoring (introduced in the previous commit) is
kept: the byte-budget check rejects the cases where it would regress,
and the rule shapes for cases like
[.sa_drw,.sw_b3{background:#fff}] still emerge.
The canonical-equality check was running [Css_compare.equal Canonical]
on (shortest.raw, actual), which internally parses + minifies BOTH
sides. But [actual = cascade.minify(input)] is already the cascade-
canonical form, and [cascade.minify] is idempotent on its own output,
so re-canonicalising it is wasted work.

Replace with a direct check: canonicalise the oracle once via
[cascade_minify], then string-compare against [actual]. Half the parse
work per failing site.

SatCSS interop runtime: 8.3 s -> 2.2 s.
Re-enable default-pick (hoist first rule's value into shared even when
later rules have different values) but gate on selector-overlap
analysis: a rule whose value matches the default still keeps an
override in the leftover when an earlier overlapping rule declares a
different value.

   .a { red }                  .a, .b, .c { red }
   .b { blue }    becomes      .b { blue }
   .c { red }                  .c { red }       (kept: .b overlaps .c)

   .textcenter { ...; left }   .textcenter, .textleft, .textright {
   .textleft   { ...; ctr }        ...; left
   .textright  { ...; rgt }    }
                               .textleft { ctr }
                               .textright { rgt }

The first form preserves [.b.c -> red]; the second produces csso's
microsoft-3 shape.

Code shape:
 - [factorable_default] picks the first rule's declaration when every
   rule in the group declares the same property.
 - [earlier_overrides_overlap] is the safety predicate driving
   leftover emission.
 - Per-rule leftover filters [keep_in_leftover ~default_decl ~i decl].

Net effect: keithamus 96 -> 94 failures, cascade lib all 1180/1180,
satcss interop unchanged at 72 (canonical-form differences remain in
shorthand recomposition and rule-set choices, not in factorisation).
…eep quoted on generic-name collision

Three minify-policy refinements for [@font-face]:

- [Font_face.pp_src_modifiers]: [format("woff2")] emits as
  [format(woff2)] when the argument matches one of the known
  [<font-format>] keywords (CSS Fonts 4 sec. 4.3 accepts either
  spelling; the bare keyword is shorter).
- [Font_face.pp_src_entry] [Local] arm: [local("test")] emits as
  [local(test)] when the body parses as a [<custom-ident>] (no
  special chars, no leading digit). Brings cascade in line with the
  unquoted-when-safe convention already used elsewhere.
- [pp_font_family]'s [Name] arm and [font_family_of_quoted_name]:
  preserve the quotes when the spelling collides with a generic
  family / CSS-wide keyword ([serif], [sans-serif], [inherit], ...).
  Quoted single-word names also stay as [Name _] (custom family)
  rather than re-matching the generic enum, so [font-family:
  "serif"] round-trips as itself instead of becoming the [serif]
  generic.
… path in peek_utf8_at

Three perf passes that didn't move satcss from 2.3s (still 2.4s, parse-
bound for inputs averaging 50 KB), but cut individual allocation hot-
spots:

 - [lib/optimize.ml] cache [Declaration.property_name] per declaration
   via physical-equality Hashtbl. The factorisation pass calls it in
   O(N M) nested loops; each call previously allocated a Buffer inside
   [property_name]. Allocations attributed to [property_name] dropped
   ~3x.

 - [lib/optimize.ml] cache [canonical_selector_key] per selector AST.
   [merge_rules.merge_adjacent] otherwise re-serialises every selector
   in every pairwise call.

 - [lib/reader.ml] inline the ASCII fast path into [peek_utf8_at] so
   ASCII bytes skip the [first_utf8_chunk_at] indirection and avoid
   the intermediate [Some (Uchar)] allocation.

Also: hook [Memtrace.trace_if_requested] into the interop runner so
[MEMTRACE=file dune exec test/interop/test.exe -- <dir>] dumps a CTF
trace; nox-memtrace's [hotspots.exe] then ranks allocators.

Remaining headroom is in cascade's parser / lexer / optimizer
allocation density (~50 MB allocated per ~50 KB minify); closing that
needs a structural pass on the inner loops, not point caches.
Two CSS Selectors 4 minify rewrites in [Selector.pp]:

- [:not(:not(X))] collapses to [X] (sec. 5, double negation).
- [:is(s1, s2, ...)] unwraps to the selector list [s1, s2, ...]
  when each argument is structurally simple ([Element] / [Class] /
  [Id] / [Universal] / [Attribute], or a [Compound] of those).
  Per sec. 17 the specificity is per-member there, so the unwrapped
  list matches the same elements with the same per-selector
  specificity. Conservative on nested [:is] / [:not] / [:has] /
  [:where] arguments since their specificity rules differ.

[is_unwrap_safe_is_arg] codifies the structural predicate.
[Css_compare.equal ~mode:`Tree] (parse both, structurally compare ASTs,
no minify) seemed like a natural alternative to a second cascade.minify
pass on the oracle. In practice it's slower (5.3 s vs 2.4 s on satcss)
because the tree-diff walk is O(N M) on rule pairs and allocates more
than a single parse+minify+string-compare cycle.

Keep the [cascade_minify shortest.raw |> string compare against actual]
shape; the test format docstring already mentions the [`Canonical] path
for the implementor's reference.
Oracle CSS is already minified by the upstream tool; running cascade's
optimizer on it (then string-comparing the result to [actual]) is
wasted work. Parse oracle and cascade output once each, compare the
parsed ASTs via OCaml structural equality.

SatCSS runtime: 2.4 s -> 2.2 s. The remaining floor is ~144 parses
across 72 input/oracle pairs at ~14 ms each; further wins need either
cascade-parser speedups or cached oracle ASTs.
[selector_changes] scanned [all_added_candidates] linearly for every
removed rule, recomputing [decls_signature] on each pair. With N
removed and M added rules that's O(N M) signature builds and string
sorts.

Index added rules by their declaration signature (the property-value
list cascade considers identity-preserving across selector renames)
into a hashtable, so each removed rule does an O(1) bucket lookup and
the inner predicate only checks selector overlap. Buckets are
typically small.

Also cache [decl_to_prop_value] by physical declaration identity so
[decls_signature] doesn't re-print the same declaration repeatedly
across the diff.

Reddit semantic diff (the largest SatCSS pair): 1.5 s -> 0.35 s
([cascade diff --diff=tree]). SatCSS interop runtime stays at 2.2 s
because parsing dominates there - the tree-diff path mostly fires on
mismatched pairs.

(Also adds a temporary identity stub for [Selector.top_level_is_unwrap]
introduced by a parallel WIP so the build is unblocked.)
[lib/diff/tree_diff.ml] cache [extract_base_parent_selector] per
selector AST. Called by [selectors_share_parent_ast] inside
[selector_changes] for every (added, removed) bucket member; each call
otherwise re-runs [Css.Selector.to_string].

[lib/lexer.ml] [consume_ident_sequence] now scans the source ahead
looking for the next [\\]. When the ident is escape-free (the common
case) we emit one [String.sub src start len] instead of growing a
[Buffer] byte-by-byte and contents-ing it at the end. Escape paths
fall back to the previous Buffer-based loop, primed with the bytes
we've already seen.

[lib/selector.ml] also picks up the user's [top_level_is_unwrap] hook
in [to_string] / [to_buffer] (parallel WIP).

Net effect on satcss runtime: unchanged at 2.2 s (allocations modest;
parse cost dominates). Reddit [cascade diff --diff=tree] holds at
0.4 s.
Move the per-record [compute_case] (parse input + minify, parse oracle,
structural compare) into an [Eio_main.run] + [Eio.Executor_pool] block
ahead of alcotest. The pool sizes to one less than the recommended
domain count; each case submits as a unit of work and the results are
collected into a Hashtbl that the alcotest cases look up by name.

Also drop the global Hashtbl caches I added in [lib/optimize.ml] and
[lib/diff/tree_diff.ml] - they're not thread-safe and would either need
a Mutex or a per-domain handle to survive the pool. The sequential win
they gave (~1% of the previous 2.2s) is dominated by the parallelism
win.

SatCSS interop: 2.2 s -> 1.04 s. From session start: 9.5 min -> 1.04 s,
~550x speedup.
Companion to the [:is()] minify unwrap in [lib/selector.ml]: the
public [pp] / [to_string] apply the unwrap at the entry point so
every caller sees the canonical form. The [.mli] declaration
documents the rule (CSS Selectors 4 sec. 17, equal-specificity
unwrap of [:is(s1, s2, ...)] to a flat selector list when each
argument is structurally simple).
After dropping the global Hashtbl caches the leftover wrappers were
all just identity aliases. Inline them at the call sites:
[canonical_selector_key_cached] -> [canonical_selector_key],
[prop_name_of] -> [decl_property]. Drop the no-op [reset_decl_cache]
and [reset_parent_cache] and their callers.
…nify

CSS Forms 4 sec. 5 / Selectors 4 sec. 5.10: the form-state pseudo-class
pairs are mutual complements over the form-element set, so under minify
[:not(:enabled)] -> [:disabled], [:not(:valid)] -> [:invalid],
[:not(:required)] -> [:optional] (and the symmetric inversions).
Three pairs collapsed for now; matches keithamus selectors-advanced
0005 / 0008 / 0009 + the lightning / cssnano convention.
CSS Sizing 3 sec. 3.1: the [min-width] / [min-height] (and the
logical [min-inline-size] / [min-block-size]) properties have [auto]
as their initial value, not [0]. Under minify the [Initial] keyword
rewrites to [Auto], which is the shorter spec-equivalent spelling
and round-trips identically.

New [pp_length_min_max] helper wraps [pp_length_percentage] for the
[Min_*] arms of [pp_property_value]; the [Max_*] arms keep
[pp_length_percentage] (their initial value is the generic [none],
which already has its own [Length None] spelling).
CSS Display 3 sec. 2.6: the two-value [display: <outside> <inside>]
form [inline flow-root] is spec-equivalent to the legacy
[inline-block] keyword. Add the [Multi (Inline, Flow_root)] arm
next to the existing [Multi (Block, Flow_root)] -> [flow-root]
collapse.
CSS Fonts 4 sec. 4.1: [font-family] is a fallback list - the cascade
resolution picks the first available family, so a later duplicate
never wins. Under minify drop the duplicates ([Helvetica, Arial,
Helvetica, sans-serif] -> [Helvetica, Arial, sans-serif]). The
first-occurrence rule preserves source position semantics.
…onent token

[read_flex_flow]'s loop was [if not is_done then read_part; loop],
but [is_done] doesn't stop on a [;] / [!important] / etc., so a
declaration like [flex-flow: wrap row;] would parse [wrap], parse
[row], then trip [read_flex_flow_part] on the trailing [;] and
reject the whole value. Rewrite the loop to (1) stop once both
direction and wrap slots are filled (CSS Flexbox 1 sec. 6.3 caps
at two values) and (2) break on the first iteration where
[read_flex_flow_part] makes no progress / raises instead of
propagating the error.
CSS Values 4 sec. 10.7: [clamp(min, val, max)] computes [max(min,
min(val, max))]. When all three arguments are byte-identical (the
clamp window degenerates to a single point), the function value is
just that argument; emit the bare value instead of the
three-argument call. Conservative on values containing variables
since the textual equality check is post-rewrite (after the
[strip_top_level_calc_arg] pass).
…ical-equal

[optimize] Add [drop_shadowed_declarations]: when two rules share a
canonical selector key, drop earlier declarations whose property name is
rewritten by a later same-selector rule at same-or-stronger importance.
Safe regardless of intermediate rules (a later same-selector write masks
the earlier value for every element the rule could match), so cleancss /
csso both rely on this to collapse repeated declaration blocks.

[test/interop] satcss now compares cascade output against the shortest
oracle via [Cascade_diff.Css_compare.equal ~mode:`Canonical] instead of
raw OCaml AST equality. Strict-AST forced cascade to match each oracle's
spelling quirks (e.g. [rgba(...)] vs hex in [box-shadow]); canonical lets
cascade-better wins (shorter spelling, sorted selector lists) pass.
[should_not_combine] previously refused to combine any rule whose
selector contained a descendant combinator (e.g. [#u .back_org] and
[#u .bri] with identical declarations stayed as two rules). The block
was Tailwind-specific source-shape compatibility, not a cascade-safety
requirement; cleancss / csso happily combine such rules. Drop the
blanket gate and rely on the existing cascade-safety checks
([can_combine_rules] + [disjoint_from] over delayed summaries) to
preserve semantics. Remove the now-unused [has_descendant_combinator].
Cleanup: drop the [of string] sidecar on the [length] math primitives
and carry the typed argument values directly.

- [Clamp of length * length * length] replaces [Clamp of string]; the
  reader [read_clamp_length] parses each comma-separated slot through
  [read_length] (nested [calc] / [var] / [min] / [max] arguments Just
  Work), and [pp_length]'s [Clamp] arm prints each slot back via
  [pp_length].
- [Min of length list] / [Max of length list] similarly type each
  arg.
- [Minmax of length * length] replaces [Minmax of string].
- New [pp_typed_math_call] emits [name(arg1, arg2, ...)] from a
  typed arg list.
- [pp_length]'s [Clamp (X, X, X)] arm collapses to [X] under minify
  (CSS Values 4 sec. 10.7 degenerate clamp window).
- [reduce_length_min_max] / [try_reduce_typed_min_max] reduce
  [min()] / [max()] of same-unit literal lengths to the chosen
  argument, recursively (so [min(min(1px, 2px), 3px)] -> [1px]).
- [context.ml]'s [simplify_length_min / max / clamp] take the typed
  args directly; no more [read_math_args] / [string_of_math_args]
  round-trip.
- Drop the dead string helpers: [simple_dimension_of_string],
  [split_top_level_commas], [reduce_min_max_pairs], [pick_min_max_value],
  [same_unit_min_max_pairs], [update_min_max_group],
  [format_simple_dimension], [try_reduce_min_max_args],
  [normalize_math_args], [emit_math_args], [pp_math_call],
  [is_leading_zero], [strip_top_level_calc_arg], [read_raw_math_args],
  [top_level_arg_count].

Drive-by: include the user's [tree_diff.ml] / [selector.ml] /
[optimize.ml] / [lexer.ml] / [font_face.ml] dune-fmt + WIP commits
that landed alongside.
samoht added 30 commits May 27, 2026 22:24
@scope bounds were stored as raw strings (Scope of string option * string option
* block) and re-parsed on every print and canonicalisation. Store them as
Selector.t option instead: the parser builds the selector (an unparseable bound
is Selector.Invalid, matching cascade's no-raw-text policy), the printer renders
it directly, and canonicalize_scope_selector / scope_selector_in_context become
plain Selector.canonicalize / substitute_nesting with no string round-trip. Drops
pp_scope_selector's parse-print hack and compact_scope_combinators.

Also drop the oklab-none merge guard: merging adjacent rules with identical
declaration blocks is always spec-equivalent, so cascade groups them even when a
value uses oklab() with a none channel (a smaller, valid merge Lightning CSS
declines). Removes value_uses_oklab_none / color_uses_oklab_none.
The document context's scope is conceptually a selector but was stored as a raw
string. Make it Selector.t option (and ?scope:Selector.t on the document
constructor) so callers pass a structured selector rather than text.
@page selectors were kept as a raw string, validated by a hand-rolled string
walker and re-compacted on print. Model them: page_selector { page_name;
page_pseudos } with page_pseudo = Page_first|Page_left|Page_right|Page_blank, and
Page / Page_with_margins now carry a page_selector list (empty = bare @page). The
lenient validator becomes parse_page_selectors that builds the list; the printer
renders it directly, dropping minify_page_selector and the string-compaction
helpers.
A factoring scan re-evaluates a growing prefix and, for each step, re-rendered
every rule to measure the before-size - O(window^2) rule_pp_size calls per scan.
Cache the size on factor_rule_summary (which already holds the rule during the
scan) and sum the cached sizes via gap_before_size, making it O(window).

Cuts ~9% of optimizer allocation on the SatCSS corpus (rule_pp_size /
pp_length). Wall-clock is unchanged: after the earlier structural-comparison
work the path is CPU-bound rather than GC-bound, so this is allocation hygiene
rather than a speedup.
- Shorten over-long physical-identity fuzz test identifiers to the
  house 4-underscore limit (e.g. test_merge_box_shorthand_longhands_
  preserves_element_identity -> test_merge_box_identity), keeping the
  descriptive alcotest labels.
- Order_maintenance.create -> v (idiomatic OCaml constructor name);
  update callers in rule_pool.ml and test_order_maintenance.ml.
- Add module doc comments to the order-maintenance, rule-merge, and
  rule-pool test .mli files.
- test_order_maintenance: Printf.sprintf -> Fmt.str.
Group the physical-identity test cases into their own [identity_cases]
list appended to [suite], keeping the suite function under the 50-line
threshold. No test changes.
Relax the same-selector gap merge and equal-anchor factoring guards from
"the merged selector must beat the intervening rule" to "their
specificities differ on the overlap". A strictly higher- or
lower-specificity competitor is decided by specificity, not source order,
so reordering equal declarations past it is unobservable; only a tie keeps
source order observable and must still block.

Run merge_same_selector_gaps inside the fixpoint pipeline rather than as a
one-shot before the loop, so a merge that only becomes possible after
another pass reorders rules converges within a single optimize call. Drop
the `Stylesheet`-scope gating: specificity reasoning is world-independent,
so the passes hold in every scope.
CSS Cascade 6.1: among same origin/layer/importance, specificity is evaluated
before source order; only a specificity *tie* defers to order. So moving a rule
across an overlapping intervening rule is unobservable unless they tie on the
satisfiable intersection and share a property. Relax both guards from "the moved
selector must beat the intervening one" to "they tie on an overlapping property":

- same-selector gap merge: block only when the *merged* rule (anchor + candidate
  declarations) ties an intervening rule on a shared property; a strictly higher-
  or lower-specificity competitor, or one writing only other properties, is safe
  to cross.
- equal-anchor factoring: use the tie-based blocker (was scope-gated to
  Stylesheet only).

These are pure cascade-safety facts (specificity is world-independent), so the
gap passes now run in every scope. merge_same_selector_gaps moves into the
fixpoint pipeline so a merge it enables only after another pass reorders rules
settles within one optimize rather than needing a second call (fixes the fuzz
minify-fixpoint regression). Tie+conflict cases (.a{c}.b{c}.a, .a/.b/.c,
.a.x/.b.x/.a.y) still block.
Drop the special-case that serialised the combinator following :scope with
surrounding spaces (":scope > .a"). :scope now uses the same combinator
printer as every other compound, so ":scope > .a" minifies to ":scope>.a",
matching ordinary selectors and what Lightning/esbuild emit.

Update the selector oracles that pinned the old spaced output, and make
dom_selector_boundary assert the minified selector output instead of only
checking that parsing succeeds.
Pick the size-minimal member subset when factoring shared declarations
instead of greedily hoisting every rule that carries them: a rule joins the
shared group only when its selector entry is cheaper than duplicating the
declaration inline, so long selectors keep the declaration inline.

Step over a nested-rule factor boundary when the boundary rule shares the
factored value and its nested rules do not touch the factored properties,
and decompose an already-grouped declaration back into adjacent rules when
inlining beats repeating a long shared selector.

Add safety and quality tests for value-aware ties, nesting-boundary
stepping, subset selection, and long-selector inlining, plus a known_larger
list in the satcss interop documenting DOM-dependent cases cascade
correctly leaves larger than SatCSS.
The optimizer now merges non-adjacent same-selector rules across a
non-conflicting intervening rule, groups equal blocks across a
higher-specificity pseudo-class competitor, and merges/nests the cross
example. Update the captured output and prose to match, and correct the
[.btn:hover] specificity from (0,1,1) to (0,2,0) (:hover is a
pseudo-class).
Split try_factor_equal_anchor's recursive [scan] into a top-level
[scan_equal_anchor] (threading [first]) to drop under the 50-line function
threshold, leaving the entry point as a short seed-fold. Use Fmt.str rather
than Printf.sprintf in the satcss interop test, adding fmt to that test's
dune libraries.
Switch to the upstream memtrace package; it exposes the same Memtrace
module API the code already uses, so no source changes are needed.
oklch(50% 0.2 none) and oklab(50% 0.1 0.2) cannot fold to hex (a none hue
and an out-of-gamut colour respectively), so optimize keeps the modern
form but folds the 50% lightness to .5. Add the optimize+minify canonical
form (~optimized) next to the faithful pp form (~expected), which is
unchanged.
…t pp

The percentage/number spelling of a colour's lightness is a node
distinction (Pct vs Num), so choosing the shorter form at print time
broke round-trip (.685 re-parses as Num, 68.5% as Pct). Move the
shortest-spelling choice into normalize_color (AST canonicalisation,
per-space scale) and make pp_color_lightness a faithful serialiser.
A default-zero spread or blur is a stored field (Some Zero), so dropping
it at print time failed round-trip (0 1px 3px 0 re-parses with spread =
Some Zero, 0 1px 3px with spread = None). Make pp_shadow_spread and
pp_shadow_blur faithful serialisers and drop the redundant trailing
default in normalize_shadow, contiguously from the end: spread drops
freely, blur drops only when no spread follows (else 0 1px 0 5px would
rebind 5px as the blur). A var() colour keeps an explicit zero blur as a
disambiguator.
pp_number_percentage was shortest-spelling [Pct 50.] as [.5] (and
[Pct 100.] as [1]) under minify, a node-changing Pct->Num fold that
violates pp purity (the two AST variants parse-roundtrip to different
nodes). Move the fold into the normalize/optimize pass, gated to typed
leaves (scale, filter opacity, ...) so it stays out of custom-property
declarations whose consumer context is unknown.

pp now emits the faithful percentage (50%, 100%) under both pretty and
minify; optimize+minify produces the short number form at typed leaves
(filter:opacity(50%) -> opacity(.5)). Refresh check_number_percentage
and the "simple value helpers optimize+minify" expected to match.
…alize

The selector lists inside [:where], [:is], [:not], [:has], the legacy
[:-moz-any] / [:-webkit-any] aliases, and the [of S] clause of the
[:nth-*-child] / [:nth-*-of-type] families match as set unions (Selectors
4 sec. 3.3-3.5, 4.5, 6.5), so their order has no effect on matching or
specificity. Dedup and sort them by minified-printed form in
[Selector.canonicalize] alongside the existing top-level [List]
handling, so permutations collapse to a single canonical AST and pp
stays a pure serialiser. [Css_compare ~mode:Canonical] equates the
permutations automatically.
…lize

The four [inset()] slots, [xywh()] / [rect()] coordinates, and [polygon()]
vertices are plain [<length-percentage>] positions - not [calc()] operands -
so CSS Values L4 sec. 6.1.1 allows the zero-unit drop ([0px] -> [0]) the way
top-level lengths already do. [Values.normalize_length{,_percentage}] and
[normalize_border_radius] already implement the fold via [~strip:true]; the
clip-path shape normalisers were passing [~strip:false], which is the right
setting only inside [calc()] operand positions (where collapsing to the
[<number>] type would break the expression). Flip them to [~strip:true] so
the canonical AST drops the unit and pp can stay a pure serialiser. [circle]
and [ellipse] use [normalize_position_value ~strip:false] which has broader
callers, left for a separate audit.
…kens

A custom-property value is stored as an opaque [Component.t] token stream
that pp serialises verbatim by design - cascade does not type-canonicalise
it. CSS Values L4 sec. 10.10 lets two streams that differ only in optional
whitespace inside a math function ([calc], [min], [max], [clamp], [round],
[mod], [rem], the trig family, [pow]/[sqrt]/[hypot]/[log]/[exp], [abs]/
[sign]) denote the same canonical AST: whitespace is *required* around
binary [+] and [-] (sign-token disambiguation - stripping it changes
[100% - var(--a)] to [100%-var(--a)], where [-var] is one ident-like
function token) but *optional* around [*], [/], [(], [)], and [,]. Walk
the component tree in [normalize_property_value]'s [Custom_property] arm,
strip the optional whitespace inside math-function bodies, and preserve a
single space adjacent to [+] / [-]. Math context propagates through
grouping parens ([Block]) and into nested math functions, but stops at
non-math functions ([var()]) which have their own grammar. Typed math is
already minified by [pp_calc], so this only matters for opaque [Tokens _]
streams - cascade's [Css_compare ~mode:Canonical] now equates the
Tailwind-vs-cascade [calc(1 / 2 * 100%)] / [calc(1/2*100%)] divergence
automatically.
…lues

Extend the Values L4 sec. 6.1 zero-unit fold (Length{Px,0} ->
Length{None,0}, valid in <length>/<length-percentage> contexts) into
object-view-box's basic-shape value via a new normalize_object_view_box
that recurses into the shape arguments, matching the prior pass for
clip-path. Same gating: typed leaf, never inside a math context. Refresh
the optimize+minify oracle in test_css.ml; the pp-only assertion next to
it keeps inset(0px 1px) faithfully.
A custom property whose @Property registration has a font-family-shape
syntax (i.e. accepts <custom-ident>+ sequences, optionally with the #
multiplier) is now typed-promoted to the font-family AST. That routes
quoted-string values like "Segoe UI Symbol" through the same
normalize_font_family pass as a real font-family declaration, so
canonical comparison treats them equal to the equivalent unquoted ident
sequence - matching the typed-leaf behavior cascade and Lightning CSS
already apply at the property leaf.

The fold stays gated to registered custom properties so unregistered
ones keep their token stream verbatim (the consumer is unknown - a
var() landing in content: would observe the quoted/unquoted difference).

Add four Css_compare ~mode:`Canonical tests covering the typed leaf,
the safety pin keeping "serif" literal distinct from the generic, the
unregistered-custom-prop guard, and the registered-custom-prop ideal
that this change satisfies.
A [:host] / [:root] block is the conventional design-token surface, and the
distinct names declared there are commutative (each [--name] is set at most
once per matching element, and even when duplicated the cascade resolves by
later-wins regardless of where in the block the earlier copy sits). Source
order is therefore not cascade-meaningful for the rule, so canonicalisation
sorts the custom-property declarations alphabetically by name in normalize -
giving a single canonical AST whether the source came from Tailwind's
(priority, suborder)-ordered emitter, a typed [Var.binding] constructor, or
a hand-written stylesheet. The sort is stable so duplicate-name pairs keep
their relative order ("later wins" preserved) and non-custom declarations
stay in their original positions so any interleaving with regular properties
is unchanged. Outside [:host] / [:root] the sort never runs - custom-property
declarations in a regular rule participate in the normal cascade, and
regular property declarations have shorthand / dedup interactions that make
source order load-bearing.
When canonical-minified bytes differ but the structural tree-diff finds no
difference, the inputs ARE equivalent - cascade's canonical pass just hasn't
(yet) collapsed those particular textual variants (tool-header comments,
[@layer a, b;] vs split-form, empty layer-order pins, whitespace inside
[url()], ...). Previously the byte-difference would override the empty
structural answer and bubble up as a string-diff fallback, forcing every
caller to allow-list each new textual equivalence. Flip the precedence:
treat the structural comparator as authoritative and collapse to [No_diff]
when it finds nothing, regardless of byte difference.

To keep the canonical-pass gap visible for future work, [No_diff] now carries
a [canonical_byte_diff] payload: [None] when the bytes also matched (the
fast-path), [Some (expected_canon, actual_canon)] when bytes differed but
structure didn't. Callers using [No_diff _] get the right answer with no
special-casing; maintainers can read the payload to pick which canonical
normalisation to land next - new folds incrementally empty the field over
time without changing caller-visible behaviour.
…int-color-adjust

Until now [-webkit-print-color-adjust] fell through to the untyped declaration
path. Type it as a sibling of [Print_color_adjust] reusing the same
[print_color_adjust] value type (same vocabulary: [economy | exact | auto |
revert | revert-layer | inherit | initial | unset | var(...)]), exactly the
shape already used for [Webkit_box_decoration_break] / [Box_decoration_break]
and [Webkit_text_size_adjust] / [Text_size_adjust].

Adds the GADT constructor, parser entry, pp dispatch, var collector, smart
constructor and .mli surface, and the vendor-alias-redundant rule so the
optimizer drops [-webkit-print-color-adjust:V] when an unprefixed twin with
the same value and importance is present in the same rule (matching the
recent-browser dedup policy applied to the other webkit aliases).
The [_in_components] suffix duplicated information already carried by the
[custom_value] (which is a component sequence) type signature; the shorter
[unquote_font_family_strings] reads cleaner at the call site without losing
any meaning.
…toms

[Calc.float : float -> length calc] and [Calc.infinity : length calc] were
artificially over-specialized: per CSS Values 4 sec. 10's type algebra a bare
[<number>] is dimensionally neutral - multiplying it by a [<length>] yields
a [<length>], by an [<angle>] yields an [<angle>], by a [<time>] yields a
[<time>] - so the same numeric leaf must flow into any dimension. The
underlying [Num of float] GADT constructor was already polymorphic (it
carries no dimension); only the smart constructor's signature was wrong.

Relax both to ['a calc]. This unblocks combining a [Num] leaf with a
non-[length] calc under [Calc.mul] / [Calc.div] - e.g. [Calc.var "spacing" *
Calc.float n] specialised to a [flex_basis calc] - which previously failed
to type-check, forcing callers to drop to opaque tokens.

Audited the rest of the Calc atom constructors: [length]/[px]/[rem]/[em]/
[pct] correctly stay at [length calc] (they wrap dimensioned [Val]s),
[var]/[nested]/[parens] were already polymorphic, and the other
dimensionless atoms in the GADT ([Math_const], [Sibling_index],
[Sibling_count]) have no exposed smart constructor. No further changes
needed; [Calc.int] does not exist.
…age normalize gap

Two changes the test oracle exposed:

(1) [Declaration.container : ?type_:container_type -> string -> declaration]
    mirrors the spec's [container: <name> [/ <type>]?] grammar with name
    mandatory and type slash-prefixed optional. Closes the gap that forced
    callers wanting the shorthand to drop to opaque tokens via
    [Css.custom_property]. Surfaced on [Cascade.Css.Declaration.container]
    rather than flat [Cascade.Css.container] because the latter is already
    bound to the at-rule constructor; the two values have disjoint types but
    OCaml shadowing does not distinguish.

(2) [normalize_property_value] dispatched [Mask_image] through
    [normalize_background_image] but had no arm for [Webkit_mask_image], so
    [-webkit-mask-image: linear-gradient(red, blue)] round-tripped verbatim
    while [mask-image: ...] and [background: ...] shortened [blue] to
    [#00f]. Adding the [Webkit_mask_image] arm closes the inconsistency -
    the vendor-prefixed property now goes through the same canonicalisation
    pipeline as its unprefixed twin.

Oracle tests in [test_declaration.ml] (added by the user as the spec)
pin the canonical forms for [-webkit-print-color-adjust], [-webkit-mask-image],
and [flex-basis: calc(var * number)] round-trips - the latter pinning the
output that the now-polymorphic [Calc.float : float -> 'a calc] must
produce when specialised to a [flex_basis calc].
…image normalize

Two related changes:

(1) Add four submodules under [Css] - [Transform], [Transform_origin],
    [Perspective_origin], [Animation] - each exposing
    [of_string : string -> (_, Error.t) result] over the existing
    [Properties.read_*] cursor-driven parsers. The argument is the
    right-hand-side of a declaration with no [property:] prefix or
    trailing [;], so callers (e.g. Tailwind bracket values like
    [transform: rotate(45deg)]) can lift a typed value substring into
    the typed AST without round-tripping through a full declaration
    parse-and-extract.

    Result-typed rather than exception-raising: a bracket value with a
    typo needs to surface the parser's structured [Error.t] (location,
    sort, breadcrumb path, snippet) so the caller can render a
    diagnostic, not just [Failure "bad transform"]. Empty-input and
    trailing-input cases produce [Bad_value] errors with the property
    name annotation.

(2) Revert the [Webkit_mask_image -> normalize_background_image] arm
    added in bdbbfcb. The corresponding test oracle was updated to
    expect [-webkit-mask-image:linear-gradient(red,blue)] verbatim
    (not the canonical [red,#00f]); vendor-prefixed properties round-trip
    the author's spelling rather than getting re-canonicalised. The
    unprefixed [Mask_image] still goes through the colour-fold
    pipeline.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant