From 442c8d933786b6243071dee91d9bade137b93993 Mon Sep 17 00:00:00 2001 From: Rowan Cockett Date: Mon, 2 Feb 2026 18:43:12 -0700 Subject: [PATCH 1/3] =?UTF-8?q?=F0=9F=8E=AF=20Core=20Node=20Types=20and=20?= =?UTF-8?q?Naming=20Conventions?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- content/RFC0003/index.md | 299 +++++++++++++++++++++++++++++++++++++++ content/RFC0003/myst.yml | 11 ++ 2 files changed, 310 insertions(+) create mode 100644 content/RFC0003/index.md create mode 100644 content/RFC0003/myst.yml diff --git a/content/RFC0003/index.md b/content/RFC0003/index.md new file mode 100644 index 0000000..6e861b2 --- /dev/null +++ b/content/RFC0003/index.md @@ -0,0 +1,299 @@ +--- +title: Core Node Types and Naming Conventions +--- + +This RFC defines the **initial set of non-controversial OXA node types** and establishes a **naming convention** for distinguishing block-level and inline-level content. + +The goal of this RFC is not to be exhaustive, but to lock down the _boring, obvious, and widely shared_ parts of document structure so that tooling can rely on a stable core while more complex or contested structures are addressed in later RFCs.Context + +Across authoring and publishing systems — including word processors, Markdown dialects, HTML, Pandoc, MyST, Quarto, Stencila, JATS — there is strong convergence around a small set of structural elements. These elements form the backbone of narrative documents and are consistently represented as a traversable tree of blocks and inline content. + +By defining these nodes early, OXA establishes a predictable baseline that: + +- Enables generic traversal and transformation +- Supports round-tripping across ecosystems +- Minimizes early bikeshedding +- Creates a clear extension path for future node types + +## Proposed Core Node Types + +### Block-Level Nodes + +These nodes represent structural units that occupy their own place in the document tree. + +- **Heading** +- **Paragraph** +- **Code** +- **ThematicBreak** + +These block nodes are sufficient to represent the majority of narrative scientific documents in a structured, tool-agnostic way. + +### Inline-Level Nodes + +These nodes represent content that appears _within_ block nodes and participates in inline flow. + +- **Text** +- **Emphasis** +- **Strong** +- **Superscript** +- **Subscript** +- **InlineCode** + +Inline nodes are always expected to appear within the `children` array of a block or inline container (See RFC0002). + +## Explicitly Out of Scope + +The following structures are intentionally excluded from this RFC because they introduce additional complexity best handled later: + +- **Quotes / BlockQuotes** — deferred due to attribution, provenance, and citation considerations +- **Tables** — complex structure, dedicated RFC +- **Lists** — ordered, unordered, and definition lists introduce additional hierarchy and semantics +- **Figures, images, and media** — handled in later media-focused RFCs +- **Citations and bibliographies** — require identifier and relationship models + +Excluding these nodes at this stage is a deliberate choice to keep the initial core small, stable, and easy to implement. + +## Naming Conventions and Design Trade-offs + +A key decision in this RFC is the **naming pattern used to distinguish block-level and inline-level nodes**. + +Several conventions were considered: + +- `BlockCode` / `InlineCode` +- `CodeBlock` / `CodeInline` +- `Code` / `InlineCode` + +Each option has trade-offs. + +### Considerations + +1. **Consistency for future extensions** + - New node types should not require revisiting naming decisions later. + - The naming scheme should gracefully handle unforeseen additions. + +2. **Avoiding semantic overreach** + - Some existing nodes (e.g. `BlockQuote`) blur the line between semantic meaning and layout, complicating symmetric naming. + +### Chosen Pattern: `Code` / `InlineCode` + +This RFC proposes the `Code` / `InlineCode` naming pattern as the default convention: + +> When there is not a clear default (e.g. `Text`) and both block and inline variants exist, inline-only nodes are explicitly prefixed with `Inline`. + +Benefits of this approach: + +- Defaults new node types to their default (either block-level or inline), matching common document structure +- Avoids forcing symmetric pairs where they are awkward or misleading (e.g. `Quote` / `InlineQuote`) +- Leaves room for future inline-only extensions without renaming existing block types + +For example: + +- `Code` (block-level) +- `InlineCode` (inline-level) + +This pattern also enables future extensions such as: + +- `List` (block-level) +- `InlineList` (inline-level, e.g. for structured a/b/c points within a sentence) + +Importantly, this avoids a breaking change if an inline variant is introduced later — we do not need to redefine an existing block type as a "block list" retroactively. + +The trade-off is that block and inline variants are not grouped adjacently in file listings (e.g. `Code` vs `InlineCode`). This is considered an acceptable and minor cost. + +## Node Definitions and Examples + +This section provides more concrete, implementation-oriented examples of the proposed core nodes. The examples below are are meant to clarify intent and guide implementers. + +### Abstract Node Shapes + +#### `Literal` + +```typescript +interface Literal { + value: string; +} +``` + +**Literal** represents a leaf node whose primary content is a scalar value. In OXA, `Literal` nodes carry their content via the `value` field rather than `children`. + +Examples of `Literal` nodes include `Text` and `InlineCode`. + +#### `Parent` + +```typescript +interface Parent { + children: [Node]; +} +``` + +**Parent** represents a node that contains other nodes. Parent nodes define the traversable document tree and are the primary mechanism by which structure and ordering are expressed. + +### Block-Level Nodes + +#### `Heading` + +```typescript +interface Heading extends Parent { + type: 'Heading'; + level: number; + children: [Inline]; +} +``` + +**Heading** represents the heading for a section of content. + +- `level` indicates the heading depth (e.g. 1–6) +- Content is expressed via inline children + +Example: + +```yaml +{ type: 'Heading', level: 1, children: [{ type: 'Text', value: 'Introduction' }] } +``` + +#### `Paragraph` + +```typescript +interface Paragraph extends Parent { + type: 'Paragraph'; + children: [Inline]; +} +``` + +**Paragraph** represents a unit of prose. It contains inline content such as text, emphasis, and inline code. + +Example: + +```yaml +{ + type: 'Paragraph', + children: + [ + { type: 'Text', value: 'Run ' }, + { type: 'InlineCode', value: 'make build' }, + { type: 'Text', value: ' to compile the project.' }, + ], +} +``` + +#### `Code` + +```typescript +interface Code extends Literal { + type: 'Code'; + language: string; +} +``` + +**Code** represents a block of preformatted text, typically source code. + +- `language` (optional) indicates the programming language +- content is stored in the `value` field + +Example: + +```yaml +{ type: 'Code', language: 'python', value: 'print("Hello, world")' } +``` + +This node is conceptually paired with `InlineCode`, but is block-level. + +#### `ThematicBreak` + +```typescript +interface ThematicBreak { + type: 'ThematicBreak'; +} +``` + +**ThematicBreak** represents a thematic or structural division between sections of content. + +Example: + +```yaml +{ type: 'ThematicBreak' } +``` + +### Inline-Level Nodes + +#### `Text` + +```typescript +interface Text extends Literal { + type: 'Text'; +} +``` + +**Text** represents unformatted character data. + +Example: + +```yaml +{ type: 'Text', value: 'Hello world' } +``` + +#### `Emphasis` + +```typescript +interface Emphasis extends Parent { + type: 'Emphasis'; + children: [Inline]; +} +``` + +**Emphasis** represents stressed emphasis of its contents. + +Example: + +```yaml +{ type: 'Emphasis', children: [{ type: 'Text', value: 'important' }] } +``` + +#### `Strong` + +```typescript +interface Strong extends Parent { + type: 'Strong'; + children: [Inline]; +} +``` + +**Strong** represents strong importance or emphasis. + +#### `InlineCode` + +```typescript +interface InlineCode extends Literal { + type: 'InlineCode'; +} +``` + +**InlineCode** represents short fragments of code appearing within prose. + +- `language` (optional) indicates the programming language +- content is stored in the `value` field + +Example: + +```yaml +{ type: 'InlineCode', language: 'bash', value: 'ls -la' } +``` + +## Implications + +If accepted, this RFC: + +- Establishes a minimal, interoperable set of node types for early OXA implementations +- Provides a clear naming convention for block vs inline nodes +- Sets a precedent for how future node types should be introduced + +Subsequent RFCs can build on this foundation to introduce: + +- Lists and tables +- Quotes and attribution-aware structures +- Media and figures +- Citations and identifiers + +## Decision + +Acceptance of this RFC establishes the initial core vocabulary and naming conventions for OXA schemas, enabling early implementations to converge while leaving space for future evolution. diff --git a/content/RFC0003/myst.yml b/content/RFC0003/myst.yml new file mode 100644 index 0000000..adee8f2 --- /dev/null +++ b/content/RFC0003/myst.yml @@ -0,0 +1,11 @@ +# See docs at: https://mystmd.org/guide/frontmatter +version: 1 +extends: + - ../rfc.yml +project: + id: 018d2b38-553f-4f4c-b9a3-9368c6b4f2e9 + short_title: Core Node Types and Naming Conventions + date: 2026-02-02 + authors: + - rowanc1 + - nokome From a23a5a33dff44f6e751f72c412c295061eaec7d0 Mon Sep 17 00:00:00 2001 From: Rowan Cockett Date: Mon, 2 Feb 2026 18:49:24 -0700 Subject: [PATCH 2/3] Add subscript/superscript and abstract --- content/RFC0003/index.md | 38 +++++++++++++++++++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) diff --git a/content/RFC0003/index.md b/content/RFC0003/index.md index 6e861b2..973260b 100644 --- a/content/RFC0003/index.md +++ b/content/RFC0003/index.md @@ -1,10 +1,12 @@ --- title: Core Node Types and Naming Conventions +abstract: | + Defines the initial set of non-controversial OXA node types (block and inline) and establishes a naming convention so tooling can rely on a stable core. --- This RFC defines the **initial set of non-controversial OXA node types** and establishes a **naming convention** for distinguishing block-level and inline-level content. -The goal of this RFC is not to be exhaustive, but to lock down the _boring, obvious, and widely shared_ parts of document structure so that tooling can rely on a stable core while more complex or contested structures are addressed in later RFCs.Context +The goal of this RFC is not to be exhaustive, but to lock down the _boring, obvious, and widely shared_ parts of document structure so that tooling can rely on a stable core while more complex or contested structures are addressed in later RFCs. Across authoring and publishing systems — including word processors, Markdown dialects, HTML, Pandoc, MyST, Quarto, Stencila, JATS — there is strong convergence around a small set of structural elements. These elements form the backbone of narrative documents and are consistently represented as a traversable tree of blocks and inline content. @@ -260,6 +262,40 @@ interface Strong extends Parent { **Strong** represents strong importance or emphasis. +#### `Superscript` + +```typescript +interface Superscript extends Parent { + type: 'Superscript'; + children: [Inline]; +} +``` + +**Superscript** represents content that should be rendered above the baseline (e.g. exponents, ordinal suffixes). + +Example: + +```yaml +{ type: 'Superscript', children: [{ type: 'Text', value: '2' }] } +``` + +#### `Subscript` + +```typescript +interface Subscript extends Parent { + type: 'Subscript'; + children: [Inline]; +} +``` + +**Subscript** represents content that should be rendered below the baseline (e.g. chemical formulae, variable indices). + +Example: + +```yaml +{ type: 'Subscript', children: [{ type: 'Text', value: '2' }] } +``` + #### `InlineCode` ```typescript From 6d433d3d69b698347241f249c0ffd44ef4ca0813 Mon Sep 17 00:00:00 2001 From: Rowan Cockett Date: Mon, 2 Feb 2026 18:50:01 -0700 Subject: [PATCH 3/3] Updates --- content/RFC0003/index.md | 1 - 1 file changed, 1 deletion(-) diff --git a/content/RFC0003/index.md b/content/RFC0003/index.md index 973260b..502865c 100644 --- a/content/RFC0003/index.md +++ b/content/RFC0003/index.md @@ -14,7 +14,6 @@ By defining these nodes early, OXA establishes a predictable baseline that: - Enables generic traversal and transformation - Supports round-tripping across ecosystems -- Minimizes early bikeshedding - Creates a clear extension path for future node types ## Proposed Core Node Types