diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 0000000..241c527 --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,259 @@ +# Architecture + +_Last updated: 2026-05-25_ + +This document explains the moving parts of `slackify-markdown-python` and the +design decisions behind the trickier bits. It assumes you've already read the +README and understand what the library does at a surface level (convert +CommonMark / GFM-flavored Markdown into Slack's `mrkdwn` flavor). + +## File layout + +``` +src/slackify_markdown/ +├── __init__.py # exports `slackify_markdown(text) -> str` +├── service.py # thin entry: SlackifyMarkdown(text).slackify() +├── slackify.py # the renderer (everything interesting lives here) +└── utils.py # escape_specials() — &, <, >, preserves Slack mentions +tests/ +└── test_convert.py # pytest suite +``` + +## Parsing pipeline + +We use `markdown-it-py` as the parser. We extend its `RendererHTML` class and +override the per-token handlers (`paragraph_open`, `bullet_list_close`, etc.) +to emit Slack `mrkdwn` instead of HTML. + +``` +markdown text + │ + ▼ +slackify() ── scrub STX from input (see "newline sentinel" below) + │ + ▼ +MarkdownIt(gfm-like).render(text) + │ produces a flat token stream: + │ [paragraph_open, inline(text+strong+...), paragraph_close, + │ bullet_list_open, list_item_open, paragraph_open(hidden), ..., + │ bullet_list_close] + ▼ +SlackifyMarkdown.render(tokens) + │ filters to SUPPORTED_TOKENS, then delegates to RendererHTML.render + │ which dispatches each token to the matching handler method + │ on our class. Handlers return strings, which are concatenated. + ▼ +post-process: cap structural-newline runs, materialize sentinel → \n, rstrip + │ + ▼ +final mrkdwn string +``` + +The handlers are mostly straightforward: `strong_open` returns `*`, +`em_open` returns `_`, `link_open/close` build ``, etc. + +## Format mappings + +| Markdown | Slack mrkdwn | Notes | +|---|---|---| +| `# Heading` | `*Heading*` | All 6 levels collapse to Slack's single bold form | +| `**bold**` | `*bold*` | | +| `*italic*` | `_italic_` | | +| `~~strike~~` | `~strike~` | | +| `` `code` `` | `` `code` `` | | +| `[txt](url)` | `` | | +| `` autolink | `` | | +| `- item` | `• item` | (4-space indent per nest level) | +| `1. item` | `1. item` | | +| `> quote` | `> quote` | Single-line prefix; multi-line currently flows as plain | +| Fenced ```` ``` ```` | ```` ``` ```` | Content preserved verbatim, including blank lines | + +## The "structural newline" cap and STX sentinel + +This is the only nontrivial piece of machinery in the renderer. It exists +because `markdown-it-py`'s default renderer model is a **flat token stream +with string-concatenation**, and that model produces ugly newline cascades +when block elements close in chains. + +### The problem + +When a deeply nested list ends, several close-handlers fire back-to-back. +Each one independently emits `\n` (or `\n\n`) for "structural separation." +But they don't know about each other. So a 3-level list ending into a +paragraph produces: + +``` +last_paragraph_close (hidden, tight): \n +inner_list_close: \n +mid_list_close: \n +outer_list_close: \n + = "\n\n\n\n" before next block +``` + +That's 3 blank lines between the list and the paragraph. Visually broken. + +Blockquotes have the same shape: inner `paragraph_close` emits `\n\n`, then +`blockquote_close` adds another `\n` = `\n\n\n` (2 blank lines instead of 1). + +### The fix — sentinel + cap regex + +We replace every "structural" newline emitted by a close-handler with a +sentinel character `NEW_LINE = "\x02"` (U+0002 STX, ASCII Start of Text). +Then in `render()`, after all handlers have run, we: + +1. Cap runs of 3+ sentinels down to 2 with one regex. +2. Replace every sentinel with a real `\n`. + +```python +NEW_LINE = "\x02" +_NEW_LINE_CAP_RE = re.compile(NEW_LINE + "{3,}") + +def render(self, tokens, options, env): + final = [t for t in tokens if t.type in self.SUPPORTED_TOKENS] + rendered = super().render(final, options, env) + rendered = self._NEW_LINE_CAP_RE.sub(self.NEW_LINE * 2, rendered) + rendered = rendered.replace(self.NEW_LINE, "\n") + return rendered.rstrip("\n") + "\n" +``` + +Net effect: any structural-newline cascade collapses to exactly one blank +line, regardless of how deep the close-chain is. + +### Why a sentinel — why not just `re.sub(r"\n{3,}", "\n\n", rendered)`? + +Code blocks. A fenced ```` ``` ```` block can legitimately contain runs of +blank lines in its content. If we ran the cap regex against the rendered +output directly, we'd corrupt user code. + +By using a sentinel instead of real `\n` for *structural* newlines, we +separate the alphabets: + +- Close-handlers emit `\x02` (sentinel) for "I'm contributing to block + separation" +- Code-block handlers emit literal `\n` for content +- The cap regex only ever sees / cares about runs of `\x02` +- The final `replace` materializes sentinels into real `\n` +- Code-block `\n` is untouched throughout + +### Why STX (U+0002) specifically? + +This is well-trodden territory. The most-installed Python markdown library, +`python-markdown`, uses **STX (U+0002) and ETX (U+0003)** as boundary +markers for its own internal placeholders (see `markdown/util.py` — +`AMP_SUBSTITUTE`, `INLINE_PLACEHOLDER`, etc.). We use the same convention +for the same reason: STX is an ASCII control character that essentially +never appears in real user text, and is safe to manipulate as a normal +character everywhere we touch it. + +Other choices we considered and rejected: + +| Choice | Verdict | Why | +|---|---|---| +| `\n{3,}` regex on real newlines | rejected | Eats blank lines inside code blocks | +| Split output on `` ``` ``, regex evens | rejected | Special-casing; readable but ugly | +| `` (Private Use Area) | rejected | "Reserved for cooperating apps to define" — Unicode FAQ explicitly warns this collides with real PUA usage | +| `﷐` (Unicode noncharacter) | rejected | Same valid-Unicode-codepoint risk per FAQ | +| `\x00` NULL | rejected | Python source files cannot contain literal NULL; shells / argv / `os.exec*` all reject NULL | +| **`\x02` STX** | **chosen** | Battle-tested by python-markdown; ASCII-safe in source files, shells, JSON, filesystems | +| Source-level fix (track container state, lookahead, kill cascade at emit) | viable but rejected for now | ~30 lines + new state vs. 1 regex; would couple close-handlers | + +### Collision safety — the input scrub + +STX *can* appear in user input — `markdown-it-py` does not strip ASCII +control characters during normalization. So before parsing, `slackify()` +runs a one-line scrub: + +```python +text = self.markdown_text.replace(self.NEW_LINE, "") +``` + +This guarantees that no STX reaches the renderer except via our own +close-handlers, so the cap-then-materialize logic can't be confused. + +The trade-off: a literal STX a user typed in their Markdown will be +silently dropped. In practice nobody types ASCII control characters into +Markdown by accident, so this is a non-issue. + +## State on the renderer + +Two pieces of state, both reset to default per-instance: + +- `self._in_heading: bool` — set by `heading_open`, cleared by `heading_close`. + Used by `strong_open/close` to suppress `**` inside `# **Bold**` headings + (otherwise Slack `mrkdwn` collides: both heading and bold map to `*`, + producing malformed `**text**` output). +- `self._list_depth: int` — incremented by `bullet_list_open` and + `ordered_list_open`, decremented by the corresponding closes. Used by + `list_item_open` to choose the right bullet glyph (`•` / `◦` / `▪` for + depths 1/2/3+) and to compute the leading indent (`4 * (depth - 1)` spaces). + +We never look at sibling/parent token relationships beyond the one-token-back +implicit "did we just see X" via these flags. Anything more sophisticated +would push us toward the AST-walker design (see below). + +## Known limitations + +These all stem from the same root cause: a flat-token-stream renderer with +no structural context can't compute things that depend on tree shape. + +1. **Multi-paragraph items in lists don't carry the list indent.** When a + list item contains a second paragraph or a code block, the continuation + block flows back to column 0 instead of being indented to match the + item. Fixing this needs the renderer to know "I'm currently inside a + `list_item` at depth N" when handling a `code_block` or non-first + `paragraph` token. +2. **Hardbreak / softbreak continuation lines inside list items lose + indent** for the same reason as (1). +3. **Bullet glyphs only have 3 distinct shapes** (`•`, `◦`, `▪`); deeper + nesting reuses `▪` but indent keeps growing. This matches Slack's own + native rendering of deeply nested lists. +4. **Multi-line blockquotes** only get the `> ` prefix on the first + paragraph. Lines after the first `paragraph_close` inside a blockquote + flow as plain content. + +## Would a real AST renderer fix this? + +Yes. The cascade is purely an artifact of `markdown-it-py`'s sequential +token-stream renderer model. Each `*_close` handler returns a string in +isolation, and they get concatenated blindly. + +A tree-walker renderer would have full structural context: when walking +into a `list_item` node it could push indent state; when emitting the last +child of a top-level `list` node it could emit *exactly* the right +separator for what comes next; multi-paragraph items would naturally +indent because the walker knows it's inside an item. + +`markdown-it-py` ships with `markdown_it.tree.SyntaxTreeNode` which can +build a tree from a token list. Migrating would mean writing a recursive +`walk(node) -> str` method that owns its own indent / spacing state, +replacing both the per-handler emit model and the sentinel cap. + +This is ~50 lines of refactor and probably the right long-term move. It +would obsolete the sentinel + cap and resolve all four known limitations +above. Not done yet because the current setup works for real Slack +content and the cap is a 5-line fix that buys ~80% of the value. + +Tracked in [issue #19](https://github.com/thesmallstar/slackify-markdown-python/issues/19). + +## Test coverage + +`tests/test_convert.py` contains 60 tests covering: + +- All single-token mappings (bold, italic, strike, links, mentions, etc.) +- Tight lists, loose lists, mixed lists, deep nesting up to 5 levels +- Blockquotes with inner content +- Code blocks with special characters and blank-line preservation +- The STX cascade-cap (verified that runs of 3+ blank lines collapse to 1) +- The STX input scrub (verified that user-input STX cannot corrupt output) +- A large "complex_markdown" integration test that exercises most features + together +- 10 explicitly complex / edge-case tests (deep nesting, mixed ordered/ + unordered, code with specials, loose lists, blockquote+list, all heading + levels, multi-blank-line collapse-with-code-preservation, link with + nested formatting, inline-code + mentions, sentinel scrub). + +Run with: + +```bash +PYTHONPATH=src python3 -m pytest tests/ -v +``` diff --git a/src/slackify_markdown/slackify.py b/src/slackify_markdown/slackify.py index 489cae9..4ff6d4d 100644 --- a/src/slackify_markdown/slackify.py +++ b/src/slackify_markdown/slackify.py @@ -49,12 +49,24 @@ class SlackifyMarkdown(RendererHTML): "softbreak", ] + _BULLETS_BY_DEPTH = ("•", "◦", "▪") + _INDENT_UNIT = " " + # U+0002 STX (Start of Text) is the "structural newline" sentinel. + # Close-handlers emit this instead of "\n" so render() can cap structural- + # newline runs at 2 (one blank line) without touching real \n inside code + # blocks. Same approach python-markdown uses for its placeholders. + # User input is scrubbed of NEW_LINE in slackify() so collisions are + # impossible. See architecture.md for the full rationale. + NEW_LINE = "\x02" + _NEW_LINE_CAP_RE = re.compile(NEW_LINE + "{3,}") + def __init__(self, markdown_text: str): super().__init__() self.markdown_text = markdown_text self._in_heading = False + self._list_depth = 0 - # this is not correctly done, we need to check in an deopth for children, + # this is not correctly done, we need to check in an depth for children, # the library offers allowed tokens/tags. Move to that instead of this :), todo. def render( self, tokens: List[Token], options: Dict[str, Any], env: Dict[str, Any] @@ -65,6 +77,11 @@ def render( final_tokens.append(token) rendered = super().render(final_tokens, options, env) + # Cap structural-newline runs at 2 (one blank line), then materialize + # to real \n. Code blocks emit real \n directly, so their content is + # not affected by the cap. + rendered = self._NEW_LINE_CAP_RE.sub(self.NEW_LINE * 2, rendered) + rendered = rendered.replace(self.NEW_LINE, "\n") return rendered.rstrip("\n") + "\n" def hardbreak( @@ -86,6 +103,10 @@ def softbreak( return "\n" def slackify(self) -> str: + # Scrub the sentinel char from user input so it can't collide with our + # newline-cap machinery in render(). markdown-it-py does not strip + # ASCII control chars, so we have to do it here. + text = self.markdown_text.replace(self.NEW_LINE, "") md = MarkdownIt( "gfm-like", renderer_cls=type(self), @@ -96,7 +117,7 @@ def slackify(self) -> str: }, ).disable("table") - return md.render(self.markdown_text) + return md.render(text) def text( self, @@ -125,7 +146,7 @@ def heading_close( env: Dict[str, Any], ) -> str: self._in_heading = False - return "*\n\n" + return f"*{self.NEW_LINE}{self.NEW_LINE}" def strong_open( self, @@ -257,6 +278,7 @@ def bullet_list_open( options: Dict[str, Any], env: Dict[str, Any], ) -> str: + self._list_depth += 1 return "" def bullet_list_close( @@ -266,7 +288,8 @@ def bullet_list_close( options: Dict[str, Any], env: Dict[str, Any], ) -> str: - return "" + self._list_depth -= 1 + return self.NEW_LINE def list_item_open( self, @@ -275,10 +298,11 @@ def list_item_open( options: Dict[str, Any], env: Dict[str, Any], ) -> str: + indent = self._INDENT_UNIT * max(self._list_depth - 1, 0) if tokens[idx].info: - return f"{tokens[idx].info}. " - else: - return "• " + return f"{indent}{tokens[idx].info}. " + depth_idx = min(max(self._list_depth - 1, 0), len(self._BULLETS_BY_DEPTH) - 1) + return f"{indent}{self._BULLETS_BY_DEPTH[depth_idx]} " def list_item_close( self, @@ -296,7 +320,7 @@ def ordered_list_open( options: Dict[str, Any], env: Dict[str, Any], ) -> str: - + self._list_depth += 1 return "" def ordered_list_close( @@ -306,7 +330,8 @@ def ordered_list_close( options: Dict[str, Any], env: Dict[str, Any], ) -> str: - return "" + self._list_depth -= 1 + return self.NEW_LINE def paragraph_open( self, @@ -327,8 +352,8 @@ def paragraph_close( # Tight-list items have hidden paragraph tokens; they only need a # single newline between items, not a blank-line block separator. if tokens[idx].hidden: - return "\n" - return "\n\n" + return self.NEW_LINE + return f"{self.NEW_LINE}{self.NEW_LINE}" def blockquote_open( self, @@ -346,7 +371,7 @@ def blockquote_close( options: Dict[str, Any], env: Dict[str, Any], ) -> str: - return "\n" + return self.NEW_LINE def image( self, diff --git a/tests/test_convert.py b/tests/test_convert.py index caf6c00..95a7ad7 100644 --- a/tests/test_convert.py +++ b/tests/test_convert.py @@ -115,17 +115,21 @@ def greet(name): • _Innovative solutions_ • *Cutting-edge technology* • ~Disruptive strategies~ + *Features* 1. *User-Friendly Interface* -• Intuitive design -• Responsive layouts + ◦ Intuitive design + ◦ Responsive layouts + 2. *Performance* -• High-speed processing -• Low latency + ◦ High-speed processing + ◦ Low latency + 3. *Security* -• Data encryption -• Regular security audits + ◦ Data encryption + ◦ Regular security audits + *Code Example* Here's a simple Python function: @@ -136,7 +140,6 @@ def greet(name): ``` > "Code is like humor. When you have to explain it, it’s bad." – _Cory House_ - *Links and Images* For more information, visit our . @@ -417,3 +420,362 @@ def test_user_mention(): # mrkdown = '[](http://atlassian.com "Atlassian")' # slack = '\n' # assert slackify_markdown(mrkdown) == slack + + +# --------------------------------------------------------------------------- +# Complex / edge-case tests. These exercise the structural-newline cap, code +# block content preservation, deep + mixed list nesting, blockquotes, and the +# STX sentinel scrub. See architecture.md for the rationale behind these. +# --------------------------------------------------------------------------- + + +def test_deeply_nested_bullet_list_four_levels(): + """Beyond the 3 distinct bullet glyphs, deeper levels reuse the deepest + glyph but keep adding indent. Trailing block ends with exactly one blank + line (cap fires on the cascading list_closes).""" + markdown = ( + "- L1\n" + " - L2\n" + " - L3\n" + " - L4 falls back to deepest bullet\n" + " - L5 also falls back\n" + "\n" + "After the list." + ) + expected = ( + "• L1\n" + " ◦ L2\n" + " ▪ L3\n" + " ▪ L4 falls back to deepest bullet\n" + " ▪ L5 also falls back\n" + "\n" + "After the list.\n" + ) + assert slackify_markdown(markdown) == expected + + +def test_mixed_ordered_unordered_three_levels(): + """Ordered -> unordered -> ordered -> unordered. Each item_open uses the + right glyph; ordered numbering restarts on the inner level.""" + markdown = ( + "1. Outer ordered\n" + " - Unordered child\n" + " 1. Re-ordered grandchild\n" + " - Final bullet\n" + "2. Second outer\n" + "\n" + "After." + ) + expected = ( + "1. Outer ordered\n" + " ◦ Unordered child\n" + " 1. Re-ordered grandchild\n" + " ▪ Final bullet\n" + "\n" + "2. Second outer\n" + "\n" + "After.\n" + ) + assert slackify_markdown(markdown) == expected + + +def test_code_block_preserves_specials_and_blank_lines(): + """Asterisks, underscores, tildes, HTML entities, and multi-blank-line + spacing all survive inside a fenced code block. The newline cap MUST NOT + touch code-block content.""" + markdown = ( + "Intro.\n\n" + "```\n" + "x = '*not bold*'\n" + "y = '_not italic_'\n" + "z = '~not strike~'\n" + "html = '
&
'\n" + "\n" + "\n" + "\n" + "blank_lines_preserved = True\n" + "```\n\n" + "Outro." + ) + expected = ( + "Intro.\n\n" + "```\n" + "x = '*not bold*'\n" + "y = '_not italic_'\n" + "z = '~not strike~'\n" + "html = '
&
'\n" + "\n\n\n" + "blank_lines_preserved = True\n" + "```\n" + "Outro.\n" + ) + assert slackify_markdown(markdown) == expected + + +def test_loose_list_paragraph_per_item(): + """A list with blank lines between items becomes "loose" — each item's + paragraph_close emits a blank-line separator. The continuation paragraph + inside the first item is rendered as its own block.""" + markdown = ( + "- First item paragraph one.\n\n" + " First item paragraph two.\n\n" + "- Second item.\n\n" + "After list." + ) + expected = ( + "• First item paragraph one.\n\n" + "First item paragraph two.\n\n" + "• Second item.\n\n" + "After list.\n" + ) + assert slackify_markdown(markdown) == expected + + +def test_blockquote_with_inner_list_and_trailing_paragraph(): + """Blockquote containing a paragraph, a list, and a trailing paragraph. + The blockquote prefix "> " is emitted only on the first inner paragraph + (current behavior); inner lists and trailing paragraphs flow as plain + blocks. Cap keeps everything to single blank-line separators.""" + markdown = ( + "> Intro quoted.\n>\n" + "> - first bullet inside quote\n" + "> - second bullet\n>\n" + "> Trailing quoted paragraph.\n\n" + "After quote." + ) + expected = ( + "> Intro quoted.\n\n" + "• first bullet inside quote\n" + "• second bullet\n\n" + "Trailing quoted paragraph.\n\n" + "After quote.\n" + ) + assert slackify_markdown(markdown) == expected + + +def test_all_heading_levels_followed_by_body(): + """All six heading levels collapse to Slack's single bold style (*x*), + each separated by one blank line, followed by a body paragraph.""" + markdown = ( + "# H1\n" + "## H2\n" + "### H3\n" + "#### H4\n" + "##### H5\n" + "###### H6\n\n" + "Body paragraph." + ) + expected = ( + "*H1*\n\n" + "*H2*\n\n" + "*H3*\n\n" + "*H4*\n\n" + "*H5*\n\n" + "*H6*\n\n" + "Body paragraph.\n" + ) + assert slackify_markdown(markdown) == expected + + +def test_multiple_blank_lines_collapse_outside_code_only(): + """Many blank lines between paragraphs collapse to ONE blank line. + Same many-blank-lines INSIDE a code block are preserved verbatim.""" + markdown = ( + "Para1.\n\n\n\n\n" + "Para2.\n\n" + "```\nline1\n\n\n\n\nline2\n```\n\n\n\n" + "Para3." + ) + expected = ( + "Para1.\n\n" + "Para2.\n\n" + "```\nline1\n\n\n\n\nline2\n```\n" + "Para3.\n" + ) + assert slackify_markdown(markdown) == expected + + +def test_link_with_nested_formatting_and_special_url(): + """Bold/italic/strike inside the link text are preserved, and URL query + params with & are not double-escaped. Bare autolink gets the same + shape.""" + markdown = ( + "See [**bold _italic ~strike~_** text](https://example.com/path?q=a&b=c) here.\n\n" + "Plain autolink." + ) + expected = ( + "See here.\n\n" + "Plain autolink.\n" + ) + assert slackify_markdown(markdown) == expected + + +def test_inline_code_and_slack_mentions_preserved(): + """Slack mention syntax <@U...>, , <#C...|name> must survive + escape_specials unchanged. Inline code with `<`, `>`, `&` stays literal.""" + markdown = ( + "Hi <@U12345>, try `if a < b && c > d: pass` then ping .\n\n" + "Channel: <#C99999|general>." + ) + expected = ( + "Hi <@U12345>, try `if a < b && c > d: pass` then ping .\n\n" + "Channel: <#C99999|general>.\n" + ) + assert slackify_markdown(markdown) == expected + + +def test_user_input_with_sentinel_chars_does_not_corrupt(): + """If user input contains literal STX (the internal newline sentinel), + slackify() scrubs it before parsing so it cannot be misread as a + structural break. ETX is unrelated and passes through to output.""" + stx = chr(0x02) + etx = chr(0x03) + markdown = "Hello" + stx + "world and " + etx + " too." + out = slackify_markdown(markdown) + # STX was stripped from input; "Hello" and "world" sit adjacent. + assert stx not in out + assert "Helloworld" in out + # ETX is not our sentinel — it survives. + assert etx in out + + +def test_full_document_with_all_patterns(): + """Single integration test covering: all six heading levels, bold + + italic + strike + inline code + links + autolinks, ordered list, + nested bullets up to depth 3, mixed bold parent items with sub-lists, + fenced code blocks (plain and language-tagged) sitting between + paragraphs and after lists, a verbatim Markdown table (Slack does not + render tables, so it passes through), a blockquote containing an + inline list and a trailing paragraph, Slack mentions of all flavors, + URLs with query strings and ampersands, inline code containing + HTML-special chars, and a tail with many blank lines collapsing to + one. If anything in the renderer regresses, this test will catch it.""" + markdown = ( + "# Project Documentation\n" + "\n" + "A quick **overview** with _emphasis_, ~~deprecated~~, and `inline_code()`. " + "Visit our [home](https://example.com/?q=a&b=c) or autolink " + ".\n" + "\n" + "## Features\n" + "\n" + "The system supports the following capabilities:\n" + "\n" + "1. **Authentication** — multiple providers\n" + " - [OAuth2](https://example.com/oauth)\n" + " - [SAML](https://example.com/saml)\n" + " - Local password fallback\n" + "2. **Storage**\n" + " - Disk\n" + " - S3\n" + " - Versioning\n" + " - Lifecycle rules\n" + " - GCS\n" + "3. **Reliability**\n" + "\n" + "## Quick start\n" + "\n" + "Install the package:\n" + "\n" + "```\n" + "pip install slackify-markdown\n" + "```\n" + "\n" + "Then use it in your code:\n" + "\n" + "```python\n" + "from slackify_markdown import slackify_markdown\n" + "\n" + 'result = slackify_markdown("# Hello\\n\\n- item *one*\\n- item _two_")\n' + "print(result)\n" + "```\n" + "\n" + "The code above will produce mrkdwn ready to post to Slack.\n" + "\n" + "## API surface\n" + "\n" + "| Function | Description |\n" + "|----------|-------------|\n" + "| slackify_markdown(text) | Convert Markdown to mrkdwn |\n" + "\n" + "(Tables are passed through as raw text — Slack does not render Markdown tables.)\n" + "\n" + "## Things to watch out for\n" + "\n" + "> **Important:** Slack mentions like <@U12345>, <#C99999|general>, " + "and must round-trip unchanged.\n" + ">\n" + "> - bullet inside a quoted callout\n" + "> - second bullet\n" + ">\n" + "> Trailing quoted paragraph.\n" + "\n" + "### Example with everything nested\n" + "\n" + "- **Outer item with bold**\n" + " - Sub item with a [link](https://example.com)\n" + " - Sub item with `inline code`\n" + "- **Second outer**\n" + "\n" + "After the deeply mixed list comes a code-heavy paragraph: " + '`if x < 0 && y > 0: raise ValueError("&negative")`.\n' + "\n" + "#### Conclusion\n" + "\n" + "That's it — multiple blank lines below should collapse to one:\n" + "\n" + "\n" + "\n" + "\n" + "End." + ) + expected = ( + "*Project Documentation*\n\n" + "A quick *overview* with _emphasis_, ~deprecated~, and `inline_code()`. " + "Visit our or autolink " + ".\n\n" + "*Features*\n\n" + "The system supports the following capabilities:\n\n" + "1. *Authentication* — multiple providers\n" + " ◦ \n" + " ◦ \n" + " ◦ Local password fallback\n\n" + "2. *Storage*\n" + " ◦ Disk\n" + " ◦ S3\n" + " ▪ Versioning\n" + " ▪ Lifecycle rules\n\n" + " ◦ GCS\n\n" + "3. *Reliability*\n\n" + "*Quick start*\n\n" + "Install the package:\n\n" + "```\npip install slackify-markdown\n```\n" + "Then use it in your code:\n\n" + "```\nfrom slackify_markdown import slackify_markdown\n\n" + 'result = slackify_markdown("# Hello\\n\\n- item *one*\\n- item _two_")\n' + "print(result)\n" + "```\n" + "The code above will produce mrkdwn ready to post to Slack.\n\n" + "*API surface*\n\n" + "| Function | Description |\n" + "|----------|-------------|\n" + "| slackify_markdown(text) | Convert Markdown to mrkdwn |\n\n" + "(Tables are passed through as raw text — Slack does not render Markdown tables.)\n\n" + "*Things to watch out for*\n\n" + "> *Important:* Slack mentions like <@U12345>, <#C99999|general>, " + "and must round-trip unchanged.\n\n" + "• bullet inside a quoted callout\n" + "• second bullet\n\n" + "Trailing quoted paragraph.\n\n" + "*Example with everything nested*\n\n" + "• *Outer item with bold*\n" + " ◦ Sub item with a \n" + " ◦ Sub item with `inline code`\n\n" + "• *Second outer*\n\n" + "After the deeply mixed list comes a code-heavy paragraph: " + '`if x < 0 && y > 0: raise ValueError("&negative")`.\n\n' + "*Conclusion*\n\n" + "That's it — multiple blank lines below should collapse to one:\n\n" + "End.\n" + ) + assert slackify_markdown(markdown) == expected