streaming #

Use the Mdz component (see usage) when you have complete content upfront. For content that arrives incrementally (e.g. from an LLM), use MdzStreamParser with MdzStreamState and MdzStream. The parser emits opcodes as rendering instructions — never re-parsing — and the state applies them as fine-grained Svelte mutations. The streaming design is derived from @pngwn's ideas in this Bluesky thread (pngwn.at), which originated the approach mdz implements: restrict the dialect so streaming is tractable, render optimistically and correct when wrong, minimize work by never re-parsing, and emit serializable target-agnostic opcodes instead of a tree.

Demo
#

Click the stream button below — each character is fed one at a time to show how constructs build incrementally:

100ms

scrub
0/498

(press to begin)

opcodes (0)

import {MdzStreamParser} from '@fuzdev/mdz/mdz_stream_parser.js';
import {MdzStreamState} from '@fuzdev/mdz/mdz_stream_state.svelte.js';

const parser = new MdzStreamParser();
const stream = new MdzStreamState();

// feed chunks as they arrive
parser.feed(chunk);
stream.apply_batch(parser.take_opcodes());

// when done
parser.finish();
stream.apply_batch(parser.take_opcodes());

<MdzStream {stream} />

Three rendering paths
#

mdz ships three ways to turn .mdz text into a rendered tree. Pick based on whether content is static, available all-at-once, or streaming.

Path 1: Mdz component (default)

For inline use in a Svelte template, with content known up front:

<Mdz content="**bold** text" />

Internally calls mdz_parse and renders via MdzNodeView. Best for: documentation pages, alerts, tooltips — anything where you have the full string before render. With /docs/svelte_preprocess_mdz this also compiles away at build time for static strings. To skip the parse, pass a pre-parsed tree instead of a string — <Mdz nodes={tree} /> — for content parsed ahead of render (e.g. a docs site pre-parsing its markdown at build time).

Path 2: mdz_parse + MdzNodeView (one-shot, manual)

For control over wrapper markup, custom whitespace handling, or non-default CSS:

import {mdz_parse} from '@fuzdev/mdz/mdz.js';
import MdzNodeView from '@fuzdev/mdz/MdzNodeView.svelte';

const nodes = mdz_parse(content);

<div class="custom white-space:pre">
	{#each nodes as node}
		<MdzNodeView {node} />
	{/each}
</div>

Same input, same tree as path 1, but you own the surrounding container. mdz_parse is the canonical reference parser — fixture tests pin its output as the source of truth.

Path 3: MdzStreamParser + MdzStreamState + MdzStream

For content that arrives in chunks:

import {MdzStreamParser} from '@fuzdev/mdz/mdz_stream_parser.js';
import {MdzStreamState} from '@fuzdev/mdz/mdz_stream_state.svelte.js';

const parser = new MdzStreamParser();
const stream = new MdzStreamState();

// feed chunks as they arrive
parser.feed(chunk);
stream.apply_batch(parser.take_opcodes());

// when done
parser.finish();
stream.apply_batch(parser.take_opcodes());

<MdzStream {stream} />

MdzStreamParser emits opcodes — small, serializable rendering instructions — as bytes arrive. MdzStreamState applies them to a reactive Svelte 5 tree. MdzStream walks that tree and produces DOM. Each layer is replaceable; opcodes are target-agnostic.

Picking a path
#

The split is by input regime. The sync parser (paths 1–2) owns random-access input — content you already have as a complete string. The streaming parser (path 3) owns append-only input — content that arrives over time. They implement one grammar; parity tests bind them, with the sync parser as the normative reference.

Use path 1 when the content is fixed at write time or arrives from a synchronous source. The svelte_preprocess_mdz preprocessor can collapse the call to a static render.

Use path 2 when you need custom wrapping markup but still parse all-at-once.

Use path 3 when chunks arrive over time. The output tree is identical to path 1/2 for the same final input, outside the documented adversarial cases (see below).

Opcode design
#

A streaming parser can't backtrack — it must emit something coherent for every byte it consumes. mdz handles this with optimistic opens and explicit reverts.

When the parser sees ** it doesn't know yet whether it'll form bold or end up as literal **. It emits open Bold immediately. If a closing ** arrives, it emits close Bold and the speculation succeeded. If a paragraph break or EOF interrupts first, it emits revert Bold — the consumer drops the wrapper, re-parents the children to the grandparent, and prepends the literal ** delimiter as text.

The opcode types are:

open — open a container (Paragraph, Bold, Italic, Link, Heading, List, ListItem, Codeblock, etc.)
close — close the previously opened container, with deferred metadata resolved at close time (heading id, link reference). May carry discard: true for whitespace-only paragraphs the consumer should drop.
text — create a leaf Text or Code node
append_text — extend the last text node (avoids one node per character during plain runs)
trim_text — drop trailing characters from a text node (used for trailing-newline trim at block close)
void — create a self-contained leaf (Hr)
revert — undo an optimistic inline open (block structure is never speculative)
wrap — retroactively wrap an existing text node in a Link (auto-links only — may also split trailing punctuation; see below)

The full type definition is in mdz_opcodes.ts.

Why `wrap` exists
#

Auto-detected URLs (https://..., /path, ./relative) are the one case where neither optimistic-open nor hold-until-terminator gives a good streaming feel.

If the parser opens a Link optimistically on every leading h, every word starting with h flashes blue before reverting. If it instead holds all bytes until a terminator, a 40-character URL creates a 40-character pause in the rendered output — readers see a stutter.

The wrap opcode resolves both problems. The URL streams as ordinary visible text. When the terminator finally arrives, wrap retroactively re-parents that text node inside a Link. The text content never changes — only its parent changes. Readers see prose flowing naturally, then a single moment where the URL upgrades from prose to clickable link. No flash, no pause.

wrap also handles trailing punctuation trim. For https://fuz.dev., the . is not part of the URL. wrap carries trim_end and trim_id fields that split the text node — the URL portion goes inside the Link, the trailing punctuation becomes a sibling Text node after it.

Determinism and chunk boundaries
#

The opcode sequence is not deterministic across different chunk sizes. The same input fed as one chunk versus many produces different intermediate text/append_text splits and different optimistic/revert sequences along the way.

The final rendered tree is deterministic. Bold, italic, and strikethrough open optimistically when no closer is visible yet, and four mechanisms keep the chunked result identical to the one-shot parse:

Greedy rejection carries over. One-shot parsing rejects an italic opener whose first closer candidate fails its word boundary. When that candidate only arrives in a later chunk, the streaming parser reverts the already-open container (revert_failed_close) — so the _user_id field never stays italic, no matter how it's chunked. (Bold and strikethrough have no boundary checks to fail — their doubled delimiters pair anywhere.)
A one-character hold at the delimiter. A * or ~ that is the last buffered character waits for the next character to learn whether it doubles into a delimiter; a potential italic closer at the buffer end waits because its word boundary depends on what follows.
Failed closers can re-open. After a failed-closer revert, the same delimiter is re-tried as a fresh opener, matching where the one-shot parse continues.
EOF gating. At finish() the buffer is complete, so open decisions stop speculating: bold, italic, and strikethrough open only when their confirmed closer is already in the final buffer, and an inline-code candidate without one degrades to literal text. Content held back during streaming (e.g. behind an undecided backtick) still parses like the one-shot parse when the stream ends.

Four residual divergence classes remain:

Italic-bounded code spans. An inline-code candidate held across chunks can decide text-vs-code bounded by a wrongly-optimistic italic — one that opened before its failed closer was visible, where the one-shot parse greedy-rejects it and scans unbounded. This needs an _-bearing code span chunked so the italic opens before the span's closing backtick arrives; italic is the only wedge, since it's the only delimiter whose one-shot form rejects on a failed first closer.
Unclosed optimistic code spans. An inline-code candidate that opened optimistically and never closes consumes its tail as raw code text, so formatting inside it never forms once EOF flattens it back to text — the parser never re-parses.
EOF-flat links and tags. finish() doesn't open links or tags, so link/tag syntax held into EOF parses flat.
Block interrupts across optimistic inlines. A column-0 block line (heading, HR, fence, list marker, quote prefix) interrupts the paragraph even when an optimistic inline spans it — the one-shot parse, knowing the closer exists, swallows the line as inline text instead; streaming can't know, and interrupting matches the one-shot parse whenever no closer ever arrives.

Chunked-equals-one-shot is asserted across chunk sizes in src/test/mdz_parser_parity.test.ts, including failed-closer and delimiter-run inputs.

Append-only invariant
#

Once emitted, opcodes are never mutated or removed. This means:

A consumer can persist the opcode stream and replay it later.
A network protocol can carry opcodes from parser to renderer.
The renderer never re-parses.

The revert and trim_text opcodes look retroactive but aren't — they're new opcodes the consumer interprets as "drop these nodes" or "shorten this string". The stream itself only grows.

Stated precisely, the invariant is no implicit retroactivity. Corrections to already-emitted output are allowed — but they must be bounded, local, and expressed in the stream itself (revert, wrap, trim_text), so the set of things that can ever visually change is enumerable and testable. Re-parsing is the unbounded, implicit form of correction — "anything in this region may now differ, go figure out what" — which pushes diffing onto every consumer and excludes write-once targets. That is what mdz bans, and it's why the dialect is restricted: every construct must be decidable within a bounded hold or correctable with a local opcode.

This is the core of pngwn's opcode insight: because no opcode is ever mutated or removed, the stream itself is the incremental interface — no tree to produce, no diffing to minimize work, and any target (a Svelte tree, HTML, native views) can consume it.

Consumers
#

mdz ships two opcode consumers:

mdz_opcodes_to_nodes(opcodes) — replays an opcode array into the same MdzNode[] tree that mdz_parse produces. Used by tests to assert parity. Useful when you want the static tree shape but already have opcodes (e.g. cached from an earlier stream).
MdzStreamState — applies opcodes to a reactive Svelte 5 tree of MdzStreamNode instances. Each node's content, children, and metadata fields are $state, so Svelte updates only what changed. MdzStream renders the tree.

The two consumers' outputs are structurally equivalent (parity tests assert this). MdzStreamState is built for fine-grained reactivity and keeps per-id node identity for granular updates, so it skips a couple of tidies (adjacent-Text merging, single-tag paragraph unwrap) that mdz_opcodes_to_nodes applies at tree-build time.

Limitations
#

Residual divergences under adversarial input, as documented above — italic-bounded and unclosed code spans under adversarial chunking, link/tag syntax held into EOF, and column-0 block lines interrupting paragraphs spanned by an optimistic inline.
Opcode stream order varies with chunking, but the final tree does not.
No partial-revert — once a container closes, it's committed. Mid-render edits aren't supported.
Single-pass — backtracking would defeat the streaming guarantee. Ambiguous syntax (e.g. unclosed [) renders as visible text via revert rather than re-parsing.
A table's header holds one line — a pipe row (| a | b |) renders only once its delimiter row (| - | - |) arrives, the bounded lookahead that tells a table from a paragraph of literal pipes; body rows then stream one per line. This is intentional, not a stall: there's no useful partial render of a header before it's known to be a table.