mdz_stream_parser.ts view source
import {MdzStreamParser} from '@fuzdev/mdz/mdz_stream_parser.js'; Streaming opcode parser for mdz content.
Feed chunks via feed(), retrieve opcodes via take_opcodes(), call finish() at end.
The opcode sequence is not deterministic across chunk boundaries — the same input
fed in different chunk sizes may produce different text/append_text splits and
different optimistic/revert sequences. The final tree (via mdz_opcodes_to_nodes)
matches the one-shot result: optimistic opens are corrected by revert opcodes
at the first failed closer, at run close (paragraph, heading, or list-item
run), or at EOF, and at EOF the delimiter-paired opens
(bold/italic/strikethrough) and inline-code candidates are gated on a closer
scan of the complete final buffer, so held tails parse like the one-shot
parse. The residual divergence classes are:
- Backtick-adjacent chunking — an inline-code candidate held across
chunks can make its text-vs-code decision bounded by a wrongly-optimistic
italic (opened before its failed closer was visible), where the one-shot
parse greedy-rejects that italic and scans unbounded — `` _
__`` chunked at the italic stays flat text where one-shot parses the code span. Italic is the only wedge (the only delimiter whose one-shot form rejects on a failed first closer); seetry_code's hold andcode_search_limit. - Optimistic inline code unclosed at EOF — a ``
` that opened optimistically consumes its tail as raw code text, so formatting inside it never forms (`` hx **b** z` stays flat where one-shot parses the bold); the parser never re-parses, so the EOF revert can only flatten it to text. - Link/tag opens at EOF —
finish()doesn't open links or tags, so a held tail containing complete[text](url)/<Tag>…</Tag>syntax parses flat. - Block elements interrupt optimistic inlines — at column 0 a
heading/HR/fence/list/blockquote line interrupts the open paragraph even when an
optimistic inline container spans it (
**a\n# h\nb**parses the heading here; the one-shot parse, knowing the closer exists, swallows the line as bold text). Inherent: the swallow is only correct when a closer eventually arrives, which streaming can't know — interrupting matches the one-shot parse on the no-closer flip side (**a\n# h) and renders blocks promptly.
Lifecycle: one parser instance per stream — feed() any number of times,
finish() exactly once, then a final take_opcodes(). There is no
reset(); calling feed() or finish() after finish() is undefined.
To restart, construct a new parser (and a new consumer — see
MdzStreamState).
feed
Feed a chunk of text to the parser.
Opcodes are accumulated and retrieved via take_opcodes().
type (chunk: string): void
chunk
stringvoidfinish
Signal end of input. Resolves all pending state: closes open blocks, reverts unclosed optimistic opens, trims trailing newlines.
Trailing-newline trimming is handled in one place: trim_trailing_newline()
called at the top of close_paragraph() and close_codeblock_at_eof(),
before either function reverts its inner stack. The trim sees the
just-flushed text node's last_text_id (or a still-accumulated \n) and
emits a trim_text opcode. Revert opcodes only fire after.
Optimistic-container revert is handled by close_paragraph and
close_heading (each reverts everything above its own block frame via
revert_above), so no separate revert pass is needed — optimistic
containers can only exist inside an open Paragraph or Heading (parser
invariant).
type (): void
voidtake_opcodes
Drain and return all accumulated opcodes. Destructive — empties the internal queue. The returned array is owned by the caller.
type (): MdzOpcode[]
MdzOpcode[]