mdz_stream_parser.ts

Streaming opcode parser for mdz.

Fed chunks of text (e.g., from LLM output), emits opcodes as rendering instructions. Makes optimistic assumptions about ambiguous syntax and emits revert opcodes to correct when wrong. Never re-parses.

The design was independently arrived at but shares goals with {@link https://bsky.app/profile/pngwn.at/post/3mi527zntb22n @pngwn.at}'s Penguin-Flavoured Markdown (PFM): restrict the syntax so streaming is tractable, render optimistically and correct when wrong, emit serializable opcodes to avoid re-parsing, and keep the opcodes target-agnostic so any renderer can consume them. mdz diverges in one respect: the Svelte consumer (MdzStreamState) does build a reactive tree from opcodes — the platform dictates this — but mutations are fine-grained via $state, not diffed.

The parser is split across sibling modules: this file holds the public MdzStreamParser class and the process_loop / process_inline orchestrators. Per-category handlers (block / inline / link / url / text) live in mdz_stream_parser_*.ts as free functions taking the shared MdzStreamParserState as first argument.

Usage:

const parser = new MdzStreamParser(); parser.feed('hello **bold'); const ops1 = parser.take_opcodes(); // open Paragraph, text "hello ", open Bold, text "bold" parser.feed('** world'); const ops2 = parser.take_opcodes(); // close Bold, text " world" parser.finish(); const ops3 = parser.take_opcodes(); // close Paragraph

Declarations
#

view source

MdzStreamParser
#

mdz_stream_parser.ts view source

import {MdzStreamParser} from '@fuzdev/fuz_ui/mdz_stream_parser.js';

Streaming opcode parser for mdz content. Feed chunks via feed(), retrieve opcodes via take_opcodes(), call finish() at end.

The opcode sequence is not deterministic across chunk boundaries — the same input fed in different chunk sizes may produce different text/append_text splits and different optimistic/revert sequences. The final tree (via mdz_opcodes_to_nodes) matches the one-shot result for all input except one case: italic (_..._) where the opening and closing delimiters straddle a chunk boundary. Italic is non-optimistic — it requires a confirmed closer in the buffer — so when the closer arrives only in a later chunk, the opening _ has already been emitted as text and italic cannot apply retroactively. See try_italic in mdz_stream_parser_inline.ts.

feed

Feed a chunk of text to the parser. Opcodes are accumulated and retrieved via take_opcodes().

type (chunk: string): void

chunk

type string
returns void

finish

Signal end of input. Resolves all pending state: closes open blocks, reverts unclosed optimistic opens, trims trailing newlines.

Trailing-newline trimming is handled in one place: trim_trailing_newline() called at the top of close_paragraph() and close_codeblock_at_eof(), before either function reverts its inner stack. The trim sees the just-flushed text node's last_text_id (or a still-accumulated \n) and emits a trim_text opcode. Revert opcodes only fire after.

Optimistic-container revert is handled by close_paragraph and close_heading (each pops its own inner stack), so no separate revert_all_optimistic is needed — optimistic containers can only exist inside an open Paragraph or Heading (parser invariant).

type (): void

returns void

take_opcodes

Drain and return all accumulated opcodes. Destructive — empties the internal queue. The returned array is owned by the caller.

type (): MdzOpcode[]

returns MdzOpcode[]

Depends on
#