mdz_helpers.ts

Shared constants and pure helper functions for mdz parsers.

Used by both the single-pass parser (mdz.ts) and the two-phase lexer+parser (mdz_lexer.ts + mdz_token_parser.ts).

Declarations
#

48 declarations

view source

A_LOWER
#

A_UPPER
#

AMPERSAND
#

APOSTROPHE
#

ASTERISK
#

AT
#

BACKTICK
#

COLON
#

COMMA
#

DOLLAR
#

EQUALS
#

EXCLAMATION
#

extract_single_tag
#

mdz_helpers.ts view source

(nodes: MdzNode[]): MdzElementNode | MdzComponentNode | null

nodes

type MdzNode[]

returns

MdzElementNode | MdzComponentNode | null

HASH
#

HR_HYPHEN_COUNT
#

HTTP_PREFIX_LENGTH
#

HTTPS_PREFIX_LENGTH
#

HYPHEN
#

is_at_absolute_path
#

mdz_helpers.ts view source

(text: string, index: number): boolean

Check if position in text is the start of an absolute path (starts with /). Must be preceded by whitespace or be at the start of the string. Rejects // (comments/protocol-relative) and / (bare slash).

text

type string

index

type number

returns

boolean

is_at_relative_path
#

mdz_helpers.ts view source

(text: string, index: number): boolean

Check if position in text is the start of a relative path (./ or ../). Must be preceded by whitespace or be at the start of the string. Requires at least one path character after the prefix.

text

type string

index

type number

returns

boolean

is_letter
#

mdz_helpers.ts view source

(char_code: number): boolean

Check if character code is a letter (A-Z, a-z).

char_code

type number

returns

boolean

is_tag_name_char
#

mdz_helpers.ts view source

(char_code: number): boolean

Check if character code is valid for tag name (letter, number, hyphen, underscore).

char_code

type number

returns

boolean

is_valid_path_char
#

mdz_helpers.ts view source

(char_code: number): boolean

Check if character code is valid in URI path per RFC 3986. Validates against the pchar production plus path/query/fragment separators.

Valid characters: - unreserved: A-Z a-z 0-9 - . _ ~ - sub-delims: ! $ & ' ( ) * + , ; = - path allowed: : @ - separators: / ? # - percent-encoding: %

char_code

type number

returns

boolean

is_word_char
#

mdz_helpers.ts view source

(char_code: number): boolean

Check if character is part of a word for word boundary detection. Used to prevent intraword emphasis with _ and ~ delimiters.

Formatting delimiters (*, _, ~) are NOT word characters - they're transparent. Only alphanumeric characters (A-Z, a-z, 0-9) are considered word characters.

This prevents false positives with snake_case identifiers while allowing adjacent formatting like **bold**_italic_.

char_code

type number

returns

boolean

LEFT_ANGLE
#

LEFT_BRACKET
#

LEFT_PAREN
#

MAX_HEADING_LEVEL
#

MIN_CODEBLOCK_BACKTICKS
#

NEWLINE
#

NINE
#

PERCENT
#

PERIOD
#

PLUS
#

QUESTION
#

RIGHT_ANGLE
#

RIGHT_BRACKET
#

RIGHT_PAREN
#

SEMICOLON
#

SLASH
#

SPACE
#

TAB
#

TILDE
#

trim_trailing_punctuation
#

mdz_helpers.ts view source

(url: string): string

Trim trailing punctuation from URL/path per RFC 3986 and GFM rules. - Trims simple trailing: .,;:!?] - Balanced logic for () only (valid in path components) - Invalid chars like [] {} are already stopped by whitelist, but ] trimmed as fallback

Optimized to avoid O(n²) string slicing - tracks end index and slices once at the end.

url

type string

returns

string

UNDERSCORE
#

Z_LOWER
#

Z_UPPER
#

ZERO
#

Imported by
#