{{count}}, {{date, datetime}}), the $t() nesting refs, and any markup tags.
@astilba/core provides the masking and validation logic to make that reliable.
The problem masking solves
Left unprotected, an MT engine will happily “translate” the parts it shouldn’t:{{count}} itemsmight come back as{{cuenta}} elementos— the variable renamed, so interpolation silently breaks.- A formatter keyword like
one/otherinside a token, or a$t()ref name, can be translated touno/otros, which then resolves to nothing. - A markup tag can be dropped or rewritten.
Mask, translate, unmask
maskTokens replaces every non-text token (interpolation, nesting, markup) with an opaque
sentinel drawn from the Unicode private-use area (U+E000…U+E001). The formatter keyword
and the $t() ref name live inside the masked span, so the engine never even sees them
to translate.
One guard on masking
maskTokens throws (MASK_VALIDATION) if the literal text already contains a reserved
sentinel delimiter (U+E000 / U+E001). This is rare but legal in real values — private-use
glyphs from icon fonts like Material Icons or Nerd Fonts — and masking it would be
ambiguous. Strip or escape those characters before masking.
After translation: two complementary checks
Once the translation comes back, astilba offers two checks, each suited to a different point in the pipeline.validateSentinels — operate on the still-masked string
If you still have the masked string the engine returned (before unmasking),
validateSentinels checks that every sentinel was returned exactly once, unmodified, and
that the engine invented none. Reordering is allowed (target languages reorder freely); pass
requireOrder: true to also assert original order.
validatePlaceholderTokens — operate on restored tokens
validatePlaceholderTokens is the fail-closed placeholder validator. It compares a
source value’s tokens against its translation’s tokens and fails if any placeholder was
added, dropped, or modified. Placeholder identity is the canonical fields directly — variable
- format for interpolation, ref + options for nesting, raw for markup — so a value and its own translation carry byte-identical placeholders, and no syntax-specific normalisation is needed.
The string-entry form
The one place a raw string must be re-tokenized is a translation returned from MT — it was never in the model, so it has no token view.validatePlaceholders(source, translated, tokenize) takes the adapter’s tokenizer by injection, tokenizes both sides, and defers to
validatePlaceholderTokens. The i18next adapter pre-binds its tokenizer so you get an
ergonomic two-argument validatePlaceholders(source, translated) — see
the adapter reference.
An adapter wanting looser placeholder matching (for example, normalising whitespace) can
pre-normalise its tokens before calling
validatePlaceholderTokens. By default the check
is strict, because a value and its translation carry byte-identical placeholders.Related
- The canonical model — the
ValueTokenkinds masking operates on. - @astilba/core API — the masking function signatures.