Which function do I want?¶
This is the most important distinction in disarm, and the one newcomers most often get wrong. disarm performs two different mappings that look similar but are opposites, backed by two separate tables.
The common mistake is reaching for transliterate to defend against homoglyph
spoofing. It does the opposite mapping — it will turn a Cyrillic р into
r and leave the spoof readable.
| If you want to… | Use | Mapping | Example |
|---|---|---|---|
| Defend against homoglyph / look-alike spoofing | normalize_confusables, strip_obfuscation |
visual (Unicode TR39) | Cyrillic р → Latin p |
| Romanize text to readable ASCII | transliterate |
phonetic / standards-based (BGN/PCGN, ISO 9, GOST) | Cyrillic р → Latin r; Київ → Kyiv (uk profile) |
| Flag spoofed hostnames / IDNs | is_suspicious_hostname |
analysis (no rewrite) | аpple.com → suspicious |
Visual mapping — for security¶
normalize_confusables and strip_obfuscation fold visually confusable
characters to their prototypes, per Unicode TR39. A Cyrillic р (U+0440) and a
Latin p (U+0070) look identical, so the visual mapping sends the Cyrillic one
to p. This is what reverses a homoglyph substitution, and it is the basis of
disarm's adversarial-text defence.
Phonetic mapping — for readability¶
transliterate is a romanizer: it maps by sound and by transliteration
standard, not by appearance. It sends Cyrillic р to r (its phonetic value),
producing readable ASCII like Київ → Kyiv (with the uk language profile).
This is the right tool for
catalog keys, slugs, and search indexing — but it is not a security control,
because it leaves a look-alike spoof intact.
Rule of thumb¶
If the goal is "is this text trying to fool a human or a matcher?", use the visual functions. If the goal is "make this text readable / indexable in ASCII", use
transliterate. When in doubt, normalize confusables first, then transliterate.
The function names above are shared across every binding; only the spelling and call convention change per language (see your language's Getting started page).