Last updated
Zero Width Character Detector Examples
The Zero Width Character Detector finds and removes invisible Unicode characters that can cause string comparison failures, bypass security filters, and create subtle bugs. Below are examples of each character type and common scenarios where they appear.
Zero-Width Space (U+200B)
This character is invisible but breaks string equality. The two strings below look identical but are not:
String A: "hello world"
String B: "helloworld" (contains U+200B between "hello" and "world")
Detector output:
Found 1 zero-width character:
Position 5: U+200B ZERO WIDTH SPACE
Context: "hello[]world"
After removal:
"hello world" (clean, no hidden characters)
Zero-Width Non-Joiner (U+200C)
Used in some scripts to prevent ligatures, but problematic in plain text data:
Input: "filename" (U+200C inserted between letters)
Detector output:
Found 2 zero-width characters:
Position 2: U+200C ZERO WIDTH NON-JOINER
Position 5: U+200C ZERO WIDTH NON-JOINER
Zero-Width Joiner (U+200D)
Used in emoji sequences, but can appear in regular text unexpectedly:
Input: "adminuser" (U+200D between "admin" and "user")
Detector output:
Found 1 zero-width character:
Position 5: U+200D ZERO WIDTH JOINER
Context: "admin[]user"
This string would not match "adminuser" in a security check, even though they look the same.
Word Joiner (U+2060)
Prevents line breaks but is invisible in most editors:
Input: "donotbreak" (U+2060 between words)
Detector output:
Found 2 zero-width characters:
Position 2: U+2060 WORD JOINER
Position 7: U+2060 WORD JOINER
Directional Marks (U+200E, U+200F)
Left-to-right and right-to-left marks affect text rendering direction:
Input: "username" (LRM at start, RLM at end)
Detector output:
Found 2 zero-width characters:
Position 0: U+200E LEFT-TO-RIGHT MARK
Position 9: U+200F RIGHT-TO-LEFT MARK
Security Attack Example — Username Spoofing
An attacker registers a username that looks like "admin" but contains hidden characters:
Displayed: admin
Actual bytes: a d m i n U+200B
Detector output:
WARNING: Zero-width characters detected in input.
Position 5: U+200B ZERO WIDTH SPACE
This string does NOT equal "admin".
Potential homograph or filter bypass attempt.
Content Filter Bypass Example
A banned word with zero-width characters inserted to evade keyword filters:
Input: "spam" (U+200B inserted between "sp" and "am")
The word "spam" would not match a filter looking for the literal string "spam". The detector reveals the hidden character so the filter can be updated to strip zero-width characters before matching.
Code File Audit
Zero-width characters in source code can cause hard-to-debug issues:
const apiKey = "abc123"; // variable name contains U+200B
console.log(apiKey); // ReferenceError: apiKey is not defined
Detector output:
Found 1 zero-width character in identifier:
Line 1, Position 13: U+200B ZERO WIDTH SPACE
The variable name "apiKey" contains a hidden character.
This will cause a ReferenceError when referenced as "apiKey".
Cleaning User Input
Before storing or comparing user-submitted text, strip all zero-width characters:
Raw input: "John Doe" (contains U+200B)
Cleaned: "John Doe" (safe for storage and comparison)
Removal options available in the tool:
- Remove all zero-width characters (recommended for most cases)
- Remove specific types only (e.g., keep ZWJ for emoji support)
- Replace with visible markers [U+200B] for manual review
- Show character count and positions before removing
All Detectable Zero-Width Characters
- U+200B — Zero Width Space
- U+200C — Zero Width Non-Joiner
- U+200D — Zero Width Joiner
- U+200E — Left-to-Right Mark
- U+200F — Right-to-Left Mark
- U+2060 — Word Joiner
- U+FEFF — Zero Width No-Break Space (BOM)
- U+180E — Mongolian Vowel Separator
Common Use Cases
- Sanitizing user input before storing in a database
- Auditing content for hidden characters before publishing
- Debugging string comparison failures that look like they should match
- Security auditing of usernames and identifiers
- Content moderation — detecting filter bypass attempts
- Cleaning text copied from PDFs or web pages
- Auditing source code files for hidden characters in identifiers
Paste any text into the detector to instantly reveal all invisible characters, their Unicode code points, and their exact positions. Use the clean output with confidence that no hidden characters remain.