ToolPilot

Unicode Inspector

Inspect every Unicode character in your text: codepoint, name, category, UTF-8 bytes. Detect invisible characters and compare NFC/NFD normalizations.

Unicode Inspector: full analysis of every character

Why use this Unicode inspector?

Unicode includes over 149,000 characters across 161 blocks. Some are invisible (zero-width space, BOM, directional marks) and can cause display bugs, security vulnerabilities, or unexpected behavior in your applications. This inspector reveals every character with its codepoint (U+XXXX), official name, category, and UTF-8 hex bytes.

All processing happens in your browser. Your text is never sent to a remote server. You can safely analyze sensitive data (passwords, tokens, confidential content). The tool automatically detects invisible characters and flags them visually in red.

The inspector also compares the NFC (composed) and NFD (decomposed) normalization forms of your text. This feature is essential for verifying string compatibility across systems (databases, APIs, macOS vs. Linux file systems).

Who uses this Unicode inspector?

Developers
Debug encoding issues by identifying the exact codepoints of each character. Spot stray BOMs (U+FEFF) at the start of files, zero-width spaces (U+200B) in copy-pasted strings, or directional marks (LRM/RLM) that break bidirectional rendering.
Security researchers
Detect homoglyph attacks by comparing codepoints of visually identical characters. The Latin "a" (U+0061) and the Cyrillic "a" (U+0430) are clearly distinguished by their codepoints and Unicode categories.
Linguists
Study combining diacritical marks (U+0300 through U+036F) and compare NFC/NFD forms of accented text. Check whether an "e" is a precomposed character (U+00E9) or a base letter "e" followed by a combining accent (U+0065 + U+0301).
QA testers
Validate Unicode string handling in your applications. Verify that your system correctly processes surrogate pairs, invisible characters, and different normalization forms. The UTF-16 unit vs. codepoint counter reveals discrepancies.

How does the Unicode inspector work?

Paste or type your text in the input area. The inspector analyzes each character individually using JavaScript's Unicode iteration (which correctly handles surrogate pairs for emojis and characters beyond the Basic Multilingual Plane).

For each character, the tool displays: the character itself (or a red "INVISIBLE" badge), its U+XXXX codepoint, its Unicode name, its category (Lu, Ll, Nd, Po, Cf, etc.), and its UTF-8 bytes in hexadecimal. Statistics show the total codepoint count, UTF-16 unit count, unique codepoints, and invisible character count.

The NFC/NFD section compares the two normalization forms of your text. If codepoints differ between NFC and NFD, the tool flags it explicitly. All processing is local: no network requests, no stored data.

Frequently Asked Questions

How do I detect invisible characters in text?
Paste your text into the inspector. Every invisible character (zero-width space, ZWNJ, ZWJ, BOM, directional marks LRM/RLM) is flagged in red with an "INVISIBLE" badge. The counter at the top shows the total number of invisible characters detected.
What is the difference between NFC and NFD?
NFC (Normalization Form Composed) combines a base character and its diacritics into a single codepoint when possible (e.g., e = U+00E9). NFD (Normalization Form Decomposed) separates the base character and the diacritic (e.g., e + accent = U+0065 U+0301). The inspector displays both forms and their codepoints for any text.
Is my text sent to a server?
No. Unicode analysis happens entirely in your browser via JavaScript. No data is transmitted, stored, or logged. Your text stays private.
How does this tool help detect homoglyph attacks?
The inspector shows the exact codepoint (U+XXXX) and Unicode name of every character. Two visually identical characters from different Unicode blocks (for example, Latin "a" U+0061 and Cyrillic "a" U+0430) are clearly distinguished by their codepoints and categories.