Last updated
How Text Is Stored as Binary
Every character you type is stored in memory as a number, and that number is stored as binary. The mapping from characters to numbers is defined by a character encoding standard. The most common is ASCII (American Standard Code for Information Interchange), which maps 128 characters to 7-bit numbers (0–127). Modern systems use UTF-8, which is backward-compatible with ASCII for the first 128 characters but can encode over 1 million Unicode characters using 1–4 bytes per character.
ASCII Binary Reference
| Character | Decimal | Binary (8-bit) |
|---|---|---|
| Space | 32 | 0010 0000 |
| A | 65 | 0100 0001 |
| Z | 90 | 0101 1010 |
| a | 97 | 0110 0001 |
| z | 122 | 0111 1010 |
| 0 | 48 | 0011 0000 |
| 9 | 57 | 0011 1001 |
| ! | 33 | 0010 0001 |
Decoding "Hello" from Binary
The word "Hello" in binary (8 bits per character, space-separated):
01001000 = 72 = 'H'
01100101 = 101 = 'e'
01101100 = 108 = 'l'
01101100 = 108 = 'l'
01101111 = 111 = 'o'
Result: Hello
JavaScript: Text ↔ Binary
// Text to binary
function textToBinary(str) {
return str.split('').map(char =>
char.charCodeAt(0).toString(2).padStart(8, '0')
).join(' ');
}
// Binary to text
function binaryToText(binary) {
return binary.trim().split(/\s+/).map(byte =>
String.fromCharCode(parseInt(byte, 2))
).join('');
}
console.log(textToBinary('Hi'));
// → "01001000 01101001"
console.log(binaryToText('01001000 01101001'));
// → "Hi"
UTF-8 and Multi-Byte Characters
For characters outside the ASCII range, UTF-8 uses multiple bytes.
The euro sign € (U+20AC) encodes as 3 bytes: 0xE2 0x82 0xAC,
which in binary is 11100010 10000010 10101100.
Emoji like 😀 (U+1F600) use 4 bytes: 0xF0 0x9F 0x98 0x80.
This is why you can't simply split a UTF-8 binary string on every 8 bits to get characters —
you need to respect the multi-byte encoding boundaries.
The TextEncoder and TextDecoder Web APIs handle this correctly in browsers.