Last updated
Common Encoding Issues
- Mojibake — garbled text from reading a file with the wrong encoding
- Question marks or boxes replacing characters — encoding mismatch
- Accented characters showing as two characters — UTF-8 read as Latin-1
- Smart quotes showing as “ — Windows-1252 read as UTF-8
Encoding Quick Reference
- UTF-8 — universal, handles all Unicode, web standard, use this by default
- UTF-16 — used internally by Windows, Java, JavaScript strings
- ASCII — English only, 7-bit, legacy systems
- ISO-8859-1 (Latin-1) — Western European, legacy web pages
- Windows-1252 — Windows Western European, Excel exports
- UTF-8 with BOM — UTF-8 with a byte order mark, Excel compatibility
Use TechConverter's Text Encoding Converter to convert text between any encoding instantly in your browser — no data is sent to any server.
Examples
Example 1: UTF-8 to ASCII (with fallback)
Input (UTF-8): "Café résumé naïve"
Output (ASCII): "Cafe resume naive"
Note: ASCII cannot represent accented characters.
The converter replaces them with their closest ASCII equivalents
or marks them as unconvertible.
Example 2: Windows-1252 to UTF-8 (fixing mojibake)
Problem: A CSV file exported from Excel shows garbled text:
"é" instead of "é"
"“" instead of """
"’" instead of "'"
Cause: File is Windows-1252 encoded but being read as UTF-8.
Fix: Convert from Windows-1252 to UTF-8:
Input encoding: Windows-1252
Output encoding: UTF-8
Input: "élève"
Output: "élève"
Example 3: Python Encoding Conversion
# Read a file with unknown encoding and convert to UTF-8
import chardet
# Detect encoding
with open('legacy_file.txt', 'rb') as f:
raw = f.read()
detected = chardet.detect(raw)
print(f"Detected encoding: {detected['encoding']}") # e.g., 'ISO-8859-1'
# Convert to UTF-8
text = raw.decode(detected['encoding'])
with open('converted_file.txt', 'w', encoding='utf-8') as f:
f.write(text)