Last updated
What Is a File Hash?
A file hash is a fixed-length string produced by running a file's contents through a cryptographic hash function. The same file always produces the same hash. Any change to the file — even a single bit — produces a completely different hash. This makes hashes ideal for verifying file integrity: if the hash matches the expected value, the file is intact and unmodified.
Common Use Cases
- Download verification: Software publishers post SHA-256 hashes alongside downloads. You compute the hash of your downloaded file and compare it to the published value.
- Deduplication: Find duplicate files by comparing hashes instead of byte-by-byte comparison.
- Version control: Git uses SHA-1 (transitioning to SHA-256) to identify every commit, tree, and blob object.
- Digital signatures: Sign the hash of a document rather than the document itself — much faster for large files.
- Forensics: Law enforcement uses MD5/SHA-1 hashes to prove a file hasn't been tampered with since collection.
Computing File Hashes
# Linux / macOS
sha256sum file.iso
md5sum file.iso
sha1sum file.iso
# macOS (shasum)
shasum -a 256 file.iso
shasum -a 512 file.iso
# Windows PowerShell
Get-FileHash file.iso -Algorithm SHA256
Get-FileHash file.iso -Algorithm MD5
# Python
import hashlib
def file_hash(path, algorithm='sha256'):
h = hashlib.new(algorithm)
with open(path, 'rb') as f:
for chunk in iter(lambda: f.read(65536), b''):
h.update(chunk)
return h.hexdigest()
print(file_hash('ubuntu.iso')) # sha256 by default
Which Algorithm to Use?
| Algorithm | Output | Use for | Avoid for |
|---|---|---|---|
| MD5 | 128-bit (32 hex) | Non-security checksums, deduplication | Security, passwords |
| SHA-1 | 160-bit (40 hex) | Legacy systems, Git | New security applications |
| SHA-256 | 256-bit (64 hex) | File integrity, TLS, code signing | Password hashing (use bcrypt) |
| SHA-512 | 512-bit (128 hex) | High-security applications | Constrained environments |
| BLAKE3 | 256-bit (64 hex) | High-performance hashing | Legacy compatibility |