Last updated
CSV Diff: Comparing Spreadsheets
CSV diff compares two CSV files and shows what rows were added, removed, or modified. Unlike text diff (which compares line by line), CSV diff understands the structure — it can match rows by a key column and show field-level changes within rows. This is essential for auditing data changes, validating ETL outputs, and tracking changes in exported reports.
CSV Diff Strategies
| Strategy | Use Case |
|---|---|
| Line-by-line | Simple files where row order is stable |
| Key-based | Files with a unique ID column — matches rows by key |
| Hash-based | Large files — hash each row to quickly find changes |
| Semantic | Normalize values (trim, lowercase) before comparing |
Key-Based CSV Diff
function csvDiff(oldCsv, newCsv, keyColumn = 'id') {
const parse = csv => {
const [header, ...rows] = csv.trim().split('
');
const cols = header.split(',');
return rows.map(row => {
const vals = row.split(',');
return Object.fromEntries(cols.map((c, i) => [c, vals[i]]));
});
};
const oldRows = new Map(parse(oldCsv).map(r => [r[keyColumn], r]));
const newRows = new Map(parse(newCsv).map(r => [r[keyColumn], r]));
const added = [...newRows.values()].filter(r => !oldRows.has(r[keyColumn]));
const removed = [...oldRows.values()].filter(r => !newRows.has(r[keyColumn]));
const modified = [...newRows.values()].filter(r => {
const old = oldRows.get(r[keyColumn]);
return old && JSON.stringify(old) !== JSON.stringify(r);
});
return { added, removed, modified };
}