Last updated
How Broken Link Detection Works
A broken link finder crawls a website, extracts all hyperlinks from each page, and sends HTTP HEAD requests to each URL to check if it returns a 200 OK response. Links returning 404 (Not Found), 410 (Gone), 500 (Server Error), or connection timeouts are flagged as broken. Internal links are checked by crawling the site; external links are checked with HTTP requests.
HTTP Status Codes for Links
| Status | Meaning | Action |
|---|---|---|
| 200 | OK — link works | None needed |
| 301/302 | Redirect | Update to final URL |
| 404 | Not Found | Fix or remove link |
| 410 | Gone (permanent) | Remove link |
| 500 | Server Error | Check later |
| Timeout | No response | Retry or remove |
Simple Link Checker in Node.js
async function checkLink(url, timeout = 5000) {
try {
const controller = new AbortController();
const timer = setTimeout(() => controller.abort(), timeout);
const res = await fetch(url, {
method: 'HEAD',
signal: controller.signal,
redirect: 'follow'
});
clearTimeout(timer);
return { url, status: res.status, ok: res.ok };
} catch (err) {
return { url, status: 0, ok: false, error: err.message };
}
}
// Extract links from HTML
function extractLinks(html, baseUrl) {
const matches = html.matchAll(/href=["']([^"']+)["']/gi);
return [...matches].map(m => new URL(m[1], baseUrl).href);
}