Last updated
What Does a URL Parser Do?
A URL parser breaks a URL into its individual components — protocol, host, port, path, query parameters, and fragment — displaying each part separately. This makes it easy to inspect, debug, and understand any URL without manually scanning the string.
URL Structure (RFC 3986)
Full URL:
https://api.example.com:8443/v2/users/42?format=json&lang=en#profile
Components:
Protocol: https
Host: api.example.com
Port: 8443
Path: /v2/users/42
Query: format=json&lang=en
Fragment: profile
Parsed query parameters:
format → json
lang → en
Parsing a Simple Web URL
Input: https://www.example.com/blog/hello-world
Protocol: https
Subdomain: www
Domain: example.com
TLD: com
Host: www.example.com
Port: (default: 443 for HTTPS)
Path: /blog/hello-world
Path segments:
[0] blog
[1] hello-world
Query: (none)
Fragment: (none)
Parsing a URL with Query Parameters
Input: https://shop.example.com/search?q=laptop&brand=dell&price_max=1500&sort=price_asc&page=2
Protocol: https
Host: shop.example.com
Path: /search
Query string: q=laptop&brand=dell&price_max=1500&sort=price_asc&page=2
Parsed parameters:
q → laptop
brand → dell
price_max → 1500
sort → price_asc
page → 2
Parsing a URL with Encoded Characters
Input: https://example.com/search?q=hello%20world&city=New%20York&tag=caf%C3%A9
Decoded query parameters:
q → hello world (was: hello%20world)
city → New York (was: New%20York)
tag → café (was: caf%C3%A9)
Path encoding:
Encoded: /files/my%20document%20(2024).pdf
Decoded: /files/my document (2024).pdf
Parsing an API URL
Input: https://api.github.com/repos/owner/repo/issues?state=open&labels=bug&per_page=30&page=1
Protocol: https
Host: api.github.com
Path: /repos/owner/repo/issues
Path segments:
[0] repos
[1] owner
[2] repo
[3] issues
Query parameters:
state → open
labels → bug
per_page → 30
page → 1
Parsing a URL with Fragment
Input: https://docs.example.com/guide/getting-started#installation
Protocol: https
Host: docs.example.com
Path: /guide/getting-started
Fragment: installation
Note: The fragment (#installation) is processed by the browser only.
It is NEVER sent to the server in the HTTP request.
Use this for in-page navigation and client-side routing.
Parsing a Non-Standard Port URL
Input: http://localhost:3000/api/v1/health
Protocol: http
Host: localhost
Port: 3000 (explicit, non-default)
Default port for HTTP: 80
Path: /api/v1/health
Note: Port 3000 is commonly used for local development servers.
In production, standard ports (80/443) are used and omitted from URLs.
Parsing an International Domain Name (IDN)
Input: https://münchen.de/events?year=2024
Unicode form: münchen.de
Punycode form: xn--mnchen-3ya.de
Protocol: https
Host (Unicode): münchen.de
Host (Punycode): xn--mnchen-3ya.de
Path: /events
Query:
year → 2024
Note: DNS lookups use the Punycode form. Browsers display the Unicode form.
URL Reconstruction
After parsing, the tool reconstructs the URL from components to verify round-trip accuracy:
Original: https://api.example.com:8443/v2/users?id=42&format=json#details
Reconstructed: https://api.example.com:8443/v2/users?id=42&format=json#details
Match: ✓
If the reconstructed URL differs from the original, it indicates
encoding normalization occurred during parsing.
URL Parsing in JavaScript
// Built-in URL API
const url = new URL('https://api.example.com:8443/v2/users?id=42&format=json#details');
console.log(url.protocol); // "https:"
console.log(url.hostname); // "api.example.com"
console.log(url.port); // "8443"
console.log(url.pathname); // "/v2/users"
console.log(url.hash); // "#details"
// Query parameters
const params = url.searchParams;
console.log(params.get('id')); // "42"
console.log(params.get('format')); // "json"
// Iterate all params
for (const [key, value] of params) {
console.log(`${key}: ${value}`);
}
URL Parsing in Python
from urllib.parse import urlparse, parse_qs
url = 'https://api.example.com:8443/v2/users?id=42&format=json#details'
parsed = urlparse(url)
print(parsed.scheme) # https
print(parsed.netloc) # api.example.com:8443
print(parsed.hostname) # api.example.com
print(parsed.port) # 8443
print(parsed.path) # /v2/users
print(parsed.fragment) # details
# Parse query string
params = parse_qs(parsed.query)
print(params) # {'id': ['42'], 'format': ['json']}
Common Use Cases
- Debugging failing API calls by inspecting each URL component
- Extracting query parameters from redirect URLs
- Validating URL structure in web application testing
- Understanding redirect chains and URL transformations
- Security analysis — checking for suspicious schemes or hosts
- Building URL manipulation utilities