Need to standardize URLs for comparison, deduplication, or storage? This guide covers everything about URL normalization via API, including normalization rules, common issues, and implementation examples.
What is URL Normalization?
URL normalization (also called canonicalization) is the process of transforming URLs into a consistent, standardized format. Multiple URLs can point to the same resource, and normalization helps identify them as equivalent.
For example, these URLs all point to the same page:
- https://EXAMPLE.COM/path
- https://example.com/path/
- https://example.com/path?utm_source=google
Normalization Rules
Common normalization operations include:
Case Normalization
Convert scheme and host to lowercase (path may be case-sensitive).
Removing Default Ports
Remove :80 for HTTP and :443 for HTTPS.
Removing Trailing Slashes
Standardize on with or without trailing slash.
Removing Tracking Parameters
Strip utm_*, fbclid, gclid, and other tracking parameters.
Using the URL Normalization API
TinyFn provides a comprehensive endpoint for URL normalization:
POST https://api.tinyfn.io/v1/url/normalize
Headers: X-API-Key: your-api-key
Content-Type: application/json
{
"url": "HTTPS://Example.com:443/path/?utm_source=google&name=test#section"
}
{
"original": "HTTPS://Example.com:443/path/?utm_source=google&name=test#section",
"normalized": "https://example.com/path?name=test",
"changes": [
"lowercase_scheme",
"lowercase_host",
"remove_default_port",
"remove_trailing_slash",
"remove_tracking_params",
"remove_fragment"
]
}
Parameters
| Parameter | Type | Description |
|---|---|---|
url |
string | URL to normalize (required) |
remove_tracking |
boolean | Remove tracking parameters (default: true) |
remove_fragment |
boolean | Remove hash fragment (default: true) |
trailing_slash |
string | "remove", "add", or "keep" (default: "remove") |
sort_params |
boolean | Sort query parameters alphabetically (default: true) |
Code Examples
JavaScript / Node.js
const response = await fetch(
'https://api.tinyfn.io/v1/url/normalize',
{
method: 'POST',
headers: {
'X-API-Key': 'your-api-key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
url: 'HTTPS://Example.com/path/?utm_source=google'
})
}
);
const { normalized } = await response.json();
console.log(normalized); // "https://example.com/path"
Python
import requests
response = requests.post(
'https://api.tinyfn.io/v1/url/normalize',
headers={'X-API-Key': 'your-api-key'},
json={'url': 'HTTPS://Example.com/path/?utm_source=google'}
)
normalized = response.json()['normalized']
print(normalized) # "https://example.com/path"
cURL
curl -X POST "https://api.tinyfn.io/v1/url/normalize" \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{"url": "HTTPS://Example.com/path/?utm_source=google"}'
Common Use Cases
- URL Deduplication: Identify duplicate URLs in databases
- SEO Analysis: Detect duplicate content issues
- Web Crawling: Avoid crawling the same page multiple times
- Analytics: Aggregate traffic to equivalent URLs
- Link Comparison: Compare URLs for equality
Best Practices
- Be consistent: Use the same normalization rules across your system
- Preserve meaningful params: Don't remove params that change content
- Test thoroughly: Some URLs break when normalized aggressively
- Store both versions: Keep original for reference, normalized for comparison
Try the URL Normalization API
Get your free API key and start normalizing URLs in seconds.
Get Free API Key