URL Normalization API: The Complete Guide

Need to standardize URLs for comparison, deduplication, or storage? This guide covers everything about URL normalization via API, including normalization rules, common issues, and implementation examples.

What is URL Normalization?

URL normalization (also called canonicalization) is the process of transforming URLs into a consistent, standardized format. Multiple URLs can point to the same resource, and normalization helps identify them as equivalent.

For example, these URLs all point to the same page:

  • https://EXAMPLE.COM/path
  • https://example.com/path/
  • https://example.com/path?utm_source=google

Normalization Rules

Common normalization operations include:

Case Normalization

Convert scheme and host to lowercase (path may be case-sensitive).

Removing Default Ports

Remove :80 for HTTP and :443 for HTTPS.

Removing Trailing Slashes

Standardize on with or without trailing slash.

Removing Tracking Parameters

Strip utm_*, fbclid, gclid, and other tracking parameters.

Note: Be careful with path normalization - some servers treat /path and /path/ as different resources. The API allows configuring this behavior.

Using the URL Normalization API

TinyFn provides a comprehensive endpoint for URL normalization:

API Request
POST https://api.tinyfn.io/v1/url/normalize
Headers: X-API-Key: your-api-key
Content-Type: application/json

{
  "url": "HTTPS://Example.com:443/path/?utm_source=google&name=test#section"
}
Response
{
  "original": "HTTPS://Example.com:443/path/?utm_source=google&name=test#section",
  "normalized": "https://example.com/path?name=test",
  "changes": [
    "lowercase_scheme",
    "lowercase_host",
    "remove_default_port",
    "remove_trailing_slash",
    "remove_tracking_params",
    "remove_fragment"
  ]
}

Parameters

Parameter Type Description
url string URL to normalize (required)
remove_tracking boolean Remove tracking parameters (default: true)
remove_fragment boolean Remove hash fragment (default: true)
trailing_slash string "remove", "add", or "keep" (default: "remove")
sort_params boolean Sort query parameters alphabetically (default: true)

Code Examples

JavaScript / Node.js

const response = await fetch(
  'https://api.tinyfn.io/v1/url/normalize',
  {
    method: 'POST',
    headers: {
      'X-API-Key': 'your-api-key',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      url: 'HTTPS://Example.com/path/?utm_source=google'
    })
  }
);
const { normalized } = await response.json();
console.log(normalized); // "https://example.com/path"

Python

import requests

response = requests.post(
    'https://api.tinyfn.io/v1/url/normalize',
    headers={'X-API-Key': 'your-api-key'},
    json={'url': 'HTTPS://Example.com/path/?utm_source=google'}
)
normalized = response.json()['normalized']
print(normalized)  # "https://example.com/path"

cURL

curl -X POST "https://api.tinyfn.io/v1/url/normalize" \
  -H "X-API-Key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{"url": "HTTPS://Example.com/path/?utm_source=google"}'

Common Use Cases

  • URL Deduplication: Identify duplicate URLs in databases
  • SEO Analysis: Detect duplicate content issues
  • Web Crawling: Avoid crawling the same page multiple times
  • Analytics: Aggregate traffic to equivalent URLs
  • Link Comparison: Compare URLs for equality

Best Practices

  1. Be consistent: Use the same normalization rules across your system
  2. Preserve meaningful params: Don't remove params that change content
  3. Test thoroughly: Some URLs break when normalized aggressively
  4. Store both versions: Keep original for reference, normalized for comparison

Try the URL Normalization API

Get your free API key and start normalizing URLs in seconds.

Get Free API Key

Ready to try TinyFn?

Get your free API key and start building in minutes.

Get Free API Key