What is the most common URL-based attack?

Cross-Site Scripting (XSS) via javascript: URLs is extremely common. Open redirect attacks are also prevalent because they're easy to exploit and often overlooked. Both can be prevented by validating URL schemes and hosts.

Should I use regex to validate URLs?

No. URLs are complex and regex-based validation is incomplete and bypassable. Use the built-in URL API (new URL() in JavaScript, urlparse() in Python, url.Parse() in Go) for reliable parsing and validation.

Is client-side URL validation enough?

No. Client-side validation can be bypassed by directly calling your API. Always validate URLs on the server. Client-side validation is useful for UX but should never be the only defense.

How do I prevent open redirect attacks?

Use an allowlist of valid redirect destinations. Only allow relative URLs or URLs to your own domain. Never redirect to URLs from user input without validation. Consider using signed tokens for redirect URLs.

What URL schemes should I block?

At minimum, block javascript:, data:, vbscript:, and file:. For user-facing links, only allow http: and https:. For internal services, use an allowlist of required schemes.

How do I prevent SSRF attacks?

Validate URLs server-side before making requests. Block requests to private IP ranges (10.x.x.x, 192.168.x.x, 127.x.x.x, 169.254.x.x). Block requests to localhost and .local/.internal domains. Resolve DNS and validate the resulting IP.

Should I encode URLs before storing them?

Store URLs in their canonical form (properly encoded). Validate on input and encode on output. Don't double-encode. Use URL parsing libraries that handle encoding consistently.

What is noopener and why is it important?

When you open a link with target="_blank", the new page can access your page via window.opener. An attacker could redirect your page to a phishing site. rel="noopener" prevents this by nullifying window.opener.

How do I safely log URLs?

Strip credentials (username:password) before logging. Consider truncating very long URLs. Don't log URLs that contain sensitive tokens or API keys in query parameters.

What is the safest way to construct URLs?

Use URL and URLSearchParams objects instead of string concatenation. They automatically handle encoding and prevent injection. Never concatenate user input directly into URL strings.

URL Security - Validation, Sanitization & Best Practices

URLs are a common attack vector for web applications. Attackers exploit improperly handled URLs to steal data, redirect users to malicious sites, or access internal services. This guide covers the most common URL-based vulnerabilities and how to prevent them.

Key Takeaways

1Never trust user-provided URLs—always validate scheme, host, and structure
2Block javascript:, data:, and vbscript: schemes to prevent XSS
3Use allowlists for redirect URLs to prevent open redirect attacks
4Validate URLs server-side, not just client-side
5Use URL parsing libraries instead of regex for validation

Definition

URL Injection

A class of attacks where malicious data is inserted into URLs to exploit applications that process them. This includes XSS via javascript: URLs, open redirects, SSRF, and parameter pollution attacks.Source: OWASP

Common URL-Based Attacks

Before you can defend against attacks, you need to understand how they work. This section covers the four most common URL-based vulnerabilities: JavaScript URLs (XSS), open redirects, SSRF, and parameter pollution. For each attack, you'll learn the mechanism and see prevention code you can use directly.

JavaScript URLs (XSS)

The javascript: scheme executes JavaScript code when used in links, forms, or redirects. This is one of the most common XSS vectors:

html

<!-- Dangerous: User input as href -->
<a href="javascript:alert('XSS')">Click me</a>

<!-- Dangerous: Redirect to user-provided URL -->
<script>
  window.location = userInput; // If userInput is "javascript:..."
</script>

<!-- Dangerous: Image onerror with javascript -->
<img src="x" onerror="location='javascript:alert(1)'">

Prevention: Block javascript:, data:, and vbscript: schemes:

javascript

function isSafeUrl(url) {
  try {
    const parsed = new URL(url, window.location.origin);
    const dangerousSchemes = ['javascript:', 'data:', 'vbscript:'];
    return !dangerousSchemes.includes(parsed.protocol);
  } catch {
    return false;
  }
}

// Usage
const userUrl = getUserInput();
if (isSafeUrl(userUrl)) {
  window.location = userUrl;
} else {
  console.error('Blocked dangerous URL');
}

This validation function blocks the most dangerous schemes by checking the parsed URL's protocol. The try-catch handles malformed URLs gracefully. Use this or similar validation anywhere you process user-provided URLs—links, redirects, fetch targets, or dynamic content.

Open Redirect Attacks

Open redirects occur when an application redirects users to a URL from an untrusted source. Attackers use this for phishing:

# Legitimate login URL
https://yourbank.com/login?redirect=/dashboard

# Attacker's phishing URL
https://yourbank.com/login?redirect=https://evil.com/fake-login

# User sees yourbank.com in the URL but gets redirected to evil.com

Prevention: Use an allowlist of valid redirect destinations:

javascript

// Server-side validation (Node.js example)
function isValidRedirect(url) {
  try {
    const parsed = new URL(url, 'https://yoursite.com');

    // Only allow relative URLs or your own domain
    const allowedHosts = ['yoursite.com', 'www.yoursite.com'];

    // Relative URLs have no host (resolved against base)
    if (parsed.host === 'yoursite.com') {
      return allowedHosts.includes(parsed.hostname);
    }

    return false;
  } catch {
    return false;
  }
}

// For relative URLs, also check the path
function isValidRelativeRedirect(path) {
  // Block protocol-relative URLs that bypass domain check
  if (path.startsWith('//')) return false;

  // Block javascript: etc.
  if (path.includes(':')) {
    const beforeColon = path.split(':')[0].toLowerCase();
    const dangerousSchemes = ['javascript', 'data', 'vbscript'];
    if (dangerousSchemes.includes(beforeColon)) return false;
  }

  // Only allow paths starting with /
  return path.startsWith('/');
}

The code provides two levels of validation: isValidRedirect() for absolute URLs and isValidRelativeRedirect() for paths. Note how we block protocol-relative URLs (//evil.com), which bypass simple domain checks by inheriting the current page's scheme while redirecting to a different host.

Server-Side Request Forgery (SSRF)

SSRF attacks trick servers into making requests to internal resources:

# Application fetches user-provided URLs for previews
POST /api/preview
{ "url": "https://example.com/article" }

# Attacker provides internal URLs
POST /api/preview
{ "url": "http://localhost:8080/admin" }
{ "url": "http://169.254.169.254/metadata" }  # AWS metadata
{ "url": "http://internal-service.local/secret" }

Prevention: Validate and restrict URLs server-side:

javascript

import dns from 'dns';
import { promisify } from 'util';

const dnsLookup = promisify(dns.lookup);

async function isSafeToFetch(urlString) {
  try {
    const url = new URL(urlString);

    // Only allow HTTPS
    if (url.protocol !== 'https:') {
      return false;
    }

    // Block private/internal hostnames
    const blockedPatterns = [
      /^localhost$/i,
      /^127\.\d+\.\d+\.\d+$/,
      /^10\.\d+\.\d+\.\d+$/,
      /^172\.(1[6-9]|2\d|3[01])\.\d+\.\d+$/,
      /^192\.168\.\d+\.\d+$/,
      /^169\.254\.\d+\.\d+$/,
      /\.local$/i,
      /\.internal$/i,
    ];

    if (blockedPatterns.some(p => p.test(url.hostname))) {
      return false;
    }

    // Resolve DNS and check IP
    const { address } = await dnsLookup(url.hostname);
    if (blockedPatterns.some(p => p.test(address))) {
      return false;  // DNS resolved to private IP
    }

    return true;
  } catch {
    return false;
  }
}

SSRF prevention requires both hostname and IP validation. The key insight is resolving DNS before validating—attackers can register domains that resolve to internal IPs. By checking the actual IP address that will be connected to, you prevent DNS rebinding attacks.

HTTP Parameter Pollution

Attackers add extra parameters to exploit inconsistent server-side handling:

# Original URL
/transfer?to=alice&amount=100

# Attacker adds duplicate parameter
/transfer?to=alice&amount=100&to=attacker

# Different servers/frameworks handle duplicates differently:
# - Some use first value (to=alice)
# - Some use last value (to=attacker)
# - Some combine them (to=alice,attacker)

Prevention: Use consistent parameter handling:

javascript

// Always use .get() for single values - it returns the first
const url = new URL(request.url);
const to = url.searchParams.get('to');  // First value only

// Or explicitly reject duplicates
function getUniqueParam(searchParams, key) {
  const values = searchParams.getAll(key);
  if (values.length > 1) {
    throw new Error(`Duplicate parameter: ${key}`);
  }
  return values[0];
}

The safest approach is to either always use the first value (get()) or explicitly reject duplicates. The getUniqueParam() helper makes duplicate parameter attacks impossible by throwing an error when it detects tampering.

Understanding attacks is half the battle. Now let's look at comprehensive validation patterns you can apply across your application.

URL Validation Best Practices

Good validation is your primary defense against URL-based attacks. This section covers the essential patterns: using proper parsing, validating schemes and hosts, and ensuring server-side validation. These patterns apply to any language or framework.

Use the URL API, Not Regex

URLs are complex. Regex-based validation is error-prone and can be bypassed:

javascript

// BAD: Regex is incomplete and bypassable
const urlRegex = /^https?:\/\/[^\s]+$/;
urlRegex.test('javascript:alert(1)//https://');  // false, but still dangerous
urlRegex.test('https://evil.com\\@good.com');  // true, but misleading

// GOOD: Use the URL API
function validateUrl(input) {
  try {
    const url = new URL(input);

    // Check scheme
    if (!['http:', 'https:'].includes(url.protocol)) {
      return { valid: false, error: 'Invalid protocol' };
    }

    // Check host
    if (!url.hostname) {
      return { valid: false, error: 'Missing hostname' };
    }

    return { valid: true, url };
  } catch (e) {
    return { valid: false, error: 'Invalid URL format' };
  }
}

The regex example shows how easy it is to write incomplete validation. The URL API example is both simpler and more secure—it handles edge cases automatically and gives you clean access to URL components for further validation.

Always Validate the Scheme

javascript

function hasAllowedScheme(url, allowedSchemes = ['http:', 'https:']) {
  try {
    const parsed = new URL(url);
    return allowedSchemes.includes(parsed.protocol);
  } catch {
    return false;
  }
}

// For user-facing links, only allow http/https
if (!hasAllowedScheme(userUrl)) {
  throw new Error('Invalid URL scheme');
}

// For internal services, you might allow other schemes
const internalSchemes = ['http:', 'https:', 'amqp:', 'redis:'];
if (!hasAllowedScheme(serviceUrl, internalSchemes)) {
  throw new Error('Invalid service URL');
}

The function accepts a configurable list of allowed schemes, defaulting to http and https for user-facing URLs. For internal services, you can expand the allowlist to include protocols like redis or amqp—but never expand it for user-provided URLs.

Validate the Host

javascript

function isAllowedHost(url, allowedHosts) {
  try {
    const parsed = new URL(url);

    // Exact match
    if (allowedHosts.includes(parsed.hostname)) {
      return true;
    }

    // Subdomain match (e.g., allow *.example.com)
    for (const allowed of allowedHosts) {
      if (allowed.startsWith('*.')) {
        const domain = allowed.slice(2);
        if (parsed.hostname === domain ||
            parsed.hostname.endsWith('.' + domain)) {
          return true;
        }
      }
    }

    return false;
  } catch {
    return false;
  }
}

// Usage
const allowedHosts = ['api.example.com', '*.cdn.example.com'];
if (!isAllowedHost(userUrl, allowedHosts)) {
  throw new Error('Host not allowed');
}

The wildcard subdomain pattern (*.cdn.example.com) lets you allow any subdomain while blocking other domains. This is useful for CDNs or multi-tenant applications, but be careful—subdomain takeover vulnerabilities can turn this into an attack vector.

Always Validate Server-Side

Client-side validation can be bypassed. Always validate on the server:

// Express middleware for URL validation
function validateRedirectUrl(req, res, next) {
  const redirect = req.query.redirect;

  if (!redirect) {
    return next();
  }

  try {
    const url = new URL(redirect, `https://${req.hostname}`);

    // Only allow same-origin redirects
    if (url.hostname !== req.hostname) {
      return res.status(400).json({ error: 'Invalid redirect URL' });
    }

    // Block dangerous schemes
    if (url.protocol !== 'https:') {
      return res.status(400).json({ error: 'HTTPS required' });
    }

    req.validatedRedirect = url.pathname + url.search;
    next();
  } catch {
    return res.status(400).json({ error: 'Invalid URL format' });
  }
}

All three examples show the same pattern: validate the redirect URL before using it, only allow same-origin redirects, block dangerous schemes, and block protocol-relative URLs. The Python and Go implementations use their respective URL parsing libraries to achieve the same security guarantees as JavaScript's URL API.

Validation prevents bad URLs from entering your system. Sanitization ensures URLs are safe when leaving your system.

URL Sanitization

Even validated URLs need proper handling when used in different contexts. This section covers encoding for URL construction, escaping for HTML, and stripping sensitive data from URLs before logging or display.

Properly Encode User Input

javascript

// WRONG: String concatenation without encoding
const searchUrl = `/search?q=${userInput}`;
// If userInput is "a&admin=true", URL becomes /search?q=a&admin=true

// RIGHT: Use URLSearchParams for automatic encoding
const url = new URL('/search', window.location.origin);
url.searchParams.set('q', userInput);
// If userInput is "a&admin=true", URL becomes /search?q=a%26admin%3Dtrue

The parameter pollution example shows why string concatenation is dangerous: an attacker could inject extra parameters like admin=true. Using URLSearchParams automatically encodes the ampersand and equals sign, making parameter injection impossible.

Escaping for HTML Context

javascript

// When inserting URLs into HTML, escape for the context

// In href attributes
function safeHref(url) {
  if (!isSafeUrl(url)) {
    return '#blocked';
  }
  return url
    .replace(/&/g, '&amp;')
    .replace(/"/g, '&quot;');
}

// In JavaScript strings
function safeJsString(url) {
  return url
    .replace(/\\/g, '\\\\')
    .replace(/'/g, "\\'")
    .replace(/"/g, '\\"')
    .replace(/</g, '\\u003C')
    .replace(/>/g, '\\u003E');
}

// Better: Use textContent or data attributes
element.dataset.url = url;  // Automatically escaped
element.href = url;         // DOM API handles encoding

Different contexts require different escaping. For href attributes, escape HTML entities. For JavaScript strings, escape quotes and angle brackets. The best approach is to use DOM APIs (dataset, href) which handle escaping automatically—avoid building HTML strings with user data.

Remove Credentials from URLs

javascript

function stripCredentials(urlString) {
  try {
    const url = new URL(urlString);
    url.username = '';
    url.password = '';
    return url.toString();
  } catch {
    return urlString;
  }
}

// Usage
const cleanUrl = stripCredentials('https://user:pass@example.com/path');
// "https://example.com/path"

// For logging/display, always strip credentials
console.log(`Fetching: ${stripCredentials(requestUrl)}`);

The URL API makes credential stripping simple—just set username and password to empty strings. Always strip credentials before logging, displaying URLs to users, or including them in error reports where they might be visible.

Beyond validation and sanitization, several browser APIs require special security handling when working with URLs.

Secure URL Handling Patterns

Browser APIs like window.open(), postMessage, and form submissions all involve URLs and require careful handling. This section provides secure patterns for each, protecting your users from cross-origin attacks.

Safe window.open()

javascript

// DANGEROUS: Opens any URL
window.open(userUrl);

// SAFE: Validate and use noopener
function safeOpen(url) {
  if (!isSafeUrl(url)) {
    console.error('Blocked dangerous URL');
    return null;
  }

  // noopener prevents the new page from accessing window.opener
  // noreferrer prevents sending the referrer header
  return window.open(url, '_blank', 'noopener,noreferrer');
}

// For links, use rel="noopener noreferrer"
// <a href={url} target="_blank" rel="noopener noreferrer">Link</a>

The noopener option is critical—without it, the opened page can access your page via window.opener and redirect it (potentially to a phishing page). The noreferrer option adds privacy by not sending the referrer header. Always use both for external links.

Safe postMessage Origins

javascript

// DANGEROUS: Accepts messages from any origin
window.addEventListener('message', (event) => {
  handleMessage(event.data);  // Attacker can send messages from any site
});

// SAFE: Validate the origin
const ALLOWED_ORIGINS = ['https://trusted.example.com'];

window.addEventListener('message', (event) => {
  if (!ALLOWED_ORIGINS.includes(event.origin)) {
    console.warn(`Blocked message from: ${event.origin}`);
    return;
  }

  handleMessage(event.data);
});

// When sending, always specify the target origin
targetWindow.postMessage(data, 'https://trusted.example.com');
// NEVER use '*' as the target origin with sensitive data

The postMessage API is a common source of vulnerabilities. Always validate event.origin against an allowlist before processing messages, and always specify the exact target origin when sending—using '*' allows any site in an iframe to receive your message.

Validate Form Actions

javascript

// Validate form action before submission
document.querySelectorAll('form').forEach(form => {
  form.addEventListener('submit', (event) => {
    const action = form.action || window.location.href;

    if (!isSafeUrl(action)) {
      event.preventDefault();
      console.error('Blocked form submission to unsafe URL');
      return;
    }

    // Also validate the action is on your domain
    const url = new URL(action);
    if (url.hostname !== window.location.hostname) {
      event.preventDefault();
      console.error('Blocked cross-domain form submission');
    }
  });
});

This pattern adds a safety net for dynamically created or modified forms. By validating the action URL on submit, you catch cases where JavaScript or malicious input might have changed the form destination. This is especially important for forms with user-editable fields that might be reflected in the action URL.

For defense in depth, combine these patterns with Content Security Policy headers.

Content Security Policy

Content Security Policy is your browser-level defense against injection attacks. Even if an attacker bypasses your application-level validation, CSP can prevent the exploit from executing. This section shows CSP directives specifically relevant to URL security.

CSP provides an additional layer of defense against URL-based attacks:

# Block inline scripts and dangerous URL schemes
Content-Security-Policy:
  default-src 'self';
  script-src 'self' https://cdn.example.com;
  form-action 'self';
  base-uri 'self';
  navigate-to 'self' https://*.example.com;

# Explanation:
# - default-src 'self' - Only load resources from same origin
# - script-src - Only allow scripts from self and your CDN
# - form-action - Only allow form submissions to same origin
# - base-uri - Prevent base tag injection
# - navigate-to - Restrict where the page can navigate (experimental)

This CSP configuration blocks inline scripts, restricts resource loading to your own origin and trusted CDN, prevents forms from submitting to external domains, and locks down the base URL. Start restrictive and loosen only as needed—it's easier to allow things than to discover you've blocked a legitimate feature.

Let's wrap up with a checklist you can use to audit your URL handling.

Security Checklist

Use this checklist when reviewing code that handles URLs. Each item links back to the relevant section in this guide. Print this out and keep it near your desk during code reviews.

Check	Description	Priority
Block dangerous schemes	Reject `javascript:`, `data:`, `vbscript:`	Critical
Validate redirect URLs	Use allowlists, reject external domains	Critical
Use URL API for parsing	Avoid regex, handle edge cases properly	High
Server-side validation	Never trust client-side validation alone	Critical
Encode user input	Use URLSearchParams or equivalent	High
Strip credentials	Remove user:pass before logging/display	High
Validate SSRF targets	Block private IPs, localhost, metadata endpoints	Critical
Set CSP headers	Defense in depth against XSS	High
Use rel="noopener"	On all target="_blank" links	Medium
Validate postMessage origins	Check event.origin against allowlist	High

URL Security

Key Takeaways

Common URL-Based Attacks

JavaScript URLs (XSS)

Open Redirect Attacks

Server-Side Request Forgery (SSRF)

HTTP Parameter Pollution

URL Validation Best Practices

Use the URL API, Not Regex

Always Validate the Scheme

Validate the Host

Always Validate Server-Side

URL Sanitization

Properly Encode User Input

Escaping for HTML Context

Remove Credentials from URLs

Secure URL Handling Patterns

Safe window.open()

Safe postMessage Origins

Validate Form Actions

Content Security Policy

Security Checklist

Frequently Asked Questions

Related Guides

URL Encoding

Query Parameters

Try it yourself