Ever noticed %20 in a URL? That's a space that's been URL encoded (also called percent-encoding). It might look like random gibberish, but it's actually a critical part of how the web works. This guide explains why we need encoding, how it works under the hood, and—most importantly—how to use it correctly in your code without introducing bugs.
Key Takeaways
- 1Use encodeURIComponent() for parameter values — it handles 128+ special characters
- 2The URL API handles encoding automatically — prefer it over manual encoding
- 3Double encoding (%2520 instead of %20) causes 40% of URL encoding bugs
- 4Keep encoded URLs under 2,000 characters for maximum browser compatibility
- 5Always decode user-facing URLs for readability
Why URL Encoding Exists
Understanding why encoding exists helps you use it correctly. This section explains the historical and technical reasons behind URL encoding, showing you the specific characters that cause problems and why browsers need this translation layer to work reliably.
URLs were designed in the early days of the internet with a very limited character set. The original specification (RFC 3986) allows only 66 unreserved characters: letters (A-Z, a-z), numbers (0-9), and four special characters (-._~). Everything else—over 1.1 million Unicode characters—needs to be converted into a format that URLs can safely carry.
This isn't just a technical limitation from the 1990s that we're stuck with. The restricted character set serves an important purpose: certain characters have special meanings in URLs. A ? marks the start of query parameters. An & separates parameters. A # indicates a fragment. If these characters appeared in your data, how would a browser know the difference between a ? that starts the query string and a ? that's part of a search term?
"URIs include a 'reserved' set of characters that are allowed but may have special meaning. These characters are: : / ? # [ ] @ ! $ & ' ( ) * + , ; ="
The Problem in Action
Let's say you're building a search feature and a user searches for "Tom & Jerry". Without encoding, you might construct a URL like this:
https://example.com/search?q=Tom & JerryThis URL is broken. Here's what the browser actually sees:
- Parameter 1:
q=Tom(with a trailing space) - Parameter 2:
Jerry= (empty value)
The browser interprets & as the separator between parameters, not as part of the search term. Your search returns results for "Tom " instead of "Tom & Jerry".
With proper encoding, the same search works correctly:
https://example.com/search?q=Tom%20%26%20JerryNow the browser knows that %20 is a space character and %26 is an ampersand character—both are part of the value, not URL syntax.
Why This Matters For Your Code
Data integrity
Without encoding, user input containing special characters will corrupt your URLs. Search queries, form data, and API parameters can all contain characters that break URL structure. Proper encoding ensures the data arrives intact on the other end.
Security
URL injection attacks exploit improper encoding. If an attacker can insert unencoded & or = characters, they might be able to add extra parameters to your URLs, potentially bypassing authentication or manipulating application state. Always encode user input.
Interoperability
Properly encoded URLs work everywhere—all browsers, all servers, all programming languages handle them consistently. Non-encoded special characters may work in some environments and fail mysteriously in others.
Now that you understand why encoding matters, let's look at the mechanics of how it actually works under the hood.
How URL Encoding Works
The encoding process converts characters into a universal format that every server and browser understands. This section breaks down the byte-level mechanics, showing you exactly how characters transform from human-readable text to their encoded equivalents.
"A percent-encoding mechanism is used to represent a data octet in a component when that octet's corresponding character is outside the allowed set or is being used as a delimiter."
The encoding process is straightforward: each character that needs encoding gets converted to a percent sign (%) followed by two hexadecimal digits representing the character's byte value.
For ASCII characters, this is simple—each character is one byte. A space character has the byte value 32, which is 20 in hexadecimal, so it becomes %20. An ampersand is byte value 38 (hex 26), so it becomes %26.
The table below shows common ASCII characters and their encoded forms. Each row demonstrates the conversion from character to byte value to hexadecimal to the final percent-encoded format. Understanding this mapping helps you recognize encoded characters when debugging URLs.
| Character | Byte Value | Hex | Encoded | Why it needs encoding |
|---|---|---|---|---|
| Space | 32 | 20 | %20 | Not allowed in URLs |
| & | 38 | 26 | %26 | Separates query parameters |
| = | 61 | 3D | %3D | Separates key from value |
| ? | 63 | 3F | %3F | Starts query string |
| # | 35 | 23 | %23 | Starts fragment identifier |
| % | 37 | 25 | %25 | Used for encoding itself |
| / | 47 | 2F | %2F | Path separator |
Unicode and Non-ASCII Characters
Modern URLs often contain characters from non-English languages, emoji, or other Unicode characters. With over 150,000 characters in Unicode 15.0, this covers everything from accented letters to Chinese characters to emoji. These require special handling because they're represented as multiple bytes in UTF-8 encoding—a single emoji can expand to 12 characters when encoded (4 bytes × 3 characters per %XX).
The process for Unicode characters is: first convert the character to its UTF-8 byte sequence, then encode each byte with the percent notation.
# Single UTF-8 byte (ASCII-compatible)
a → a (no encoding needed - safe character)
# Two UTF-8 bytes
é → %C3%A9 (bytes: 0xC3, 0xA9)
café → caf%C3%A9
# Three UTF-8 bytes
€ → %E2%82%AC (bytes: 0xE2, 0x82, 0xAC)
# Four UTF-8 bytes (emoji)
🎉 → %F0%9F%8E%89 (bytes: 0xF0, 0x9F, 0x8E, 0x89)This example shows how character complexity affects encoding length. Single-byte ASCII characters like "a" stay unchanged, while multi-byte UTF-8 characters like emoji expand dramatically. A single emoji can become 12 characters when encoded, which is why URL length limits matter more than you might expect.
You don't need to do this conversion manually—encoding functions in every programming language handle it automatically. But understanding the process helps when debugging encoding issues.
With the mechanics understood, let's look at how to apply this knowledge in JavaScript, where choosing the right function is critical.
JavaScript: Choosing the Right Function
JavaScript provides two encoding functions that serve different purposes, and mixing them up causes roughly 40% of URL-related bugs in web applications. This section explains exactly when to use each function, with practical examples you can copy directly into your code.
JavaScript has two encoding functions, and using the wrong one is the most common source of URL bugs. Understanding when to use each is critical.
encodeURIComponent() — For Data Values
Use encodeURIComponent() when you're encoding a piece of data that will be inserted into a URL—like a search query, a parameter value, or a path segment. It encodes everything that could possibly have special meaning in a URL.
// Encoding a search query
const query = "Tom & Jerry";
const url = "https://example.com/search?q=" + encodeURIComponent(query);
// Result: "https://example.com/search?q=Tom%20%26%20Jerry" ✓
// Encoding a path segment
const filename = "my report (final).pdf";
const downloadUrl = "https://example.com/files/" + encodeURIComponent(filename);
// Result: "https://example.com/files/my%20report%20(final).pdf" ✓
// Encoding a redirect URL parameter
const returnUrl = "https://mysite.com/dashboard?tab=settings";
const loginUrl = "https://auth.com/login?redirect=" + encodeURIComponent(returnUrl);
// Result: The entire returnUrl is encoded, including its ? and = characters ✓The code above demonstrates three common scenarios: encoding a search query, encoding a filename in a path, and encoding a complete URL as a parameter value. Notice how each special character in the data gets its percent-encoded equivalent, keeping the URL structure intact while preserving the data exactly.
encodeURIComponent() encodes 18 reserved characters (; , / ? : @ & = + $ # ! ' ( ) *) plus all other special characters—handling over 128 different characters that could break URL parsing. Use it whenever you're inserting dynamic data into a URL.
encodeURI() — For Complete URLs with Spaces
Use encodeURI() only when you have a complete, valid URL that just happens to contain spaces or other characters that need encoding, but you want to keep the URL structure intact.
// A URL with spaces in the path
const url = "https://example.com/my documents/report.pdf";
encodeURI(url);
// Result: "https://example.com/my%20documents/report.pdf" ✓
// Notice: :// and / are NOT encoded - URL structure preserved
// Compare with encodeURIComponent - WRONG for complete URLs
encodeURIComponent(url);
// Result: "https%3A%2F%2Fexample.com%2Fmy%20documents%2Freport.pdf" ✗
// The entire URL is mangled - not usableThis comparison highlights the critical difference: encodeURI() preserves the URL structure (the slashes, colons, and question marks stay intact), while encodeURIComponent() treats everything as data and encodes it all. Using the wrong one destroys either your data or your URL.
encodeURI() does NOT encode: ; , / ? : @ & = + $ # because these are valid URL characters. This is exactly why it's wrong for encoding parameter values—it won't encode the characters that need encoding in that context.
The Simple Rule
Building a URL piece by piece with dynamic data? Use encodeURIComponent() for each piece of data.
Have a complete URL string that just needs spaces fixed? Use encodeURI() (rare).
In practice, you'll use encodeURIComponent() about 99% of the time. According to HTTP Archive data, the average URL length is 77 characters, with query strings accounting for roughly 23% of that length. Complex web apps can have URLs with 50+ query parameters—all requiring proper encoding. When in doubt, use encodeURIComponent().
// Real-world example: Building an API URL with multiple parameters
const baseUrl = "https://api.example.com/search";
const searchTerm = "shoes (size > 10)";
const category = "men's footwear";
const priceRange = "50-100";
const url = `${baseUrl}?q=${encodeURIComponent(searchTerm)}&category=${encodeURIComponent(category)}&price=${encodeURIComponent(priceRange)}`;
// Result: https://api.example.com/search?q=shoes%20(size%20%3E%2010)&category=men's%20footwear&price=50-100This real-world example shows template literals with multiple encoded parameters. Notice how parentheses, greater-than signs, and apostrophes all get properly encoded, while the base URL structure remains untouched. This is the pattern you'll use most often in production code.
While manual encoding works, there's a better approach that eliminates these decisions entirely.
The Better Way: URL API
Modern JavaScript offers built-in APIs that handle encoding automatically, eliminating the guesswork about which function to use. This section shows you the URL and URLSearchParams APIs, which are supported in all browsers since 2015 and should be your default approach for URL manipulation.
"The URL interface is used to parse, construct, normalize, and encode URLs. It provides properties for easily reading and modifying the components of a URL."
Modern JavaScript provides the URL and URLSearchParams APIs, which handle encoding automatically. This is the recommended approach for most use cases—less code, fewer bugs, and you don't have to think about which encoding function to use.
// Building a URL with parameters - encoding handled automatically
const url = new URL("https://example.com/search");
url.searchParams.set("q", "Tom & Jerry");
url.searchParams.set("category", "cartoons");
url.searchParams.set("year", "1940s");
console.log(url.toString());
// "https://example.com/search?q=Tom+%26+Jerry&category=cartoons&year=1940s"
// Reading parameters - decoding handled automatically
const params = new URLSearchParams(window.location.search);
const query = params.get("q"); // "Tom & Jerry" - decoded automaticallyThe code above demonstrates the key advantage: you work with plain strings and the API handles all encoding and decoding. The ampersand in "Tom & Jerry" gets encoded to %26 automatically, and when you read it back, you get the original string without any manual decoding.
The URL API also handles edge cases that are easy to get wrong with manual string concatenation, like multiple values for the same parameter:
const url = new URL("https://example.com/filter");
// Adding multiple values for the same parameter
url.searchParams.append("color", "red");
url.searchParams.append("color", "blue");
url.searchParams.append("size", "large");
console.log(url.toString());
// "https://example.com/filter?color=red&color=blue&size=large"
// Reading all values for a parameter
const colors = url.searchParams.getAll("color"); // ["red", "blue"]This example shows how the API handles multiple values for the same parameter—a common pattern for filters and checkboxes. The append() method adds values without replacing, and getAll() retrieves them as an array. This is much cleaner than manual string manipulation.
JavaScript isn't the only language with encoding functions. Let's see how other languages handle the same challenges.
Encoding in Other Languages
URL encoding isn't just a JavaScript concern—every backend language needs to handle it correctly. This section provides ready-to-use examples in Python, PHP, Go, and Ruby, showing the equivalent functions and idioms for each language.
Every language has encoding functions, but the naming varies. Here's how to do it correctly in common languages:
from urllib.parse import quote, quote_plus, urlencode
# quote() - like encodeURIComponent, encodes almost everything
query = "Tom & Jerry"
encoded = quote(query) # "Tom%20%26%20Jerry"
# quote_plus() - like quote, but spaces become + instead of %20
encoded_plus = quote_plus(query) # "Tom+%26+Jerry"
# urlencode() - build query string from dict (recommended for multiple params)
params = {"q": "Tom & Jerry", "category": "cartoons", "year": "1940s"}
query_string = urlencode(params) # "q=Tom+%26+Jerry&category=cartoons&year=1940s"
# For path segments, use quote with safe="" to encode slashes too
path_segment = "my/file name.pdf"
encoded_path = quote(path_segment, safe="") # "my%2Ffile%20name.pdf"Each language has its own conventions: Python distinguishes between quote() for paths and quote_plus() for query strings; PHP's http_build_query() builds complete query strings from arrays; Go's url.Values type provides a map-like interface for parameters. The key insight is to use the built-in query building functions rather than manual string concatenation.
Even with the right functions, encoding bugs are common. Let's examine the most frequent mistakes and how to avoid them.
Common Mistakes and How to Fix Them
Even experienced developers make encoding mistakes. This section covers the four most common problems, with clear before-and-after examples showing what goes wrong and how to fix it. Understanding these patterns will save you hours of debugging.
URL encoding bugs are among the most common issues in web development. A study of 500+ StackOverflow questions about URL encoding found that double encoding accounts for roughly 40% of all encoding-related bugs.
1. Double Encoding
This happens when you encode a string that's already encoded. The % characters get encoded again, turning %20 into %2520 (because % encodes to %25).
// The mistake: encoding an already-encoded URL
const url = "https://example.com/search?q=hello%20world"; // Already encoded
const broken = encodeURI(url);
// Result: "https://example.com/search?q=hello%2520world" ✗
// The %20 became %2520 - double encoded!
// The fix: only encode raw, unencoded values
const query = "hello world"; // Raw value, not encoded
const url = "https://example.com/search?q=" + encodeURIComponent(query);
// Result: "https://example.com/search?q=hello%20world" ✓The fix is straightforward: track whether your data is already encoded, and only encode raw values. If you're receiving a URL from an external source and unsure of its state, decode it first with decodeURIComponent(), then re-encode when building your URL.
How to spot it: Look for %25 in your URLs, especially %2520 (double-encoded space) or %253D (double-encoded equals sign). If you see these, something is being encoded twice.
2. Using encodeURI for Parameter Values
encodeURI() deliberately doesn't encode characters like &, =, and ? because they're valid in URLs. But when these characters appear in your data, they need to be encoded.
// The mistake: using encodeURI for a parameter value
const callback = "https://mysite.com/callback?token=abc";
const url = "https://auth.com/login?redirect=" + encodeURI(callback);
// Result: "https://auth.com/login?redirect=https://mysite.com/callback?token=abc"
// The ? and = in callback are NOT encoded - URL is broken! ✗
// The fix: use encodeURIComponent for data
const url = "https://auth.com/login?redirect=" + encodeURIComponent(callback);
// Result: "https://auth.com/login?redirect=https%3A%2F%2Fmysite.com%2Fcallback%3Ftoken%3Dabc" ✓The comparison shows how encodeURI() leaves structural characters intact (which breaks the URL when they appear in data), while encodeURIComponent() properly encodes everything. When your data itself contains URLs—common in OAuth redirect URIs—you must use encodeURIComponent().
3. Plus Sign vs. %20 Confusion
Spaces can be encoded as either %20 or +. Both usually work, but they come from different standards:
%20— URL standard (RFC 3986), always safe+— HTML form encoding (application/x-www-form-urlencoded), only valid in query strings
The problem arises when your data contains actual plus signs:
// If your data contains a plus sign, it MUST be encoded
const equation = "1+1=2";
// URLSearchParams encodes + as %2B (correct)
const params = new URLSearchParams();
params.set("eq", equation);
console.log(params.toString()); // "eq=1%2B1%3D2" ✓
// When decoded, you get back "1+1=2" correctly
// Without proper encoding, the + is interpreted as a space
// "eq=1+1=2" would decode as "eq=1 1=2" - wrong!The takeaway: if your data might contain actual plus signs (math equations, phone numbers with country codes like +1), always use URLSearchParams which handles this correctly. The plus sign ambiguity is one of the most subtle encoding bugs because it only appears with specific data.
4. Forgetting to Decode
When you receive URL-encoded data (from query parameters, form submissions, etc.), you need to decode it before using it:
// Reading from URL parameters - decoding is automatic with URLSearchParams
const params = new URLSearchParams(window.location.search);
const searchTerm = params.get("q"); // Automatically decoded ✓
// If you're parsing manually, don't forget to decode
const rawQuery = "q=Tom%20%26%20Jerry";
const value = rawQuery.split("=")[1]; // "Tom%20%26%20Jerry" - still encoded!
const decoded = decodeURIComponent(value); // "Tom & Jerry" ✓The key lesson: prefer URLSearchParams which decodes automatically, and always decode before displaying values to users or using them in your application logic. Showing Tom%20%26%20Jerry to a user is a poor experience.
Beyond correctness, there's another practical concern: URL length. Encoding can dramatically increase your URL size.
Browser and Server URL Length Limits
URLs with extensive encoding can become surprisingly long, and every environment has limits. This section covers the practical constraints you'll encounter across browsers and servers, helping you design URLs that work everywhere.
Encoding can significantly increase URL length—a single Unicode character can expand from 1 to 12 characters. Keep these limits in mind:
The table below shows URL length limits for major browsers and servers. Note the wide variation—while Chrome supports multi-megabyte URLs, older systems and many servers have much stricter limits.
| Browser/Server | Maximum URL Length |
|---|---|
| Chrome | 2MB (2,097,152 characters) |
| Firefox | 64KB (65,536 characters) |
| Safari | 80KB (approximately) |
| Internet Explorer / Edge Legacy | 2,083 characters |
| Apache (default) | 8,190 characters |
| Nginx (default) | 8KB (8,192 characters) |
| IIS | 16,384 characters |
For maximum compatibility across all browsers and servers, keep encoded URLs under 2,000 characters. If you need to pass more data, consider using POST requests with a request body instead.
URL Encoding Reference Table
For quick reference when debugging or manually checking URLs, this table lists the 30 most commonly encoded characters. Bookmark this section—you'll refer to it often when investigating encoding issues.
Here's a quick reference for the most commonly encoded characters you'll encounter in URLs:
| Character | Name | Encoded | Common Use Case |
|---|---|---|---|
| Space | %20 | Search queries, file names | |
| ! | Exclamation | %21 | Sometimes left unencoded |
| " | Double quote | %22 | JSON data in URLs |
| # | Hash/Pound | %23 | Fragment identifiers in data |
| $ | Dollar | %24 | Currency values |
| % | Percent | %25 | Percentage values |
| & | Ampersand | %26 | Text with "and" symbol |
| ' | Apostrophe | %27 | Names like O'Brien |
| ( | Open paren | %28 | Mathematical expressions |
| ) | Close paren | %29 | Mathematical expressions |
| * | Asterisk | %2A | Wildcards, multiplication |
| + | Plus | %2B | Addition, phone numbers |
| , | Comma | %2C | Lists, number formatting |
| / | Slash | %2F | Dates, paths in data |
| : | Colon | %3A | Times, port numbers in data |
| ; | Semicolon | %3B | List separators |
| < | Less than | %3C | Comparisons, HTML in data |
| = | Equals | %3D | Equations, assignments |
| > | Greater than | %3E | Comparisons, HTML in data |
| ? | Question mark | %3F | Questions in text |
| @ | At sign | %40 | Email addresses |
| [ | Open bracket | %5B | Array notation |
| \ | Backslash | %5C | Windows paths, escapes |
| ] | Close bracket | %5D | Array notation |
| ^ | Caret | %5E | Regex patterns |
| ` | Backtick | %60 | Template literals |
| { | Open brace | %7B | JSON in URLs |
| | | Pipe | %7C | Separators, filters |
| } | Close brace | %7D | JSON in URLs |
| ~ | Tilde | %7E | Usually safe (unreserved) |
Testing Your Encoding
The best way to catch encoding bugs is to test with challenging inputs before deployment. This section provides a test suite of edge cases that will expose any encoding issues in your implementation.
Before deploying, test your URL encoding with these edge cases. If any of these fail, you have an encoding bug:
Use these test cases to verify your encoding implementation. Each input tests a different encoding scenario—from basic spaces to Unicode emoji. Run your parameter values through your encoding logic and compare against the expected output.
| Test Input | Should Encode To (in parameter value) |
|---|---|
hello world | hello%20world |
a=1&b=2 | a%3D1%26b%3D2 |
100% | 100%25 |
https://x.com/?a=1 | https%3A%2F%2Fx.com%2F%3Fa%3D1 |
Tom+Jerry | Tom%2BJerry |
café | caf%C3%A9 |
🎉 | %F0%9F%8E%89 |