encodeURIComponent vs encodeURI: URL Encoding Cheatsheet

Why URLs need encoding

URLs have a small alphabet. Letters, digits, a few punctuation marks, and that is it. Anything else (spaces, accents, special characters) has to be rewritten in a form the URL knows how to carry.

The standard form is “percent encoding”: each forbidden character becomes a percent sign followed by two hex digits. A space becomes %20, an accented é becomes %C3%A9, and so on.

This sounds simple. In practice, half of the URL bugs in production come from encoding the wrong things, encoding the right things twice, or using the wrong function for the job.

The two JavaScript functions, and when each one wins

JavaScript ships with two URL encoding functions. They look similar and they are very different.

encodeURIComponent() is the one you almost always want. It encodes everything except letters, digits, and a small set of safe characters. It treats its input as a single piece (a parameter value, a query field, a path segment), and it makes that piece safe regardless of where it lands.

encodeURI() assumes its input is already a complete URL and only encodes the spaces and accents it finds. It deliberately leaves :, /, ?, &, # alone because those are part of the URL structure.

The rule of thumb:

Encoding a value that goes into a parameter or path segment: encodeURIComponent.
Encoding a string that is supposed to be a complete URL with structure already in place: encodeURI.

Using encodeURI where you needed encodeURIComponent is the bug that turns ?q=cats&dogs into ?q=cats&dogs instead of ?q=cats%26dogs. The user’s literal & inside their search becomes a real parameter separator, and the URL means something different than what you intended.

The plus sign ambiguity

URLs have two ways to encode a space: %20 and +.

%20 is correct everywhere. + is only valid in the query string portion of a URL (the bit after ?), and it is a legacy from HTML form encoding.

Most browsers handle both. Some servers do not. If you encode a path segment using + for spaces, some servers will deliver it as a literal +, and your file with the name My File.pdf becomes inaccessible because the URL goes to My+File.pdf.

Safe rule: always use %20 for spaces. encodeURIComponent does this correctly, the form encoders that ship with frameworks usually do not.

The URL encoder on AldeaCode lets you convert in either direction and shows the difference between path style and query style encoding. Useful when you receive a malformed URL and need to figure out which encoding broke.

The reserved characters

The URL spec reserves certain characters because they have meaning in the URL structure:

: / ? # [ ] @ ! $ & ' ( ) * + , ; =

These characters are not always encoded. Some need encoding only in certain positions. : is fine in the path but breaks in a query parameter name. & is fine outside the query string but separates parameters inside it.

encodeURIComponent is the safe choice because it encodes all of them. It is the closest thing to “encode this, no matter where it goes”.

Two characters often confuse people:

~ (tilde): unreserved, never needs encoding, often encoded anyway by paranoid implementations.
' (apostrophe): unreserved, but some old implementations encode it. If your URL has %27 everywhere, that is an apostrophe that did not need to be encoded.

Double encoding: the silent bug

The most common URL bug is encoding something twice. Once on the way out, once again by a framework or middleware that did not realise it was already encoded.

A space becomes %20. Encode the encoded value again, and %20 becomes %2520 (the percent itself gets encoded as %25). The URL still works in the sense that it parses, but the value the server receives is %20, not a space.

If your URLs end up with %25 sprinkled through them, you are double encoding somewhere. The fix is to find the second encoder and remove it, or to decode once before encoding.

Encoding accents and non-Latin characters

UTF-8 is the standard for non-ASCII characters in URLs. The character é becomes %C3%A9 (two bytes of UTF-8, percent encoded as four characters total).

Some old systems encode accented characters in their Latin-1 form (%E9 for é) instead. Most modern URL parsers handle both, but some do not. When in doubt, use UTF-8 encoding, which is what encodeURIComponent produces.

For Asian characters, emojis, and other multi byte UTF-8 characters, the rule is the same. encodeURIComponent handles them correctly. Each character expands to several percent-encoded bytes, so a URL with a few emojis can grow significantly.

A practical 30 second rule

For any value going into a URL:

const safe = encodeURIComponent(value);
const url = `https://example.com/api?q=${safe}`;

For a URL you assemble from scratch, encode each piece separately. Never trust a string from a user as already URL safe. Never run encodeURI on a value, only on a complete URL.

The URL encoder, the URL cleaner when you need to strip tracking parameters, and the find and replace for normalising URLs in batches all run in your browser, no upload. Two rules end most encoding bugs: use encodeURIComponent for values, never encode the same string twice.

encodeURIComponent vs encodeURI: URL Encoding Cheatsheet

Why URLs need encoding

The two JavaScript functions, and when each one wins

The plus sign ambiguity

The reserved characters

Double encoding: the silent bug

Encoding accents and non-Latin characters

A practical 30 second rule

Honest sites. No shortcuts.

You might also like

Convert Unix Timestamp to Date: Avoid the Timezone Bug

Base64 vs Hex: Which Encoding to Use and When

Common CSV Problems and How to Fix Them

Why URLs need encoding

The two JavaScript functions, and when each one wins

The plus sign ambiguity

The reserved characters

Double encoding: the silent bug

Encoding accents and non-Latin characters

A practical 30 second rule

Honest sites. No shortcuts.

Convert Unix Timestamp to Date: Avoid the Timezone Bug

Base64 vs Hex: Which Encoding to Use and When

Common CSV Problems and How to Fix Them

Cookie Consent

Settings