|Main Index | Element Index | Element Tree | HTML Support History|
|RFC 1738 |
Which characters must be encoded and why|
How to URL encode characters | URL encode a character
RFC 1738: Uniform Resource Locators (URL) specification
The specification for URLs (RFC 1738, Dec. '94) poses a problem, in that it limits the use of allowed characters in URLs to only a limited subset of the US-ASCII character set:
"...Only alphanumerics [0-9a-zA-Z], the special characters "$-_.+!*'()," [not including the quotes - ed], and reserved characters used for their reserved purposes may be used unencoded within a URL."HTML, on the other hand, allows the entire range of the ISO-8859-1 (ISO-Latin) character set to be used in documents - and HTML4 expands the allowable range to include all of the Unicode character set as well. In the case of non-ISO-8859-1 characters (characters above FF hex/255 decimal in the Unicode set), they just can not be used in URLs, because there is no safe way to specify character set information in the URL content yet [RFC2396.]
URLs should be encoded everywhere in an HTML document that a URL is referenced to import an object (A, APPLET, AREA, BASE, BGSOUND, BODY, EMBED, FORM, FRAME, IFRAME, ILAYER, IMG, ISINDEX, INPUT, LAYER, LINK, OBJECT, SCRIPT, SOUND, TABLE, TD, TH, and TR elements.)
What characters need to be encoded and why?
How are characters URL encoded?
URL encoding of a character consists of a "%" symbol, followed by the two-digit hexadecimal representation (case-insensitive) of the ISO-Latin code point for the character.
URL encoding converter
The box below allows you to convert content between its unencoded and encoded forms. The initial input state is considered to be "unencoded" (hit 'Convert' at the beginning to start in the encoded state.) Further, to allow actual URLs to be encoded, this little converter does not encode URL syntax characters (the ";", "/", "?", ":", "@", "=", "#" and "&" characters)...if you also need to encode these characters for any reason, see the "Reserved characters" table above for the appropriate encoded values.