Let's say I want to put a non-bland character in an HTML file; for instance, '→'. Is there any reason why I should enter it as '→' instead of just putting '→' in the HTML file? Assume my HTML file is encoded & transmitted in some Unicode format.
3 Answers
Those two final statements are big assumptions.
For example, we have a web app that uses AJAX to its literal meaning - we use it for loading XML documents on the fly. If the XML document does not have the correct content-encoding header (or is lacking one at all), then any unicode characters (smartquotes, long dashes, even some special whitespace and the word Café) makes Internet Explorer fall on its arse every single time. The AJAX request just fails and fires off a javascript error.
However, if we do a server-side replace of all the unicode characters with their HTML entities, everything works just fine.
Of course, if your file has the correct content-headers then this shouldn't be a problem for any modern browser.
- 4,994
- 2
- 29
- 43
Just to add to the excellent accepted answer: on the whole, ASCII files are much more portable across various editors.
- 291
- 1
- 11
However, if we do a server-side replace of all the unicode characters with their HTML entities, everything works just fine.
This assumes all characters can be replaced with HTML-entities, which they can’t. Use the correct headers and spot these issues (using the wrong header) early, instead of being confused when they occur later.