Posts in current category
May 27, 2021 19:00 XML
XML documents can contain non-ASCII characters, such as Norwegian, or French.
To avoid errors, you need to specify XML encoding, or save XML files as Unicode.
If you load an XML document, you can get two different errors that indicate coding problems:
Invalid characters were found in the text content.
If your XML contains non-ASCII characters and the file is saved as a single-byte ANSI (or ASCII) without a specified encoding, you get an error.
Switch the current encoding to a specified encoding that is not supported
If your XML file is saved as a double-byte Unicode (or UTF-16) with the specified single-byte encoding (WINDOWS-1252, ISO-8859-1, UTF-8), you get an error.
If your XML file is saved as a single-byte ANSI (or ASCII) with the specified double-byte encoding (UTF-16), you will also get an error.
By default, Windows Notepad saves files as single-byte ANSI (ASCII).
If you select Save as..., you can specify ANSI, UTF-8, Unicode (UTF-16), or Unicode Big.
Save the following XML as ANSI, UTF-8, and Unicode (note that the document does not contain any encoding properties).
xml version="1.0" <note> <from>Jani</from> <to>Tove</to> <message>Norwegian: æøå. French: êèé</message> </note>
Try dragging the file to your browser and seeing the results. Different browsers display different results.
Experiences with different codes:
xml version="1.0" encoding="us-ascii" xml version="1.0" encoding="windows-1252" xml version="1.0" encoding="ISO-8859-1" xml version="1.0" encoding="UTF-8" xml version="1.0" encoding="UTF-16"