May 29, 2021 Article blog
Believe that most novice programmers inadvertently appear garbled when writing HTML, so how to solve garbled code?
If you want to display html pages correctly, you want your browser to know which character set you are using. The browser will decode your HTML code with the character set you use to avoid garbled code.
The earliest character set used on the World Wide Web was ASCII. ASCII supports numbers from 0 to 9, lowercase or uppercase English letters, and special characters.
In XML, JavaScript, LDAP, java, Unicode is already possible, and it was developed by the Unicode Consortium to target replacing existing character sets with standard Unicode conversion formats (UTFs).
Unicode can be compatible with different character sets, and the most commonly used encodings on browsers are UTF-8 and UTF-16.
character set | description |
---|---|
UTF-8 | Characters in UTF8 can be 1-4 bytes long. U TF-8 can represent any character in the Unicode standard. T he UTF-8 is backward compatible with ASCII. UTF-8 is the preferred encoding for web pages and e-mail messages. |
UTF-16 | The 16-bit Unicode conversion format is a Unicode variable character encoding that encodes all Unicode instruction tables. UTF-16 is primarily used in operating systems and environments, such as Microsoft's Windows 2000/XP/2003/Vista/CE, as well as Java and .NET byte code environments. |
We won't have any more garbled code after we add UTF-8 to the code head. Code demo:
<html>
<head>
<meta charset="UTF-8">
<title>Title</title>
</head>
</html>
ISO character sets are standard character sets defined by isO based on different alphabets.
Take a look at the character sets used around the world.
character set | description | Use the range |
---|---|---|
ISO-8859-1 | Latin alphabet part 1 | North America, Western Europe, Latin America, Caribbean, Canada, Africa |
ISO-8859-2 | Latin alphabet part 2 | Eastern Europe |
ISO-8859-3 | Latin alphabet part 3 | SE Europe, Esperanto, and other miscellaneous items |
ISO-8859-4 | Latin alphabet part 4 | Scandinavia/Baltic Sea (and other parts not included in ISO-8859-1) |
ISO-8859-5 | Latin/Cyrillic part 5 | Languages that use the ancient Slavic alphabet, such as Bulgarian, Belarusian, Russian, macedonian |
ISO-8859-6 | Latin/Arabic part 6 | Use the language of the Arabic alphabet |
ISO-8859-7 | Latin/Greek part 7 | Modern Greek, as well as mathematical symbols derived from Greek |
ISO-8859-8 | Latin/Hebrew part 8 | Use The Language Of HebreW |
ISO-8859-9 | Latin 5 part 9 | Turkish. In addition to the Turkish characters that replace the Icelandic text, the others are the same as ISO-8859-1. |
ISO-8859-10 | Latin 6 | Lapland, Germanic, Eskimo Nordic |
ISO-8859-15 | Latin 9 (aka Latin 0) | Similar to ISO 8859-1, euro symbols and other characters replace some less-used symbols |
ISO-2022-JP | Latin/Japanese part 1 | Japanese |
ISO-2022-JP-2 | Latin/Japanese part 2 | Japanese |
ISO-2022-KR | Latin/Korean part 1 | Korean |
The above is a small compilation of how there is garbled after html all content, want to learn more clickable: HTML micro-class, HTML-CSS advanced real combat