Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Extension of the ES6 string


May 08, 2021 ES6


Table of contents


1. The Unicode notation of the character

ES6 Unicode allowing a character to be represented in the form of s uxxxx, where xxxx represents the Unicode code point of the character.

  1. "\u0061"
  2. // "a"

However, this notation is limited to characters with code points between the number points of the code between .u0000 and .uFFFF. Characters that are outside this range must be represented in two double bytes.

  1. "\uD842\uDFB7"
  2. // "????"
  3. "\u20BB7"
  4. // " 7"

The above code indicates that JavaScript will understand the value of more than 0xFFFF (e.g., s u20BB7) directly after the value of Because s u20BB is a non-printable character, only one space is displayed, followed by a 7.

ES6 has made improvements to this point by correctly interpreting the 大括号 in braces.

  1. "\u{20BB7}"
  2. // "????"
  3. "\u{41}\u{42}\u{43}"
  4. // "ABC"
  5. let hello = 123;
  6. hell\u{6F} // 123
  7. '\u{1F680}' === '\uD83D\uDE80'
  8. // true

In the above code, the last example shows that 大括号 notation is UTF-16 encoding.

With this notation, JavaScript 6 to represent a character.

  1. '\z' === 'z' // true
  2. '\172' === 'z' // true
  3. '\x7A' === 'z' // true
  4. '\u007A' === 'z' // true
  5. '\u{7A}' === 'z' // true

2. The string's traverser interface

ES6 字符串 遍历器接口 to the string (see chapter Iterator for details) so that the string can be for... Of loop traversal.

  1. for (let codePoint of 'foo') {
  2. console.log(codePoint)
  3. }
  4. // "f"
  5. // "o"
  6. // "o"

In addition to traversing strings, the biggest advantage of this traverser is that it recognizes points larger than 0xFFFF, which traditional for loops do not recognize.

  1. let text = String.fromCodePoint(0x20BB7);
  2. for (let i = 0; i < text.length; i++) {
  3. console.log(text[i]);
  4. }
  5. // " "
  6. // " "
  7. for (let i of text) {
  8. console.log(i);
  9. }
  10. // "????"

In the above code, the string text has only one character, but the for loop will think that it contains two characters (neither printable) and for... The of loop correctly recognizes this character.

3. Enter U-2028 and U-2029 directly

JavaScript strings allow you to enter characters directly, as well as the escape form of input characters. For example, the Unicode code point in "in" is U-4e2d, which you can enter directly into the string, or you can enter its escape form, s u4e2d, which is equivalent.

  1. '中' === '\u4e2d' // true

However, JavaScript states that there are five characters that cannot be used directly in strings and can only be used in escape form.

  • U-005C: Reverse solidus
  • U-000D: Carriage return
  • U-2028: Line separator
  • U-2029: Segment separator (paragraph separator)
  • U-000A: line feed

For example, a string cannot contain a backslash directly, and be sure to escape it by writing it as either . . . or .

The problem with this rule itself is that the JSON format allows the string to use the U-2028 (line separator) and U-2029 (segment separator) directly inside the string. As a result, the JSON output of the server is parsed by JSON.parse, and it is possible to report errors directly.

  1. const json = '"\u2028"';
  2. JSON.parse(json); // 可能报错

The JSON format has been frozen (RFC 7159) and cannot be modified. To eliminate this error, ES2019 allows the JavaScript string to enter the U-2028 (line separator) and U-2029 (segment separator) directly.

  1. const PS = eval("'\u2029'");

According to this proposal, the above code will not be misntaged.

Note that the template string now allows both characters to be entered directly. In addition, regular expressions still do not allow direct input of these two characters, which is no problem, because JSON does not allow direct inclusion of regular expressions.

4. Renovation of JSON.stringify().

According to the standard, JSON data must UTF-8 However, it is now possible for the JSON.stringify() method to return strings that do not meet the UTF-8 standard.

Specifically, the UTF-8 standard states that code points between 0xD800 and 0xDFFF cannot be used alone and must be paired. F or example, s uD834/uDF06 is two code points, but must be paired together to represent the character ???? 。 T his is a workable way to 0xFFFF that the code point is greater than the value of the code. It is illegal to use the two code points, s uD834 and s uDFO6, alone, or in reverse order, because s uDF06 and uD834 do not have the corresponding characters.

The problem with JSON.stringify() is that it may return 0xD800 point between 0xDFFF and a single code point.

  1. JSON.stringify('\u{D834}') // "\u{D834}"

To ensure that legitimate UTF-8 characters are returned, ES2019 changed the behavior of JSON.stringify(). If you encounter a single code point between 0xD800 and 0xDFFF, or a pairing that does not exist, it returns an escape string, leaving it to the app to decide what to do next.

  1. JSON.stringify('\u{D834}') // ""\\uD834""
  2. JSON.stringify('\uDF06\uD834') // ""\\udf06\\ud834""

5. Template string

In the traditional JavaScript language, the output template is usually written like this (jQuery's method is used below).

  1. $('#result').append(
  2. 'There are <b>' + basket.count + '</b> ' +
  3. 'items in your basket, ' +
  4. '<em>' + basket.onSale +
  5. '</em> are on sale!'
  6. );

The above writing is quite cumbersome and inconvenient, ES6 introduces template strings to solve this problem.

  1. $('#result').append(
  2. There are <b>${basket.count}</b> items
  3. in your basket, <em>${basket.onSale}</em>
  4. are on sale!
  5. );

模板字符串 string is an enhanced version of a string that is identified by an inverse quote ' . It can be used as a normal string, can be used to define a multi-line string, or to embed variables in a string.

  1. // 普通字符串
  2. In JavaScript '\n' is a line-feed.
  3. // 多行字符串
  4. In JavaScript this is
  5. not legal.
  6. console.log( string text line 1
  7. string text line 2 );
  8. // 字符串中嵌入变量
  9. let name = "Bob", time = "today";
  10. Hello ${name}, how are you ${time}?

The template strings in the above code are represented in back quotes. If you need to use back quotes in the template string, you'll have to escape with a backslash before.

  1. let greeting = \ Yo\ World! ;

If you use a template string to represent a multi-line string, all spaces and indentations are preserved in the output.

  1. $('#list').html(
  2. <ul>
  3. <li>first</li>
  4. <li>second</li>
  5. </ul>
  6. );

In the code above, spaces and line changes for all template strings are preserved, such as a line change in front of the label. If you don't want this line change, you can trim the trim method.

  1. $('#list').html(
  2. <ul>
  3. <li>first</li>
  4. <li>second</li>
  5. </ul>
  6. .trim());

The variable is embedded in the template string and needs to be written in ${

  1. function authorize(user, action) {
  2. if (!user.hasPrivilege(action)) {
  3. throw new Error(
  4. // 传统写法为
  5. // 'User '
  6. // + user.name
  7. // + ' is not authorized to do '
  8. // + action
  9. // + '.'
  10. User ${user.name} is not authorized to do ${action}. );
  11. }
  12. }

Inside the braces, you can put any JavaScript expression, you can perform operations, and you can refer to object properties.

  1. let x = 1;
  2. let y = 2;
  3. ${x} + ${y} = ${x + y}
  4. // "1 + 2 = 3"
  5. ${x} + ${y * 2} = ${x + y * 2}
  6. // "1 + 4 = 5"
  7. let obj = {x: 1, y: 2};
  8. ${obj.x + obj.y}
  9. // "3"

The function can also be called 调用函数

  1. function fn() {
  2. return "Hello World";
  3. }
  4. foo ${fn()} bar
  5. // foo Hello World bar

If the value in braces is not a string, it is converted to a string as a general rule. For example, braces are an object that calls the object's toString method by default.

If the variable in the template string is not declared, an error is reported.

  1. // 变量place没有声明
  2. let msg = Hello, ${place} ;
  3. // 报错

Because JavaScript code is executed inside the braces of the template string, if the braces are inside a string, it will be output as is.

  1. Hello ${'World'}
  2. // "Hello World"

Template strings can even 嵌套

  1. const tmpl = addrs =>
  2. <table>
  3. ${addrs.map(addr =>
  4. <tr><td>${addr.first}</td></tr>
  5. <tr><td>${addr.last}</td></tr>
  6. ).join('')}
  7. </table>
  8. ;

In the code above, another template string is embedded in the variable of the template string, using the following method.

  1. const data = [
  2. { first: '<Jane>', last: 'Bond' },
  3. { first: 'Lars', last: '<Croft>' },
  4. ];
  5. console.log(tmpl(data));
  6. // <table>
  7. //
  8. // <tr><td><Jane></td></tr>
  9. // <tr><td>Bond</td></tr>
  10. //
  11. // <tr><td>Lars</td></tr>
  12. // <tr><td><Croft></td></tr>
  13. //
  14. // </table>

If you need to reference the template string itself, execute it when you need it, and you can write it as a function.

  1. let func = (name) => Hello ${name}! ;
  2. func('Jack') // "Hello Jack!"

In the above code, the template string is written as the return value of a function. Executing this function is equivalent to executing this template string.

6. Example: Template compilation

Let's look at an instance that generates a formal template from a template string.

  1. let template =
  2. <ul>
  3. <% for(let i=0; i < data.supplies.length; i++) { %>
  4. <li><%= data.supplies[i] %></li>
  5. <% } %>
  6. </ul>
  7. ;

The above code places a regular template in the template string. The template uses the JavaScript code to be placed and the JavaScript expression to be output using the javaScript expression.

How do I compile this template string?

One idea is to convert it to a JavaScript expression string.

  1. echo('<ul>');
  2. for(let i=0; i < data.supplies.length; i++) {
  3. echo('<li>');
  4. echo(data.supplies[i]);
  5. echo('</li>');
  6. };
  7. echo('</ul>');

This conversion uses a regular expression.

  1. let evalExpr = /<%=(.+?)%>/g;
  2. let expr = /<%([\s\S]+?)%>/g;
  3. template = template
  4. .replace(evalExpr, ' ); \n echo( $1 ); \n echo( ')
  5. .replace(expr, ' ); \n $1 \n echo( ');
  6. template = 'echo( ' + template + ' );';

Then, encapsulate the template in a function and return it.

  1. let script =
  2. (function parse(data){
  3. let output = "";
  4. function echo(html){
  5. output += html;
  6. }
  7. ${ template }
  8. return output;
  9. }) ;
  10. return script;

Put the above into a template compilation function compile.

  1. function compile(template){
  2. const evalExpr = /<%=(.+?)%>/g;
  3. const expr = /<%([\s\S]+?)%>/g;
  4. template = template
  5. .replace(evalExpr, ' ); \n echo( $1 ); \n echo( ')
  6. .replace(expr, ' ); \n $1 \n echo( ');
  7. template = 'echo( ' + template + ' );';
  8. let script =
  9. (function parse(data){
  10. let output = "";
  11. function echo(html){
  12. output += html;
  13. }
  14. ${ template }
  15. return output;
  16. }) ;
  17. return script;
  18. }

The use of the compile function is as follows.

  1. let parse = eval(compile(template));
  2. div.innerHTML = parse({ supplies: [ "broom", "mop", "cleaner" ] });
  3. // <ul>
  4. // <li>broom</li>
  5. // <li>mop</li>
  6. // <li>cleaner</li>
  7. // </ul>

7. Label template

Template strings are functional, not just above. I t can be followed by a function name that is called to handle the template string. This is known as the tagged template feature.

  1. alert hello
  2. // 等同于
  3. alert(['hello'])

标签模板 are not really templates, but a special form of function calls. "Tag" refers to a function, and the template string that follows it is its argument.

However, if there are variables in the template character, it is not a simple call, but the template string is processed into multiple parameters before the function is called.

  1. let a = 5;
  2. let b = 10;
  3. tag Hello ${ a + b } world ${ a * b } ;
  4. // 等同于
  5. tag(['Hello ', ' world ', ''], 15, 50);

In the code above, the template string is preceded by an identification tag, which is a function. The return value of the entire expression, which is the return value after the tag function processes the template string.

The function tag receives more than one argument in turn.

  1. function tag(stringArr, value1, value2){
  2. // ...
  3. }
  4. // 等同于
  5. function tag(stringArr, ...values){
  6. // ...
  7. }

The first argument to the tag function is an array of members of the template string that do not have variable replacements, that is, variable substitutions occur only between the first and second members of the array, between the second member and the third member, and so on.

The other parameters of the tag function are the values of the template string after each variable has been replaced. Because in this case, the template string contains two variables, tag accepts two parameters, value1 and value2.

The actual values of all parameters of the tag function are as follows.

  • First argument: 'Hello', 'world', ''
  • Second parameter: 15
  • Third parameter: 50

That is, the tag function is actually called as follows.

  1. tag(['Hello ', ' world ', ''], 15, 50)

We can write the code for the tag function as needed. Here's how the tag function is written, and the results of the run.

  1. let a = 5;
  2. let b = 10;
  3. function tag(s, v1, v2) {
  4. console.log(s[0]);
  5. console.log(s[1]);
  6. console.log(s[2]);
  7. console.log(v1);
  8. console.log(v2);
  9. return "OK";
  10. }
  11. tag Hello ${ a + b } world ${ a * b} ;
  12. // "Hello "
  13. // " world "
  14. // ""
  15. // 15
  16. // 50
  17. // "OK"

Here is a more complex example.

  1. let total = 30;
  2. let msg = passthru The total is ${total} (${total*1.05} with tax) ;
  3. function passthru(literals) {
  4. let result = '';
  5. let i = 0;
  6. while (i < literals.length) {
  7. result += literals[i++];
  8. if (i < arguments.length) {
  9. result += arguments[i];
  10. }
  11. }
  12. return result;
  13. }
  14. msg // "The total is 30 (31.5 with tax)"

The example above shows how to flatten the parameters back in their original positions.

The passthru function is written as follows with the rest argument.

  1. function passthru(literals, ...values) {
  2. let output = "";
  3. let index;
  4. for (index = 0; index < values.length; index++) {
  5. output += literals[index] + values[index];
  6. }
  7. output += literals[index]
  8. return output;
  9. }

An important application of tag templates is to filter HTML strings to prevent users from entering malicious content.

  1. let message =
  2. SaferHTML <p>${sender} has sent you a message.</p> ;
  3. function SaferHTML(templateData) {
  4. let s = templateData[0];
  5. for (let i = 1; i < arguments.length; i++) {
  6. let arg = String(arguments[i]);
  7. // Escape special characters in the substitution.
  8. s += arg.replace(/&/g, "&")
  9. .replace(/</g, "<")
  10. .replace(/>/g, ">");
  11. // Don't escape special characters in the template.
  12. s += templateData[i];
  13. }
  14. return s;
  15. }

In the above code, the sender variable is often provided by the user, and after being processed by the SaferHTML function, the special characters inside are escaped.

  1. let sender = '<script>alert("abc")</script>'; // 恶意代码
  2. let message = SaferHTML <p>${sender} has sent you a message.</p> ;
  3. message
  4. // <p><script>alert("abc")</script> has sent you a message.</p>

Another application of label templates is multilingual transformation (internationalization).

  1. i18n Welcome to ${siteName}, you are visitor number ${visitorNumber}!
  2. // "欢迎访问xxx,您是第xxxx位访问者!"

Template strings by themselves are not a substitute for template libraries such as Mustache because there is no conditional judgment or circular processing, but with tag functions, you can add them yourself.

  1. // 下面的hashTemplate函数
  2. // 是一个自定义的模板处理函数
  3. let libraryHtml = hashTemplate
  4. <ul>
  5. #for book in ${myBooks}
  6. <li><i>#{book.title}</i> by #{book.author}</li>
  7. #end
  8. </ul>
  9. ;

In addition, you can even use tag templates to embed other languages in the JavaScript language.

  1. jsx
  2. <div>
  3. <input
  4. ref='input'
  5. onChange='${this.handleChange}'
  6. defaultValue='${this.state.value}' />
  7. ${this.state.value}
  8. </div>

The above code converts a DOM string into a React object through the jsx function. You can find the specific implementation of the jsx function in GitHub.

Here's a hypoththetic example of running Java code in JavaScript code through the java function.

  1. java
  2. class HelloWorldApp {
  3. public static void main(String[] args) {
  4. System.out.println("Hello World!"); // Display the string.
  5. }
  6. }
  7. HelloWorldApp.main();

The first argument of the template handler (an array of template strings) also has a raw property.

  1. console.log 123
  2. // ["123", raw: Array[1]]

In the above code, console .log accepted parameters, which are actually an array. The array has a raw property that holds the escaped original string.

Take a look at the example below.

  1. tag First line\nSecond line
  2. function tag(strings) {
  3. console.log(strings.raw[0]);
  4. // strings.raw[0] 为 "First line\\nSecond line"
  5. // 打印输出 "First line\nSecond line"
  6. }

In the above code, the first argument of the tag function, strings, has a raw property and also points to an array. T he members of the array are exactly the same as the strings array. F or example, if the strings array is the "First line", then the strings.raw array is the "First line" and the "nSecond line". T he only difference between the two is that the slashes inside the string are escaped. F or example, the strings .raw array treats the array as two characters, s and n, rather than line breaks. This is designed to facilitate the original template prior to escape.

8. The limit of the template string

The label template mentioned earlier can be embedded in other languages. However, the template string escapes the string by default, preventing it from being embedded in other languages.

For example, the LaTEX language can be embedded in a label template.

  1. function latex(strings) {
  2. // ...
  3. }
  4. let document = latex
  5. \newcommand{\fun}{\textbf{Fun!}} // 正常工作
  6. \newcommand{\unicode}{\textbf{Unicode!}} // 报错
  7. \newcommand{\xerxes}{\textbf{King!}} // 报错
  8. Breve over the h goes \u{h}ere // 报错

In the code above, the template string embedded in the variable document is completely legal for the LaTEX language, but the JavaScript engine will report errors. The reason is the escape of the string.

The template string escapes the characters s u00FF and s u{42} as Unicode characters, so the sunicode parses the error, and the sx56 escapes as a hex string, so sxerxes will report the error. That is, there's a special meaning in LaTEX, but JavaScript has escaped them.

To solve this problem, ES2018 relaxed restrictions on string escape in label templates. If you encounter an illegal string escape, you return the undefined instead of reporting an error, and you can get the original string from above the raw property.

  1. function tag(strs) {
  2. strs[0] === undefined
  3. strs.raw[0] === "\\unicode and \\u{55}";
  4. }
  5. tag`\unicode and \u{55}`

In the above code, the template string should have been reported as an error, but because the restrictions on string escape were relaxed, the JavaScript engine set the first character to undefined, but the raw property can still get the original string, so the tag function can still handle the original string.

Note that this relaxation of string escape only takes effect when the label template parses the string, not when the label template is in place, and will still report errors.

  1. let bad = `bad escape sequence: \unicode`; // 报错