Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Nginx basic data structure


May 23, 2021 Nginx Getting started


Table of contents


The basic data structure

In pursuit of ultimate efficiency, Nginx authors implement a number of unique Nginx-style data structures and common functions. F or example, Nginx provides strings of length, string copy functions optimized for compiler options, ngx_copy so on. T herefore, when we write Nginx modules, we should try to call the apis provided by Nginx, although some apis are just macro definitions of glibc. In this section, we introduce string, list, buffer, chain and a series of the most basic data structures and related api usage techniques and considerations.

ngx_str_t

ngx_string.h|c the encapsulation of the string and the api for string-related operations are included. Nginx provides a string structure with length ngx_str_t prototype as follows:

    typedef struct {
        size_t      len;
        u_char     *data;
    } ngx_str_t;

In the structure, data points to the first character of the string data, and the end of the string is represented by '\\0' Therefore, when writing Nginx code, the method of handling strings is very different from what we normally use, but always remember that strings do not end with '\\0' try to manipulate strings using the api of string operations provided by Nginx.

So what's the benefit of Nginx doing this? F irst, the length of the string is represented by the length, reducing the number of times the string length is calculated. S econd, Nginx can repeatedly reference a piece of string memory, and data can point to any memory, with a length that indicates the end, without having to copy a copy of its own string '\\0' change the original string, so you're bound to copy a string). I n the members of the ngx_http_request_t structure, we can find many examples of strings that reference a piece of memory, such as request_line, uri, args, and so on, and the data portion of these strings points to the memory that the buffer is pointing to when the data is received, and there is no need for a copy of the args to come out. In this way, a lot of unnecessary memory allocation and copying is reduced.

It is on this basis that a string must be carefully modified in Nginx. W hen modifying a string, you need to carefully consider whether the string can be modified, and whether the string can be modified to affect its references. Y ou'll ngx_unescape_uri when you look at the function later. H owever, there are some problems with strings that use Nginx, and many of the system api functions provided by glibc mostly represent the end of the string by '\\0' call the system api directly into str-gt;data. A t this point, it is common practice to create a str-gt;len-1-size memory, and then copy the string, with the last byte '\\0' T he practice of comparing hacks is to set the last character of the string to backup one, and then '\\0' then change it back from backup after the call is done, but only if you have to make sure that the character is modifiable and that there is a memory allocation that does not cross the line, but this is generally not recommended. Next, take a look at the api related to the action string provided by Nginx.

    #define ngx_string(str)     { sizeof(str) - 1, (u_char *) str }

ngx_string(str) is a macro that constructs an Nginx string from a normal string str that ends with '\\0' and the argument must be a constant string, since it uses the sizeof operator to calculate the string length.

    #define ngx_null_string     { 0, NULL }

When you define a variable, ngx_null_string initialization string is an empty string, the string is 0 in length, and the data is NULL.

    #define ngx_str_set(str, text)                                               \
        (str)->len = sizeof(text) - 1; (str)->data = (u_char *) text

ngx_str_set is used to set the string str to text, which must be a constant string because the length is calculated using sizeof.

    #define ngx_str_null(str)   (str)->len = 0; (str)->data = NULL

ngx_str_null is used to set the string str to an empty string, a length of 0, and data to NULL.

The above four functions, when using must be careful, ngx_string and ngx_null_string are in the format of " , , " and therefore can only be used for initialization when assigning values, such as:

    ngx_str_t str = ngx_string("hello world");
    ngx_str_t str1 = ngx_null_string;

If you use it this way below, you'll have a problem with the syntax rules in the c language that assign values to structural variables, which are not covered here.

    ngx_str_t str, str1;
    str = ngx_string("hello world");    // 编译出错
    str1 = ngx_null_string;                // 编译出错

In this case, you can ngx_str_set the ngx_str_null functions of the two functions to do:

    ngx_str_t str, str1;
    ngx_str_set(&str, "hello world");    
    ngx_str_null(&str1);

According to the C99 standard, you can do the same:

    ngx_str_t str, str1;
    str  = (ngx_str_t) ngx_string("hello world");
    str1 = (ngx_str_t) ngx_null_string;

Also note that ngx_string and ngx_str_set must be constant strings when called, otherwise you get an unexpected error (because ngx_str_set uses sizeof() u_char* not the length of the string). S uch as:

   ngx_str_t str;
   u_char *a = "hello world";
   ngx_str_set(&str, a);    // 问题产生

Also, it's worth noting that since ngx_str_set and ngx_str_null are actually two-line statements, using them separately in statements such as if/for/while requires braces, such as:

   ngx_str_t str;
   if (cond)
      ngx_str_set(&str, "true");     // 问题产生
   else
      ngx_str_set(&str, "false");    // 问题产生
   void ngx_strlow(u_char *dst, u_char *src, size_t n);

By converting the first n characters of src into a low case in a dst string, the caller needs to ensure that the space that dst points to is greater than or equal to n, and that the space it points to must be writeable. T he operation does not change the original string. To change the original string, you can:

    ngx_strlow(str->data, str->data, str->len);
    ngx_strncmp(s1, s2, n)

Case-sensitive string comparisons, comparing only the first n characters.

    ngx_strcmp(s1, s2)

Case-sensitive string comparisons without length.

    ngx_int_t ngx_strcasecmp(u_char *s1, u_char *s2);

Case-insensescing string comparisons without length.

    ngx_int_t ngx_strncasecmp(u_char *s1, u_char *s2, size_t n);

Case-insensed with length string comparisons, comparing only the first n characters.

u_char * ngx_cdecl ngx_sprintf(u_char *buf, const char *fmt, ...);
u_char * ngx_cdecl ngx_snprintf(u_char *buf, size_t max, const char *fmt, ...);
u_char * ngx_cdecl ngx_slprintf(u_char *buf, u_char *last, const char *fmt, ...);

The above three functions are used for string formatting, and max, the second ngx_snprintf of the ngx_snprintf, indicates the size of the buf space, and ngx_slprintf indicates the size of the buf space through last. A second or third function is recommended to format the string, ngx_sprintf function is dangerous and prone to buffer overflow vulnerabilities. I n this series of functions, Nginx adds some escape characters that are convenient for formatting Nginx types, such as %V to format the ngx_str_t structure, in addition to being compatible with strings in glibc. In the ngx_string.c ngx_string.c Nginx source file:

    /*
     * supported formats:
     *    %[0][width][x][X]O        off_t
     *    %[0][width]T              time_t
     *    %[0][width][u][x|X]z      ssize_t/size_t
     *    %[0][width][u][x|X]d      int/u_int
     *    %[0][width][u][x|X]l      long
     *    %[0][width|m][u][x|X]i    ngx_int_t/ngx_uint_t
     *    %[0][width][u][x|X]D      int32_t/uint32_t
     *    %[0][width][u][x|X]L      int64_t/uint64_t
     *    %[0][width|m][u][x|X]A    ngx_atomic_int_t/ngx_atomic_uint_t
     *    %[0][width][.width]f      double, max valid number fits to %18.15f
     *    %P                        ngx_pid_t
     *    %M                        ngx_msec_t
     *    %r                        rlim_t
     *    %p                        void *
     *    %V                        ngx_str_t *
     *    %v                        ngx_variable_value_t *
     *    %s                        null-terminated string
     *    %*s                       length and string
     *    %Z                        '\0'
     *    %N                        '\n'
     *    %c                        char
     *    %%                        %
     *
     *  reserved:
     *    %t                        ptrdiff_t
     *    %S                        null-terminated wchar string
     *    %C                        wchar
     */

In particular, we are most often used to format the ngx_str_t structure, which corresponds to the escape character %V and must be passed to the function by pointer type, otherwise the program will be coredump off. T his is also the easiest mistake we can make. Like what:

    ngx_str_t str = ngx_string("hello world");
    u_char buffer[1024];
    ngx_snprintf(buffer, 1024, "%V", &str);    // 注意,str取地址
    void ngx_encode_base64(ngx_str_t *dst, ngx_str_t *src);
    ngx_int_t ngx_decode_base64(ngx_str_t *dst, ngx_str_t *src);

These two functions are used to base64 encoding and decoding str, and before calling, you need to make sure that there is enough space in the dst to store the results, and if you don't know the exact size, you can call ngx_base64_encoded_length and ngx_base64_decoded_length to estimate the maximum footprint.

    uintptr_t ngx_escape_uri(u_char *dst, u_char *src, size_t size,
        ngx_uint_t type);

Src is encoded in different ways, depending on type, and if dst is NULL, the number of characters that need to be escaped is returned, resulting in the amount of space required. The type of type can be:

    #define NGX_ESCAPE_URI         0
    #define NGX_ESCAPE_ARGS        1
    #define NGX_ESCAPE_HTML        2
    #define NGX_ESCAPE_REFRESH     3
    #define NGX_ESCAPE_MEMCACHED   4
    #define NGX_ESCAPE_MAIL_AUTH   5
    void ngx_unescape_uri(u_char **dst, u_char **src, size_t size, ngx_uint_t type);

Src is recoded, and type can be 0, NGX_UNESCAPE_URI, NGX_UNESCAPE_REDIRECT, and three values. I f it is 0, all characters in src are transcoded. I f NGX_UNESCAPE_URI and NGX_UNESCAPE_REDIRECT, you encounter '?' T hen it's over, and the characters behind it don't matter. The NGX_UNESCAPE_URI difference between NGX_UNESCAPE_URI and NGX_UNESCAPE_REDIRECT is that NGX_UNESCAPE_URI transcodes for the characters you encounter that require transcoding, and NGX_UNESCAPE_REDIRECT transcodes only non-visible characters.

    uintptr_t ngx_escape_html(u_char *dst, u_char *src, size_t size);

Encode the html tag.

Of course, I only introduced here some commonly used api use, we can first familiar with, in the actual use of the process, encountered do not understand, the fastest and most direct way is to look at the source code, see the implementation of the api or see Nginx itself called api where the place is done, the code is the best document.

ngx_pool_t

ngx_pool_t is a very important data structure, which is used in many important situations, and many important data structures are also used. S o what exactly is it? Simply put, it provides a mechanism to help manage a range of resources (e.g. memory, files, etc.) so that the use and release of these resources are carried out uniformly, eliminating the need to consider the use of a variety of resources when to release, whether the release is missing.

For example, for memory management, if we need to use memory, we always get memory from an ngx_pool_t object, and at some point in the end, we destroy that ngx_pool_t object, all of which is freed. T his way we don't have to do malloc and free on this memory, and we don't have to worry about whether a block of memory that's been blockoc has not been freed. Because when ngx_pool_t object is destroyed, all memory allocated from the object is released uniformly.

For example, if we want to use a series of files, but when we open them, we eventually need to close them all, so we register them in one ngx_pool_t object, and when the ngx_pool_t object is destroyed, all of these files will be closed.

From the two examples cited above, we can see that when using the ngx_pool_t data structure, all resources are released uniformly at the moment the object is destroyed, which raises the question that the lifetime of these resources (or the amount of time occupied) is basically consistent with the survival cycle of ngx_pool_t (ngx_pool_t also provides a small amount of operations to release resources in advance). T his is not the best in terms of efficiency. F or example, we need to use A, B, C in turn, and when we use B, A will no longer be used, and A and B will not be used when C is used. I f we ngx_pool_t resources to manage these three resources, we might apply for A from inside the system, use A, and then release A. T hen apply for B, use B, and release B. F inally apply for C, use C, and then release C. B ut when we use a ngx_pool_t object to manage these three resources, the release of A, B, and C occurs at the last, that is, after C is used. I t is true that this objectively increases the amount of resources used by the program over time. B ut it also reduces the effort of programmers to manage the life cycle of each of the three resources. T his is also the gain, there must be a reason to lose. It's actually a matter of a pick and choose, depending on which one you care more about in the specific case.

Take a look at a typical scenario in Nginx that uses ngx_pool_t, and for each http request that Nginx processes, Nginx generates an ngx_pool_t object associated with this http request, and all the resources that need to be requested during processing are obtained from this ngx_pool_t object, and when this http request processing is complete, all the resources requested during the processing, will be released with the destruction ngx_pool_t the associated object.

ngx_pool_t structure and operations are defined in file src/core/ngx_palloc.h|c

    typedef struct ngx_pool_s        ngx_pool_t; 

    struct ngx_pool_s {
        ngx_pool_data_t       d;
        size_t                max;
        ngx_pool_t           *current;
        ngx_chain_t          *chain;
        ngx_pool_large_t     *large;
        ngx_pool_cleanup_t   *cleanup;
        ngx_log_t            *log;
    };

From ngx_pool_t of the general consumer of the data, you don't have to ngx_pool_t the role of the fields in the structure. So there is no detailed explanation here, of course, when describing the use of certain operating functions, if necessary.

Let's explain each of the ngx_pool_t the following.

    ngx_pool_t *ngx_create_pool(size_t size, ngx_log_t *log);

Create a pool with an initial node size size, and log is the object that outputs the log when subsequent operations are performed on the pool. I t should be noted that the size of the size must be less than NGX_MAX_ALLOC_FROM_POOL size and must be greater than sizeof ngx_pool_t.

Selecting a value greater than NGX_MAX_ALLOC_FROM_POOL is wasteful because space greater than that limit is not used (just memory on the first memory block managed by the ngx_pool_t object, and subsequent allocations are redistributed if the free part on the first memory block is used up).

Selecting a value less than ngx_pool_t sizeof causes the program to crash. Because the initial size of the memory block to use a portion of the storage ngx_pool_t this information itself.

When an ngx_pool_t object is created, the object's max field is assigned to size-ngx_pool_t and NGX_MAX_ALLOC_FROM_POOL, which are relatively small. S ubsequent blocks of memory allocated from this pool will need to continue to request memory from the operating system if you want to continue to allocate after the first memory usage is complete. W hen the size of the memory is less than or equal to the max field, a new block of memory is allocated, linked to a list managed by the d field (actually the d.next field). W hen the block of memory to be allocated is larger than max, the memory requested from the system is hooked to a list managed by the large field. Let's call this a large memory chain and a small memory chain.

    void *ngx_palloc(ngx_pool_t *pool, size_t size); 

Allocate a piece of size-sized memory from this pool. N ote that the starting address of the memory allocated by this function NGX_ALIGNMENT the value of . A lignment increases the speed of system processing, but can result in a waste of a small amount of memory.


    void *ngx_pnalloc(ngx_pool_t *pool, size_t size); 

Allocate a piece of size-sized memory from this pool. However, the memory allocated by this function is not aligned as the function above.

.. code:: c

void *ngx_pcalloc(ngx_pool_t *pool, size_t size);

The function also allocates size memory and zeros the allocated blocks of memory. T he internal is actually ngx_palloc call.


    void *ngx_pmemalign(ngx_pool_t *pool, size_t size, size_t alignment);

Follow the specified alignment size to request a piece of memory of size size. T he memory obtained here will be managed in a large memory blockchain regardless of size.

    ngx_int_t ngx_pfree(ngx_pool_t *pool, void *p);

Releases a block in a column of memory that is placed in a large memory chain, that is, managed by the large field. T he implementation of this function is a large memory list that is sequentially traversed for large management. S o it's inefficient. I f this memory is found in this list, release it and return NGX_OK. Otherwise, the NGX_DECLINED.

Because this operation is inefficient, there is generally no need to call unless necessary, which means that the memory is very large and should be released in a timely manner. Anyway, memory is always released when this pool is destroyed!

    ngx_pool_cleanup_t *ngx_pool_cleanup_add(ngx_pool_t *p, size_t size); 

ngx_pool_t the cleanup field in the list manages a special list, each of which records a special resource that needs to be released. I t is self-explaining how the resources contained in each node in this list are released. T his also provides a lot of flexibility. T his means that ngx_pool_t can manage not only memory, but also any resources that need to be freed, such as closing files, or deleting files, and so on. L et's take a look at the types of each node in this list:

    typedef struct ngx_pool_cleanup_s  ngx_pool_cleanup_t;
    typedef void (*ngx_pool_cleanup_pt)(void *data);

    struct ngx_pool_cleanup_s {
        ngx_pool_cleanup_pt   handler;
        void                 *data;
        ngx_pool_cleanup_t   *next;
    };
  • data: Indicates the resources that correspond to the node.

  • Handler: is a function pointer that points to a function that frees the resources corresponding to the data. T he function has only one argument, data.

  • next: Points to the next element in the list.

Seeing here, ngx_pool_cleanup_add use of this function, I believe we should all have some understanding. B ut what does this parameter size do? This size is to store the size of the resource that this data field points to, and the function allocates size-sized space for the data.

For example, we need to delete a file at the end. T hen when we call this function, we specify size as the size of the string that stores the file name, and then call this function to add an item to the cleanup list. T he function returns the newly added node. W e then copy the data field in this node as the file name. Assign the hander field to a function that deletes the file void (\*ngx_pool_cleanup_pt)(void \*data) course, the prototype ngx_pool_cleanup_pt of the function is to follow void ( .

    void ngx_destroy_pool(ngx_pool_t *pool);

This function is the function that frees all the memory held in the pool, and the handler field that calls each element in the list managed by the cleanup field in turn, to free up all the resources managed by the pool. A nd the pool points to the ngx_pool_t also released, completely unavailable.

    void ngx_reset_pool(ngx_pool_t *pool);

This function frees up memory on all large blocks of memory lists in the pool, and blocks of memory on small memory chains are modified to be available. H owever, items on the cleanup list are not processed.

ngx_array_t

ngx_array_t is the array structure used inside Nginx. N ginx's array structure is similar in storage to the array built into the cognitive C language, for example, the area where the data is actually stored is also a large chunk of continuous memory. B ut arrays contain metadata to describe some relevant information in addition to the memory in which the data is stored. L et's take a detailed look at the definition of the array. n gx_array_t definition of the src/core/ngx_array.c|h

    typedef struct ngx_array_s       ngx_array_t;
    struct ngx_array_s {
        void        *elts;
        ngx_uint_t   nelts;
        size_t       size;
        ngx_uint_t   nalloc;
        ngx_pool_t  *pool;
    };
  • elts: Points to the actual data storage area.

  • nelts: The actual number of elements in the array.

  • size: The size of an array of individual elements in bytes.

  • nalloc: The capacity of the array. I ndicates the maximum number of elements that the array can store without raising the expansion. W hen the nelts growth reaches nalloc, if you store elements in this array again, the expansion of the array is raised. T he capacity of the array will be expanded to 2 times the size of the original capacity. I n effect, a new piece of memory is allocated, which is 2 times the size of the original memory. T he original data is copied into a new piece of memory.

  • Pool: The pool of memory that the array is used to allocate memory.

Here's ngx_array_t related action functions.

    ngx_array_t *ngx_array_create(ngx_pool_t *p, ngx_uint_t n, size_t size);

Create a new array object and return it.

  • p: The memory pool used by the array allocation memory;
  • n: The initial capacity size of the array, i.e. the maximum number of elements that can be accommodated without expansion.
  • size: The size of a single element in bytes.
    void ngx_array_destroy(ngx_array_t *a);

Destroy the array object and free up its allocated memory back into the memory pool.

    void *ngx_array_push(ngx_array_t *a);

Add a new element to array a and return a pointer to the new element. T he returned pointer needs to be converted to a specific type using a type conversion, and then assigned to the new element itself or to each field if the elements of the array are complex types.

    void *ngx_array_push_n(ngx_array_t *a, ngx_uint_t n);

Append n elements to array a and return a pointer to the position of the first element of those appended elements.

    static ngx_inline ngx_int_t ngx_array_init(ngx_array_t *array, ngx_pool_t *pool, ngx_uint_t n, size_t size);

If an array object is assigned to the heap, you can call this function if you want to use it again after calling ngx_array_destroy destroyed.

If an array object is assigned to the stack, this function needs to be called before it can be used for initialization.

Note Because memory is allocated ngx_palloc using ngx_palloc, the array does not free up old memory when it is expanded, resulting in memory waste. Therefore, it is best to plan the capacity of the array in advance, in the creation or initialization of the time at once, to avoid multiple capacity expansion, resulting in memory waste.

ngx_hash_t

ngx_hash_t is the implementation of Nginx's own hash table. T he definition and implementation src/core/ngx_hash.h|c n gx_hash_t of the data structure is much the same as the implementation of the hash table described in the data structure textbook. F or the commonly used conflict resolution methods are linear detection, secondary detection and chaining. n gx_hash_t is the most commonly used method, the open-chain method, which is also used by hash tables in STL.

However ngx_hash_t implementation of this system has several notable features:

  1. ngx_hash_t unlike other hash table implementations, you can insert deleted elements, it can only be initialized once, after the entire hash table is built, neither can be deleted, nor can the element be inserted.
  2. ngx_hash_t open chain doesn't really open a list, it actually opens a continuous storage space, almost as an array. T his is because ngx_hash_t at the time of initialization, there is a pre-calculation process in which how many elements will be put in each bucket in advance to calculate, so that the size of each bucket is known in advance. T hen there is no need to use a list, and a continuous amount of storage space is sufficient. This also saves memory usage to some extent.

From the above description, we can see that the higher this value, the more memory waste. J ust two steps, first initialization, and then you can look it up inside. Let's take a look at it in detail.

ngx_hash_t the initialization of the data.

 ngx_int_t ngx_hash_init(ngx_hash_init_t *hinit, ngx_hash_key_t *names,
 ngx_uint_t nelts);

Let's first look at the initialization function. T he first argument of the function, hinit, is a collection of initialized parameters. n ames are an array that initializes ngx_hash_t all keys required by a system. N elts, on the other, is the number of keys. H ere's a look ngx_hash_init_t type, which provides some basic information needed to initialize a hash table.

    typedef struct {
        ngx_hash_t       *hash;
        ngx_hash_key_pt   key;

        ngx_uint_t        max_size;
        ngx_uint_t        bucket_size;

        char             *name;
        ngx_pool_t       *pool;
        ngx_pool_t       *temp_pool;
    } ngx_hash_init_t;
  • Hash: If the field is NULL, after the initialization function is called, the field points to the newly created hash table. If the field is not NULL, then at the beginning, all the data is inserted into the hash table referred to in this field.

  • key: Points to the hash function that generates the hash value from the string. The default implementation function is provided in Nginx's source code ngx_hash_key_lc.

  • max_size: The number of buckets in the hash table. T he larger the field, the less likely elements are to conflict when stored, and fewer elements are stored in each bucket, which is faster to query. Of course, the higher this value, the greater the waste of memory (and actually not much).
:bucket_size: The maximum limit size per bucket, in bytes. If, when you initialize a hash table, you find that all the elements that belong to the bucket that cannot be stored in a bucket fail to initialize the hash table.

name: The name of the hash table.

Pool: The pool used by the hash table to allocate memory.

temp_pool: The hash table uses a temporary pool that can be released and destroyed after initialization is complete.

Here's a look at the structure of the array that stores the key of the hash table.

    typedef struct {
        ngx_str_t         key;
        ngx_uint_t        key_hash;
        void             *value;
    } ngx_hash_key_t;

The meaning of key and value is obvious and need not be explained. key_hash is a value calculated using the hash function for key.

After analyzing these two structures, I think you should already understand how this function should be used. After the function successfully initializes a hash table, it returns NGX_OK, otherwise it NGX_ERROR.

    void *ngx_hash_find(ngx_hash_t *hash, ngx_uint_t key, u_char *name, size_t len);

Look inside hash for value for key. I n fact, the key here is the hash value calculated for the real key (that is, name). len is the length of name.

If the lookup is successful, a pointer to value is returned, otherwise NULL is returned.

ngx_hash_wildcard_t

Nginx implemented a hash table like this in order to handle the matching of domain names ngx_hash_wildcard_t wildcards. H e can support two types of domain names with wildcards. O ne is a wildcard that is in front of you, for \*.abc.com abc.com can also omit the asterisk and write .abc.com S uch keys can match www.abc.com, qqq.www.abc.com and so on. T he other is a wildcard at the end, for example: mail.xxx.\* please pay special attention to the wildcard at the end of the wildcard does not appear to be at the beginning of the wild card can be omitted. Such wildcards can match domain names such mail.xxx.com, mail.xxx.com.cn, mail.xxx.net, and so on.

It is important to note that a ngx_hash_wildcard_t hash table can only contain wildcards in the first key or wildcards after keys. K eys that cannot contain both types of wildcards. n gx_hash_wildcard_t type variables are built through function ngx_hash_wildcard_init, while queries are made through function ngx_hash_find_wc_head ngx_hash_find_wc_tail or functions. ngx_hash_find_wc_head query contains the hash table of the key before the wildcard, and ngx_hash_find_wc_tail is the hash table that contains the key after the wildcard.

The following details the use of these functions.

    ngx_int_t ngx_hash_wildcard_init(ngx_hash_init_t *hinit, ngx_hash_key_t *names,
        ngx_uint_t nelts);

This function is used to build a hash table that can contain wildcard keys.

  • hinit: Constructs a collection of parameters for a wildcard hash table. For a description of the type that corresponds to this parameter, see ngx_hash_t of the ngx_hash_init in the type.

  • names: Constructs an array of all wildcard keys for this hash table. I t is important to note that the keys here have been preprocessed. F or example: .abc.com .abc.com \*.abc.com after the pre-processing com.abc. O n mail.xxx.\* mail.xxx. W hy is this being handled? H ere you have to briefly describe how the wildcard hash table works. W hen you construct a hash table of this type, you actually construct a "chain list" of a hash table, which is "linked" by the key in the hash table. F or example, two hash tables are constructed for \*.abc.com the first hash table has a table item with a key of com, the value of the table item contains a pointer to the second hash table, and the second hash table has a table item \*.abc.com and the value of the table item contains a pointer to the value corresponding to the value of abc.com the second hash table. T hen when querying, such as querying www.abc.com, check com first, by looking at com you can find the second-level hash table, in the second-level hash table, and then look for abc, and so on, until the value of the table item found in the hash table at a certain level corresponds to a real value rather than a pointer to the next-level hash table, the query process ends. O ne thing to note here is that the value value of the elements in the names array must be 0 (for special purposes). I f this condition is not met, the hash table query does not produce the correct results.

  • nelts: The number of elements of the names array.

The function executes successfully to return NGX_OK, otherwise NGX_ERROR.

    void *ngx_hash_find_wc_head(ngx_hash_wildcard_t *hwc, u_char *name, size_t len);

The function query contains wildcards in the hash table of the previous key.

  • hwc: Pointer to the hash table object.
  • name: The domain name that needs to be queried, www.abc.com.
  • len: The length of name.

The function returns a matching wildcard corresponding to value. If not, return NULL.

    void *ngx_hash_find_wc_tail(ngx_hash_wildcard_t *hwc, u_char *name, size_t len);

The function query contains wildcards at the end of the key's hash table.

Parameters and return values Please participate in the description of the last function.

ngx_hash_combined_t

The combined type hash table, which is defined as follows:

    typedef struct {
        ngx_hash_t            hash;
        ngx_hash_wildcard_t  *wc_head;
        ngx_hash_wildcard_t  *wc_tail;
    } ngx_hash_combined_t;

From its definition, the type actually contains three hash tables, a normal hash table, a hash table that contains forward wildcards, and a hash table that contains forward wildcards.

Nginx provides the role of this type by providing a convenient container containing three types of hash tables, and when there is a set of keys that contain wildcards and do not contain wildcards, the hash table is queried in a convenient way, and you don't need to think about which type of hash table a key should go to.

When constructing such a combination of hash tables, you first define a variable of that type, and then construct the three sub-hash tables that they contain separately.

For queries for this type of hash table, Nginx provides a convenient function ngx_hash_find_combined.

    void *ngx_hash_find_combined(ngx_hash_combined_t *hash, ngx_uint_t key,
    u_char *name, size_t len);

In this combined hash table, the function queries its three sub-hash tables in turn to see if they match, and as soon as it is found, returns the lookup result, which means that if there are more than one possible match, only the first match is returned.

  • Hash: This combination of hash table objects.
  • key: The hash value calculated from name.
  • Name: Key's specific content.
  • len: The length of name.

Returns the result of the query, and NULL is returned if it is not found.

ngx_hash_keys_arrays_t

You see which keys of ngx_hash_wildcard_t need to be preprocessed when building a wild card. I t's a bit of a hassle to deal with. A nd when there is a set of keys, which range from keys without wildcards to keys that contain wildcards. W e need to build three hash tables, a hash table with a normal key, a hash table with forward wildcards, and a hash table with a back wildcard (or you can combine these three hash tables into one ngx_hash_combined_t). In this case, Nginx provides this secondary type for the convenience of constructing these hash tables.

This type and related action functions are also src/core/ngx_hash.h|c Let's first look at the definition of that type.

    typedef struct {
        ngx_uint_t        hsize;

        ngx_pool_t       *pool;
        ngx_pool_t       *temp_pool;

        ngx_array_t       keys;
        ngx_array_t      *keys_hash;

        ngx_array_t       dns_wc_head;
        ngx_array_t      *dns_wc_head_hash;

        ngx_array_t       dns_wc_tail;
        ngx_array_t      *dns_wc_tail_hash;
    } ngx_hash_keys_arrays_t;
  • hsize: The number of buckets of the hash table to be built. This parameter is used for three types of hash tables built with the information contained in this structure.

  • Pool: Build the pool used by these hash tables.

  • temp_pool: Temporary pools may be used during the construction of this type and the final three hash tables. T he temp_pool can be destroyed after the build is complete. Here's just a temporary amount of memory consumption.

  • Keys: An array that holds all non-wildcard keys.

  • keys_hash: This is a two-dimensional array, and the first dimension keys_hash[i] of the bucket, so what is stored in keys_hash is all the keys calculated by the hash value to the key of the hsize after molding. A ssuming that there are three keys, one key1, two key2, and three key3, assuming that the hash values are all i for hsize after they are calculated, the keys_hash[i][0] of the three keys are stored in keys_hash order in keys_hash[i][2] the order keys_hash[i][1] . This value is used during the call to save and detect if there is a conflicting key value, that is, whether there is a duplicate.

  • dns_wc_head: The value after the forward wildcard key is processed. For example, \*.abc.com it becomes "com.abc." and is stored in this array.

  • dns_wc_tail: Store the value after the back-to-wildcard key is processed. For mail.xxx.\* is completed, it becomes "mail.xxx." is stored in this array.

  • dns_wc_head_hash: This value is used to save and detect the key value of a conflicting front wildcard during the call, i.e. whether it is duplicated.

  • dns_wc_tail_hash: This value is used to save and detect the key value of a conflicting back wildcard during the call, i.e. whether there is a duplicate.

After defining a variable of this type and assigning the fields pool and temp_pool, you can call the function ngx_hash_add_key to add all keys to the structure, which automatically implements the classification and inspection of normal keys, keys with forward wildcards, and keys with back wildcards, and stores these values in the corresponding fields, and then you can check keys, dns_wc_ Head, dns_wc_tail Whether the three arrays are empty, to decide whether to build a normal hash table, forward wildcard hash table and back wildcard hash table (in the construction of these three types of hash table, you can use keys, dns_wc_head, dns_wc_tail three arrays).

Once you've built these three hash tables, you can combine them in ngx_hash_combined_t object and use the ngx_hash_find_combined to find them. Or still keep three separate variables corresponding to the three hash tables, and decide for yourself when and in which hash table to query.

    ngx_int_t ngx_hash_keys_array_init(ngx_hash_keys_arrays_t *ha, ngx_uint_t type);

Initializing this structure, primarily by initializing the ngx_array_t type of field in the structure, successfully returns NGX_OK.

  • ha: The object pointer to the structure.

  • type: The field has 2 values to choose from, namely NGX_HASH_SMALL and NGX_HASH_LARGE. U sed to indicate the type of hash table to be established, and if NGX_HASH_SMALL is used, there are a smaller number of buckets and array element sizes. NGX_HASH_LARGE is the opposite.
    ngx_int_t ngx_hash_add_key(ngx_hash_keys_arrays_t *ha, ngx_str_t *key,
    void *value, ngx_uint_t flags);

Typically, this function is called in a loop, adding a set of key value pairs to the structure. R eturn NGX_OK is joined successfully. Returning NGX_BUSY means that the key value repeats.

  • ha: The object pointer to the structure.

  • Key: The parameter name explains itself.

  • value: The parameter name is self-explained.

  • flags: There are two flag bits that can be set, NGX_HASH_WILDCARD_KEY and NGX_HASH_READONLY_KEY. A t the same time to set the use of logic and operators on it. N GX_HASH_READONLY_KEY is set, the value of key is not converted to a small-case character when the hash value is calculated, otherwise it will be. N GX_HASH_WILDCARD_KEY the key is set, the key may contain wildcards that will be processed accordingly. If neither flag bit is set, pass 0.

For the use of this data structure, you can refer to the ngx_http.c function in src/http/ngx_http.c

ngx_chain_t

Nginx's filter module is processing data passed from another filter module or handler module (in effect, the http response that needs to be sent to the client). T his passed data is in the form of a list (ngx_chain_t). A nd the data may be passed over multiple times. That is, the handler of the filter is called multiple times, ngx_chain_t.

The structure is defined src/core/ngx_buf.h|c Let's take a look at ngx_chain_t definition.

    typedef struct ngx_chain_s       ngx_chain_t;

    struct ngx_chain_s {
        ngx_buf_t    *buf;
        ngx_chain_t  *next;
    };

For 2 fields, next points to the next node of the list. B uf points to the actual data. So it's also very easy to append nodes on this list, as long as you point the next pointer of the end element to the new node and assign the next value of the new node to NULL.

    ngx_chain_t *ngx_alloc_chain_link(ngx_pool_t *pool);

The function creates an object ngx_chain_t object and returns a pointer to the object, which fails to return NULL.

    #define ngx_free_chain(pool, cl)                                             \
        cl->next = pool->chain;                                                  \
    pool->chain = cl

The macro frees an object ngx_chain_t type. If you want to release the entire chain, iterate over the list, using this macro for each node.

Note: The release of type ngx_chaint_t does not really free up memory, but simply hangs the object on a chain corresponding to a field called chain for this pool object for the next time you allocate a ngx_chain_t type object from this pool, the first element of the chain is quickly removed from the pool-and-chain and returned, of course, if the chain is empty, it will actually be allocated using the ngx_palloc function on this pool.

ngx_buf_t

This ngx_buf_t is the ngx_chain_t data for each node of this list. T he structure is actually an abstract data structure that represents some kind of specific data. This data may point to a buffer in memory, it may point to a part of a file, or it may be pure metadata (metadata is used to indicate that the reader of the list handles the read data differently).

The data structure is src/core/ngx_buf.h|c file. Let's look at its definition.

    struct ngx_buf_s {
        u_char          *pos;
        u_char          *last;
        off_t            file_pos;
        off_t            file_last;

        u_char          *start;         /* start of buffer */
        u_char          *end;           /* end of buffer */
        ngx_buf_tag_t    tag;
        ngx_file_t      *file;
        ngx_buf_t       *shadow;

        /* the buf's content could be changed */
        unsigned         temporary:1;

        /*
         * the buf's content is in a memory cache or in a read only memory
         * and must not be changed
         */
        unsigned         memory:1;

        /* the buf's content is mmap()ed and must not be changed */
        unsigned         mmap:1;

        unsigned         recycled:1;
        unsigned         in_file:1;
        unsigned         flush:1;
        unsigned         sync:1;
        unsigned         last_buf:1;
        unsigned         last_in_chain:1;

        unsigned         last_shadow:1;
        unsigned         temp_file:1;

        /* STUB */ int   num;
    };
  • pos: When the data that the buf points to is in memory, the pos points to where the data begins.

  • last: When the data that the buf points to is in memory, last points to where the data ends.

  • file_pos: When the data that the buf points to is in the file, file_pos points to the offset in the file where the data started.

  • file_last: When the data that the buf points to is in the file, file_last points to the offset in the file where the data ends.

  • start: When the data to which the buf points is in memory, the entire block of memory may contain content that may be contained in multiple bufs (for example, if other data is inserted in the middle of a piece of data, which needs to be split). T he start and end in these bufs then point to the start and end addresses of this block of memory. Pos and last point to the beginning and end of the data actually contained in this buf.

  • end: Explain see start.

  • tag: It's actually a void * type, and the consumer can associate any object up as long as it makes sense to the consumer.

  • file: When the buf contains content in the file, the file field points to the corresponding file object.

  • Shadow: When this buf completely copys all the fields of another buf, the two bufs actually point to the same piece of memory, or to the same part of the same file, at which point to each other. Then for such two bufs, in the release time, the user needs to be particularly careful, specifically by where to release, to consider in advance, if the resources are released multiple times, may cause program crash!

  • Temporary: At 1, the buf contains content in a block of memory created by a user and can be changed during filter processing without causing problems.

  • memory: 1 states that the buf contains content that is in memory, but that the content cannot be changed by the filter being processed.

  • mmap: 1 states that the buf contains content that is in memory, that is mapped from the file to memory by mmap using a memory map, and that the content cannot be changed by the filter that is being processed.

  • Recycled: Recyclable. T hat is, this buf can be released. This field is typically used with the shadow field, which can be used to indicate that the buf can be released for a buf created with the ngx_create_temp_buf function and another buf for shadow.

  • in_file: 1 states that the buf contains content in the file.

  • flush: When you encounter a buf chain with a flush field set to 1, the data for that chain, even if it is not the last data (the last_buf is set to indicate that everything to be output is complete), is output, not limited by the postpone_output configuration, but by other conditions such as the send rate.

  • last_buf: The data is passed to the filter with multiple chains, and a field of 1 indicates that this is the last buf.

  • last_in_chain: In the current chain, this buf is the last. I t is important last_in_chain the buf of the last_buf is not necessarily last_buf, but the buf of the last_in_chain must be true. This is because the data is passed to a filter module in multiple chains.

  • last_shadow: When you create a shadow of a buf, you typically set the last_shadow of the newly created buf to 1.

  • temp_file: Because of memory usage limitations, sometimes some buf content needs to be written to a temporary file on disk, then this flag is set.

For the creation of this object, you can assign it ngx_pool_t a file, and then assign a value to the corresponding field as needed. You can also use the defined 2 macros:

    #define ngx_alloc_buf(pool)  ngx_palloc(pool, sizeof(ngx_buf_t))
    #define ngx_calloc_buf(pool) ngx_pcalloc(pool, sizeof(ngx_buf_t))

It is self-evident that these two macros use similar functions.

For creating a buf with a temporary field of 1( that is, its contents can be modified by subsequent filter modules), you can create it directly ngx_create_temp_buf function.

    ngx_buf_t *ngx_create_temp_buf(ngx_pool_t *pool, size_t size);

The function creates an object ngx_buf_t type and returns a pointer to the object, creating a failed return NULL.

For this object that was created, its start and end point to where the newly allocated memory started and ended. both pos and last point to the beginning of this newly allocated memory so that subsequent operations can store data on this newly allocated memory.

  • Pool: The pool used to allocate the memory used by the buf and buf.
  • size: The size of the memory used by the buf.

To complement the ngx_buf_t, Nginx defines the following macro-convenient operations.

    #define ngx_buf_in_memory(b)        (b->temporary || b->memory || b->mmap)

Returns whether the contents of this buf are in memory.

    #define ngx_buf_in_memory_only(b)   (ngx_buf_in_memory(b) && !b->in_file)

Returns whether the contents of this buf are only in memory and not in the file.

    #define ngx_buf_special(b)                                                   \
        ((b->flush || b->last_buf || b->sync)                                    \
         && !ngx_buf_in_memory(b) && !b->in_file)

Returns whether the buf is a special buf that contains only special flags and does not contain real data.

    #define ngx_buf_sync_only(b)                                                 \
        (b->sync                                                                 \
         && !ngx_buf_in_memory(b) && !b->in_file && !b->flush && !b->last_buf)

Returns whether the buf is a special buf that contains only the sync flag and not real data.

    #define ngx_buf_size(b)                                                      \
        (ngx_buf_in_memory(b) ? (off_t) (b->last - b->pos):                      \
                                (b->file_last - b->file_pos))

Returns the size of the data contained in the buf, whether it is in a file or in memory.

ngx_list_t

ngx_list_t as the name implies, it looks like a list of data structures. S uch a statement is not true. Because it conforms to some of the characteristics of the list type data structure, such as the possible addition of elements, self-growth, not like the array type of data structure, limited by the initial set array capacity, and it is the same as our common list type data structure, the internal implementation uses a list.

So what's the difference between it and the list we use for our common list implementation? The difference is that its nodes, unlike our common list nodes, hold only one element, and the nodes of ngx_list_t are actually a fixed-size array.

At initialization, we need to set the size of the space that the element needs to take up, the capacity of each node array. When an element is added to the list, the element is added to the array in the node at the very end, and if the array of the node is full, a new node is added to the list.

Well, see here, you should basically understand this list structure, right? It doesn't matter, let's take a look at its definitions, which are defined in src/core/ngx_list.h|c file.

    typedef struct {
        ngx_list_part_t  *last;
        ngx_list_part_t   part;
        size_t            size;
        ngx_uint_t        nalloc;
        ngx_pool_t       *pool;
    } ngx_list_t;
  • last: Points to the last node of the list.
  • part: The first node of the list to hold specific elements.
  • size: The amount of memory required for specific elements stored in the list.
  • nalloc: The capacity of a fixed-size array contained in each node.
  • Pool: The pool that the list uses to allocate memory.

Okay, let's look at the definition of each node.

    typedef struct ngx_list_part_s  ngx_list_part_t;
    struct ngx_list_part_s {
        void             *elts;
        ngx_uint_t        nelts;
        ngx_list_part_t  *next;
    };
  • elts: The start address of the memory in the node where the specific element is stored.

  • nelts: The number of elements already in the node. This value cannot be greater than the ngx_list_t the nalloc field in the type.

  • next: Point to the next node.

Let's take a look at the function of an operation provided.

    ngx_list_t *ngx_list_create(ngx_pool_t *pool, ngx_uint_t n, size_t size);

The function creates an object ngx_list_t type and allocates memory space for the element to the first node of the list.

  • Pool: The pool used to allocate memory.

  • n: The length of the ngx_list_part_t of each node (or node) fixed length, i.e. the maximum number of elements that can be stored.

  • size: The amount of memory each element occupies.

  • Return value: Successfully returns a pointer to the ngx_list_t object that was created, and fails to return NULL.
    void *ngx_list_push(ngx_list_t *list);

The function appends an element at the end of a given list and returns a pointer to the space where the new element is stored. If the append fails, NULL is returned.

    static ngx_inline ngx_int_t
    ngx_list_init(ngx_list_t *list, ngx_pool_t *pool, ngx_uint_t n, size_t size);

The function is used for an object of type ngx_list_t that already exists, but its first node holds the memory space of the element but has not yet been allocated, you can call this function to allocate the memory space of the element to the first node of the list.

So when will there be an ngx_list_t type that already has a new type, and its first node holds an element whose memory has not yet been allocated? T hat is, ngx_list_t of this type of variable was not created ngx_list_create called the function. For example, if a member variable of a structure is type ngx_list_t, the member variable is created when an object of that structure type is created, but the memory of the storage element at its first node is not allocated.

In short, if this ngx_list_t-type variable, if not created by you by calling the function ngx_list_create, you must call this function to initialize, otherwise you can append elements to this list, or the program will crash!

ngx_queue_t

ngx_queue_t is a bidirectional list in Nginx, in ngx_queue.h|c src/core The prototype is as follows:

    typedef struct ngx_queue_s ngx_queue_t;

    struct ngx_queue_s {
        ngx_queue_t  *prev;
        ngx_queue_t  *next;
    };

Unlike textbooks, which declare the data members of a list node in the structure of a list node, ngx_queue_t simply declares forward and back pointers. When using, we first need to define a Sentinel node (which we call a data node for subsequent specific data storage), such as:

    ngx_queue_t free;

The next step is initialization, which is achieved ngx_queue_init macro ():

    ngx_queue_init(&free);

ngx_queue_init () are defined as follows:

    #define ngx_queue_init(q)     \
        (q)->prev = q;            \
        (q)->next = q

It is visible that both the prev and next of the Sentinel node point to themselves at the beginning, so it is actually an empty list. ngx_queue_empty () can be used to determine whether a list is empty, and its implementation is simple:

    #define ngx_queue_empty(h)    \
        (h == (h)->prev)

So how do you declare a list node with data elements? J ust add a member of the ngx_queue_t the corresponding structure. For example ngx_http_upstream_keepalive_module in ngx_http_upstream_keepalive_cache_t:

    typedef struct {
        ngx_http_upstream_keepalive_srv_conf_t  *conf;

        ngx_queue_t                        queue;
        ngx_connection_t                  *connection;

        socklen_t                          socklen;
        u_char                             sockaddr[NGX_SOCKADDRLEN];
    } ngx_http_upstream_keepalive_cache_t;

For each such data node, you can add ngx_queue_insert_head() to the list, the first parameter is the Sentinel node, and the second parameter is the data node, for example:

    ngx_http_upstream_keepalive_cache_t cache;
    ngx_queue_insert_head(&free, &cache.queue);

Several macros are defined as follows:

    #define ngx_queue_insert_head(h, x)                         \
        (x)->next = (h)->next;                                  \
        (x)->next->prev = x;                                    \
        (x)->prev = h;                                          \
        (h)->next = x

    #define ngx_queue_insert_after   ngx_queue_insert_head

    #define ngx_queue_insert_tail(h, x)                          \
        (x)->prev = (h)->prev;                                   \
        (x)->prev->next = x;                                     \
        (x)->next = h;                                           \
        (h)->prev = x

ngx_queue_insert_head () and ngx_queue_insert_after () are nodes added to the head, ngx_queue_insert_tail are added nodes to the tail. F rom the code, you can see that the prev of the Sentinel node points to the tail data node of the list and next points to the head data node of the list. In ngx_queue_head the two macros ngx_queue_last () and "() can get the head node and the tail node, respectively.

So how do you get the data ngx_queue_t if you now have a ngx_queue_t ngx_queue_t *q that points to the queue member of the data node in ngx_http_upstream_keepalive_cache_t list? Nginx provides a ngx_queue_data() macro to get ngx_http_upstream_keepalive_cache_t the value, such as:

 ngx_http_upstream_keepalive_cache_t *cache = ngx_queue_data(q,
                              ngx_http_upstream_keepalive_cache_t,
                                                     queue);

Perhaps you can already guess that ngx_queue_data is obtained by subtracting the address:

    #define ngx_queue_data(q, type, link)                        \
        (type *) ((u_char *) q - offsetof(type, link))

Nginx also provides ngx_queue_remove() macros to remove a data node from a list, and ngx_queue_add() to add one list to another.