May 23, 2021 Nginx Getting started
1. Analysis of the filter module
3. The response head filter function
4. The response body filter function
ngx_chain_t structure is very simple and is a one-way list:
typedef struct ngx_chain_s ngx_chain_t;
struct ngx_chain_s {
ngx_buf_t *buf;
ngx_chain_t *next;
};
In the filter module, all output is made up of a one-way list. T his one-way list is designed to coincide with the Nginx streaming output mode. E ach time Nginx reads a part of it, it's put on the list and output. The benefit of this design is simple, non-blocking, but the corresponding problem is that cross-list content operation is very cumbersome, if you need cross-list, many times can only cache the contents of the list.
The single-link load is ngx_buf_t, which is very widely used, let's look at the code of the structure:
struct ngx_buf_s {
u_char *pos; /* 当前buffer真实内容的起始位置 */
u_char *last; /* 当前buffer真实内容的结束位置 */
off_t file_pos; /* 在文件中真实内容的起始位置 */
off_t file_last; /* 在文件中真实内容的结束位置 */
u_char *start; /* buffer内存的开始分配的位置 */
u_char *end; /* buffer内存的结束分配的位置 */
ngx_buf_tag_t tag; /* buffer属于哪个模块的标志 */
ngx_file_t *file; /* buffer所引用的文件 */
/* 用来引用替换过后的buffer,以便当所有buffer输出以后,
* 这个影子buffer可以被释放。
*/
ngx_buf_t *shadow;
/* the buf's content could be changed */
unsigned temporary:1;
/*
* the buf's content is in a memory cache or in a read only memory
* and must not be changed
*/
unsigned memory:1;
/* the buf's content is mmap()ed and must not be changed */
unsigned mmap:1;
unsigned recycled:1; /* 内存可以被输出并回收 */
unsigned in_file:1; /* buffer的内容在文件中 */
/* 马上全部输出buffer的内容, gzip模块里面用得比较多 */
unsigned flush:1;
/* 基本上是一段输出链的最后一个buffer带的标志,标示可以输出,
* 有些零长度的buffer也可以置该标志
*/
unsigned sync:1;
/* 所有请求里面最后一块buffer,包含子请求 */
unsigned last_buf:1;
/* 当前请求输出链的最后一块buffer */
unsigned last_in_chain:1;
/* shadow链里面的最后buffer,可以释放buffer了 */
unsigned last_shadow:1;
/* 是否是暂存文件 */
unsigned temp_file:1;
/* 统计用,表示使用次数 */
/* STUB */ int num;
};
A typical buffer structure can represent a piece of memory, with the start and end addresses of memory represented by start and end, respectively, and pos and last representing the actual content. I f the content has already been processed, the position of the pos can be moved back. I f you read the new content, the last position moves back. S o buffer can be used during multiple calls. I f last equals end, this memory is used up. I f pos equals last, the memory has been processed. Here's a simple diagram of how pointers are used in buffer:
The primary use of the response head filter function is to process the header of the HTTP response, which can be modified or deleted depending on the situation. The response head filter function predates the response body filter function and is called only once, so it is generally available for initialization of the filter module.
There is only one entry to the response head filter function:
ngx_int_t
ngx_http_send_header(ngx_http_request_t *r)
{
...
return ngx_http_top_header_filter(r);
}
The function is called when it sends a reply to the client, and then executed in the order described in the previous section. The return value of the function is typically NGX_OK, NGX_ERROR, NGX_AGAIN, which indicate successful, failed, and unfinished processing, respectively.
You can think of the WAY of storing the HTTP response header as a hash table, where individual response heads can be easily found and modified inside Nginx, where the ngx_http_header_filter_module filter module combines all the HTTP heads into a complete buffer, and eventually the ngx_http_write_filter_module filter module outputs the buffer.
In the order of the filter modules in the previous section, the following are explained in turn:
filter module | description |
---|---|
ngx_http_not_modified_filter_module | Open by default, if the requested if-modified-since is equal to the last-modified inter-value of the reply, the reply does not change, empty the contents of all replies, and return 304. |
ngx_http_range_body_filter_module | Turned on by default, just the response body filter function, which supports the range feature, and if the request contains a range request, send only a section of the range request. |
ngx_http_copy_filter_module | Always open, just the response body filter function, the main job is to read the contents of the file into memory for processing. |
ngx_http_headers_filter_module | Always on, you can set the expire and Cache-control heads, and you can add any name header |
ngx_http_userid_filter_module | Off by default, you can add statistically identifiable cookies. |
ngx_http_charset_filter_module | Off by default, you can add charsets, or you can convert content from one character set to another, and multi-byte character sets are not supported. |
ngx_http_ssi_filter_module | Turn off by default, filter SSI requests, and you can initiate sub-requests to get include in files |
ngx_http_postpone_filter_module | Always open to merge the output chains of child and primary requests |
ngx_http_gzip_filter_module | Off by default, supports streaming compressed content |
ngx_http_range_header_filter_module | Open by default, just the response header filter function, which resolves the range header and produces the header of the range response. |
ngx_http_chunked_filter_module | Opens by default, and replies to HTTP/1.1 and missing content-length open automatically. |
ngx_http_header_filter_module | Always open to make all headers into a complete HTTP header. |
ngx_http_write_filter_module | Always open, copy the output chain to r-gt;out, and then output the content. |
The response body filter function is a function that filters the response body. ngx_http_top_body_filter this function may be executed more than once per request, and its entry function is ngx_http_output_filter, such as:
ngx_int_t
ngx_http_output_filter(ngx_http_request_t *r, ngx_chain_t *in)
{
ngx_int_t rc;
ngx_connection_t *c;
c = r->connection;
rc = ngx_http_top_body_filter(r, in);
if (rc == NGX_ERROR) {
/* NGX_ERROR may be returned by any filter */
c->error = 1;
}
return rc;
}
ngx_http_output_filter can be called by a normal static processing module, or it can be called inside an upstream module, and for the entire request processing phase, they are in the same use, that is, the response content filtered, and then sent to the client.
The format of the response body filter function for a specific module is similar:
static int
ngx_http_example_body_filter(ngx_http_request_t *r, ngx_chain_t *in)
{
...
return ngx_http_next_body_filter(r, in);
}
The return value of the function is typically NGX_OK, NGX_ERROR, NGX_AGAIN, which indicate successful, failed, and unfinished processing, respectively.
The body content of the response is stored on a single list in, which is generally not too long, and sometimes the in parameter may be NULL. I n a bug structure, the buf size defaults to 32K for static files, or 4k or 8k for applications with reverse agents. I n order to keep memory consumption low, Nginx generally does not allocate too much memory, the principle of processing is to receive a certain amount of data, sent out. A simple example is Nginx's chunked_filter module, which can be streamed plus length without content-length, allowing the browser to receive and display content.
In the response body filtering module, it is important to note the flag bit of the buf, the full description can be seen in the "Related Structures" section. I f the buf contains the last flag, which means that the last buf can be output directly and the request is ended. I f there is a flush flag, this buf needs to be output immediately and cannot be cached. If the entire buffer has been processed and there is no data, you can put the sync flag on the buffer to indicate that it is only synchronous.
When all the filtering modules are processed, in the last write_fitler module, Nginx copies the in output chain to the end of the r-gt;out output chain and then calls sendfile or writev interface output. B ecause Nginx is a non-blocking socket interface, writes are not necessarily successful, and some of the data may still be residual. On the next call, Nginx continues to attempt to send until it succeeds.
One of the features of the Nginx filtering module is that you can make sub-requests, that is, when filtering the response content, you can send new requests, Nginx will be based on the order in which you call, the content of multiple replies stitched together into a normal response body. A simple example can be referred to the other module.
How does Nginx guarantee the order of parent and child requests? W hen Nginx makes a child request, the ngx_http_subrequest function is called to insert the child request into the r-gt; S ubque requests are called in turn when the primary request is executed. Subsenter requests also have a lifetime and processing process for all requests, and they also go into the filter module process.
The key point is postpone_filter module, which stitches the response content of the main and sub-requests. T he r-and-projected, which holds parent and child requests sequentally, is a list that does not output if the previous request is not completed. When the current request is completed and output, the 1st request is output, and when all sub-requests are completed, all responses are output.
The structures involved in the Nginx filter module, mainly chain and buf, are very simple. In everyday filtering modules, these two types of structures are used very frequently, and Nginx uses a principle similar to freelist reuse to place the used chain or buf structure in a fixed free list for next use.
For example, in a common memory pool structure, the released chain is stored in the pool-chain variable. T he general buf structure, which does not have a common free list pool between modules, is stored in the cached free list pool of each module. For the buf structure, there is also a busy list that indicates that the bufs in the list are output and can be released and reused if the buf output is complete.
Function | The name of the function |
---|---|
Chain assignment | ngx_alloc_chain_link |
chain release | ngx_free_chain |
buf allocation | ngx_chain_get_free_buf |
buf release | ngx_chain_update_chains |
Because Nginx designs a streaming output structure, we have to cache some of the buf content when we need to filter the response in full. S uch filter modules are often complex, such as sub, ssi, gzip and so on. The design of these modules is very flexible, so let me briefly talk about the design principles:
The input chain in requires a copy operation, and through the cached filter module, the input and output chain is often completely different, so it needs to be copied and done through the ngx_chain_add_copy function.
You typically have your own free and busy cache list pools to improve buf allocation efficiency.
If you need to allocate large chunks of content, you typically allocate a fixed-size memory card and set the recycled flag to indicate that it can be reused.