 Computer Science Atlas
Code Review

C Examples: Radix Tree (Trie) Implementation from nginx

October 29, 2020|Last Updated March 31, 2021

Context

Web servers like nginx allow the admin to block access from certain client IP addresses and/or IP address ranges (e.g., 192.168.0.0/16). On every request, the server checks if the current client's IP address is marked as blocked, so it can decide whether to deny access. The lookup must be quick in order for nginx to maintain fast response time performance, which means it needs to solve the following problem:

How do you efficiently store and lookup IP addresses and ranges?

nginx's choice is to store the bit string representations of IP addresses in a radix tree (a.k.a. trie) data structure. A bit string radix tree (a.k.a. bit string trie) is a trie where each internal node has up to two children. When traversing down the trie, you move:

• left if the next bit in the string is 0; and
• right if the next bit in the string is 1.

This also allows IP address ranges to be stored in the trie, since an IP address range can be expressed as the front part of an IP address.

For example, the IPv4 address 192.168.1.54 is simply a number composed of 4 bytes: the first byte has value 192, the second byte has value 168, the third byte has value 1, and the last byte has value 54. In other words, the dots separate the bytes. Since each byte is 8 bits, an IPv4 address is simply a 32-bit number. 192.168.1.54 can be represented as the following bit string (the spaces are simply included for readability):

Out
 1100 0000 1010 1000 0000 0001 0011 0110

Similarly, an IP address range 192.168.0.0/16 can be represented as a 16-bit number:

Out
 1100 0000 1010 1000

Both bit strings can easily be stored in a bit string radix tree and efficiently accessed on each web request.

Here is an animation of how a bit string is inserted into a bit string radix tree:

Code

nginx has 32-bit radix tree functions for IPv4 addresses and 128-bit functions for IPv6 addresses. We'll examine the 32-bit functions here, but the 128-bit functions are conceptually the same.

Data Structures

C
 1 2 3 4 5 6 7 typedef struct { ngx_radix_node_t *root; ngx_pool_t *pool; ngx_radix_node_t *free; char *start; size_t size; } ngx_radix_tree_t;

C

Search

C
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 uintptr_t ngx_radix32tree_find(ngx_radix_tree_t *tree, uint32_t key) { uint32_t bit; uintptr_t value; ngx_radix_node_t *node; bit = 0x80000000; value = NGX_RADIX_NO_VALUE; node = tree->root; while (node) { if (node->value != NGX_RADIX_NO_VALUE) { value = node->value; } if (key & bit) { node = node->right; } else { node = node->left; } bit >>= 1; } return value; }

Iterating Bit-by-Bit

In the function above, bit is initialized to 0x80000000, which in binary form is just a 1 followed by 31 0s:

Out
 10000000000000000000000000000000

On each iteration of the while loop, line 24 (bit >>= 1) moves the 1 one position to the right. Thus, if we think of bit as a bit array of length 32, the index is at 0 on the first while loop iteration, then at index 1, then at index 2, and so on:

Out
 Iteration bit ------------------------------------------------- 0 10000000000000000000000000000000 1 01000000000000000000000000000000 2 00100000000000000000000000000000 3 00010000000000000000000000000000 4 00001000000000000000000000000000

Thus if the 1 in bit is at index 0, on line 17, key & bit tells us whether key index 0 is a 1 or 0. Then on the next iteration, key & bit tells us whether key index 1 is a 1 or 0. And so on.

To summarize, this is just a way of iterating on each bit of key from left to right.

Insert

C
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 ngx_int_t ngx_radix32tree_insert(ngx_radix_tree_t *tree, uint32_t key, uint32_t mask, uintptr_t value) { uint32_t bit; ngx_radix_node_t *node, *next; bit = 0x80000000; node = tree->root; next = tree->root; while (bit & mask) { if (key & bit) { next = node->right; } else { next = node->left; } if (next == NULL) { break; } bit >>= 1; node = next; } if (next) { if (node->value != NGX_RADIX_NO_VALUE) { return NGX_BUSY; } node->value = value; return NGX_OK; } while (bit & mask) { next = ngx_radix_alloc(tree); if (next == NULL) { return NGX_ERROR; } next->right = NULL; next->left = NULL; next->parent = node; next->value = NGX_RADIX_NO_VALUE; if (key & bit) { node->right = next; } else { node->left = next; } bit >>= 1; node = next; } node->value = value; return NGX_OK; }

Delete

C
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 ngx_int_t ngx_radix32tree_delete(ngx_radix_tree_t *tree, uint32_t key, uint32_t mask) { uint32_t bit; ngx_radix_node_t *node; bit = 0x80000000; node = tree->root; while (node && (bit & mask)) { if (key & bit) { node = node->right; } else { node = node->left; } bit >>= 1; } if (node == NULL) { return NGX_ERROR; } if (node->right || node->left) { if (node->value != NGX_RADIX_NO_VALUE) { node->value = NGX_RADIX_NO_VALUE; return NGX_OK; } return NGX_ERROR; } for ( ;; ) { if (node->parent->right == node) { node->parent->right = NULL; } else { node->parent->left = NULL; } node->right = tree->free; tree->free = node; node = node->parent; if (node->right || node->left) { break; } if (node->value != NGX_RADIX_NO_VALUE) { break; } if (node->parent == NULL) { break; } } return NGX_OK; }

nginx 1.19.3