Coding With Fun
Home Docker Django Node.js Articles Python pip guide FAQ Policy

Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!


Jun 01, 2021 Article blog


Table of contents


This article was reproduced from the public number: Java Geek Technology Author: Duck Blood Fans

Recently, a friend of A powder went out for an interview, came back to complain with A powder, the interviewer did not play cards according to the way, directly disrupted his rhythm.

Here's the thing, the previous interview asked a few Java-related questions, my friend answered well, and then the interviewer asked: It seems that the Java foundation is good, Java HashMap you are familiar with?

My friend replied. Work is often used, have seen the source code.

My friend thought, come on, this question is ready before, just ask it.

Who knows, the interviewer has the following sentence:

"Well, let's talk about the Redis dictionary."

Directly forced him all over the place.

A powder friend because he has not studied the Redis dictionary much, so this question directly answered do not know.

"Of course, if you really don't know in the interview, then answer don't understand, directly next question, don't mess around."

However, this question, A powder feel is still a pity, in fact, the Redis dictionary basic principles and HashMap is similar, then we can actually use the principle of this, do not ask for a full answer, but how can also get a passing score

The interview process really has to meet this question, we can answer from the following three aspects.

  • data structure
  • The element increase process
  • Expansion

Dictionary data structure

When it comes to dictionaries, they may be unfamiliar, but we all know how Redis itself provides KV queries, KV are actually saved through the dictionary at the bottom.

In addition, Redis supports a variety of data types, one of which is the Hash key, which can also be used to store KV data.

When A powder first learned about this data structure, it was thought that this was implemented using a dictionary. This is not the case, with the initial creation Has h key, which saves memory space by using another data structure by default, ZIPLIST (compressed list).

However, once any of the following conditions are met, the data structure of the Hash key becomes a dictionary to speed up queries.

  • The length of a key or value in the hash table is greater than server.hash_max_ziplist_value (the default is 64
  • The number of nodes in the compressed list is greater than server.hash_max_ziplist_entries (the default is 512

When the Redis dictionary is new, an array of hash tables is created by default, saving two hash tables.

Where ht[0] table allocates memory space the first time a key value is added to the dictionary, the other ht[1] will be expanded/shrinked below for space allocation.

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!1

The hash table in the dictionary is actually equivalent to Java HashMap we know that Java uses array plus linked list/ red and black tree implementation, in fact, hash table is also using a similar data structure.

The hash table structure looks like this:

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!2

The table property is an array where the array elements hold a dictEntry structure that is exactly like Entry type in HashMap which stores a KV key value pair.

At the same time, to solve the problem of hash collisions, dictEntry has a next pointer that points to the next dictEntry which forms a dictEntry list.

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!3

Now, when we look back at HashMap in Java, we can see that the data structure is basically the same.

It's just that HashMap uses a red-and-black data structure when there are too many list elements in JDK1.8 to slow down queries in order to resolve the problem of long lists.

Let's start by adding new elements to understand how this works.

The element increase process

When we add elements to a new dictionary, space is allocated by default for ht[0] table in the dictionary, which by default has an array size of 4 ("DICT_HT_INITIAL_SIZE").

The key value of the newly added element is hashed to determine the location of the hash table array and then added to the corresponding location, as shown in the figure:

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!4

Continue to add elements, and if two different keys pass through the hash algorithm to produce the same hash value, a hash collision occurs.

Suppose we now have three elements in the hash table:

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!5

Let's add a new element, and if a collision happens at this point in the position of array 3, Redis will resolve the hash collision in the form of a list.

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!6

"Note that the new element will be placed at the head node of the linked list, so that the new element will be accessed again at a high probability, and the head node will be placed to increase the speed of access."

Here we compare the element addition process and see that the Redis process is actually similar to HashMap version of JDK 1.7

As we increase the number of elements, hash collisions become more frequent, which can lead to long list lengths and, in extreme cases, O(1) query efficiency degradation into O(N) query efficiency.

To do this, the dictionary must be expanded so that the dictionary rehash operation is triggered.

Expansion

When Redis does Rehash expansion, it will first allocate more space for the dictionary that does not use ht[1] table.

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!7

Voiceover: ht[1] Hash table size is the first ht[0].used*2 (n-power of 2) greater than or equal to ht.use

All key-value pairs in ht[0] are then migrated to ht[1]

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!8

For simplicity, ignore pointing to an empty node

When all nodes are migrated, ht[0] footprint is freed up and ht[1] set to ht[0]

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!9

The expansion operation requires ht[0] all key-value pairs be rehash-to-ht, and if there are too many key values, assuming a billion key-value pairs exist, such a one-time migration will inevitably result in the server stopping service for a period of time. Rehash ht[1]

Also, if each rehash blocks the current operation, this is very unfriendly for client processing.

To avoid the impact of rehash on servers, Redis takes a gradual approach to migration, slowly spreading data migration across multiple operational steps.

This operation relies on a property in the rehashidx which is an index location counter that records elements on the next hash table array, with the default value of -1.

Suppose the pre-expansion dictionary is shown in the figure:

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!10

When the rehash operation starts, rehashidx will be set to "0".

Each time the add, delete, find, update commands are received during this period, in addition to the commands that will be executed, ht[0] table is subject to the element rehash at the rehashidx location to ht[1]

Assuming that a query operation with the "K3" key is received at this point, Redis first performs the query operation, and then Redis migrates all nodes on the table array rehashidx index on the ht[0] table to ht[1]

 Ali Interviewer: Is HashMap familiar? Okay, let's talk about the Redis dictionary!11

When the operation is complete, add the rehashidx property value by 1.

Finally, when all key-value pairs are rehash to ht[1] rehashidx is reset to -1.

While incremental rehash operations reduce the amount of work, they add complexity to key value operations.

This is because during the progressive rehash operation, Redis cannot know exactly whether the key is in ht[0] or in ht[1] so at this point Redis has to look for two hash tables.

In the case of lookups, Redis first ht[0] ht[1]

Adding operations isn't really that cumbersome, because ht[0] not going to work, so it's good to add unity to ht[1]

Finally, let's compare the Java HashMap expansion operation, which is a one-time operation that requires all key-value pairs to be migrated to the new array for each expansion, so if the amount of data is large, it will take a long time.

summary

The Redis dictionary uses hash tables as the underlying implementation, each containing two hash tables, one for normal use and one for rehash operations only.

Hash tables in general are really similar to Java HashMap and the underlying implementation is an array plus-link data structure.

Finally, when the hash table is expanded, a progressive rehash operation is taken to slowly migrate all key pairs to the new hash table.

In fact, to understand the principles of the Redis dictionary, and then compare Java HashMap you can actually find that the two have so many similarities.

So when learning this kind of knowledge, don't just go back, we have to understand its underlying principles, know why.

Above is W3Cschool编程狮 About Ali Interviewer: HashMap Familiar? O kay, let's talk about the Redis dictionary! Related to the introduction, I hope to help you.