Jun 01, 2021 Article blog
This article was reproduced from the public number: Java Geek Technology Author: Duck Blood Fans
Recently, a friend of A powder went out for an interview, came back to complain with A powder, the interviewer did not play cards according to the way, directly disrupted his rhythm.
Here's the thing, the previous interview asked a few
Java-related
questions, my friend answered well, and then the interviewer asked: It seems that
the Java
foundation is good,
Java HashMap
you are familiar with?
My friend replied. Work is often used, have seen the source code.
My friend thought, come on, this question is ready before, just ask it.
Who knows, the interviewer has the following sentence:
"Well, let's talk about the Redis dictionary." 」
Directly forced him all over the place.
A powder friend because he has not studied the Redis dictionary much, so this question directly answered do not know.
"Of course, if you really don't know in the interview, then answer don't understand, directly next question, don't mess around." 」
However, this question, A powder feel is still a pity, in fact,
the Redis
dictionary basic principles and
HashMap
is similar, then we can actually use the principle of this, do not ask for a full answer, but how can also get a passing score
The interview process really has to meet this question, we can answer from the following three aspects.
When it comes to dictionaries, they may be unfamiliar, but we all know how
Redis
itself provides
KV
queries,
KV
are actually saved through the dictionary at the bottom.
In addition,
Redis
supports a variety of data types, one of which is the
Hash
key, which can also be used to store
KV
data.
When A powder first learned about this data structure, it was thought that this was implemented using a dictionary.
This is not the case, with the initial creation
Has
h key, which saves memory space by using another data structure by default,
ZIPLIST
(compressed list).
However, once any of the following conditions are met, the data structure of the
Hash
key becomes a dictionary to speed up queries.
server.hash_max_ziplist_value
(the default is
64
server.hash_max_ziplist_entries
(the default is
512
When the Redis dictionary is new, an array of hash tables is created by default, saving two hash tables.
Where
ht[0]
table allocates memory space the first time a key value is added to the dictionary, the other
ht[1]
will be expanded/shrinked below for space allocation.
The hash table in the dictionary is actually equivalent to
Java HashMap
we know that Java uses array plus linked list/ red and black tree implementation, in fact, hash table is also using a similar data structure.
The hash table structure looks like this:
The
table
property is an array where the array elements hold a
dictEntry
structure that is exactly like
Entry
type in
HashMap
which stores a KV key value pair.
At the same time, to solve the problem of hash collisions,
dictEntry
has a next pointer that points to the next
dictEntry
which forms a
dictEntry
list.
Now, when we look back at
HashMap
in
Java,
we can see that the data structure is basically the same.
It's just that
HashMap
uses a red-and-black data structure when there are too many list elements in JDK1.8 to slow down queries in order to resolve the problem of long lists.
Let's start by adding new elements to understand how this works.
When we add elements to a new dictionary, space is allocated by default for
ht[0]
table in the dictionary, which by default has an array size of 4
("DICT_HT_INITIAL_SIZE").
The key value of the newly added element is hashed to determine the location of the hash table array and then added to the corresponding location, as shown in the figure:
Continue to add elements, and if two different keys pass through the hash algorithm to produce the same hash value, a hash collision occurs.
Suppose we now have three elements in the hash table:
Let's add a new element, and if a collision happens at this point in the position of array 3, Redis will resolve the hash collision in the form of a list.
"Note that the new element will be placed at the head node of the linked list, so that the new element will be accessed again at a high probability, and the head node will be placed to increase the speed of access." 」
Here we compare the element addition process and see that the Redis process is actually similar to
HashMap
version of
JDK 1.7
As we increase the number of elements, hash collisions become more frequent, which can lead to long list lengths and, in extreme cases, O(1) query efficiency degradation into O(N) query efficiency.
To do this, the dictionary must be expanded so that the dictionary
rehash
operation is triggered.
When Redis does Rehash expansion, it will first allocate more space for the dictionary that does not use
ht[1]
table.
❝
Voiceover:
ht[1]
Hash table size is the firstht[0].used*2
(n-power of 2) greater than or equal to ht.use❞
All key-value pairs in
ht[0]
are then migrated to
ht[1]
For simplicity, ignore pointing to an empty node
When all nodes are migrated,
ht[0]
footprint is freed up and
ht[1]
set to
ht[0]
The expansion operation requires
ht[0]
all key-value pairs be rehash-to-ht, and if there are too many key values, assuming a billion key-value pairs exist, such a one-time migration will inevitably result in the server stopping service for a period of time.
Rehash
ht[1]
Also, if each
rehash
blocks the current operation, this is very unfriendly for client processing.
To avoid the impact of
rehash
on servers, Redis takes a gradual approach to migration, slowly spreading data migration across multiple operational steps.
This operation relies on a property in the
rehashidx
which is an index location counter that records elements on the next hash table array, with the default value
of -1.
Suppose the pre-expansion dictionary is shown in the figure:
When the rehash operation starts,
rehashidx
will be set to
"0".
Each time the add, delete, find, update commands are received during this period, in addition to the commands that will be executed,
ht[0]
table is subject to the element rehash at the
rehashidx
location to
ht[1]
Assuming that a query operation with the
"K3"
key is received at this point, Redis first performs the query operation, and then Redis migrates all nodes on the
table
array
rehashidx
index on the
ht[0]
table to
ht[1]
When the operation is complete, add the
rehashidx
property value by 1.
Finally, when all key-value pairs are
rehash
to
ht[1]
rehashidx
is reset to -1.
While incremental rehash operations reduce the amount of work, they add complexity to key value operations.
This is because during the progressive
rehash
operation, Redis cannot know exactly whether the key is in
ht[0]
or in
ht[1]
so at this point Redis has to look for two hash tables.
In the case of lookups, Redis first
ht[0]
ht[1]
Adding operations isn't really that cumbersome, because
ht[0]
not going to work, so it's good to add unity to
ht[1]
Finally, let's compare the Java HashMap expansion operation, which is a one-time operation that requires all key-value pairs to be migrated to the new array for each expansion, so if the amount of data is large, it will take a long time.
The Redis dictionary uses hash tables as the underlying implementation, each containing two hash tables, one for normal use and one for
rehash
operations only.
Hash tables in general are really similar to
Java
HashMap
and the underlying implementation is an array plus-link data structure.
Finally, when the hash table is expanded, a progressive
rehash
operation is taken to slowly migrate all key pairs to the new hash table.
In fact, to understand the principles of the Redis dictionary, and then compare
Java HashMap
you can actually find that the two have so many similarities.
So when learning this kind of knowledge, don't just go back, we have to understand its underlying principles, know why.
Above is
W3Cschool编程狮
About
Ali Interviewer: HashMap Familiar? O
kay, let's talk about the Redis dictionary!
Related to the introduction, I hope to help you.