May 31, 2021 Article blog
National Day plus Mid-Autumn Festival has passed, are you ready to learn?
If redis is used in a project, it is primarily used as a cache
Since it is used as a cache, there is bound to be a cache penetration/cache breakdown/cache avalanche problem
This article says how to deal with this situation when it comes to it
First let's figure out what cache penetration is. These three words are so similar that you have to figure out if the concept is not
In fact, just literally speaking, probably can also know a little, cache penetration, is directly through the cache, the request to the database to go
In general, to query the data, the cache should be there, but can not prevent hackers ah, if the hacker request query is the database does not exist in the data, the database does not have data, the cache will certainly not have, right, then the request will hit our database inside, this is the cache penetration
You think, hackers want to attack, how can only request once, must be a large number of requests to come, because the database does not exist in the id to request, then these requests no doubt directly hit the database above, then our database may be because these a large number of requests directly down
How to solve it?
Let's go back to the scenario that created this problem, why do so many requests hit the database? Because there is no corresponding key in the cache, it crosses the cache directly to the database
Then the problem is solved, there is no corresponding key in the cache? OK, if this key database doesn't have it, then I'm in redis, I'm in this key, the value is null, so if there's a request to query this key, I'll go straight back to null and I'm done, so I don't have to hit the database
Note that it's usually three to five minutes to remember to set its expiration time
But the other side is a hacker, may use a key to request? He might send requests with a lot of keys in a short period of time, so if a key were to store a null value in redis, would so many keys store so many null values?
In that case, is there a value of null in redis?
So is there a better solution?
That has to be there! Blom filter, you're worth trying
What is a blom filter? It just tells you that a value must not exist or may exist (emmmm), and I don't know if I've made it clear
So you can cache the contents of the database to the Bloem filter, so that when a large number of requests come over, redis inside there is no, it doesn't matter, and then go to the Blom filter filter, so that the request does not have to hit the database above, you can determine whether there is in this key database
Doesn't that reduce the pressure on the database, but it's a genius?
Cache breakdown is that, in a high concurring situation, if multiple requests are querying for a key, unfortunately, this key has failed for some reason (such as the set expiration time, the cache server went down), which causes so many requests to hit the database directly
Then if the number of these requests is large enough, the database may be wiped out directly
Knowing what's causing the results, it's good to find a solution
Not because many requests hit the database, but they are only a key request, so here can be implemented with an exclusion lock, the first request to reach the request key found cache inside does not, allow it to go to the database query, while locking, so that the second request, the third request ... will be blocked to the present and will no longer hit the database, which reduces the concurrent pressure on the database
Cache avalanche, avalanche avalanche, it is more serious, breakdown is a key failure situation, avalanche refers to the occurrence of large-scale cache failure, this is possible, for example, my cache server down, that is not directly on the large-scale cache failure; Or, I was trying to save trouble, many key settings of the expiration time are the same, and then just when the cache is invalid, a lot of requests different keys came over
Solution, in fact, is not suitable for the use of locked way to solve, because this is a lot of requests different key, it is not a
And, we are because many key settings of the expiration time are the same, so the solution is, we do not set the same time to let the cache invalidate, let's give a random time, let the cache random invalidate, so that the large-scale cache failure situation is reduced a lot
What if my cache server goes down directly? Also good to do, to a cluster on the solution, here is just a solution, its landing implementation is not the focus of this article ha
OK, if you see here, in fact, the content of this article is finished
But I feel the Blom filter that piece, I did not say clearly, so here to take out to say in detail (I know you must be silently quaal powder is a warm man, good, just know, don't really say it, I will be shy.)
The Blom filter is a data structure, a probability-type data structure that tells you that "something must not exist or may exist"
You might say, this just didn't say it, it's quite tongue-in-cheek, you said
Not because this sentence is more important, I think this sentence to understand thoroughly, then the Blom filter should understand should also be in place
Come on, in order to bring the image to life, let's give an example. The Blom filter is a bit vector or bit array, probably long like this:
Now, we need to store the "AliPay" field. The approximate stored procedure is to map the values, generate multiple hash values using multiple different hash functions, and then set the bit to 1 for each generated hash value
To give, for example, we'll now map the "AliPay" value through three different hash functions, and that's probably it:
Again, now that I'm storing another value, "WechatPay," I might map it like this:
Careful you may find that the value of position 4, at first not to "AliPay", and then "WechatPay" is also there, so that the value is not covered
Well, yes, it's covered
Next, we query "Ali" Then after the query, the Blom filter may give you a value of "0,1,2", and the result is that the position of "2" is 0, indicating that no value is mapped to this location, so we can determine that there is no "Ali" value in the database
Then if I query "AliPay", there is no doubt that it will definitely return to me "1,4,6", so can we say that there must be "AliPay" in the database? No, because the value of "1,4,6" may have been overwritten by other values, we can only say that there may be "AliPay" in the database.
That's what the Blom filter says, "A value must not exist or may exist."
Honey, do you understand?
This article comes from the public number: Java Geek Technology Author: Duck Blood Fans
Above is
W3Cschool编程狮
About Cache Penetration/Cache Breakdown/Cache Avalanche It's enough to read this article.
I hope to help you.