Jun 01, 2021 Article blog
Rewrite
equals
and rewrite
hashCode?
I
t's not just an interview question, it's a question about whether our code is robust and right.
This article takes you from the bottom to analyze the meaning of
hashcode
method rewriting and how to implement it.
Let's first review
the Object's equals method
implementation and briefly summarize the laws that use the
equals
method.
public boolean equals(Object obj) {
return (this == obj);
}
From
Object
source code of Object above, one concludes that if a class does not override
equals
method, the effect of the comparison between the
equals
The first two articles talk about the differences between
String
and
Integer
in comparison, and the key point is their implementation of the
equals
method.
The interview concludes that, by default, the
equals
method inherited from the
Object
class is exactly equivalent to the """
But we can override
equals
method so that it is compared as needed, such as
String
class overrides the equals method, comparing the sequence of
equals
instead of the memory address.
So what does the
equals
method have to do with the
hashCode
method?
Let's look at a comment
Object上equals
method on Object.
Note that it is generally necessary to override the hashCode method whenever this method is overridden, so as to maintain the general contract for the hashCode method, which states that equal objects must have equal hash codes.
The general meaning is that it is necessary to override the
hashCode
method when the
equals
method is overridden to ensure that the convention of "the same object must have the same hash value" in the
hashCode
method is not violated.
Here's just a reminder of the need to rewrite the
hashCode
method, so what is the
hashCode
method design convention mentioned in it?
The relevant content is defined in the annotation section of the
hashCode
method.
There are more conventions about
hashCode
methods, you can see directly at the source code, here is a summary, a total of three:
(1) If the parameters of the object being compared using
equals
method are not modified, the hash returned by calling an object's
hashCode()
method so many times should be the same.
(2) If the two objects are equal by
equals
method comparison, then the
hashCode
method that requires the two objects should also return equal values.
(3) If the comparison between two objects is different by
equals
method, then the
hashCode
method of the two objects is not required to return a different value.
However, we should know that generating different hash values for different objects can improve performance for hash tables (HashMap, etc.).
In fact, we've seen the implementation protocol for
hashCode
here, but it's still not clear why implementing
equals
method requires rewriting
hashCode
method.
But we can come up with a rule: One of the things that the
hashCode
method actually has to do is return the same hash value for the
equals
method that is identified as the same object.
In fact, the hash table is mentioned in the above statute, which is one of the scenarios used in
hashCode
approach, and the core of why we should rewrite it.
If you know the data structure of
HashMap
you know that it uses the hash code of the "key object" and when we call
put
method or
get
method to operate on the
Map
container, we calculate the storage location based on the hash code of the key object.
If we don't have a guarantee of a hash code acquisition, we may not get the expected results.
The hash code of the object is obtained by
hashCode
method.
If the method is not implemented in a custom class, the
hashCode()
method in
Object
is used.
In
Object
the method is a local method that returns a hash value of the
int
type.
This can be done by converting the internal address of an object to an integer, but there is no mandatory requirement in
Java
to do so.
There are different statements on the implementation network, ranging from a built-in address conversion to an "OpenJDK8 default
hashCode
calculation method obtained by using a random number associated with the current thread plus three determining values, using a random number obtained by
Marsaglia's xorshift scheme
random number algorithm".
Regardless of the default implementation, in most cases the
equals
method cannot be met, and
hashCode
results are the same.
For example, there is a big gap between the following example overrides and not.
public void test1() {
String s = "ok";
StringBuilder sb = new StringBuilder(s);
System.out.println(s.hashCode() + " " + sb.hashCode());
String t = new String("ok");
StringBuilder tb = new StringBuilder(s);
System.out.println(t.hashCode() + " " + tb.hashCode());
}
The above code prints the results as:
3548 1833638914
3548 1620303253
String
hashCode
method, and
StringBuilder
does not, which makes
hashCode
different even if the values are the same.
In the previous example, the problem was less obvious, so let's take
HashMap
as an example and see what serious consequences would be if the
hashCode
approach were not implemented.
@Test
public void test2() {
String hello = "hello";
Map<String, String> map1 = new HashMap<>();
String s1 = new String("key");
String s2 = new String("key");
map1.put(s1, hello);
System.out.println("s1.equals(s2):" + s1.equals(s2));
System.out.println("map1.get(s1):" + map1.get(s1));
System.out.println("map1.get(s2):" + map1.get(s2));
Map<Key, String> map2 = new HashMap<>();
Key k1 = new Key("A");
Key k2 = new Key("A");
map2.put(k1, hello);
System.out.println("k1.equals(k2):" + s1.equals(s2));
System.out.println("map2.get(k1):" + map2.get(k1));
System.out.println("map2.get(k2):" + map2.get(k2));
}
class Key {
private String k;
public Key(String key) {
this.k = key;
}
@Override
public boolean equals(Object obj) {
if (obj instanceof Key) {
Key key = (Key) obj;
return k.equals(key.k);
}
return false;
}
}
The internal class
Key
is defined in the instance, where the
equals
method is implemented, but the
hashCode
method is not implemented.
value
values stored in
Map
are all string "hellos".
The code is divided into two paragraphs, the first showing what happens when
Map
key
implements
String
and the second showing what happens when
Map
key
passes through
Key
objects that do not implement
hashCode
method.
hashCode
Executing the above code, the print results are as follows:
s1.equals(s2):true
map1.get(s1):hello
map1.get(s2):hello
k1.equals(k2):true
map2.get(k1):hello
map2.get(k2):null
The analysis shows that for
String
as
key
s1 and s2, it is natural to compare equals through
equals
and the resulting values are the same. B
ut k1 and k2 are equal by
equals
comparison, but why do you get different results in
Map
Essentially, the failure to override
hashCode
method
Map
to call the
hashCode
method during storage and acquisition to obtain inconsistent values.
At this point, add the
hashCode
method to the
Key
class:
@Override
public int hashCode(){
return k.hashCode();
}
Do it again to get the corresponding value normally.
s1.equals(s2):true
map1.get(s1):hello
map1.get(s2):hello
k1.equals(k2):true
map2.get(k1):hello
map2.get(k2):hello
The potential consequences of not rewriting
hashCode
method are demonstrated through the typical example above.
Take a quick look at the
put
method in
HashMap
public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
// 通过哈希值来查找底层数组位于该位置的元素p,如果p不为null,则使用新的键值对来覆盖旧的键值对
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
// (二者哈希值相等)且(二者地址值相等或调用equals认定相等)。
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
// 如果底层数组中存在传入的Key,那么使用新传入的覆盖掉查到的
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}
In the above method, the
put
method calls the
hashCode
method on the key object as soon as it
key
the
key
Don't look at the code that follows, if you don't override the
hashCode
method, you can't make sure that the
hash
values of
key
are consistent, and the next action is two
key
operations.
Now that you understand the importance of rewriting the
hashCode
method and the corresponding protocol, let's talk about how to gracefully rewrite
hashCode
method.
First, if you're using
IDEA
you can use shortcuts directly.
The results are as follows:
@Override
public boolean equals(Object o) {
if (this == o) {
return true;
}
if (o == null || getClass() != o.getClass()) {
return false;
}
Key key = (Key) o;
return Objects.equals(k, key.k);
}
@Override
public int hashCode() {
return Objects.hash(k);
}
The internal implementation of the generated method can be modified as needed. T
he
java.util.Objects
class is used in the example above, and the advantage of its
hash
method is that if the argument is
nul
l, only 0 is returned, otherwise the result of
hashCode
called by the object parameter is returned.
Objects.hash
method source code is as follows:
public static int hash(Object... values) {
return Arrays.hashCode(values);
}
The
Arrays.hashCode
method source code is as follows:
public static int hashCode(Object a[]) {
if (a == null)
return 0;
int result = 1;
for (Object element : a)
result = 31 * result + (element == null ? 0 : element.hashCode());
return result;
}
Of course, there is only one parameter here, and you can also use
Objects
class
hashCode
method directly:
public static int hashCode(Object o) {
return o != null ? o.hashCode() : 0;
}
The first method is recommended if more than one property is involved in the
hash
value.
It is only important to note that when the class structure (member variables) changes, the parameter values in the synchronous increase or decrease method are synchronized.
As we prepare for the interview, we've been reciting "Implementing
equals
approach while implementing
hashCode
approach", bearing in mind that there's nothing wrong with these conclusions. B
ut we can't forget why these interview questions are so frequent because we're in a hurry to prepare them.
When you dig deeper, you'll find that there are so many points of knowledge behind those boring conclusions, and so many interesting designs and pitfalls.
Source: www.toutiao.com/a6865829963505861132/
That's
W3Cschool编程狮
question about Java interview: rewrite the equals, and rewrite hashCode?
Related to the introduction, I hope to help you.