The eternal issue about Object.hashCode() in Java

Posted by & filed under , .

Have you heard of hashCode() and equals() in Java and the eternal discussions around this? And you know by now the implications of having a messed up relationship in between hashCode and equals so you make sure every time you implement a class with an equals() you have to make sure hashCode() follows suit right?

So with that in mind you might think just like me: I will make sure that if I implement a Java bean then whatever fields I involve in equals() I will involve in hashCode() — JDK nowadays offers a method for computing hashcode via Objects.hash(...) so this makes it easy right? And as long as I do that with every bean then if I ever extend one of these bean classes all i have to care about is only the fields in the subclass bean and also make sure I delegate to the superclass bean and I will be fine.

At least that’s how I used to think…. until I encountered the issue below!

So as I said, we start with a simple Java bean class — let’s consider a simple class which has just an int member — which we will involve in a both hashCode() and equals():

public class BaseClass {
 private int intValue; 
 
 @Override
 public boolean equals(Object o) {
  if (this == o) return true;
  if (!(o instanceof BaseClass)) return false;
  BaseClass baseClass = (BaseClass) o;
  return intValue == baseClass.intValue;
 }
 
 @Override
 public int hashCode() {
  return Objects.hash(intValue);
 }
 
 public int getIntValue() {
  return intValue;
 }
 
 public void setIntValue(int intValue) {
  this.intValue = intValue;
 }
}

Anything wrong with that? Nope, all good right?

OK now for the next step I decide that I need to “specialize” this class — so I extend it a bit and create SpecializedChild class which also adds a String member. Of course, being conscientious about the implications of hashCode/equals I make sure I use this member in both as well as making sure I factor in the superclass’ hashCode/equals — something like this:

public class SpecializedChild extends BaseClass {
 private String stringValue;
 
 @Override
 public boolean equals(Object o) {
  if (this == o) return true;
  if (!(o instanceof SpecializedChild)) return false;
  if (!super.equals(o)) return false;
  SpecializedChild that = (SpecializedChild) o;
  return Objects.equals(stringValue, that.stringValue);
 }
 
 @Override
 public int hashCode() {
  return Objects.hash(super.hashCode(), stringValue);
 }
 
 public String getStringValue() {
  return stringValue;
 }
 
 public void setStringValue(String stringValue) {
  this.stringValue = stringValue;
 }
}

See this way not only am I checking for equality against the newly introduced stringValue member, but I also allow the base class to do its own checks — as well as contribute in the hash calculations.

And if you want I can take the subclassing one step further and create SubSpecializedChild which adds another member and uses the same principle to check its own member and use it for hash computation as well as passing control to the base class:

public class SubSpecializedChild extends SpecializedChild {
 private Locale locale;
 
 @Override
 public boolean equals(Object o) {
  if (this == o) return true;
  if (!(o instanceof SubSpecializedChild)) return false;
  if (!super.equals(o)) return false;
  SubSpecializedChild that = (SubSpecializedChild) o;
  return Objects.equals(locale, that.locale);
 }
 
 @Override
 public int hashCode() {
  return Objects.hash(super.hashCode(), locale);
 }
 
 public Locale getLocale() {
  return locale;
 }
 
 public void setLocale(Locale locale) {
  this.locale = locale;
 }
}

Anything wrong so far with this approach?

I bet you are inclined to say “nope” … just like I did! But here’s the thing: the problem is exactly with subclassing and specifically with passing the control also back to the super class!

To prove it, imagine I now go through an exercise of refactoring and decide that you know what that int member in BaseClass is irrelevant and does not contribute to the object “state” so it doesn’t make sense to include it in the equals or the hash computations. Since that is the only member in that class all I have to do is simply remove the equals() and hashCode() implementation right and rely on whatever Object class is offering — so I end up with this:

public class BaseClass {
 private int intValue; 
 
 public int getIntValue() {
  return intValue;
 }
 
 public void setIntValue(int intValue) {
  this.intValue = intValue;
 }
}

OK you say all good it makes sense. But not you start noticing pretty soon that your equals() in the SubSpecialized class don’t work anymore! And the same for hashCode(). Even worse, a piece of code like this:

        SubSpecializedChild child1 = new SubSpecializedChild();
        child1.setLocale(new Locale(("en")));
 
        SubSpecializedChild child2 = new SubSpecializedChild();
        child2.setLocale(new Locale(("en")));
 
        Set set = new HashSet<>();
 
        System.out.println("EQUALS : " + child1.equals(child2));
        System.out.println("HASH 1 : " + child1.hashCode());
        System.out.println("HASH 2 : " + child2.hashCode());
 
        set.add(child1);
        set.add(child2);
        System.out.println("SET :");
        set.forEach(entry -> System.out.println("ENTRY : " + entry));

which used to work fine before and would generate a Set with a single entry now ends up with … 2 entries! Even though they both store the same data!

The problem as I said is that now the call to hashCode() bubbles all the way up to Object.hashCode() which is random at best 🙂 So now despite all of your subclasses being so diligent about every single data member, it’s the super class of them all Object which screws it all up.

Moral of the story: in a hierarchy of Java beans be careful with the base class. And as much as it makes me sound like a broken record: write some unit tests for simple things like hashCode and equals — especially if you are going to use these in Set and Map instances (which make use of the object hashes).

More code for this on github here: https://github.com/liviutudor/java-object-hashcode

One Response to “The eternal issue about Object.hashCode() in Java”

  1. Kurt Clash

    Never ever base your hashCode on a mutable field. And: never ever use instanceof in equals – use getClass() == other.getClass(). Using instanceof leads to #equals not being symmetric when subclasses are in use.