Collection Sorting — Java vs Groovy

Posted by & filed under , .

Lots of digits on the computer screen.

With the introduction of lambdas in Java (not so) recently, some argue that Groovy lost some of its thunder, as closures are now first class citizens in the JDK. However, as I’m about to show, while lambda’s pushed the Java language a great deal forward, Groovy still makes a lot of things incredibly easy (and very succinct to write).

Take for instance something as simple as sorting a list. Now if you’re using primitive data types (Strings, integers etc) you get that out of the box in Java — and a simple call to Collections.sort() solves the problem. What happens though if you decide to use your own Java bean?

Well then if you have a couple of options in Java:

  • have your bean implement Comparable
  • use a Comparator implementation

The first one comes in handy when there is only one possible way to compare your java beans. For instance, if your class encapsulates only one String — but offers other methods around it — then the only way to sort it is likely based on the encapsulated String, so Comparable makes sense.

If we consider your class to encapsulate just a person name, the code could look something like this:

public class Person implements Comparable<Person> {
   private String name;
 
   public String getName() { return name; }
 
   public void setName( String name ) { this.name = name; }
 
   public int compareTo( Person p ) {
      if( p == null ) return 1; // everything greater than null!
      if( name == null ) {
         if( p.name == null ) return 0;
         else return -1;
      }
      return name.compareTo( p.name );
   }
}

Once you have this in place then whenever you need a List<Person> sorted simply call Collections.sort() and the natural order you have implemented via compareTo will be applied.

Now if your Java bean encapsulates a few field, there might be instances where sorting by different fields (or combination of them) is necessary. At this stage, Comparable doesn’t help as you can only provide one single implementation for the comparison this way!

Let’s consider the class Person now but with 3 fields: first name, surname and age. You can easily envisage cases where we need to sort by age, as well as cases where we need to sort by surname + first name. We can do this easily by passing different Comparator‘s in the call to Collections.sort() — and this is where lambda’s play a crucial role in writing compact code:

Collections.sort( myList, (p1, p2) -> {
   if( p1 == null ) return -1;
   if( p2 == null ) return 1;
   return p1.getAge() - p2.getAge();
});

The above code sorts the collections by age (in ascending order). You can easily implement a comparison based on surname via:

Collections.sort( myList, (p1, p2) -> {
   if( p1 == null ) return -1;
   if( p2 == null ) return 1;
   if( p1.surname == null ) return -1;
   if( p2.surname == null ) return 1;
   return p1.surname.compareTo(p2.surname);
});

There are in fact libraries out there that allows you to eliminate a lot of the boilerplate code around checking for nulls and so on. However, this is a huge step ahead from previous JDK versions where you had to supply an anonymous inner class (implementing Comparator) — or even worse before that: have to actually declare a class which implements the Comparator interface and instantiate that.

However, I would argue this is still a loooong way from Groovy! 🙂

The Comparator approach still requires us to use a closure in Groovy in the call to Collections.sort(), however, the safe dereference operator makes it nicer to write the code:

Collections.sort( myList, { p1, p2 ->
   p1?.surname?.compareTo(p2?.surname);
})

As for Comparable, Groovy offers a nice Sortable annotation which allows us to get out of the box sorting based on the order we declare the fields in the class:

@Sortable class Person {
   String firstName
   String surname
   int age
}

The above would always sort the instances on first name first, then on surname then on age. And if we are in the habit of continuously refactoring our classes and don’t want to rely on the order of fields in the class, then the @Sortable annotation allows us to customize this:

@Sortable(includes=['age', 'surname', 'firstName']) class Person {
   String firstName
   String surname
   int age
}

This code would now allow sorting by age first, then surname then first name.

You tell me now which one is more elegant — not to mention easier to write! 🙂