Java Strings vs Ropes

Posted by & filed under , , , .

EasyRidinDukeRecently, in my news feeds I stumbled across this implementation of Ropes in Java (http://ahmadsoft.org/ropes/) and it captured my attention right away as it boasts a faster (better?) alternative for dealing with characters in Java. Unfortunately the website doesn’t offer any comparative information regarding how much faster a Rope implementation is than the standard String implementation, so I set off to write some stupid simple tests to get some basic measurements for this library.

The idea behind my tests was rather simple: at first glance I want to find out how much time does it take to create, concatenate to and find in a Rope vs a String. I’ve published the sources for these tests in GitHub and you can see for yourself in this repository: https://github.com/liviutudor/ropesvstrings. It’s once again a maven-ized Java project, and I’ve set up the pom.xml so you can run the test suite with maven — simply execute

mvn exec:java

and it will run the com.liviutudor.App class which executes the tests for String and Rope implementation once. Also, since the String implementation is known not to be the best way to go about concatenating strings in Java, I’ve also used a StringBuilder test as well. (I chose StringBuilder since StringBuffer is synchronized, and as such inherently slower than a StringBuilder — we all know that….right? :D) Since one run is not entirely conclusive, I’ve put together a shell script (run.sh) which runs the whole suite 5 times — the idea being to average the numbers over these 5 executions. (As a side note, I’ve used the Perf4J library to get the timings — while I haven’t configured any nice output formats, by default this will dump out all the timers to console and I’ve decided for the benefit of the little tests I was running this is enough — however, feel free to configure fancy logging and what-not if you want to perform more in-depth analysis of this.)

One note about the Ropes library though: I couldn’t find it in the standard maven repos, as such there are 2 approaches to compiling the code and running it.

1. Using the <systemPath> inside <dependency>

This refers to specifying the dependency using the system scope and point <systemPath> to the lib/ropes.jar that I have included in the github repo:

<dependency>
<groupId>org.ahmadsoft</groupId>
<artifactId>ropes</artifactId>
<version>1.2.5</version>
<scope>system</scope>
<systemPath>${project.basedir}/lib/ropes.jar</systemPath>
</dependency>

This works for compiling the code, however, when using the exec:java mojo in maven, it fails to find the jar and it requires some ninja-style pom changes which I wasn’t prepare to go through for the purpose of this. (Feel free to though and send me a pull request perhaps? 😉 )

2. Installing the jar in your local repo

This is much easier and it simply involves (as per instructions here) running first a command like the following in the project directory:

mvn install:install-file -Dfile=lib/ropes.jar -DgroupId=org.ahmadsoft -DartifactId=ropes -Dversion=1.2.5 -Dpackaging=jar

and then referencing the jar “normally” (I opted as you can tell for “org.ahmadsoft” as groupId and “ropes” as artifactId as you can see):

<dependency>
<groupId>org.ahmadsoft</groupId>
<artifactId>ropes</artifactId>
<version>1.2.5</version>
</dependency>

The latter approach makes mvn exec:java a breeze — and as such I recommend it.

Getting back to running the tests — I’ve put together an Excel worksheet too with the results — included in the github repo too, but I’m going to present the results here as well and go through them. If you look at the Excel sheet yourself, bear in mind I’m not that good with it, but I’ve highlighted the final numbers in yellow. And converted them to nanoseconds as well, since the value was too small occasionally to be represented in miliseconds. (As a reminder: the values in the left hand side in the Excel sheet are the values for 100,000 iterations summed up — see the code!)

So, on average, it seems that creation time is comparable in between Ropes and Strings — slightly slower though for Ropes — while StringBuilder is around twice as slow. However, what StringBuilder and Rope doesn’t get in creation time, it does more than make up when it comes to concatenating! Whereas a String takes 21ns to concatenate a char — a Rope takes only 1.1ns!! (I’ll come back to why the numbers for StringBuilder char concatenation are not that relevant in this test and why they are constant at 0.06 ns.) The fastest string concatenation though still remains StringBuilder at 0.16 ns per operation — with the Rope slightly behind at 24ns (String is way behind at 3,770 ns — in other words 3.7 miliseconds!). However, this again is not that conclusive because Ropes are immutable — which means during the test, at each concatenation, a new Rope gets created, whereas with the StringBuilder occasionally, memory gets re-assigned and copied. (Note to self: think of a better test for these 2!) Finally though, when it comes to finding, Rope finally pays the price for all the other niceties: 5.69 ns per find operation versus 0.4 with a String (0.8 in a StringBuilder!).

Getting back to the StringBuilder and concatenation of a single char — it occurred to me after I ran the test that because of the way I construct the initial StringBuilder this starts with 100 chars in the buffer and an extra capacity of 16 chars, this means when the 16th append() occurs, the buffer will be expanded to 116 * 2 = 232, but since I only append 1 char 100 times the second expansion never occurs, hence the very low times on this.

As a final conclusion, this seems indeed to be a very fast implementation — and preferable to String by far! Not convinced yet it’s preferable to StringBuilder, but as I highlighted above some of my tests are flawed. Also, if you are looking for immutable solutions, then StringBuilder is not an option and as such Rope wins hands down. One thing is for sure though — when it comes to searching, reverting back to a String instance provides the fastest time no matter what. So prefer StringBuilder.toString().indexOf() versus StringBuilder.indexOf()… hmmm… interesting, huh?