Small Yet So Costly Java Mistake

Posted by & filed under , , .

DukeWithHelmetOK so I felt like I needed to share this code with “the world” as I think it’s a good example of a few things:

  • first of all nobody’s perfect — I know you read that in my website tag line 🙂 but yeah nobody’s perfect… sadly, in this instance, myself included 🙂
  • secondly, it shows that the smallest (really really small as you will see!) mistake can have huuuge repercussions. This ought to be an incentive for most people to get a second pair of eyes on their code every now and then — I’m not talking pair programming and rigorous lengthy code reviews, but just someone else to spend half an hour maybe a week over your code. In this particular case, I was definitely experiencing “code fatigue” — so I was staring at the same piece of code which screams obvious mistake to everyone with a fresh pair of eyes but it did not occur to me until I took a break of a few days then looked back at the code and spotted it right away.
  • thirdly, it should discourage everyone to re-invent the wheel. In my case, simply using something like the Guava library and their ThreadFactoryBuilder would have avoided this from the beginning.
  • and last but not least, to the devs out there looking after plugins such as FindBugs, PMD and the likes — this sort of issue would never get caught by these tools (at the moment at least!) so perhaps might be something to add to these tools.

And now, after this (rather majestic and) long intro, let’s look at what is this small costly mistake I’m talking about.

To put everything in context first, I’m debugging a multi-threaded application; this beast has tons of threads, I’m not kidding. And I’m only interested in one 🙂 In other words, I have a one particular piece of code executing in one of the threads in this application and I want to debug it. The problems is that occasionally with so many threads in my application and so many breakpoints, it’d difficult to identify which thread is running the piece of code I’m interested in. My code uses a thread pool of about 50 threads and submits tasks running this piece of code to the thread pool — so it’s difficult to figure out which thread is which at times.

So to make it easier on myself during my debugging sessions in IntelliJ, I’ve decided to use a ThreadFactory and pass that in the call to Executors.newCachedThreadPool() — the ThreadFactory will simply name my threads some name which would make it easier for me to recognize. (In fact it will use a “stem” and a counter — so names will be something like MyThread.001, MyThread.002, MyThread.003 and so on…)

Easily done, I went ahead and implemented my ThreadFactory then I notice an oddity: none of my breakpoints are executing anymore. I suspect first an issue with IntelliJ, then with my build and then with the libraries. So to reduce the scope of this investigation, I created a simple standalone project — which you can find on GitHub here: https://github.com/liviutudor/SmallJavaMistake

In this code, I have only 3 classes — though it can be easily written in one single file — in order to simplify the setup and help me track down the issue:

  • a DummyRunnable — this is just a simple Runnable implementation which just prints a message to stdout then exits;
  • the main program (Library.java) — this is the driver: it creates a thread pool (in this case a fixed thread pool of 5 threads), it passes in a thread factory (the one mentioned below) and then submits 100 tasks to the thread pool and waits for it for 100 seconds to complete the work. (Since the tasks submitted are all DummyRunnable‘s, 100 seconds should be plenty for most laptops/computers to actually run these but feel free to play with that delay and you’ll see it doesn’t have any impact.)
  • a thread factory (NamedThreadFactory) — this is the thread factory used to name the threads created (so I can identify them easily in my IDE as per above). If you look at the source code, this is nothing fancy: the thread factory accepts a base name (stem) and a format string, it then uses these 2 and a static counter to build up the thread name. This allows us to have an incremental id for each thread created (the static counter) as well as another more human-readable string and then “paste” the 2 together to build a thread name and set it on a newly created thread:
@Override
public Thread newThread(Runnable r) {
   String threadName = buildName();
   System.out.println("Creating new thread with name " + threadName);
   return new Thread(threadName);
}

On top of these, all these classes print out some more debugging messages to stdout so I can easily find out what’s going on when I run the program just looking at the output. Now let’s run this code (you can run it from gradle directly):

lgml-ltudor3QC:SmallJavaMistake > ./gradlew run
:compileJava UP-TO-DATE
:processResources UP-TO-DATE
:classes UP-TO-DATE
:run
Submitting 0
Creating new thread with name LivTest.001
Submitting 1
Creating new thread with name LivTest.002
Submitting 2
Creating new thread with name LivTest.003
Submitting 3
Creating new thread with name LivTest.004
Submitting 4
Creating new thread with name LivTest.005
Submitting 5
Submitting 6
Submitting 7
Submitting 8
Submitting 9
Submitting 10
Submitting 11
Submitting 12
Submitting 13
Submitting 14
Submitting 15
Submitting 16
Submitting 17
Submitting 18
Submitting 19
Submitting 20
Submitting 21
Submitting 22
Submitting 23
Submitting 24
Submitting 25
Submitting 26
Submitting 27
Submitting 28
Submitting 29
Submitting 30
Submitting 31
Submitting 32
Submitting 33
Submitting 34
Submitting 35
Submitting 36
Submitting 37
Submitting 38
Submitting 39
Submitting 40
Submitting 41
Submitting 42
Submitting 43
Submitting 44
Submitting 45
Submitting 46
Submitting 47
Submitting 48
Submitting 49
Submitting 50
Submitting 51
Submitting 52
Submitting 53
Submitting 54
Submitting 55
Submitting 56
Submitting 57
Submitting 58
Submitting 59
Submitting 60
Submitting 61
Submitting 62
Submitting 63
Submitting 64
Submitting 65
Submitting 66
Submitting 67
Submitting 68
Submitting 69
Submitting 70
Submitting 71
Submitting 72
Submitting 73
Submitting 74
Submitting 75
Submitting 76
Submitting 77
Submitting 78
Submitting 79
Submitting 80
Submitting 81
Submitting 82
Submitting 83
Submitting 84
Submitting 85
Submitting 86
Submitting 87
Submitting 88
Submitting 89
Submitting 90
Submitting 91
Submitting 92
Submitting 93
Submitting 94
Submitting 95
Submitting 96
Submitting 97
Submitting 98
Submitting 99
java.util.concurrent.ThreadPoolExecutor@3d4eac69[Running, pool size = 5, active threads = 5, queued tasks = 95, completed tasks = 0]
> Building 75% > :run
java.util.concurrent.ThreadPoolExecutor@3d4eac69[Running, pool size = 5, active threads = 5, queued tasks = 95, completed tasks = 0]
 
BUILD SUCCESSFUL
 
Total time: 1 mins 43.174 secs
lgml-ltudor3QC:SmallJavaMistake >

WTF???? My code submits 5 tasks to the thread pool, the thread pool invokes the thread factory to create 5 threads (which is expected, our thread pool is meant to have 5 threads), but it seems none of them gets executed! The reason for that is that we end up submitting all the 100 tasks and at the end of this our thread pool reports that it has 5 active tasks and 95 in the queue. However, none of these tasks get executed as there’s nothing on stdout! Even more, after the 100 seconds wait, the thread pool reports the same thing: 95 tasks queued and 5 active — so no state change takes place during these 100 seconds! Odd, wouldn’t you say?

I’ve run this at this point on a couple of other laptops to eliminate any oddity with my configuration / laptop / JVM and I end up with the same result. So the issue is not JVM, not the laptop — it’s the code!

Now, where is the issue?

To identify it, I thought I’d plug in the Guava library — so I’ve re-written this line in Library.java:

ExecutorService exec = Executors.newFixedThreadPool(threads, new NamedThreadFactory("LivTest"));

as follows:

ExecutorService exec = Executors.newFixedThreadPool(threads, new ThreadFactoryBuilder().setNameFormat("LivTest").build());

And when you run this — it works!!!

As such, the problem must be with my thread factory 🙁 So I went and looked again — and looked again … and again… That code is minimal what could be wrong with it? Believe it or not I had to take a break away from this code and then when I came back and looked again I spotted it: the problem is on line 64:

return new Thread(threadName);

This thing just builds a thread with a name — and no runnable!!! As such our thread just sits there and never finishes — the thread pool thinks this thread is busy so keeps queuing tasks for it but never gets to actually execute any of the other tasks. Simply writing it like this solves the issue:

Thread t = new Thread(r);
t.setName(threadName);
return t;

Like I said, really small mistake but because it’s one tiny line that makes it occasionally difficult to spot it — and sadly, none of the automated tools I have catch this. Small bug, huge implications — and tons of frustrations on my side 🙁

The learning from this I guess is that every line of code matters — so I need to pay more attention to these details.

One Response to “Small Yet So Costly Java Mistake”

  1. Kai

    Glad you found the culprit 🙂