OK so I felt like I needed to share this code with “the world” as I think it’s a good example of a few things:
- first of all nobody’s perfect — I know you read that in my website tag line 🙂 but yeah nobody’s perfect… sadly, in this instance, myself included 🙂
- secondly, it shows that the smallest (really really small as you will see!) mistake can have huuuge repercussions. This ought to be an incentive for most people to get a second pair of eyes on their code every now and then — I’m not talking pair programming and rigorous lengthy code reviews, but just someone else to spend half an hour maybe a week over your code. In this particular case, I was definitely experiencing “code fatigue” — so I was staring at the same piece of code which screams obvious mistake to everyone with a fresh pair of eyes but it did not occur to me until I took a break of a few days then looked back at the code and spotted it right away.
- thirdly, it should discourage everyone to re-invent the wheel. In my case, simply using something like the Guava library and their ThreadFactoryBuilder would have avoided this from the beginning.
- and last but not least, to the devs out there looking after plugins such as FindBugs, PMD and the likes — this sort of issue would never get caught by these tools (at the moment at least!) so perhaps might be something to add to these tools.
And now, after this (rather majestic and) long intro, let’s look at what is this small costly mistake I’m talking about.
To put everything in context first, I’m debugging a multi-threaded application; this beast has tons of threads, I’m not kidding. And I’m only interested in one 🙂 In other words, I have a one particular piece of code executing in one of the threads in this application and I want to debug it. The problems is that occasionally with so many threads in my application and so many breakpoints, it’d difficult to identify which thread is running the piece of code I’m interested in. My code uses a thread pool of about 50 threads and submits tasks running this piece of code to the thread pool — so it’s difficult to figure out which thread is which at times.
So to make it easier on myself during my debugging sessions in IntelliJ, I’ve decided to use a ThreadFactory
and pass that in the call to Executors.newCachedThreadPool()
— the ThreadFactory
will simply name my threads some name which would make it easier for me to recognize. (In fact it will use a “stem” and a counter — so names will be something like MyThread.001, MyThread.002, MyThread.003 and so on…)
Easily done, I went ahead and implemented my ThreadFactory
then I notice an oddity: none of my breakpoints are executing anymore. I suspect first an issue with IntelliJ, then with my build and then with the libraries. So to reduce the scope of this investigation, I created a simple standalone project — which you can find on GitHub here: https://github.com/liviutudor/SmallJavaMistake
In this code, I have only 3 classes — though it can be easily written in one single file — in order to simplify the setup and help me track down the issue:
- a DummyRunnable — this is just a simple
Runnable
implementation which just prints a message tostdout
then exits; - the main program (Library.java) — this is the driver: it creates a thread pool (in this case a fixed thread pool of 5 threads), it passes in a thread factory (the one mentioned below) and then submits 100 tasks to the thread pool and waits for it for 100 seconds to complete the work. (Since the tasks submitted are all
DummyRunnable
‘s, 100 seconds should be plenty for most laptops/computers to actually run these but feel free to play with that delay and you’ll see it doesn’t have any impact.) - a thread factory (NamedThreadFactory) — this is the thread factory used to name the threads created (so I can identify them easily in my IDE as per above). If you look at the source code, this is nothing fancy: the thread factory accepts a base name (stem) and a format string, it then uses these 2 and a static counter to build up the thread name. This allows us to have an incremental id for each thread created (the static counter) as well as another more human-readable string and then “paste” the 2 together to build a thread name and set it on a newly created thread:
@Override public Thread newThread(Runnable r) { String threadName = buildName(); System.out.println("Creating new thread with name " + threadName); return new Thread(threadName); } |
On top of these, all these classes print out some more debugging messages to stdout
so I can easily find out what’s going on when I run the program just looking at the output. Now let’s run this code (you can run it from gradle directly):
lgml-ltudor3QC:SmallJavaMistake > ./gradlew run :compileJava UP-TO-DATE :processResources UP-TO-DATE :classes UP-TO-DATE :run Submitting 0 Creating new thread with name LivTest.001 Submitting 1 Creating new thread with name LivTest.002 Submitting 2 Creating new thread with name LivTest.003 Submitting 3 Creating new thread with name LivTest.004 Submitting 4 Creating new thread with name LivTest.005 Submitting 5 Submitting 6 Submitting 7 Submitting 8 Submitting 9 Submitting 10 Submitting 11 Submitting 12 Submitting 13 Submitting 14 Submitting 15 Submitting 16 Submitting 17 Submitting 18 Submitting 19 Submitting 20 Submitting 21 Submitting 22 Submitting 23 Submitting 24 Submitting 25 Submitting 26 Submitting 27 Submitting 28 Submitting 29 Submitting 30 Submitting 31 Submitting 32 Submitting 33 Submitting 34 Submitting 35 Submitting 36 Submitting 37 Submitting 38 Submitting 39 Submitting 40 Submitting 41 Submitting 42 Submitting 43 Submitting 44 Submitting 45 Submitting 46 Submitting 47 Submitting 48 Submitting 49 Submitting 50 Submitting 51 Submitting 52 Submitting 53 Submitting 54 Submitting 55 Submitting 56 Submitting 57 Submitting 58 Submitting 59 Submitting 60 Submitting 61 Submitting 62 Submitting 63 Submitting 64 Submitting 65 Submitting 66 Submitting 67 Submitting 68 Submitting 69 Submitting 70 Submitting 71 Submitting 72 Submitting 73 Submitting 74 Submitting 75 Submitting 76 Submitting 77 Submitting 78 Submitting 79 Submitting 80 Submitting 81 Submitting 82 Submitting 83 Submitting 84 Submitting 85 Submitting 86 Submitting 87 Submitting 88 Submitting 89 Submitting 90 Submitting 91 Submitting 92 Submitting 93 Submitting 94 Submitting 95 Submitting 96 Submitting 97 Submitting 98 Submitting 99 java.util.concurrent.ThreadPoolExecutor@3d4eac69[Running, pool size = 5, active threads = 5, queued tasks = 95, completed tasks = 0] > Building 75% > :run java.util.concurrent.ThreadPoolExecutor@3d4eac69[Running, pool size = 5, active threads = 5, queued tasks = 95, completed tasks = 0] BUILD SUCCESSFUL Total time: 1 mins 43.174 secs lgml-ltudor3QC:SmallJavaMistake > |
WTF???? My code submits 5 tasks to the thread pool, the thread pool invokes the thread factory to create 5 threads (which is expected, our thread pool is meant to have 5 threads), but it seems none of them gets executed! The reason for that is that we end up submitting all the 100 tasks and at the end of this our thread pool reports that it has 5 active tasks and 95 in the queue. However, none of these tasks get executed as there’s nothing on stdout
! Even more, after the 100 seconds wait, the thread pool reports the same thing: 95 tasks queued and 5 active — so no state change takes place during these 100 seconds! Odd, wouldn’t you say?
I’ve run this at this point on a couple of other laptops to eliminate any oddity with my configuration / laptop / JVM and I end up with the same result. So the issue is not JVM, not the laptop — it’s the code!
Now, where is the issue?
To identify it, I thought I’d plug in the Guava library — so I’ve re-written this line in Library.java
:
ExecutorService exec = Executors.newFixedThreadPool(threads, new NamedThreadFactory("LivTest")); |
as follows:
ExecutorService exec = Executors.newFixedThreadPool(threads, new ThreadFactoryBuilder().setNameFormat("LivTest").build()); |
And when you run this — it works!!!
As such, the problem must be with my thread factory 🙁 So I went and looked again — and looked again … and again… That code is minimal what could be wrong with it? Believe it or not I had to take a break away from this code and then when I came back and looked again I spotted it: the problem is on line 64:
return new Thread(threadName); |
This thing just builds a thread with a name — and no runnable!!! As such our thread just sits there and never finishes — the thread pool thinks this thread is busy so keeps queuing tasks for it but never gets to actually execute any of the other tasks. Simply writing it like this solves the issue:
Thread t = new Thread(r); t.setName(threadName); return t; |
Like I said, really small mistake but because it’s one tiny line that makes it occasionally difficult to spot it — and sadly, none of the automated tools I have catch this. Small bug, huge implications — and tons of frustrations on my side 🙁
The learning from this I guess is that every line of code matters — so I need to pay more attention to these details.
Glad you found the culprit 🙂