SimpleDateFormat and Multiple Threads

iStock 000002692558XSmall 150x150 SimpleDateFormat and Multiple Threads photo    technology random thoughts blogroll  thread pool text formatting synchronization multithreading JDK Java formatting text date format I felt like I had to write this to spread the word a bit more about this little-known danger of one of the JDK classes: SimpleDateFormat. Used quite widely from what I can tell — who hasn’t written something like this:

SimpleDateFormat df = new SimpleDateFormat();
df.format( new Date() );

It’s easy and convenient, and pretty easy to understand — everyone reading the above code reads this: “create a date formatter and use it to format the given date”.

There is however a bit of a problem with that code — well, a potential problem, but it can be a pretty big one!

Let’s look at what the JavaDoc for SimpleDateFormat says about this class:

Synchronization

Date formats are not synchronized. It is recommended to create separate format instances for each thread. If multiple threads access a format concurrently, it must be synchronized externally. “

So, if you use the above piece of code in your multi-threaded application exactly as it’s shown above, chances are you’re ok, as long as each thread gets its own instance of SimpleDateFormat. However, if you decide to share an instance across all of your threads, you might just as well have a random string generated, since your threads are going to step on each other’s toes in using this instance.

Now let’s construct a simple example which shows what can happen if you miss-use this class — let’s consider a simple Java program which just shares a SimpleDateFormat across say 10 threads: each thread does 100 iterations, for each iteration it chooses 1 of 2 predefined dates to format using this shared SimpleDateFormat instance, such that it is possible at some point to have at least 2 threads formatting 2 different dates. (On the basis that 2 threads formatting the date Date instance using the shared SimpleDateFormat instance might step on each other’s data but since the dates are the same, the result of this data corruption might not be obvious to the user as the end result will seem correct.)

To do so, we create a helper class UseDateFormat — this takes 2 Date instances as parameters and a Format object (which will be set to an instance of SimpleDateFormat). This class simply prepares a thread pool Executor which will run 10 threads in parallel, each thread formatting one of the 2 dates 100 times. (Granted, not the best example, but good enough to cause at least 1 data corruption during execution!):

class UseDateFormat {
 /** Used for formatting output. */
 private Format                       df;
 
 /** First date to format. */
 private Date                         date1;
 
 /** Second date to format. */
 private Date                         date2;
 
 /**
  * Stores the results of formatting the dates, an entry for each thread
  * iteration.
  */
 private List                 results;
 
 /** Executor service used to run all the formatting. */
 private ExecutorService              executor;
 
 /** List of prepared tasks. */
 private List<Callable<List>> tasks;
 
 /**
  * Constructs an object which will format the 2 dates alternatively
  * using the supplied formatter. This constructor prepares a list of
  * tasks to be submitted to the executor so it's all ready for the call
  * to {@link #go()}.
  *
  * @param date1
  *            Date 1 to format
  * @param date2
  *            Date 2 to format
  * @param df
  *            Formatter to use
  */
 public UseDateFormat(Date date1, Date date2, Format df) {
  this.date1 = date1;
  this.date2 = date2;
  this.df = df;
  this.results = new ArrayList();
  this.executor = Executors.newFixedThreadPool(N_THREADS);
  this.tasks = new ArrayList<Callable<List>>();
  for (int i = 0; i < N_THREADS; i++) {
   tasks.add(new Callable<List>() {
    public List call() throws Exception {
     List results = new ArrayList(N_STEPS);
     for (int j = 0; j < N_STEPS; j++) {
      Date d = (j % 2 == 1) ? UseDateFormat.this.date1 : UseDateFormat.this.date2;
      results.add(UseDateFormat.this.df.format(d));
     }
     return results;
    }
   });
  }
 }
 
 /**
  * Starts the thread pool with the prepared tasks in {@link #tasks}.
  *
  * @throws Exception
  *             If any InterruptedException etc occur
  */
 public void go() throws Exception {
  if (executor.isTerminated()) {
   return;
  }
  List<Future<List>> calls = executor.invokeAll(tasks);
  executor.shutdown();
  executor.awaitTermination(1L, TimeUnit.MINUTES);
  for (int i = 0; i < calls.size(); i++) {
   results.addAll(calls.get(i).get());
  }
  eliminateDuplicates();
 }
 
 /**
  * Goes through results, sorts it and eliminates duplicates.
  */
 private void eliminateDuplicates() {
  // no duplicates in empty list or list with 1 element
  if (results == null || results.size()  iRes = results.iterator();
  String prev = iRes.next();
  while (iRes.hasNext()) {
   String cur = iRes.next();
   if (prev.equals(cur)) {
    iRes.remove();
   } else {
    prev = cur;
   }
  }
 }
 
 /**
  * Getter for {@link #results}.
  *
  * @return List of results gathered in the call to {@link #go()}.
  */
 public List getResults() {
  return results;
 }
}

When go() gets called, the prepared tasks which performed the above are submitted to the thread pool and the code will wait there till they are all finished; each thread returns a List containing the results of every single call to SimpleDateFormat.format(). We take these lists and merge them then we eliminate duplicates and print them out. If all goes well, we should see only 2 dates in the output: the original 2 dates we supplied; anything less or more than that and it means there was some data corruption! (In fact, we should see an equal numbers of formatting of date 1 and date 2 — but that is less important; simply not seeing just 2 formatted strings, which represents the 2 dates we fed in, proves the point about SimpleDateFormat.)

Now I’ve fed 2 dates to this code: 14/Feb/1975 (my birthday) and 15/Aug/1969 (Woodstock, baby!!!). You’d expect I guess an output like this:

2/14/75 10:51 PM
7/7/07 10:51 PM

(I’m currently in the USA so I’m using the US locale, by the way).

Yet here’s what comes out:

2/14/07 10:51 PM
2/14/75 10:51 PM
2/14/75 11:51 PM
2/7/07 10:51 PM
7/14/07 10:51 PM
7/14/75 10:51 PM
7/7/07 10:51 PM
7/7/07 9:51 PM
7/7/75 10:51 PM

Errr…computer says no icon biggrin SimpleDateFormat and Multiple Threads photo    technology random thoughts blogroll  thread pool text formatting synchronization multithreading JDK Java formatting text date format Where did all those dates come from? Clearly the data got corrupted during the execution when 2 or more threads are calling format() passing in different dates!

Incidentally, if you run findbugs against this code, it will actually correctly signal STCAL_INVOKE_ON_STATIC_DATE_FORMAT_INSTANCE — so this is a known issue, and hopefully you are already using FindBugs and know of it. But if you don’t, what is there to be done?

One solution is to have a SimpleDateFormat instance per thread — but that can prove costly, from a memory point of view. You can create your own class which wraps up a SimpleDateFormat and synchronizes on each call to format() — but this creates a bottleneck in your app.

There are probably quite a few other approaches, but my preferred one is to use the Apache Commons Lang’s FastDateFormat — this is still an instance of Format and it’s pretty much a straight replacement of SimpleDateFormat. The only difference is in terms of creating it — you can use a constructor and supply your own format pattern, locale etc (as you can do with SimpleDateFormat) or if you want the defaults applied, simply use a factory method:

FastDateFormat df = FastDateFormat.getInstance();

The rest is absolutely the same but the advantage is that this class is thread-safe — as per the class JavaDoc:

This class can be used as a direct replacement to SimpleDateFormat in most formatting situations. This class is especially useful in multi-threaded server environments. SimpleDateFormat is not thread-safe in any JDK version, nor will it be as Sun have closed the bug/RFE.

And running the above code with an instance of FastDateFormat you will get just the 2 dates we fed in as input — as to be expected!

Now go and look back at some of your old logs where you used a SimpleDateFormat — are those dates correct?

Attached the source code — as usual for you to download: DateFormatUsage.java

Tags: , , , , , , ,

Leave a Comment