Application Monitoring and Management using Sun’s JMX/HTML Interface

Posted by & filed under , , .

This is something I wanted to write about a while back – while there are articles on the net about using JMX in a Java application to keep an eye on how it ticks, or manage its running cycle, I think there is still a large number of users out there who are somewhat reluctant to use this part of the JDK to achieve this in their Java applications. Reasons I think are quite varied, and they span from the large availability of enterprise-wide monitoring tools out there (so much easier to outsource this part to another company so you can concentrate on building the application) to the fact that the JMX Console that gets shipped with the JDK is not the most scriptable to use (it’s a Swing app after all!) or the most friendly. (In particular, it uses RMI and trying to configure a firewall to allow access via JMX Console to your infrastructure proves a pain, since it uses dynamic ports.)

Few know though that there is a little package from Sun (ahem, Oracle I mean!) who provides a HTML layer for inspecting your MBean’s and invoking actions on them. While this might not sound like much (in fact, I am pretty sure Oracle’s JDMK is not the only HTML/JMX interface, so it’s not a big deal after all that Oracle provides one), it can have some (sweet) implications in your app management and monitoring infrastructure:

  • Providing a HTTP/HTML interface on top of JMX means first of all you only have one port to open up in your firewall, and you can ask any network admin and you will find out that is a piece of cake!
  • Secondly, having a HTML interface means that you don’t actually have to write your own tools (or use a 3rd party one) apart from your browser. Even more, it doesn’t have to be a fancy browser at all – you can even use curl /  wget with this!
  • Ok, let me say that last bit once more: you can even use curl with this! Doesn’t sound like a big thing huh? But let me remind you that once you can use curl to access this interface it means you can script your management and monitoring via simple bash scripts – and once you can do that, a whole lot of possibilities open! You don’t need a state-of-the-art monitoring system anymore, a simple cron job which curls a page and checks for a certain value and sends an email accordingly will do! Or even more, since you can use URL’s to invoke methods on your MBean’s, the same script can check for a value and curl another URL to invoke a method (e.g. to address a certain issue like reset a counter, delete a file, etc)

Now you’re getting it right? So for the purpose of this post is not so much to explain what JMX does (the Oracle docs will tell you that), but more how to use this powerful little Oracle interface to monitor and manage your application via simple scripts, though obviously I will be touching on dynamic MBean’s and a few other things related to JMX.

In particular, I have implemented elements of this in Cognitive Match’s infrastructure and in a few previous jobs, and even nowadays, we find this occasionally a lifesaver.

The Code

So if you look at the code (it’s a maven-ized Eclipse project zipped up by the way – so if you’re not an Eclipse fan simply import the pom.xml in your GUI) you will notice there’s only 2 classes involved (well 3, if you count the unit test!). That doesn’t sound like much, I know, but remember: the point of this is not to go through how to build your MBean’s and expose them to JMX (though I might come back on that one in another post), but rather how to use the HTML/JMX interface and how to script the management and monitoring of your apps based on that. As such the Java code might not be the cleanest or most elegant, but bear with me on this until we get to the interesting part. (By the way, the unit test is provided only to outline the fact that the FreqMap class DOES work and does what it says on the tin – so for those of you who might be slightly skeptical about the code, feel free to modify this unit test if you want to double-check that there is no “magic” in the JMX/HTML interface.)

Most of the functionality has been packed into the FreqMap class. This class provides a mechanism to count the frequency of integers; imagine you have a random number generator class somewhere else and you want to make sure the distribution of such numbers is uniform: you simply fire random numbers at this component and it will keep track how many times each integer was generated. It does this by storing all the numbers and their frequencies in a Map – every time one of the add() methods is invoked it either increments the existing frequency (if found) or adds a new one to the map for the given number. Just to make this a bit more interesting, we have also added the facility of adding all the numbers passed in to this map and storing this running total all the time. At any point, one can query the frequency of a given number or the running total so far.

There is, of course, the added complexity of allowing this class to be used in a multi-threaded environment, as this class is thread-safe – so you will see some synchronization-related locking/unlocking code – however, mainly this is the functionality of the main class.

You would have noticed perhaps that FreqMap is a DynamicMBean – so it implements the required methods to support this interface and expose its attributes as methods as an MBean. The class has 2 attributes:

  • Size: returns the size of the internal Map – in other words the number of distinct numbers we have seen so far;
  • Total: returns the running total mentioned above

In terms of operations, it exposes most of the operations available to a caller class: adding numbers to the map, clearing/resetting the map and retrieving the frequency for a given number. If you are familiar with the DynamicMBean interface you can ignore the code after the implementation of size() method.

The second class, JMXExample, is the “driver” and it hardly does anything in comparison to FreqMap: it simply creates an instance of FreqMap and it registers it with JMX and then generates a bunch of random integer numbers and passes them to FreqMap, sleeps randomly for a while and starts again, in an infinite loop. Point being that it does this forever, thus allowing us to query the MBean registered with JMX.

The crucial part (for the purpose of this article at least) is in the second half of prepareJMX method (lines 78 – 87):

/*
 * This is the crucial part -- starts the JMX / HTTP interface
 */
/* * */
HtmlAdaptorServer adaptor = new HtmlAdaptorServer();
adaptor.setPort(JMX_HTTP_PORT);
ObjectName adapterName = new ObjectName("HTMLAgent:name=htmladapter,port=" + JMX_HTTP_PORT);
mbs.registerMBean(adaptor, adapterName);
adaptor.start();
/* * */

This is the actual bit which starts the HTML/JMX interface – on port 8000 in this case, but you can change the constant JMX_HTTP_PORT to anything you want and you will notice it still works.

In order for this bit of code to work, you will need the package com.sun.jdmk.comm – there is a bit of a problem with this package in the main Maven2 repository it seems: while you can certainly find it in the repository, a closer inspection shows it is only the pom that’s actually stored in the repository, no jars to accompany it! This turns out to be due to licensing issues with Sun/Oracle components – so at first go you will notice this won’t probably compile (unless you already have the package in your local repository). Kevin Gorham shows a few ways to solve this problem on his blog: http://developerbits.blogspot.com/2010/07/maven-sun-jar-issues-javamail-jms-jmx.html so I am not going to expand on that and assume you have got the code to the point where it compiles and runs.

Getting back to the code, if you comment out those lines, you will notice the code will still run and you will still be able to query the application MBean’s via the standard jconsole:

As you can see, our FreqMap bean is exported and can be queries via JMX with no problems. So what is the point of that code, in the light of that? Well if you uncomment the code again and run it, and point your browser to http://localhost:8000 (or whatever port you decided to use), this is what you get:

In other words, very similar to what you’d get from jconsole – but over HTML! (And from a firewalling perspective, also using 1 single TCP port.) Furthermore, clicking on FrequencyMap in the above screen allows us to manipulate and query our MBean:

So for instance, let’s say we want to find the frequency for 7 – simply type in 7 and click on getFrequency and you end up to this URL: http://localhost:8000/InvokeAction//STATS%3Aname%3DFrequencyMap/action=getFrequency?action=getFrequency&number%2Bint=7 which shows

Let’s try next to add a number – we’ll end up on this URL: http://localhost:8000/InvokeAction//STATS%3Aname%3DFrequencyMap/action=add?action=add&number%2Bint=7 and the screen will display:

Now as I said before, the first advantage of using something like this is firewall penetration – you will find it so much easier to address your network security to allow this than you would to allow something like jconsole through. The other bit which I’ve mentioned already is scripting your application management. You wonder how can that be done?

Let’s have a look again at the output of http://localhost:8000/ViewObjectRes//STATS%3Aname%3DFrequencyMap  — this is the main interface for our MBean which shows us the attributes and allows us to invoke the methods. You will notice the output is structured into 2 tables – one with attributes and one with actions; furthermore, the attributes are always in alphabetical order, so size will always come before total. The values of these attributes are inside table columns (<TD>…</TD>) so a simple regular expression like this, run against the page HTML can retrieve these 2 values for the attributes: <TD>[0-9]+</TD> . Now as I said, because the interface is available over HTTP, and also because the URL’s have a fixed structure (I swear the guys from Oracle are occasionally quite clever!), we could curl this page and running it through a grep with the above expression will give us the 2 attributes:

~ $ curl -s "http://localhost:8000/ViewObjectRes//STATS%3Aname%3DFrequencyMap" | grep -E "<TD>[0-9]+</TD>"
<TD>100</TD>
<TD>1070402</TD>
~ $

And with a bit of tidying up via sed, we can get our 2 values:

~ $ curl -s "http://localhost:8000/ViewObjectRes//STATS%3Aname%3DFrequencyMap" | grep -E "<TD>[0-9]+</TD>" | sed -E "s/.*<TD>([0-9]+)<\/TD>.*/\\1/"
100
1378534
~ $

Let’s say we need to send an email now when the size of our map reaches 100 integers – assume for a second we’re back in the era when 10K of memory meant a lot and as such we need some manual intervention when we’re getting close to running out of memory 🙂 To do so now is trivial: all we have to do is assign the output of the above scriptlet to an array and we can then check at any point their values, safe in the knowledge that the first element of the array will always be the size while the second will be the running total:

#!/bin/bash
 
arr=(` curl -s "http://localhost:8000/ViewObjectRes//STATS%3Aname%3DFrequencyMap" | grep -E "<TD>[0-9]+</TD>" | sed -E "s/.*<TD>([0-9]+)<\/TD>.*/\\1/"`)
 
if [ ${arr[0]} -gt 100 ]
then
   echo "alarm"
fi

Nothing tricky as you can see – once we have the result of the curl parsed, we assign it to an array and we then check, if it’s higher than 100 we print an alert. However, you can imagine that rather than a simple echo we could do all sorts of other things (email the current size? write an entry to a log file? kill the process? etc)

Now let’s take this one step further – the above suggestions are all good for helping you with investigating the problem, but will require manual intervention: if you email the current value of the size to your sysadmin, he/she will have to manual intervene when the message is received and take some action (free up more resources for our application by shutting down other processes, maybe killing and restarting the process etc). Nevertheless, someone has to intervene manually. Wouldn’t it be nice though if we could have a script that can fix the problem for us? In this instance, let’s carry on with the very small memory model – if our map gets over 100 numbers, we want to empty the map so the memory gets recycled. Our FreqMap has a clear() method – and that was made available via JMX; so when the email about the size reaches the sysadmin, he/she has to use the HTML interface or jconsole to invoke this method… or perhaps we can invoke it?

If we look back at the 2 URL’s we used before to invoke methods we detect a common pattern:

http://localhost:8000/InvokeAction//STATS%3Aname%3DFrequencyMap/action=add?action=add&number%2Bint=7

and

http://localhost:8000/InvokeAction//STATS%3Aname%3DFrequencyMap/action=getFrequency?action=getFrequency&number%2Bint=7

As you can see, it follows the format:

http://server:port/InvokeAction//MBean name/action=<method name>?action=<method name>[&parameters…]

So if our method is called clear… would this be the URL?

http://localhost:8000/InvokeAction//STATS%3Aname%3DFrequencyMap/action=clear?action=clear

Open that in your browser and yep, that’s it! So we can then simply curl this URL and we’ve emptied our map!

#!/bin/bash
 
arr=(` curl -s "http://localhost:8000/ViewObjectRes//STATS%3Aname%3DFrequencyMap" | grep -E "<TD>[0-9]+</TD>" | sed -E "s/.*<TD>([0-9]+)<\/TD>.*/\\1/"`)
 
if [ ${arr[0]} -gt 100 ]
then
   curl “http://localhost:8000/InvokeAction//STATS%3Aname%3DFrequencyMap/action=clear?action=clearecho "alarm"
   //send email to sysadmin?
fi

So now we have a script which checks our system size and if it goes over a certain threshold it will reset it (without having to kill the process and restart it, without invoking any root rights etc). Put this into a cron job and rest assured your system is safe – as far as size is concerned at least!

Want to monitor a different value? Might come down to just changing the regular expressions above. (In all honesty, one should probably parse the html through a DOM parser and extract values in a more structured way, but you will find that for non-complicated value inspection the above does the trick.) Want to invoke a different action when your value reaches a limit? Change the action URL to match your action and parameters.

Bottom line is that once we expose the JMS beans over HTML/HTTP, means that one can curl URL’s to get access to attributes and invoke actions on them. And once that can be done, it means we can script our management and monitoring very easily. And once that’s done it means your infrastructure is easier to maintain (no need for learning a separate monitoring system and it’s scripting language) and cheaper too!

And here is the code for this: jmxmonitor.tar.bz2