Friday, March 31, 2006

Programming


I've actually been doing some programming at home lately, which seriously cuts into the time in which I might blog. At least I've felt productive. That may have to take a back seat for a few days, as we're moving apartments this weekend.

Brian at work has created a small auditing library, which can be called by developers to log when methods are entered and exited. The cool thing was that he has turned the results into some cool interactive SVG that plots the course of the program.

At this point I remembered that Java 1.5 now includes instrumentation. If this could be used to automate the calling of the auditing methods, then this would remove the burden from the programmer (who, lets face it, could get it wrong, or just be too lazy to bother). So I started learning a little about the new interfaces.

It turns out that instrumentation is quite simple. In fact, it's too simple. Look at the size of the API, and you'll see that there are only a couple of methods. All it does is inform you when a class is being loaded, and provides you with read/write access to the bytecode for that class. That's about it. If you want to take it any further, then you have to get your hands dirty and manipulate the bytecode.

Fortunately the articles I found online did just this. The library generally used for this is ASM. There are several others, but ASM has the advantage of being simple and lightweight.

ASM also has one killer feature that I have yet to see in other bytecode manipulation libraries: an Eclipse plugin with some great features. This plugin not only lets you view the bytecode of your current code, but it also lets you switch to a view where you can see the ASM code required to generate the bytecode for your current code. So all you have to do is type in an example of the code you'd like to generate the bytecode for, and ASM will show you the ASM library calls you make. This almost makes coding with ASM trivial.

So I've been using ASM to instrument code, and generating pretty SVG plots of the call stack over time. I've learnt quite a bit about Java class loading in the process. It's also moved me one step closer to being able to generate classes out of OWL descriptions (yes, I can use UML, but that's not very serializable now, is it?). Probably the biggest trick was working out how to not audit the code doing the auditing (I hit some infinite recursion on my first attempt). However, this ended up being trivial when I realised that I'm just auditing a stack, which by definition is associated with a thread (Brian's library records method entries and exits based on the current thread). All I needed was some thread-local-storage to hold a lock on the auditing process. Then if the auditing code saw that the lock was being held, it skipped the auditing and returned to the original task. The best part is that its all local to the current thread, so there's no synchronisation required.

This kind of bytecode manipulation was amazingly similar to the Windows API interception I was doing a couple of years ago. I enjoyed it then too. It was just as tough to debug as well, since debuggers don't expect to see bytecodes modified from the condition the compiler left them in. The classes drop all line numbers, and tracepoints only start to make sense when the instruction pointer moves to an unaffected class. I've also started running into the following exception when I use instrumentation on code with a logger in it:
java.util.logging.ErrorManager: 5
java.util.MissingResourceException: Can't find bundle for base name sun.text.resources.DateFormatZoneData, locale en_US
I then get shown a deep stack, which would seem to indicate that the method-calling modifications from ASM are all correct. So it looks like the classpath is wrong at this point. Frankly I'm not surprised, as I've seen different classloaders coming in during the startup process of an application (it's odd seeing an exception about a missing class when the code in question is being called from inside the class that can't be found!

Still, if I avoid the logger, then all seems well.

No comments: