Friday, July 16, 2004

Logging for Rules
The logging I was hoping to get going today didn't work quite as planned. I was able to get some time on it, but ended up spending the majority of my time working on other things. At this point I still have no logging telling explaining what is going on with the problem Tuples. At times like this I'm sometimes tempted to open my own file and start writing to it. Hopefully I won't end up that desperate, but it there are occasions when it would be quicker and less wasteful of time.

Other Things
A little while ago I implemented the AnswerPage code to reduce RMI calls when iterating over Answer objects. This worked quite well, but results in a pause whenever the end of a page is reached and the next page has to be fetched. To help with this problem TJ made a few requests today. First, I made the page size configurable with a system property, so that these pauses can optionally be made less frequent (by making the page sizes larger). Second, I introduced a background thread to preload the next page. Of course, it was the latter optimization which was more time consuming.

The code to do this started out with all sorts of clever locks to make sure that the pages would be loaded at the correct time, and that there could be no races. Of course, that is always a ridiculous thing to do with threaded code, so I spent the rest of my time paring it back, and considering all possible race conditions. I'm quite pleased with the final code, as it is very simple, and uses only one flag that can be shared between threads.

While access to this flag should be quite safe, AN pointed out a few months ago that just because variables get set in a particular order, there is no guarantee that a CPU will in fact execute the code in the order expected. The only way to guarantee that things occur in a desired order is to issue a write barrier, in this case with a synchronized lock.

The prefetch code only sets the flag to indicate that a page has been loaded by the reading thread, and only after the prefetch thread has terminated and been joined on. This is a reasonably safe bit of code, so I've avoided putting real synchronization in.

One problem with waiting for an outstanding prefetch thread to finish is that it may take longer than the client is prepared to wait. For this reason I've put a timeout on the Thread.join() call. If the join returns as a result of a timeout, then there is no way to check except by looking at the flag which indicates successful completion. This is the one place I could see a race happening, but it doesn't actually matter. The Answer object that the next page is being prefetched from is a stateful object, and if a prefetching thread fails to retrieve a requested page then the Answer will have been moved onto the next page internally. Since there is no way to roll the Answer back to a previous page, the only option on a timeout is to throw an exception (or else a page would silently go missing while the results were being iterated over). I was initially going to try reissuing a failed page prefetch, but this realisation made my life much easier.

It's late, and Luc didn't give me any sleep last night, so I'm feeling too stupid to do any proof reading. There's probably more to say, but it will have to wait for a time when I can actually keep my eyes open.

No comments: