Tuesday, June 15, 2004

Friday
Big family committments over the weekend, so I didn't get the chance to blog on Friday night, nor any time over the weekend. We had the "Queen's Birthday" long weekend here, hence no post for Monday either.

TKS Release
Friday was the latest Beta release. AM still needed assistance, so I spent the whole day on that.

After squashing some of Thursday night's bugs I spent much of Friday testing, and generally being available to help with further bugs. AM kept finding problems with various queries, and I ended up staying late fixing them.

DM was also helping out, and I was embarressed to learn that one of his bugs was caused by me. I had a typo in my Thursday night fix for serialization of the Answer object used as the GIVEN clause for GlobalizedAnswer. The code I wrote was supposed to store the old Answer, copy it into a serializable answer and then close the original. Only I closed the new one. This made it serializable, but if there was anything in it to be used, then accessing it resulted in a NullPointerException. My own tests needed to serialize this object, but I had no GIVEN clause, so it was never accessed. DM's tests had a GIVEN, and so he saw the exception. Oops.

At the end of the day the biggest problem was with unclosed Tuples objects showing up everywhere. Closing things isn't all that common in Java coding, so I should explain what this means. To be fully scalable, Tuples may be backed by disk: either the actual index, or a temporary file. We have a couple of mechanisms to allow us to transparently use either Mapped IO or standard IO, and one of these is the BlockFile class. This lets us ask for "blocks" in a file, and then just access these blocks as a java.nio.ByteBuffer or related class (such as an IntBuffer).

Early on, we found that allocating new objects each time we wanted to access a block in a file was becoming a significant overhead. We addressed this with object pooling. As a result, whenever a block is finished with it must be "released" back into the pool. If blocks are held onto then we end up with excessive memory usage, and have even seen OutOfMemoryExceptions on occasion. So closing Tuples and anything that wraps them (such as Answers) is essential. To help in this we have a finalize method on TuplesImpl which logs an error if the object was not already closed. We used to have a call to close in this method, but it was removed once it was no longer needed.

Unfortunately, SR doesn't always think of the expense of using objects, and he uses clone extensively. He also tends to create objects inline for temporary use. This is usually fine for Java, but it is a really bad idea when applied to objects which need closing like this.

Back in February I gained some practice finding missing close statements, so I was able to find a number of these problems very quickly. By late Friday night I had found and fixed all but 3. I eventually realised that I wouldn't get these last ones in time, so I changed the error level log back down to a debug level and checked it in for TJ (who had already gone home to his family by this time).

Due to time zone differences with the States we sometimes get another crack at a release. When I got in to work this morning TJ said that he could put it out again, and asked that I confirm that all was working well. On further inspection I realised that close had been removed from the TuplesImpl, so I added this back in before handing it on to TJ. The rest of the day was then spent looking for the 3 remaining unclosed Tuples. TJ hopes to do another sub-release soon, in order to catch anything small that we missed.

The tuples in question ended up being wrapped by a Given object which were being used as a constraint expression. Constraints do not have a close method, nor do we want to add one, so I had to hold the reference to the original Tuples and close it when the constraint was finished with - but only when the constraint in question was of concrete type Given. Unfortunately, the constraint was being passed off to an ExecutionPlan which then cloned the GIVEN clause, reordered the clause (which did more cloning) and then did some internal copying of objects. I had to make sure that all of these copies were closed, but again, only when they were of type Given. This all got very messy, very quickly.

I was tempted to give up on a few occasions, and rework the problem properly, but that was going to be a big engineering effort, and there wasn't time for it if I wanted to fix things for TJ's next minor release. I finally fixed the two which were involved with the ExecutionPlan, but I still have one left. I've been adding more logic, and I think I will have it within the first hour tomorrow.

Fixing Tuples
On Friday, AM, DM and myself all had a discussion about these unclosed Tuples. They are a major problem, and all it takes it one small change to cause another leakage. The sensible solution is to automatically register all Tuples in some central place (associated with the current session), and then close them all off at the end of a query or transaction. This will take some time to implement correctly, but it is going to make the system much more reliable, and save us a LOT of time in future.

At the moment, everyone is starting to adhere to a standard whereby any use of a Tuples in a new object will clone the Tuples. This still has problems, but at least it puts any new Tuples in a defined place. The problem is that there are still a lot of classes in the system which don't do this, so the system can be very confusing when you try to find where an unclosed object was allocated.

The more time I spent tracking these bugs down today, the more I realised how essential this change is. Hopefully we'll be allocated the time for it.

Security for Remote Queries
TJ asked that I look into the current remote query code to see if it presents credentials to remote servers. In the last incarnation this definitely happened, but SR may not have had time in his last week. I didn't get as much time as I'd have liked for this, but I couldn't see it anywhere, so this is what I told TJ. It's safest to assume that it's not secure anyway.

We'll have to get the whole "resolver" interface put in soon, and we'll definately make sure that credentials and authentication are covered adequately.

Meanwhile, I've just spoken to SR on AIM, and he says that credentials are being handed over. He points out a couple of problems when witching between multiple servers, but the foundation is definitely there. I'll have to take more time to look at it in the morning.

Power5 CPUs
While waiting for builds and tests on Friday I spent a little time reading two articles in the latest IEEE Micro magazine (sorry, summaries only. Subscription or payment is required for the full articles). They were the articles on the design of the Itanium 2 6M from Intel, and the Power5 from IBM.

Both are impressive chips, and would certainly seem to represent the state of the art at the moment. However, the Power5 seems to me to be a much more elegant, flexible, and scalable design. IBM really do seem to be ahead of Intel on this one.

The Itanium 2 6M is a very hot CPU, though to Intel's credit it is no worse than the previous Itanium 2, coming in at a peak of 130W. This CPU also devotes about 2/3 of its area to 6M of L3 cache, as opposed to IBM who have no onboard L3 cache, but more processing logic.

The Power5 has put its L3 cache off chip, but has kept the index onboard, meaning that lookups are not really much slower, and cache misses cost no extra. There has been a redesign of the memory architecture, and the memory controller has now moved on chip (like the Itanium 2 6M has). This reduces load on the fabric controller, allowing them to scale up to 64 processors, from the 32 that the Power4+ can do.

The slightly slower off-chip L3 is mitigated both by the onboard index, and the larger size: The Itanium has 6MB onboard, while the Power5 has 36MB. The Power5 also has much more L2 cache at 1.875-MB, vs the Itanium at 256kB. Similarly IBM have L1 cache of 32kB and 64kB for data and instructions, while Intel have 16kB for both. Finally, registers are similar for both lines, with Intel having 128 fixed and 128 floating point registers, and IBM having 120 of each.

It seems that Intel have put a lot into having L3 cache onboard, but at the expense of having less memory at all 3 cache levels. Intel also miss out on using that space for processing logic.

The Power5 includes 2 processing cores (like the Power4+ did), with each core executing 2 symmetric threads. This means that each processor can run 4 threads natively. The load balancing features for these threads is really impressive, and makes each core able to use each of its components quite effectively. It's like Hyperthreading, only it's done right. ;-) Each core has been designed to run with a single thread with high efficiency, but there is still a performance gain to be had with a second thread.

One of the most impressive features (for me) was the set of power saving features on the Power5. The chip uses clock gating to reduce power consumption at a fine grained level. This means that if the circuitry can guarantee that a module will not do anything in the next cycle, then the local clock for that module will be gated off, and that module will not get a clock pulse. So if a multiplication unit is being written to, but not read from in the next cycle, then the read ports will not get their clock pulse (leveraging their use of CMOS, which doesn't require power to keep its state). This is all automatic, and has no impact on performance. This seemed to be far ahead of Intel, who chose to use gate types which consume less power in those places where they felt they could afford the differing characteristics.

One thing I'd have liked to see in the IBM article would be power consumption figures, but I'm sure they will be forthcoming. The fact that the Intel chip has more transistors that any other commercial chip in the world (410 million) seems a fair indication that they will be drawing more power.

Overall, I found myself drooling a little over the Power5 chip. Here's hoping that Apple choose to start putting these little monsters into their PowerMac lineup!

No comments: