Saturday, July 21, 2007

Java 1.6

Mulgara currently doesn't work with Java 6 (also called JDK 1.6). I knew I needed to enable this, but have been putting it off in lieu of more important features. But this release made it very plain that Mulgara is in an awkward position between two Java releases: namely JDK 1.4 and JDK 1.6.

The main problem going from Java 1.4 to Java 5 was the change in libraries included in the JRE. Someone had taken advantage of the Apache XML libraries that were in there, but now these had all changed packages, or were no longer available. The other issue was a few incompatibilities in the unicode implementation - some of which were the reason for introducing the CodePoint class last year, and published 8 days ago.

Going to Java 6 is relatively easy in comparison. Sun learnt their lesson about dropping in third party libraries that users may want to override with more recent versions, so this was not an issue. The only real change has been to the classes in java.sql, in which new interfaces and extensions to old interfaces have prevented a few classes from compiling. This is easily fixed with some stub methods to fulfill the interfaces, since we know these methods are not being called internally in Mulgara.

I haven't gone through everything yet (like the failing HTTP tests), but the main problem for Mulgara seems to be in passing the tests, but not in the code itself. The first of these was a query that returned the correct data, but out of order. Now any queries whose results are to be tested should have an ORDER BY directive, so this failure should not have been allowed to happen. It's easily resolved, but that made me wonder about the change in ordering, until I got to the next test failure.

Initially, I was confused with this failure. The "bad output" contained an exception, which is usually a bad sign. But when I looked at the query which caused the exception I realized that an exception was the correct response. So how could it have passed this test for previous versions of Java? Was it a Schrödinbug?

The first step was to see what the initial committer had expected the result to be. That then led to a "Doh!" moment. The idea of this test was to specifically test that the result would generate an exception, and this was the expected output. Why then, the failure?

Upon careful inspection of the expected and actual outputs, I found the difference in the following line from teh Java 6 run:
Caused by: (QueryException) org.mulgara.query.TuplesException: No such variable $k0 in tuples [$v, $p, $s] (class org.mulgara.resolver.AppendAggregateTuples)
Whereas the expected line reads:
Caused by: (QueryException) org.mulgara.query.TuplesException: No such variable $k0 in tuples [$p, $v, $s] (class org.mulgara.resolver.AppendAggregateTuples)
I immediately thought that the variables had been re-ordered due to the use of a hash table (where no ordering can be guaranteed). So I checked the classes which create this message (org.mulgara.resolver.SubqueryAnswer and org.mulgara.store.tuples.AbstractTuples). In both cases, they use a List, but I was still convinced that the list must have been originally populated by a HashSet. In fact, this also ties in with the first so-called "failure" that I saw, where data in a query was returned in a different order. Some queries will use internal structures to maintain their temporary data, and this one must have been using a Set as well.

To test this, I tried the following code in Java 5 and 6:
import java.util.HashSet;
public class Order {
public static void main(String[] args) {
HashSet s = new HashSet();
s.add("p");
s.add("v");
s.add("s");
for (String x: s) System.out.print(x + " ");
System.out.println();
}
}
In Java 5 the output is: p s v
In Java 6 the output is: v s p

I checked on this, and the hash codes have not changed. So it looks like HashMap has changed in its storage technique.

Fix

I have two ways I can address me problem. The first is to find the map where the data gets reorganized, and either use an ordered collection type, or else use a LinkedHashSet. The latter is still a set, but also guarantees ordering. However, this is a patch, and a bad one at that.

The real solution is to write some more modules for use in JXUnit, to make it more flexible than the current equal/not-equal comparisons done on strings now. This seems like a distraction from writing actual functionality, but I think it's needed, despite it taking longer that the "hack" solution.

Speaking of which... DavidW just asked if I could document the existing resolvers in Mulgara 1.1 (especially the Distributed Resolver). He didn't disagree with my reasons for releasing without documentation, but he pointed out that not having it written up soon could result in a backlash. Much as I hate to admit it (since I have other things to do), he's right.

No comments: