Thursday, July 15, 2004

ClassCastExceptions
The rule which caused this exception is 4b. Looking at it with TJ this morning made it quite obvious what the problem was. Going back to the original rule we see:

4b.     xxx aaa uuu              :  (nt)

uuu rdf:type rdfs:Resource (t1)

Only this is wrong. The above statement claims that for every subject/predicate/object statement, the object has a type of rdfs:Resource. This is not true for literals, nor is it true for anonymous nodes. The ClassCastException occurred when rule 4b attempted to put a literal into the above statement as a subject.

To fix this I need to do two things. The first is to prevent literals from being inserted. That will eliminate the ClassCastException. The second is to prevent anonymous nodes from having these statements made about them. These currently aren't causing any exceptions and are erroneously (though harmlessly) ending up as inferred statements.

The best solution for this would be a constraint which allowed statements to be constrained by node type. Only constraints usually describe existing statements, and there are no statements telling us about type, as this is exactly what rule 4b is trying to create. So I can't create the statements describing type, because I don't have any statements which describe type.

One possibility is to create a "magic" constraint, a little like trans, which allows selection of nodes according to type. Unfortunately, among other things, this would need full types support in the String Pool. This is planned, but doesn't exist yet.

The other option is to filter statements according to type. This is an undesirable method, as it does not scale well. For instance, on rule 4b, every statement in the system would need to be returned, and those whose objects are legitimate URI references would be kept. That will work, but will not scale to large systems (it will slow down linearly with the number of statements). For the moment though this is all that is available, so I spent the day going this way.

The simplest filter is to find those instances when a node of type literal is to be inserted as a subject, and to move on to the next node instead. I implemented this by catching the ClassCastExceptions as they came through, This has the added bonus of preventing exceptions like this being thrown for other situations as well, such as when a literal is used as a predicate.

Of course, the anonymous nodes still need to be filtered out, but it doesn't hurt to leave them in for the moment in order to see if all is working as expected. Removing anonymous nodes needs to be done after globalization shows up their type. AN suggested creating a new query type that does the same as the current type, but which also knows to filter out these nodes when globalizing. It would simply create a different type of Globalized Answer Iterator for the task. This should work, and in discussing it we realised that the current "closable" iterators are not being closed, so I'll make sure that gets done when I get to making these changes.

Once inferences with anonymous nodes still present was going, I opted to postpone the anonymous node issue. Instead I moved on to allow the rules to recurse by creating inferences on inferred statements as well as the base statements. Unfortunately, this caused immediate problems.

Empty Tuples
Tuples have a fixed number of columns with names, and a set of rows containing data. A short time ago it was decided that if a Tuples object were to contain no rows, then it could be represented by a constant object known as an Empty Tuples. From a mathematical perspective this works, but practically it has caused me grief on a few occasions, with today being the most recent case.

There are a number of places in the code which assert that two Tuples objects which are to be joined have the same number of columns. However, if a tuples is empty, then the Empty tuples will be presented instead, and the number of columns will be zero.

When the empty tuples object was first introduced it should have been made mandatory that all column comparisons were to be made with a method on the class, rather than by comparing the number of columns returned from getVariables().length. This method would always return true if one of the tuples in the comparison were the Empty Tuples. Since this is not the case, any assertions which expect the number of columns to be equal suddenly started failing. Unfortunately, these did not occur often enough in the code to be seen very frequently, so the problem was largely overlooked.

I've run into the problem before, and it struck again today when a selection against inferred statements returned zero rows. The frustrating thing about assertions like this is that they are runtime exceptions, and so RMI does not know how to deal with them. The result is simply an RMI exception and no description of the problem at all.

Since there are too many places in the code to track down all the assertions and assumptions that the number of columns are equal, then the fixes had to be local to this particular piece of code. A quick fixup didn't address all cases, so with some help from AM I added some logging to tell me what objects were being compared, and what their contents look like. As usual when I try to use logging in this system, it didn't coming up anywhere for me, so I will need to figure out why before I can get the rules working again.

Once the rules are running on inferred data as well as base data, then I can go back and remove anonymous nodes from the results. After that, I'll talk with AN about integration, and then move onto OWL.

No comments: