It's overdue, but I should write a short post about Friday. The idea was to take this day off to get over the bug, but I spent a bit of time looking at
TuplesOperations.join anyway. In particular, I was fortunate enough to have Andrae explain what "unification" meant.
The term comes from Prolog interpreters, and is used to simplify certain joins. If one of the constraints in a join returns just one row, then it can be the subject of unification. In this case, a constraint which contains some of the variables from the constraint with the single row solution can be modified. The constraint can have its variables set to this pre-calculated value, and looked up again.
As an example, consider two constraints A and B, where A looks like:
(<ns:foo> <ns:bar> $x)
and constraint B looks like:
($x <ns:pred> $y)
Now if A were to result in just one line:
<ns:foo> <ns:bar> <ns:obj>
Then B can be modified to look up:
(<ns:obj> <ns:pred> $y)
The result is that A can be dropped, and B can be re-evaluated (trivial), rather than having to go through the more expensive operation of joining A to B.
The only trick here is that the first column of B must still be labeled "$x". The label is needed for any subsequent joins, and for the select clause to get to the value. The iTQL way of setting a variable to a fixed value is with the magical predicate
($x <kowari:is> <ns:obj>)
So the equivalent iTQL for this is:
($x <kowari:is> <ns:obj>) AND
($x <ns:pred> $y)
This dredges up some issues I have with some of the syntax in iTQL. I might just mention them here, so I have a reference point in future.
kowari:is syntax is a "neat" way of doing it, in that it fits iTQL trivially, but it is a syntax I dislike. The reason here is that it isn't a "constraint" in the usual sense, but actually changes other constraints in the query. I'd have preferred a syntax which makes it easier for a user to understand what they are doing, rather than fitting in with the pre-existing syntax. The
kowari:is syntax is also painful to work with in the query layer, as it gets parsed as a constraint, and gets put into the constraint expression as in individual entity. This means that these magical predicates have to be found, and their effect passed on to all the other constraints that they are relevant for. I don't think this algorithm is particularly elegant.
There are several other approaches which could have been tried. The first to come to mind is simply:
($x=<ns:obj> <ns:pred> $y)
This has the advantage of not introducing an
AND when no join operations are going to occur. It can be argued that this is not a consideration: after all, the unification optimisation can result in a join being dropped when an
AND is present. However, that is an implementation optimisation, and the join operation can occur, depending on the data. The
kowari:is predicate on the other hand cannot result in a join.
On the other hand, this syntax has the disadvantage of getting lost in all the other constraints which include the variable $x. It would be redundant to set the variable for each constraint, but if you don't then this syntax would have one constraint modifying all of the others: a situation I would prefer to avoid.
So this kind of syntax would simplify some queries, but would possibly be confusing for others. It may be possible to create something that is distinct from a constraint which applies to all of the constraint, but then the
kowari:is constraint almost does that anyway (if you're willing to accept that it is not a normal constraint). There are always tradeoffs. I won't be changing it any time soon, but I certainly won't be averse to having a go if someone comes up with a better idea.
Saturday, March 12, 2005