Monday, October 04, 2004

Today was just about the same as Friday. I spent the time putting together example OWL code, writing entailment queries in iTQL, and then documenting the whole thing. Most of what I got done today revolved around owl:equivalentClass and owl:equivalentProperty.

Strangely, the queries I had originally written for this had the wrong OWL constructs in them. I wasn't paying attention during the conversion, but Volz had used several constructs which did not belong to OWL, but rather to DAML. For instance, Volz described sameClassAs and samePropertyAs, when these are from the DAML namespace. Fortunately, this is a trivial problem, as these declarations have a direct mapping into OWL, translating into owl:equivalentClass and owl:equivalentProperty.

Set Operations
While considering how to perform an inverse on a constraint, ala the complement operator except, I realised what my problem has been all along.

I keep making the mistake of describing an inner join as a set intersection. While analogous, this is not quite right. A constraint defines a set of triples, and if this is joined via a variable to another set of triples, then the result can be up to 6 values wide (not that there is a reason to use more than 5, since 6 columns means an unbounded cartesian product).

In fact, the solution is all 6 columns of the cartesian product with a few transformations applied to it. For a start, the sigma operation (select) will define one or more columns to be equal. These repeated columns are redundant and are consequently dropped, though strictly speaking, they do still exist. Similarly, any columns of fixed values are redundant, so they are also dropped. This dropping of columns is a projection of the 6 column result down to just the columns with the useful information in them, ie. the variables.

So the sigma operator defines the elements to keep from the cartesian product, giving us the result, and the final projection just removes unneeded columns. Note that mathematically these columns do exist, they just contain known data so there is no need to keep them.

The point of this is that the inputs of the join are a pair of sets of triples, but the output is a set of 6-tuples, based on a cartesian product. This is most definitely not an intersection operation.

We had been considering a "difference" operation to be an intersection with a complement of a set, but now I've come to realise that this is not the case. Instead, it would be implemented by the normal selection operation:

  A - B = { x: xi ∈ A ∧ xj ∉ B }
Where i and j are the selection columns.

I think that it might also be possible to define a ¬B based on the cartesian product of the entire dataset with itself, but I'm a bit tired to check this properly. The idea is that we need the inverse of B, not in respect to the space that it occurs in, but in the space of the result. Since the result of an inner join is that of the cartesian product, this makes the complement of B also appear in that space.

No comments: