Tuesday, April 01, 2008

Collections

So I'm trying to work out what is necessary in OWL, and what is necessary and sufficient. Actually, I just want "necessary and sufficient", but knowing the difference helps. :-)

Anyway, while working through this blog, I worked it out. But it probably won't hurt to write it down anyway...

I had narrowed my problem down to the following:

If I had a Collection like:
  <rdf:Description rdf:about="http://example.org/basket">
<ex:hasFruit rdf:parseType="Collection">
<rdf:Description rdf:about="ex:banana"/>
<rdf:Description rdf:about="ex:apple"/>
<rdf:Description rdf:about="ex:pear"/>
</ex:hasFruit>
Then this is translated to:
<ex:basket> <ex:hasFruit> _:l1 .
_:l1 <rdf:first> <ex:banana> .
_:l1 <rdf:rest> _:l2 .
_:l2 <rdf:first> <ex:apple> .
_:l2 <rdf:rest> _:l3 .
_:l3 <rdf:first> <ex:pear> .
_:l3 <rdf:rest> <rdf:nil> .
Now is this list open or closed? This is an important question for OWL, since collections are used to construct sets such as intersections.

If it's open, then I could add in another piece of fruit...
<ex:basket> <ex:hasFruit> _:l0 .
_:l0 <rdf:first> <ex:orange> .
_:l0 <rdf:rest> _:l1 .
This would work, but it implies that I can infer that every element of the list can be directly connected to the basket. i.e.
<ex:basket> <ex:hasFruit> _:l0 .
<ex:basket> <ex:hasFruit> _:l1 .
<ex:basket> <ex:hasFruit> _:l2 .
<ex:basket> <ex:hasFruit> _:l3 .
Now this makes sense to me, but I don't recall seeing it anywhere in RDF. For instance, it's not in the semantics document for RDF or RDFS. The section on Collections does say that RDF does not require any well-formedness on the structure of the list (indeed, branched structures are explicitly mentioned), but since only OWL-Full allows arbitrary RDF structures, it isn't generally applicable to what I'm interested in.

I'd come to this question while I was checking that an owl:intersectionOf with "complete" modality was necessary and sufficient. I presumed that it was, but it doesn't hurt to check. After all, I've been caught out in the open world before. :-)

I first went to the abstract syntax for class axioms to find out how "partial" modalities were encoded, vs. "complete". The triples encoding of the abstract syntax shows that "partial" is simply a list of rdfs:subClassOf statements for each element in the intersection, while "complete" uses an RDF collection. Actually, the expression "SEQ" is used, but sequences are then described as being of type rdf:List, and not rdf:Seq (which, incidentally, are extensible, but no OWL aficionado will have anything to do with them, so I knew that wasn't a possibility).

Now to make sure that "complete" really is complete, I needed to ensure that lists couldn't be extended.

There is a hint that lists can't be extended in OWL-DL in the OWL Guide:
"If we wanted to add a new winery in some other ontology and assert that it was disjoint from all of those that have already been defined, we would need to cut and paste the original owl:AllDifferent assertion and add the new maker to the list. There is not a simpler way to extend an owl:AllDifferent collection in OWL DL. In OWL Full, using RDF triples and the rdf:List constructs, other approaches are possible."

That raises the intriguing possibility that in OWL-Full an intersection can never be complete. But since OWL-Full is undecidable anyway, I guess that's not something I need to worry about.

That then brought me back to the description for Set Operators which I haven't read in a while. And in reading this I realized that I was a moron for forgetting it...
The members of the class are completely specified by the set operation.

The text then goes on to describe that an individual that is a member of each element of an intersection is then a member of the intersection. In other words, membership in each element is a necessary and sufficient condition for membership in the intersection. Had lists been open, then membership would have merely been necessary, but not sufficient, since there could be another class in the intersection that has not been asserted (yet).

So complete is indeed "necessary and sufficient". But if I'd just looked at the Guide in the first place I could have saved myself a bit of time. Sometimes I feel like an idiot... and then I go and compound it by writing about my stupidity on my blog.

Oh well, this SPARQL implementation won't write itself. I'm down to OPTIONAL - which I expect to take about an hour, and the algebra integration. I'd better make that transformation clean, as I expect to be doing it again soon for the Sesame algebra.

Somehow I also need to find some time to finish writing that paper about 2 column RDF indexes. Did I mention that I think they're a cool idea? :-)

No comments: