Tuesday, June 10, 2008

Thesis

I've finally started writing my thesis, so don't expect to see me blog much in the near term. I know I haven't been blogging much at all this year, but I'm guessing I'm about to get worse (or who knows? Maybe I'll procrastinate and blog more).

I'm still in the introductory chapters, so I'm reviewing everyone else's work. I have a stack of references from a few years ago, but need to update some of it, and finally read some of the papers I put off all that time ago.

One of the really startling things is reading about stuff that I had to discover for myself while implementing Mulgara. As a database developer you just do things because they seem pragmatic, and you figure that everyone must do it that way. Then you read a paper where someone formalizes your assumptions and gives a name to it. I can think of several here, but the first that comes to mind is "DL-safe rules".

DL-safe rules are simply rules where the variables in the head must also occur in the body. Well, building rules for OWL that meet this criteria seems obvious to me, but apparently it merited a couple of papers on the topic. For a start, I'm not sure how you'd even do this without making sure your variables in the head all come from the body. Second, the only way this would work (that I know of) is to start introducing blank nodes for existential statements.... and that way lies madness.

For instance, if you define (somewhat informally):
  ∀x ∈ Man → ∃y : Man(y) ⋀ father(y,x)

Then simply by saying Man(fred) you have an infinite loop. Incidentally, this is a trivial demonstration of how hard it can be to model the real world. The simple solution is to somehow incorporate a new type, like Men-without-fathers, and put that in your rule (hmmm, doesn't the DL-Handbook mention something like that?). Whether you introduce an entity named adam or somehow model evolution (good luck there) is up to you.

Back to the example... Of course, in OWL you can just create a blank node for an unknown father, but if you're going to take it that far then you want to create a blank node for the father of the first blank node, etc. Maybe it's reasonable to simple create that first step, and not reason further on blank nodes, but now you're making a judgment call that:
a) May not prove to be as useful as you'd envisaged.
b) May have implications for your logic.

Besides, what's the point in inferring a new node that you can't perform further inferences on? You'd just have a node there not saying anything except that it's a "father". But if you want to include it in a rule for determining ancestor(x,y), then suddenly it can be re-inferred on again, and you run the risk of an infinite loop once more.

So DL-rules just make sense in OWL (at least, they do to me). It's strange to see people like Boris Motik take them so seriously.

Speaking of Boris, he basically wrote the thesis I was hoping to write (well, sort of - fortunately I have a few ideas of my own). I came to many of the same conclusions that he has, simply by virtue of implementing stuff for Mulgara (though by virtue of having another child, moving countries, interrupting my candidature, and holding down a full time job, I didn't publish anything in time). The difference between what I would have written and what Boris did write, is that he knows the theory way better than I'm every going to have the time for. I mean, I can follow it all, but it would never have occurred to me to give such algebraic formalism to everything the way he did. It's a little humbling to see someone do something like that so much better than you would have done.

Oh well. I guess I'd better stop procrastinating and write some more.

No comments: