Thursday, June 23, 2005

Slow and Steady
I've been plodding along with the rules engine over the last week. Unfortunately, I'd initially planned to be done by now (not planning on getting sick, etc), and so I'd already committed myself for some other modeling work. That meant that I had to juggle the two together. However, I really need to finish the rules work quickly. I want to get it working, so I can move on to the next part of OWL support (needed for my thesis) and also because I won't get any money for my last couple of months until it's done. I think I'm actually supposed to invoice for time rather than the product, but I made a commitment to complete this phase, so I'll have it working before sending in the invoice. I just hope that the payment won't be too long after! :-)

After this last week, the RDF seems to be configured correctly, the code reads it all as required, and the elements all work properly. But it's tough getting them to work together in the way that I want. I'm half tempted to glue it all together in a way that I've already seen work, but I need the more general solution. At the moment, the problem is RMI. That figures.

When I tell ItqlInterpreter to apply a set of rules, I want to read the rules structure and return a result to the ItqlInterpreter. Then ItqlInterpreter can change it's current session to point to the data to be worked on, and run the rules. This all seems to work (it's not fully tested, but what I've done so far works) if I have it all occur in one step on the serve, but then the rules can't be separated from the data to be worked on (though I am presuming it is all found on the same server). I need to make it happen in the two steps I've outlined.

Unfortunately, I'm having real trouble getting the rules structure to pass over RMI correctly. By default, getting the rule structure from ItqlInterpreter would serialize the structure and create it in the client space. This has two problems. The first is that it's inefficient. There's no reason to move all that data over to the client, since it will only ever get used at the server. The more important problem is that the client would need to have access to the rule structure classes for de-serialization, and I'm trying to keep the client completely oblivious to all but the interfaces.

The better way to manage this situation is to pass back a remote reference to the class. I struggled for some time getting this right (I forgot how annoying RMI can be in this regard). To simplify things I decided not to ship a reference to the entire object (since the methods on that object should never be called remotely), and instead created a remotable wrapper to hold a local reference. The remote reference to the wrapper can be shipped over RMI, and when it comes back it can be queried for the local object that I want. Only that's not what I'm getting.

It took a LONG time to get a stack trace that was useful (it's being completely hidden in RMI, and was never being thrown from where I thought it was being thrown), but I finally worked out that the problem is coming from the server when it tries to extract the local reference to the rules structure. At this point it tries to serialize the rules structure, which is not legal (intentionally so). I believe that this is because the object doesn't know that it is now back on its machine of origin, and does not know that it can just pass back a local reference.

Perhaps I'm approaching it all wrong. Now that I have the RMI compiling and running correctly (this took me some time) I could possibly drop the indirection of the wrapping object, and pass back the rule structure as a remote reference (like I originally tried). My only concern is that calling run() needs to pass in a DatabaseSession, and I want it to take the local session without trying to serialize it. I'll have a go in the morning anyway, and see what happens. In the worst case I can always keep a local map of the remotable objects to the local rules structures they represent. Then when an object comes in for a "run request" I can get the local rules out of the map.

It's late. I'll think about this overnight and have a go in the morning.

OWL
Bob asked that I work with him on looking at extensions to OWL. He knows I've been thinking about this lately, and it turns out that he has some ideas as well.

Yesterday was the first opportunity I've had to really see how he works on these problems, and I have to say that I'm impressed. While he doesn't have a strong background in OWL, he is able to draw on a good formal understanding of category theory, E-R diagrams, and other areas. It allows him to see where OWL is missing functionality, or how certain functionality might be achieved using OWL constructs. There were a few occasions when I could suggest using an OWL or RDF construct to achieve something, and he could quickly show why this would or wouldn't work, and if it wouldn't work then why not.

The most impressive thing is that he really knows the boundaries of his knowledge, and he knows where to go looking when he doesn't know something. Conversely, I don't know how much I don't know,. Even when I recognise that I need to learn more about something, I don't even know the name of the field that I need to learn more about. I guess that just comes with experience.

The two big things to come out of our conversation was Cartesian product classes, and predicate composition. Cross products allow for several important relationships, but most importantly they would permit the description of a pair of relationships which may be individually repeated, but together must be unique (like composite keys in a database table). Predicate composition ends up covering a couple of the ideas I've already described, such as Euclidean predicate relationships, only it does it in a single construct.

I know that Ian Horrocks has explained that things like predicate composition can lead to undecidability, which needs to be considered for OWL DL. While that is no problem for OWL Full, a lot of people are more interested in OWL DL for the moment, so I think that it is important that any new constructs have some use in OWL DL. However, decidability is not something that either of us know a lot about. Fortunately, I've been given a reading list by Guido, plus I have those pointers from Ian. I just need to find some time to sit down and read them! :-)

No comments: