So where have I been, and what have I been doing? Why have I not blogged for the past few days? Have I been accomplishing anything? The answer to the final question is, "Yes." The others require explanation...
Over the last few days I've had various evening engagements. Last night was the opening of the French Film Festival, along with party. I'd provide a link, only the site contains a frameset with navigation via cookies. Very messy, and it can't be consistently deep-linked to. Try Palace Cinemas and click on "Festivals and Events" if interested.
Unfortunately, this clashed with the opening night of Flickerfest, which I'd rather have seen (we had tickets with friends for the French film, so we had to forgo Flickerfest). We have Flickerfest passes, so at least I'm getting to see the remainder. Tonight's screening was really enjoyable. I particularly enjoyed "7:35 De La Manana", which was a romantic song and dance number, performed by a suicide bomber and his captives. You'd have to see it to get it. :-)
Other nights this week were just a matter of falling asleep early, due to exhaustion.
The reason this impacts my blog, is because I write at night. I had no ability to post during the day, due to the bug I described where I log into other people's accounts while on campus. Even if I wrote during the day, it would be difficult to post at night, simply because I haven't been turning the computer on after hours.
After a few days of experimenting I now have a first draft of an OWL description of the rule data. Now if only I had an OWL engine to test the validity of the RDF I write to this specification. :-)
I'm reasonably happy with this draft, but I already want to make some changes.
I started out by building a structure in a ball and stick diagram to represent a simple rule. Unfortunately, I found that this approach has problems. While a ball and stick diagram is effective for illustrating the main elements of a graph, it quickly gets out of control when types and inheritance are included. Consequently, I stuck to the main points, and used RDFS/OWL to flesh it out when I came to write the RDF-XML.
In fact, I realise now that I only really like to use ball and stick diagrams for ABox data. TBox data may look fine in UML, but in RDF it isn't so pretty. The biggest problem is that it gets merged in with the ABox data. After all, RDF is good for merging data in this way. But having all the data in one place makes the diagram unreadable.
The biggest problem I had with this design was how to structure things. By this I mean decisions like where to use blank nodes, where collections are appropriate, and so on.
I had first thought that I would be navigating my way through the RDF programmatically. For this reason I thought that the JRDF interface might be the way to go. However, after letting myself be fooled like this for a while I started realising just how many queries would be required for the simplest parts of the graph. I felt like a fool when you consider that I am using these rules to perform set-at-a-time operations on the very same database. All I need to do is use iTQL to do all the work in one fell swoop.
To help here I can do a couple of things to make the queries easier. The first is to avoid collections where possible. The next is to be careful about the choice of blank nodes.
The only problem I have is that a variable list from a select clause must be ordered. This means that I have to use a sequence, meaning the use of the
rdf:_ predicates. I don't believe that DavidM got around to prefix matching in the stringpool, so there is no way to easily query this data.
There are several approaches I can think of here. The most obvious is to use the programmatic approach. This works, but is a horrible way to go. I'll only use it as a last resort.
The next idea is to implement prefix matching myself, including a syntax in iTQL. I know how to go about this, although the syntax would probably become a debating point, like it always does. The main problem is finding the time for this.
The last idea is to build a resolver that can find these values. It may be a decent halfway measure, though it may include some hacks. For instance, it could return a tuples of rdf:_0, ... rdf:_100 to match against set of less than 100. A terrible idea for the general purpose, but fine when you know that you will never have more than half a dozen values.
Before I commit to anything I will look in the string pool and consider what I need to do for prefix matching. I can then work from there.
There are still some corners of the structure where I'm trying to determine programmatic control versus iTQL. The best example of this is the selection of RDF structure versus an iTQL string. Since I allow both, I need code which can handle either. So do I find all rules together and sort them out as I iterate over them, or do I select them separately? This decision has an impact on the structure, so I have to consider it. At this point I think I'll be building them separately.
All in all, I'm relatively happy with the ontology. While it's not "diagrammatic" it still provides me with a solid template to build the RDF rules with. Since every rule is structured uniquely, it gets tempting to fiddle with the structure to suit the occasion, but an ontology like this prevents me from getting off track like that. It will also make me carefully consider the structure any time I decide that a change really is necessary.
One thing I hadn't expected is that I got to rename a few classes. While the RDF will need to map into the Kowari query system, the names I give the RDF nodes need have no bearing on the Kowari class structure (although it tries to reflect it). As a result, I was able to name the constraint classes in a way that I'm happier with.
In particular, I'm using the name
Constraint rather than
ConstraintOperation. In the same way, I'm using
SimpleConstraint instead of
ConstraintDisjunction are both subclassed from
ConstraintOperation which in turn is a subclass of
Constraint. Very similar to the java code, but just enough different that I feel more comfortable about it.
At the request of a friend, I'm spending a bit of time out of hours learning and working with OCL. I haven't learnt all that much yet, but it makes sense so far. I'll certainly be picking up more UML as I go!
Working with OCL, and considering the MOF, I'm starting to look for these layers in OWL and RDF. Sometimes OWL seems really restrictive since so many meta-layers have been collapsed into one.
I had a weird thought the other day. I imagined a class ontology stored in RDF, including class images stored as literals. A
ClassLoader implementation could then create instances of a class directly from the datastore, as opposed to using the file system for the classpath.
The examples I can come up with to use this seem pretty contrived, but it seems useful nonetheless, particularly when bringing in new (trusted) RDF with class definitions at runtime. The nice thing is that the class definitions could come with their own ontologies. (Is there a standard for serialised UML into RDF? Could OWL do the job? Where would OCL fit in?) Does anyone have any ideas on this? Do I just want to do it because I can? (To much software suffers from this).
The only problem I have is that I don't think we can store an arbitrary blob in Kowari. The string pool will take it, but I don't know if the interfaces will (I'm pretty sure they don't). Of course, I can always uuencode a binary, but putting the blob in directly would be better. Maybe I need to put a new datatype into the string pool.
Well it's late, and I want to ride in the morning, so I'll leave it here. I plan on working a bit this weekend, so I'll see how far I get.
Aside from work, I also need to start writing the confirmation on my weekends. I was putting this off until I'd finished reading some more papers, but I seem to be constantly accumulating literature, so at some point I'm going to have to stop. Maybe I should draw the line where I am now.
Friday, February 25, 2005