Sunday, June 03, 2007

Conference Discussions

A couple of ideas came out of SemTech which I wanted to write down.

Mulgara Reasoning

The first is the reasoning engine in Mulgara. I've been thinking for a little while that we wanted more than one reasoner, but I've finally worked out what we want and why:
  1. The Krule engine, for running OWL inferencing rules scalably over large TBox and ABox sets after loading. OK, I have a personal attachment to this, but it works very well, for several reasons - all of which will be mentioned in my upcoming thesis. :-)
  2. A Rete engine, for change management. This is required for data that is coming into an existing system. Running the full set of rule calculations in this case is prohibitively expensive, so a system that handles changes efficiently will be important. This will be based on Krule (since Krule uses Rete principles, but leverages the indices instead of using memory). I'm also hoping to use some ideas from Christian's work, if I can. Adding data isn't really an issue, but deleting it efficiently is trickier.
  3. A tableaux reasoner. While most queries are managed quite well by inferred data, there are some questions that cannot be solved without infinite inferences. This happens when there are two or more alternate paths that could satisfy a query, and the ontology describes that one of the path will be taken, but does not provide the details of which one. The classic one described in chapter 2 of the DL handbook as the Oedipus example. For this reasoner I want to use Pellet.
I noticed that all the professional systems that are using tableaux reasoners are using Racer. This isn't an option for Mulgara, as Racer is a commercial product. However, Pellet has some unique features which make it quite worthwhile. It stays up to date with the standards very well, and the OWL debugger has some really valuable features going for it. Finally, you never know, but I might be able to find some way to hook it into the indexes to let it do some of the tableaux reasoning on disk. I don't know about that one, but you never know.

Mulgara Security

Mulgara has never had security, but the hooks have always existed. In the commercial product, authentication and authorization was just handled with JAAS. Once these were established, then an RDF description of the accessible models for the current user could be intersected with the models requested.

Since JAAS is easy to implement, I've been wondering about putting this in again. There may be a bit of missing code (or code that suffered bit-rot) but the idea of performing the intersections is an easy one, so it wouldn't be hard. However, David told me that he'd heard from a few people that the JAAS approach is the wrong one to take.

It seems that security conscious people don't like security being managed in the database. Fair enough. So where do we put it?

David had the idea that maybe we put permissions into the database, as always, but then access and authentication gets done outside of the store, in a gateway interface. Any incoming queries to this interface can be modified to intersect on the models accessible by the currently authenticated party. Any RDF databases that perform efficient model intersections (ie. Mulgara) would work well, while everyone else would merely work correctly.

David thinks this sounds like a good open source project. I like the idea so much that I just had to write it down. It would be very cool to provide something like this for every SPARQL compliant database.

No comments: