Thursday, December 02, 2004

Note
I'm too tired to proof read this (or yesterday's). I'll just have to do it in the morning. In the meantime I'd be grateful if you'll skip typos, and try to infer what I really meant when I write garbage. :-)

Final Tests
TJ checked in a lot of things the previous night, so he was keen for me to do a full set of tests again. I really needed to run them again anyway, as my machine had shut down over night for some reason.

This latest test showed up a new problem with the filesystem resolver, but this did not look as simple as the I'd seen yesterday. I showed it to ML, who recognised the problem and assured me that it was not something that I had broken. I didn't think it was, but you never know!

With everything passing, I was finally able to check this code in. I wanted to commit the files individually, so I could annotate the commit for each file appropriately. This meant that I had to carefully choose the order of commits, and try to do it when everyone else was out, just in case I made a mistake in the ordering, and they picked up a half completed set of files.

SOFA
While the tests were running I was able to read a lot more about SOFA. It certainly has a lot going for it. The main shortcomings that I see are that it does not scale very well, and there are a few OWL constructs that it cannot represent.

In the case of the latter, this is not really a problem, as most of the things that it can't do are OWL Full, with only a little OWL DL missing. For instance, there is little scope for writing relations for relations. Restrictions do not seem to cover owl:someValuesFrom or owl:complementOf, and unions on restrictions are not covered at all. However, where SOFA does not permit certain OWL constructs to be represented, often there is an equivalent construct which will suit the same purpose.

The scaling issue is really due to one of SOFA's strengths. SOFA manages to keep all of it's data in memory, such that it knows what kind of relationships everything has to everything else. Our RDF store scaled much better, but there is no implicit meaning behind any of the statements. As a result, modifying anything in a SOFA ontology results in consistency checks and inferencing checks being done quickly and easily. To do the same in RDF means querying a lot of statements to understand the relationships involved with the data just modified.

So while SOFA won't apply well to a large, existing dataset, it works very well with data that is being modified one statement at a time. It's a nice way of dealing with the change problem that I've avoided up until now. Experience with this should also help to apply similar principles to changing data in the statement store. Similarly, it may be possible to apply some SOFA inferences on our data be using appropriate iTQL, making the SOFA interface more efficient.

One way to make SOFA work with larger data sets is to serialize an ontology out to a file, and then to bring it back in via SOFA, but this is not very efficient. For this reason, the need to write a proper rules engine has not been removed. I had been wondering about this when I discovered that SOFA did some inferencing.

Sub Answers
Today TJ discovered a problem with some answer types running out of memory. This occurs when an answer is serialized for transport over RMI. The problem is that an answer might have only a couple of lines, but those lines contain subanswers which are huge.

When I serialized answers for RMI it occurred to me that I didn't really know how large a subanswer could be. I initially worried that a large enough set of subanswers could be too much to be handled in memory. However, I couldn't see subanswers getting too large, and so I went ahead with what I had.

Never commit code when you think that in some circumstances there could be a problem with it. I know that. Why did I choose to forget it?

After a few minutes of thought I came up with a solution. The RemoteAnswerWrapperAnswer class determines that how to package an answer for the network. This decision needs to be replaced with an Answer factory. The factory then makes a choice:

  • If the answer has subanswers as its rows, then return an RMI reference (higher network overhead, but no memory usage).
  • If the answer contains data and is less than a configured number of rows, then serialize the subanswer.
  • If the answer contains data and is larger than a configured number of rows, then use a paged answer object.
This factory needs to be used recursively as subanswers are traversed. This will result in the large chunks of data at the bottom of the answer tree getting serialized effectively, while the tree itself will be traversed with RMI. The result should handle any amount of data. It shouldn't even take too long to implement.

Remote Resolver
One of our "value adds" (I hate that term) for TKS, is to support distributed queries. All queries may be done to remote server, but only distributed queries can refer to more than one server in a single query.

These distributed queries now get handled by the remote resolver. When I worked on this resolver I kept it in Kowari, but now that it works properly it has to be moved to TKS. As new features come into TKS, they may reduce the "high-level" value of the remote resolver, and hence allow it down into Kowari (presuming someone else hasn't rewritten it already - after all, that's what open source is about). But for the moment, it has to stay a TKS-only feature.

AN had been looking to move this code, but was waiting until I finished with the RmiSessionFactory code. However, he was having a little difficulty with it, so now that I'm done with the looping bug I've been asked to move the remote resolver myself.

Other than changing the package names of the code, the only real differences seem to be in the Ant build scripts. By the end of the day I'd managed enough that TKS will now build the remote resolver, but I have not yet run all the tests on it.

In the meantime I tried the tests on Kowari now that the remote resolver is gone. The first thing that happened was that many of the Jena tests, and all of the JRDF tests failed. I agonized over this for a while, but then I realised that I'd been doing the TKS build while these tests were running. According to AN, a TKS build can briefly start a server, which would definitely conflict with any set of Kowari tests which were being run at the same time.

I'm now re-running the Kowari tests, and I have my fingers crossed that when I get there in the morning they will have all passed. Then I just have to see how well (or poorly) the TKS tests run.

No comments: