Friday, May 20, 2005

With so many changes coming in recently, I really need to document how they all work. I'm sure than Andrae will be encountering the same problems. However, it's not a trivial thing to do.

When Kowari was still managed by Tucana, Grant was formally writing everything up for us. We'd do the copy, while he would format it, structure the content where appropriate, correct our language, etc, etc. Every so often his files would get run through some proprietary program, producing the HTML files seen on the Kowari web site today.

None of this process is available now.

That leaves me with two choices. First, I can just edit the web pages in place. This is always a bad idea with automatically generated files, but if we're not generating them automatically anymore, then who cares? However, I'd like to think that we WILL be able to get this process going again eventually, and any manual changes to these files will be painful to deal with.

The second option is to find the program that Grant used, and generate everything again. The problems here are:

  • Cost of the program (I have no idea how much it's worth).
  • It's not open source.
  • A bottleneck around whomever gets the job of building the documentation.
  • Steep learning curve for whoever takes on the job.
  • Will need a lot of work to make everything look consistent, getting publications out when needed, checking into Sourceforge, etc.
Of course, the advantages are that this is how the manuals currently work, and they are really great.

Either way, something has to be done. Does anyone have any ideas?

Who had the great idea of integrating the language parser with the execution framework? Let me explain.

When a client wants to execute a query, an ItqlInterpreter object is used. This object parses the iTQL code, and sends the results to a database session for execution. The problem is that it integrates these two steps.

I've written a lot of classes for representing rules when they are brought in from the data store, but the queries are causing me problems. My code is executing inside the database, so it will never need to go through Java RMI (establishing RMI connections is a bit bottleneck for Kowari). This means that I have a local DatabaseSession that I want to use. Unfortunately, ItqlInterpreter uses an internal SessionFactory to get the session that it is going to use, meaning that it will always go across RMI.

ItqlInterpreter creates Query objects, and I'd like to just use them on the Session that I have. This is why the coupling of interpreting and execution is such a problem for me.

The easiest way around this would be to allow a SessionFactory to be supplied to ItqlInterpreter, and to hand out the given DatabaseSession from it. However, I've been loathe to make this change without input from other people. I haven't seen much of them lately, so I have steered clear from this approach.

Instead, I've been building up Query objects manually, in the same way that ItqlInterpreter does it. There are numerous problems with this approach:
  • Takes a lot of work to write simple queries.
  • The code is hard to write and read, and is therefore not extensible nor flexible.
  • Difficult code like this is bug prone (OK so far, but I'm debugging a lot).
  • It takes longer to write
The advantages are that it means I don't have to change ItqlInterpreter, and it's very fast.

However, given my pace of progress, I'm thinking that I should just rip into ItqlInterpreter and give it a new session factory, even without consulting the other developers.

Top Down
I've started to understand the difficulties in representing hierarchal systems in databases. When building the object structure in memory, it is necessary to build objects from the top, and fill in their properties going down the hierarchy.

It is possible to come bottom up, but this can get arbitrarily complex, particularly when children at the bottom can be of arbitrary types. For instance, if A is a parent of B and C, then to build the object structure from the bottom-up, B and C must be found (actually, all leaf nodes must be found), and then you have to recognise that they share the same parent, and provide both of them when constructing A. This sounds easy enough, but gets awkward in practice, particularly when different branches have different depths.

So it is much easy to build this stuff bottom up. The problem there is that most of the classes being built expect to have all of their properties pre-built for them before they can be constructed. So that doesn't quite work either.

I'm dealing with it by building a network of objects which represents the structure, and which can be easily converted into the final class instances. It's relatively easy to build this way, and easy to follow, but again, it's more code than I wanted to write. Oh well, I guess that's what this job is all about. :-)


Tom Adams said...

I was talking with Andrew about the coupling in the interpreter a couple of weeks ago. I started to create a new way of connecting to Kowari - connections - which I was hoping would be a nicer way to talk to it. I created QueryBuilder and QueryExecutor interfaces that would split the functionality, fronted by a connection. I'd built the iTQL query builder implementation and started on the executor implementation, when I came across this problem.

I think the solution you pose is fine, it was more work that I wanted to take on during my morning train rides, so I haven't started the split yet ;).

Take a look at the org.kowari.connection classes, I'm pretty sure they're all up to date and checked in.

It's my understanding that you can pass the interpreter a session to use (however, it should just parse queries, not execute them), in fact, the bean does just this. I don't have the code in front of me so I'm not sure whether it's via a constructor or a setter.

The parseQuery() method on the interpreter will build you a Query object, but I believe that only covers selects. The other functionality is wired directly into the parser, as you've seen.

The documentation generator Grant used was called AuthorIT, I think they have non-commercial versions, I'll send you Grant's email offline if you're interested.

If you do do the refactor, let me know, as otherwise I was planning on bolting the connector classes onto JRDF. Kowari takes too long to build on my PowerBook!

Quoll said...

I'd love to see a new interface like you propose. Go for it. :-)

The solution I'm proposing is a bit of a hack, as I'm not doing it cleanly. At the moment, ItqlInterpreter uses SessionFactoryFinder to get its SessionFactories. So a clean change would inform the finder that it should be returning a factory that supplies references to the current local session. However, this has problems. It would be tough to tell the finder to start returning these new factories in a perfectly safe manner (so no other thread can accidently get the wrong type of factory). Also, a lot of work gets done to ask factories about the server it should be returning sessions to, and this should be skipped for local queries.

I've avoided all of this by putting an override on the session. If this override is set then only that session is used, and all of the code querying the SessionFactory is skipped. It's not too pretty, but at least it's efficient.

BTW, until this change, you could not pass a session to an interpreter.

While writing this code I did see the parseQuery() method, but I could not see how this could execute without using sessions anywhere. The SableCC for handling a "SELECT" is the outASelectCommand() method, and this code calls updateSession(). So while it looks like parseQuery() just parses the iTQL into a Query, it actually does more than that.

How fast is that PowerBook? I'm using a PowerBook here, and I'm pretty happy with it's speed on Kowari.