Friday, November 19, 2004

Remote Server IDs
I don't want to spend long on this blog tonight, as I'm trying to upgrade Barracuda to JDK 1.5, and I want to get back to it.

I spent a little while talking with DM today about the new bug with remote server names. We are finally storing model names as relative URIs (well, it takes URIs, and if they happen to be URLs, then it accepts relative URLs). This is great, but relative URLs now means that we can refer to a model using numerous machine names, and the current code is expecting only the canonical machine name. The result is that if a query is issued against a model using a URL based on a non-canonical machine name, then the server doesn't recognise the model as being local. Instead it connects to the "remote" machine that it finds in the URL, and sends the query on to that machine.

An example might be a query on a model stored on a machine named mycomputer.domain.com. The model's URI could be rmi://mycomputer.domain.com/server1#model. If the model is instead referred to as rmi://mycomputer/server1#model then the query will go to the correct server, but the server will not recognise that the model is local, so it will make a connection to mycomputer and forward the query. This leads to an infinite loop.

Unfortunately, it is not possible for a computer to know absolutely every name that it can be known by. For instance, all it takes is a new entry in a DNS for a previously unknown name to come into effect. As a result, a client needs some way of telling that the server that it is connected to is actually the local server. This is easily accomplished by giving each server a unique identifier and checking for it with each new connection. I had considered some kind of UUID, but server URIs are really supposed to be unique, and they can be used to tell the user useful information in the event of a failure.

I had a chat with DM about the DatabaseSession.createModel() method today, but after thinking about it I realised that it operates a little differently to the other types of operations. For a start, it refers to models which should not exist, but which nevertheless represent valid URIs. It then makes the assumption that the model is local, and does a lot of its own work to create the model, rather than passing the work on to a resolver, like other operations do. However there is no reason that it can't handle non-local URLs, and pass the createModel request on to the appropriate server. It just needs a little extra checking, and then it can pass the request on to a RemoteResolver instance.

Another thing that makes this operation a little different is that it will probably never get called unless the model being referred to is local. This is because the iTQL client (and ITQLInterpreter class) will only connect to the correct machine to make this request. So you'd think I could just ignore this operation. However, there are other ways to call this method (eg. JRDF does it) so it would just be asking for trouble if I didn't cover this case.

Finding the ID
In order to find the server URI, I had to start looking in EmbeddedKowariServer. SR has been complaining about this code for some time, comparing it to a post-apocalyptic wasteland where nomads have taken pot shots at various pieces of code as the need arose. After looking at the code I started to see his point.

For instance, there is a static value which holds the hostname, and another instance variable which also holds the hostname. There are two places where the instance hostname can be set, and the static hostname gets set on the following line. One page later appears the line:

EmbeddedKowariServer.setBoundHostname(this.getHost());
Of course, this just sets the static hostname to the instance hostname, which is what it was already set to. Someone wasn't paying attention here.

Now I was looking for the part of the code which would give me the server's URI, and I could do that by checking which parts of the code referred to the hostname. Unfortunately, silly things like I just described make it annoying to track down every part of the code which refers to the hostname.

Anyway, I finally found what I want in the constructor for RmiServerMBean (wow, I haven't seen MBeans since I implemented the JMX framework for Enhydra in 2000). However, there is no simple way to get the URI from this object, mostly because the code where I need it (RmiSessionFactory and RemoteSessionFactory) doesn't know anything about the server MBean that launched them. (Not a very clean MBean design either. These things are supposed to provide instrumentation and control for other objects, not actually implement a service).

I spoke with SR about this problem (among other things - he really hates EmbeddedKowariServer. I mean, who ever heard of a database that contains an HTTP server? It should be the other way around! Kowari would be a lot smaller and more modular then). We eventually agreed that since a server will have only one public URI, then it can be kept statically, and we can just put it in a central location (like EmbeddedKowariServer) to allow anyone to find it.

Exceptions
The other trick, is working out where to do the test of the remote server (to check if it is local), and what to do in the event of a failure.

The thing that seems the most logical is to do the test in RmiSessionFactory just after it has obtained a RemoteSessionFactory (I alluded to this above). If it turns out that the RemoteSessionFactory is on the same server, then it should be closed, and an exception thrown. The SessionFactoryFinder.newSessionFactory() code which called the RmiSessionFactory can then catch this exception, log a warning that a non-standard name was used, and then use a local SessionFactory instead.

I made a start on this code, but it was Friday afternoon, so I wasn't going to pretend to myself that I wanted to finish this before Monday. :-)

Oh, and I looked a little at the documentation of iTQL for OWL predicates, but that was all in the morning.

No comments: