Thursday, September 29, 2005

Work
Since I'm working on commercial software now I'll be doing a lot of logging on the company's internal Wiki instead of here. I'll continue to talk about Kowari or study in here, but there are only so many hours in the day!

Remote Servers
As part of what I'm doing this week, I need to talk to a Wordnet RDF set. With my poor little notebook struggling on all the tasks I'm giving it, I figured that it made more sense to put the Kowari server on my desktop machine (named "chaos"). Unfortunately, I immediately hit a problem with RMI that had me stumped for some time today.

Starting Kowari on the desktop box worked fine. Querying it on that box also worked as expected. But as soon as I tried to access the server from my notebook I started getting errors. Here is what I got when I tried to create a model:

create ;
Could not create rmi://chaos/server1#wn
(org.kowari.server.rmi.RmiSessionFactory) couldn't create session factory for rmi://chaos/server1
Caused by: (ConnectException) Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused
Caused by: (ConnectException) Connection refused
My first response was confusion at the connection attempt to 127.0.0.1. Trying to be clear on this, I changed the request to talk directly to the IP address:
create ;
Could not create rmi://192.168.0.253/server1#wn
(org.kowari.server.rmi.RmiSessionFactory) couldn't create session factory for rmi://192.168.0.253/server1
Caused by: (ConnectException) Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused
Caused by: (ConnectException) Connection refused
I started to wonder if this was a problem with a recent change to Kowari's code (which was a scary prospect), and started looking more carefully a the code, and the logged stack traces.

The clue came from the client trace:
Caused by: java.rmi.ConnectException: Connection refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:567)
at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:185)
at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:171)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:101)
at org.kowari.server.rmi.RemoteSessionFactoryImpl_Stub.getDefaultServerURI(Unknown Source)
at org.kowari.server.rmi.RmiSessionFactory.(RmiSessionFactory.java:132)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:274)
at org.kowari.server.driver.SessionFactoryFinder.newSessionFactory(SessionFactoryFinder.java:188)
... 13 more
Caused by: java.net.ConnectException: Connection refused
So the problem appears to be a connection to the local system, which isn't running a Kowari instance, so it fails. The error was occurring in the RmiSessionFactory constructor, but this seemed OK, and the stack above and below it in the stack trace was all Sun code. So what was happening here?

The relevant code in the constructor looked like this:
  Context rmiRegistryContext = new InitialContext(environment);
// Look up the session factory in the RMI registry
remoteSessionFactory =
(RemoteSessionFactory) rmiRegistryContext.lookup(serverURI.getPath().substring(1));
URI remoteURI = remoteSessionFactory.getDefaultServerURI();
The failure happens on the last line here.

What is the process, and how is this failing? Well, it starts by looking up a name server to get an RMI registry context. The important thing to note here is that this works. Since the RMI registry is running on the server rather than the client, then we know that it spoke to the remote machine and didn't try to use 127.0.0.1. So far, so good.

Next, it pulls apart the path from the server URI and looks for a service in the RMI registry with this name. In this case the name is the default "server1", and the service is a RemoteSessionFactory object. This also works.

The problem appears on the last line when it tries to access the object that it got from the registry. For some reason this object does not try to connect to the machine where the service is to be found, but instead tries to access the local machine. So somehow this object got misconfigured with the wrong IP address. How could that happen?

Since nothing had changed in how Kowari manages RMI, I started to look at my own configuration. Once I saw the problem, realised how obvious it was. Isn't hindsight wonderful? :-)

Nameservers
Once upon a time I ran Linux full time on Chaos. This meant that I could run any kind of service that I wanted, with full time availability. One of those useful services was BIND, allowing me to have DHCP dynamically hand out IP addresses to any machine on my network, and address them all by name. Of course, BIND passed off any names it hadn't heard of to higher authorities.

However, obtuse hardware, Windows only software, and expensive VM software that suddenly stopped working one day (it died after the free support period ended, and no, I can't afford support), slowly took their toll. I finally succumbed and installed that other OS.

Once Chaos started rebooting, I could no longer rely on it for DHCP or BIND. DHCP was easily handled by my Snapgear firewall/router, but I was left without a local nameserver.

New computers to my network are usually visitors wanting to access the net. This doesn't require them to know my local machine names, nor do my other machines need to access them by name. So I figured I could just manually configure all of my local machines to know about each other and I'd be fine. This is where I came unstuck.

The problem was that I had the following line in /etc/hosts on Chaos:
127.0.0.1  localhost chaos
I thought this was OK, since it just said that if the machine saw its own name then it should use the loopback address. I've seen countless other computers also set up this way (back in the day when people still used host files). For anyone who doesn't know, 127.0.0.1 is called the "loopback address", and always refers to the local computer.

This confused RMI though. When a request came in for an object, the name service sent back a stub that was supposed to connect to a remote machine named "Chaos". However, to prevent the stub from looking up the name server every time, it recorded the IP address of the server instead of the server's name. In this case it looked up /etc/hosts and discovered that the IP for that "chaos" was 127.0.0.1. The object stub then got transferred across the network to the client machine. Then when the client tried to use the stub, it attempted a connection to 127.0.0.1 instead of to the server.

The fix was to modify /etc/hosts on Chaos to read:
127.0.0.1  localhost
192.168.0.253 chaos
So now the stub that gets passed to the client will be configured to connect to 192.168.0.253. This worked just fine.

So now I know a little more about RMI. I also know that if I ever get any money, I really want a spare computer so I can boot up Windows and not have to take my Linux server offline to do it.

Tuesday, September 20, 2005

Logic
Work is keeping me busy in Chicago at the moment. That's not to say that I'll write more when I get home, but I hope I will.

Yesterday I was in a meeting being told about the semantic products provided by Exeura. Exeura is a commercialization spin off from the University of Calabria, so many of the products are using new technologies that are not fielded commercially elsewhere yet. In the course of the meeting, one of the products was described to be based on "Disjunctive Logic". When queried on this, the explanation was that Disjunctive Logic is a type of logic used commonly around the world.

Now I know there are a lot of logic families, but I hadn't heard of this one. So I went looking. Funnily enough, most of the useful references I found were publications from the very people at Exeura. That's not to say they invented Disjunctive Logic, but they are some of the co-authors of the DLV project, which is one of the only disjunctive logic processing systems. I'll confess that this made me a little cautious about it's readiness for commercialization, but I'll have to reserve judgment until I see it.

I looked up Disjunctive Logic to try to learn what makes this branch of logic special. The first paper I decided to read was co-authored by the same people at Exeura, so at least I knew I'd be looking at the same thing. Almost straight away I saw the following:
Disjunctive logic programs are logic programs where disjunction is allowed in the heads of the rules and negation may occur in the bodies of the rules.

That's when the little light came on for me!

Everything I read over the next few pages was exactly what I expected to read. This is because I've been running into this all the time recently. I've been using cascading equations in ordinary description logic in order to avoid disjunctions in the heads, and here is a tool that is specifically designed to allow for that. I still have to read all the details, but I can see how useful this could be.

Sudoku
The first example of disjunctive logic that I can think of is in a non-trivial game of Sudoku. I first played this 2 weeks ago, discovering quickly that it was just a simple logic puzzle. I expect that most programmers were like me, and immediately started thinking about how to solve the puzzle with a computer program. It seems to come down to 3 simple rules, and I've noticed that the "harder" the puzzle (according to the rating system in the book I have), the more of the rules you have to employ in order to solve it.

I use the name "group" for all the squares in a 3x3 grid, a row, or a column. I'll do that here for clarity.

It's my third rule which is relevant to disjunctive logic. It states:
If there are n squares in a group which contain exactly the same n possible numbers, then those numbers are not possibilities in any other square of that group.

(Actually, there's a corollary: If there are n squares in a group which contain at least the same n possible numbers, and those n numbers appear in no other squares of the group, then any other possible numbers in those n squares may be eliminated)

So what is this saying? For instance, consider a column with several empty squares. For two of these squares we've determined that they could only be the numbers 2 or 3. That means that we know that if the contents of one square is a 2, then the other must be a 3, and vice versa. However, which don't yet know which way around. However, this is enough to tell us that no other square can be a 2 or a 3. If there were another square which had been narrowed down to being a 2 or a 5, then the 2 can be eliminated, letting us put a 5 in there.

So even though we only had partial information of the contents of 2 squares (we knew the numbers, but not which way around to put them), it was still enough to tell us how to fill in another square. This is a case of disjunctive logic, as our result (the head of an equation) was an OR between two possibilities, but this was still enough to solve for the body of the equation.

OWL
This also works with OWL, particularly the cardinality questions I was struggling with some time ago.

If a class has a restriction:

  <owl:Class rdf:ID="MyClass">
<owl:intersectionOf rdf:parseType="Collection">
<owl:Class rdf:ID="MyOtherClass">
<owl:Restriction>
<owl:onProperty rdf:resource="#myProperty"/>
<owl:maxCardinality>2</owl:maxCardinality>
</owl:Restriction>
</owl:intersectionOf>
</owl:Class>
Then objects of type MyClass can only refer to two objects with the myProperty predicate. So if I have the following:
  <namespace:MyClass rdf:about="#myClassInstance">
<myProperty rdf:resource="namespace:A"/>
<myProperty rdf:resource="namespace:B"/>
<myProperty rdf:resource="namespace:C"/>
</namespace:MyClass>
Then I know that two of A, B or C must be the same thing. eg. If A and B are the same:
  <rdf:Description rdf:about="namespace:A">
<owl:sameAs rdf:resource="namespace:B"/>
</rdf:Description>
Of course, the possibilities are: A=B or A=C or B=C. (where I'm using = like owl:sameAs).

So this is a similar situation to the Sudoku example. We know there are only 2 objects, but we have 3 labels for them. That means that we have partial information, but like the Sudoku example, the partial information is still useful.

The question is, how do I process this partial information? Until today I had no idea. Now it appears that Disjunctive Logic was specifically designed for this situation. :-)

Of course, this is only relevant if there is useful processing to be done. Unlike Sudoku, OWL can say lots of things with no consequences. For instance, using a predicate more often than specified in an owl:maxCardinality restriction will not create an invalid document unless there are sufficient owl:differentFrom statements to differentiate the objects. It is impossible to violate owl:minCardinality unless the range of the property is too restricted in number (an uninstantiable class, or a class of owl:oneOf). I've talked about this in the past.

So with such an open system, with the extra processing allowed by Disjunctive Logic actually gain me anything? I'm not sure yet. Give me some time to find out.

Monday, September 05, 2005

Work in the USA
There is no technical content here at all, so I hope you're not expecting any. Hey, these are my notes, so I can write about whatever interests me, right? :-)

After my last day with NGC, I still had some things to do before I went home.

Some months ago, Herzum Software from Chicago got in touch with me about doing some work with them. They'd previously been doing business with Tucana, and were interested in Kowari and the inferencing work I've been doing. After several phone calls, etc, these discussions turned into an offer of work, based in Chicago.

I thought about this for a while. There were a lot of reasons to turn them down. Things have been going really well for me this year, and I've been enjoying the opportunity to work for myself, and concentrate on those areas of interest to me. I've also been looking forward to working with Andrae on full time Kowari work. I love the lifestyle in Brisbane, and I've been earning enough that we have been living quite comfortably. Considering this, it seemed like an unnecessary move, particularly with a second baby on the way. In Brisbane we have friends and relatives nearby who can help out.

On the other hand, contracting has its downsides. There is always the concern of finding the next job, and the bank being unwilling the finance the move to a bigger house (no more expensive than what we have, but Australian banks won't talk to someone without a guaranteed income).

I've also wanted the experience of permanent work overseas, and it would be much better to do it while the children are young. The requirement of a visa means that I've have to work for someone else, no matter how much I like my current independence. So it seemed that it would be worthwhile considering their offer.

There were also a couple of opportunities on the east coast of the US. I was interested in these, partly because of proximity to other large companies involved in semantic web technologies, MINDSWAP, friends like DavidW, and quick trips to Europe. However, Anne liked the idea of an architectural city like Chicago, and everyone we spoke to had great things to say about the place (except for the cold in winter).

After swinging back on forth on the idea, I decided to visit Herzum at the end of my trip to help me work out what I wanted to do.

Chicago
This time I flew out of BWI (I couldn't have handled another trip to Dulles during peak hour). I killed a few hours in the bar there with one of the guys from NGC (thanks Clay!), before catching my flight to Midway. As usual, the plane was delayed through bad weather.

Herzum keep their own apartment very close to their office (both are located in the "Loop" in the city), and they'd offered to put me up there while I visited. This was my first opportunity to meet the CTO, Bill, whom I'd been conversing with for some time. He was every bit as hospitable as our conversations had led me to believe.

I spent the Friday talking with the developers, discussing what it is that they do, while also discussing aspects of RDF and semantics that may be of benefit to them. In general, I was impressed with everything I heard.

There did appear to be a singular focus on the design paradigms set out in the book written by the company founder Peter Herzum (who was travelling at the time, and not available to meet, unfortunately). However, it also made sense that this would be the case.

I got an opportunity to have lunch with everyone there, and a few of them took me out to dinner nearby as well. Afterwards I was taken on a quick walk up to the Tribune building (quite a famous landmark, and the inspiration for many scenes from "Batman" comics), and got to see some of the other architecture on the way back to the apartment. So far I was having fun. :-)

After a slow start on Saturday, I met with one of the guys to visit "Bodyworks" at the museum, though the tickets were sold out. I still enjoyed getting to see a lot of the sights as we drove about, and we finally went back into town to see the John Hancock building. We had a drink at the bar while admiring the view, and then met another developer to have a late lunch. We then caught a movie and finally headed on home.

I'm glossing over all of this, but I had a great time. More importantly, I quite enjoyed the conversation of the people I was with. They all seemed quite intelligent, and demonstrated a great knowledge of their field. I could definitely learn something from each of them. It is important to me for my co-workers to have these qualities, and it is the reason I enjoyed working at Tucana so much. This reason alone was enough for me to give greater consideration to the position.

Wandering Sunday
I got back to the apartment reasonably early on Saturday evening. Bill had left that day to fly home to California, so I was on my own. I was missing my family and didn't want to be on my own, so I decided to go out to find a bar. I did my best to speak with people in the hope that my accent would land me in some interesting company. :-) This worked out OK, and I met up with quite a few people. I even stopped in at a McDonalds late at night for a snack on the way home (I was already gaining weight on this trip, so why fight it?) ;-) That mightn't seem like a big deal, but I don't normally eat at places like that, and Anne would have roused on me. As a point of trivia, I discovered later that this McDonalds was one of the largest in the world (It did sort of seem large at the time!)

Sunday morning I was supposed to meet up with Luigi, the VP at Herzum, who had just arrived back in town. He had to catch up with family, so he suggested that I go up to Lincoln Park to look around at a potential place to stay if we move over (I don't know if we can afford it, but it was worth a look anyway). I walked down to "State and Lake" to catch a train north, and then walked east towards the park.

I tried dropping into a general store for a light snack for lunch, but I was disappointed to discover nothing more nutritious than Twinkies and crisps. I really hope that place wasn't indicative of the general standard of snack food in the States!

I started with the Conservatory at the park, and then moved into the zoo. It was an enjoyable walk, but warmer than I expected Chicago to be. Not as hot as Brisbane gets, but still uncomfortably warm (why did I go into the Conservatory on a day like that? I must have been slightly crazy from walking in the heat!).

Walking along the lake back down to the city I saw just how many people go to the "beach" on a hot day in Chicago. It looked quite inviting, despite the lack of surf. I've never seen anything like the great lakes before, and this more than anything else made Lake Michigan look to me like an "inland sea".

That night I got to meet Luigi for the first time as we went to a nearby tapas bar for dinner.

Home Stretch
Monday was a little quieter. After packing, I went down to Herzum's office where I spent my morning talking with some of the staff and looking at Peter's book. Luigi went through some of the details of the offer, but at this stage I wasn't sure that it looked all that good. Besides, I was tired and missing my family, so I didn't trust myself to make a decision (though any decision to take the job would need agreement from Anne!).

Steve, the CFO, took me to lunch, and then I was off to the airport to get home (thanks to Dan for making sure I caught the right train to get to Midway on time).

I met some nice people on both flights home. On the Midway-LA flight I met an Ironman triathlete (I look up to these guys) who told me about the XTerra series, one of which she had just competed in. And crossing the Pacific I met a Melbourne lady I've now become friends with who's husband is about to take a job in the triangle are in North Carolina (one of the few places in America that I've been to).

Job
Since getting home I've spent quite a bit of time just trying to get over jet lag and spending time with Luc. This is one of the reasons why my blogging has been so sporadic recently.

After some further talks with Luigi, Anne and I decided that I should take the job. This is a big deal, and has us both a little intimidated!

To start with, I'm a contractor working remotely with Herzum until I can get a visa to become a full time employee. Fortuitously, Australians are now eligible for an E3 visa, instead of the old H1-B, which should make things a little easier for us (it will let Anne return to work eventually). All the same, the visa application will take a little while, so we have to wait while that comes through.

Part of my agreeing to the position was that we wouldn't have to move until January. This is just because the baby is due on the 3rd of November. For reasons of both cost and family support we definitely want to have it here in Brisbane. We're told that we shouldn't be making a major move with a newborn for at least 6 weeks, which takes us right into Christmas. Even if we wanted to move then (which we don't), it would be insane to try it during the peak travelling season. So we'll spend a couple of weeks seeing each of our families over Christmas (they will want to spend time with the grandchildren before we go overseas) and then move to Chicago in the first week of January. What a wonderful time of year to move to Chicago! :-)

In the meantime, I'm learning as much about Herzum as I can, and also looking into some of the software I may be working with. I'll continue to blog, but since I'll be working with commercial systems I may have to narrate in generalities. I don't expect things to be a real problem, as part of my work will be on open source systems (such as Kowari, or Lucene) so I should still be able to make comment on what I find.

I already have a few things to say about what I've been reading lately, but given the current late hour, I'll leave that for another time.