Tuesday, November 13, 2007


After writing about complex behavior emerging from networks of simple non-linear elements this morning, I read Slashdot this evening to see a story on just that topic. Strange.

Other than that I worked to get the new interpreter system working against the existing test suite. It's mostly there, but there are still a few bugs left.

Ironically, the transaction bug of the day was occurring in a section of code where I was doing a lot of testing to see exactly what command had been issued, and responding accordingly. However, I have an AST that works for me, and after staring at it for 10 minutes I suddenly realized that all the problems would go away if I used the same code for each type of command. Consequently, 12 lines turned into 2, and all the bugs went away. Thank goodness for being able to call into a clean interface design.

This isn't the first time that I've fixed a problem by removing code. Sometimes I wonder if real software engineering is about removing lines of code rather than inserting them. Pretty much destroys the "Lines of Code" metric that some companies like to employ (word to the wise - don't work for these companies).


I'm unable to speak above a hoarse whisper today (a fact that my two year old son is delighting in) due to some kind of virus. I'm over the worst of it, but I'm a little lightheaded, so if this post rambles more than usual you'll know why. :-)

OWL... Again

I'm much further into the 2nd Edition of the Description Logic Handbook (and to my deep satisfaction, wading through stuff I already know), and can see some interesting stuff coming up in the next few chapters. I'm also learning some interesting points in the discussions that go on in the OWL developers list. And it reoccurs to me that something is wrong here.

If it takes profession developers, and even professional academics this long to get a real handle on how OWL works, then how on earth can we expect the rest of the world to get it right? The idea of the Semantic Web is to link data from lots of different sources, but that implies we need lots of people out there who can structure that data in a way that will allow the linking to be consistent (and I'm referring to the logic meaning for "consistent").

Conversely, in order to create a semantic web, we need precise descriptions of things, and that implies Description Logic. The inventors of OWL were not trying to be obtuse - indeed, I think they desired the opposite effect. However, years of Description Logic research has led to an understanding that seemingly insignificant details in a language can have dramatic effects. So OWL had to be carefully built and constrained so as to prevent the future semantic web from shooting itself in the foot. But this leads us directly into this language of horrible complexity with subtle rules that even catch the experts off guard occasionally.

So what's the solution? Well for the moment, the industry is doing what it always does. It muddles through using what expertise the developer community has, and incrementally drags itself up to greater consistency (hopefully) and complexity (certainly). It's hardly ideal, but then, it's no different to what usually happens with software. This is why Windows used to blue-screen all the time, and why I'm unable to run Windows XP in Parallels without Leopard losing the ability to start new programs or kill off old ones (I'm really hoping Apple fix that one!). It leaves me concerned about wisdom of this approach.

On the other hand, there seems to be little alternative to this kind of design if we want to design for semantics. OWL is simply a representation of an underlying mathematics that is fundamental to what we are trying to represent. But if it turns out to be too complex to design this stuff as a community (I believe individuals are capable of it, but not enough to make a "web" out of the semantics), then that means we can't really design this at all. But we know that semantics are possible, since our brains deal with them, and our brains are little more (ha ha) than enormous networks of simple, non-linear elements. There are general guidelines (giving functional areas like the prefrontal cortex for higher thoughts and the amygdala for emotional starting emotional thoughts). In other words, build a large enough network of simple constructs, with general design guidelines), but the details can vary dramatically, even between identical twins, and as we grow and learn then the network starts to adapt and modify itself.

Despite the randomness (neural network theory even demonstrated that randomness is essential), and despite all the lack of detailed "design", the brain is the only instrument we currently have that can process semantics. Almost all of its processing capabilities come about as an emergent property simply from building up a large enough network of interacting elements. So maybe the idea of the semantic web isn't that far fetched after all. We just need to get things mostly right at a local level, and when we link it all together something special will emerge. I don't think this is what the proponents of the semantic web had in mind when they first set out, but it might be what we end up with.

We are already seeing emergent properties coming out of networks that hit some critical mass. This is the effect behind Web 2.0 - whatever that means. And that is the point here. The label "Web 2.0" is a recognition of something that has "emerged" from these networks when connected with the right technologies. Because it wasn't explicitly designed, then it's hard to exactly pinpoint just what it is, but most people in the industry agree that it's there - even if they don't agree where it's boundaries lie.

Having semantics emerge rather than being designed in would seem to be a natural extension of what we're seeing now, especially when we are getting partial semantics in small systems already (courtesy of such technologies as OWL). But is there enough structure, and is it of the correct type for true semantics to finally emerge from the network?

OK, now I'm just going off on a wild tangent. At least I didn't look at the whole OWL problem and give up on it today. Perhaps our partial and not-quite-correct systems will have a part to play in a larger network.