Day 2Once again, I'm writing a lot of my own thoughts here, rather than just trying to keep a track of everything that happened at the conference. If you're interested in that stuff, then I'm sure lots of people wrote about it. For me, I found that it got me thinking about a lot of things, which is the main reason for me to write.
Random ThoughtsThis doesn't mention any sessions at all. I just had a series of thoughts on various topics as they got mentioned at the conference, and thought I'd note them down.
AllegroGraphMy last couple of posts mentioned AllegroGraph and immediately started diving into the Mulgara implementation. One justification for this is that such a well advanced system inspired me to examine where Mulgara is, and where we want to take it. However, on some level I confess to feeling a little jealous. After all, we had the chance to be even more advanced than this, but lost it. However, we still seem to have a few unique features, and have more coming (which I'm sure you'll hear about soon).
From a philosophical point of view, having high quality open source infrastructure will help enable the Semantic Web to progress in the same way that it helped the Web and Web 2.0 to develop. Even when a project demands access to some of the features that only commercial vendors provide, an open source alternative can help bootstrap a project in its early phases, and provides competition for all to benefit from.
So while I may feel a little jealous, I'm hoping this will help inspire me to get Mulgara to do more. It can't just be Andrae and I though. Fortunately, this conference appears to have helped us there. We may be getting some more help from places like Topaz, but I'm also hoping to lift the profile of the software. After all, the more people who use it, the more likely it is that someone will need to scratch an itch and submit some code!
Codex/DLVThe core of the fourth codex suite is the Codex system. This is the new commercial name for OntoDLV [PDF]. This blog isn't an advertisement for my current employer, but it's worth mentioning the system all the same, as I find it interesting.
One of the features of Codex is to translate its ontology language into a program that can be run on DLV, a disjunctive logic reasoner. DLV offers an interesting set of features, including closed world reasoning, true negation along with negation as failure, non-monotonicity, and disjunctions in the head of a rule. Some of these are the opposite of OWL reasoning (closed world, negation as failure, non-monotonicity) while the disjunctive feature is completely orthogonal. Building an ontology language on top of this reasoner creates a very different kind of modeling environment to OWL. Interestingly, the guys in Italy have recently been looking at the consequences of importing OWL queries into the "rules" part of the OntoDLV language. This would make for an interesting complement of features.
OWL is built the way it is because of the interactions with the World Wide Web that it was designed to describe. An open world assumption is vital on the web, as things are being added all the time. Being monotonic may not work perfectly (as different sources may disagree) but is still important, as there is no way to assert priority of one source over another. The lack of a unique name assumption is also important, as it is very common that things are referred to by more than one name. In fact, it occurs to me that having a truly open world assumption means that you must not have a unique name assumption, particularly if there is an equality or equivalent operation in a language. Now that I think about it, this make it seem strange that many description logics have an open world assumption, but also presume the unique name assumption.
Contrary to the properties of the web, the corporate environment is often quite different. It may be because "the enterprise" environment is a consequence of decades of training to conform to existing systems, such as relational databases, or the type of modeling needed in this environment is inherently different. Whichever way it is, corporations model, collect, and think about data in a particular way, most of which doesn't work well with OWL. Records are often taken over, and over again, with the same structure each time (yes, corporations may have several nearly identical databases with different record structures, and OWL can help here. But bear with me). Entities have been allocated unique identifiers, and do not need to be "declared" to be distinct. Most importantly, if a company does not know about data, this almost invariably means that such data does not exist. (e.g. an employee record cannot exist until it has been entered by someone authorized to do so. Accountants don't want to know about possible future employees).
While listening to people at the conference I started to recognize that people want each of these different properties, depending on their environment. It seems that Codex has something to offer many people, especially with it's ability to import OWL data.
ReasoningAnother thought that I had while listening at the conference was that OWL reasoning is often done in a semi-closed way. Inferences are usually made in a way that accepts the possibility of unknown facts (open-world), but it can necessarily only reason on the data that it knows. This has a closed world flavor for me, though it is definitely kept consistent with the open world model.
There are lots of things that might be true that we don't consider when reasoning. No one ever declares that all individuals are in fact distinct (when it is known that they are). Classes are never declared to be disjoint, unless there is a specific reason for it (such as sharing a common superclass).
I'm not saying that people make inferences that presume unique names, nor that they do anything which would be invalid were two things revealed to be identical. The math behind valid inferences is too solid for that. But people see reasoning happen on the limited data they provide, and tend to think of that data as being their entire Universe of Discourse (I've wanted to use that phrase again for years). Consequently, people are repeatedly making closed world presumptions, and wonder why inferences and calculations don't come up with the answers they expected.
We also never see the "possible worlds" that a current model allows for. Given the infinite nature of the open world presumption, these possibilities might surprise people. In some cases, the model may prove to be completely inconsistent with reality, indicating that more modeling information is required for the system to be really useful. I know that people like Ian and Bijan know how all this works intimately, but most people are surprised when they see
owl:cardinalityapparently violated, so this sort of consideration is way beyond where most people are thinking.
I suppose I had this thought partly because almost no-one declares every
owl:differentFromevery other distinct
owl:Thing. Similarly, classes are rarely declared as
owl:disjointFromunrelated classes. Sure
Womanmay be declared disjoint, but I've yet to see
BillingCenterget declared as disjoint from one another. This becomes more noticeable when you consider the differences between a closed world reasoner (like Codex) and an open world one. It's not that an open world ontology is any less expressive, it's just that you need more information to encode the same scenario. The funny thing is that people often have this information, but they don't put it in. Consequently, open world reasoners don't come up with as many inferences - not because they are less capable, but because they don't know as much about their system as a closed world reasoner does.