OWL and Cardinality
I've been giving thought to OWL cardinality in the open world model. My general rule of thumb for the open world assumption is that a model is valid if it is possible to add new statements which make everything consistent. In other words, I'm assuming that all of those statements could exist, but they just haven't been entered yet. A model can only be invalid if there is an apparent problem, and there are no possible statements which can fix it. For instance, the following model has an inconsistency which cannot be fixed with a new statement:
<ns:A> <owl:differentFrom> <ns:B>
<ns:B> <owl:sameAs> <ns:A>
This has big ramifications for cardinality in OWL.
owl:minCardinality. This restriction says that there must be a certain minimum number of times a predicate gets used with a subject. However, if there are not enough statements to meet this restriction, then it is always possible to add them later. Hence the open world assumption means that it is generally not possible to check
The only way that
owl:minCardinalitymight be violated is if there simply are not enough options available for a new statement to be legitimately added. For instance, the range of a predicate may be a class which was defined using
owl:oneOf. However, I can't see how this could cause a problem unless the list of available objects was smaller than the minimum cardinality. That would certainly be inconsistent, but it would be inconsistent TBox information, rather than inconsistent ABox data. This means that the
owl:minCardinalitywould not be violated, but the OWL would be inconsistent. A different error, but one I should still check for.
owl:maxCardinalityis a more complex story. At first glance, the following might appear to be inconsistent:
This is only partial RDF, and I haven't validated it, but it should demonstrate a property being used twice, when it has a maximum cardinality of 1.
<owl:onProperty rdf:resource="#myProperty" />
The difficulty in checking this for consistency is that an <owl:sameAs> statement can make it all valid:
<ns:firstPropertyObject> <owl:sameAs> <ns:secondPropertyObject>
It would seem that the only statements which can be validly checked for cardinality are those which refer to objects with explicit
owl:differentFromstatements to differentiate them. However, it is only possible to work with a set of objects which completely differentiate themselves from each other. To confuse things, there can be more than one such set, with intersections between these sets. Each set has to be considered completely separately, and it is only those sets larger than that the cardinality constraint that will count.
I tried to post the iTQL query for this yesterday, but that was mostly to think the syntax through. It isn't all that difficult to find the
owl:maxCardinalityrestrictions, and to conjoin this with the set of nodes found in
owl:differentFromstatements. The trick is that the
owl:differentFromstatements have to be selected both ways and joined to the restricted predicate in order to be counted. This needs a constraint like:
Careful use of the
$subject $predicate $node1
and $node1 <owl:differentFrom> $node2
and $subject $predicate $node2
count()subqueries should let me pull these sets apart, and find how often the predicate was used on that set.
I can't test this approach just yet, as I'll need to build some good test data first. That will take a little while, as I need to get through my current tasks first.