Monday, July 19, 2004

Safari vs. Mozilla
I'm blogging in Mozilla on my Linux box tonight, and there are quite a few differences! While using Safari on OS X I noticed that there was a new "upload image" button, and a couple of other things seemed a bit different, but not that much. I realised that Safari wasn't showing up the bold or italics buttons, so I knew that some things were not being displayed, but until tonight I had no idea just how much was missing! I think I'm going to have to try Firefox on OS X to see if it shows anything that Safari doesn't.

Paged Answers
The pre-loading of pages seems to have gone well, which can be a surprise for threaded code. However, I thought it through quite thoroughly, so I'm cautiously optimistic.

I had to spend a little time today writing documentation on the page pre-loading for GN. This was made a little more awkward by the fact that someone seems to have recently introduced a device that is interfering with my Logitech radio keyboard. It made typing a real chore until I was finally able to find an unaffected frequency.

At the moment the paging ahead is done one page at a time, as this is simple and effective. However, it should be possible to create a queue of outstanding pages, within the limits of memory. However, I'm not sure how to go about finding the appropriate length of the queue. It would be counterproductive to make it too long and get an OOM error. For the time being it seems to be working well. TJ is using it for his tests at the moment, so I'm sure I'll hear about any problems soon enough.

Repeated Variables
My logging problem was so simple that I'm now feeling really, really stupid. As the resolver classes have been introduced they have taken over a number of existing classes, and have duplicated them in the resolver packages. This means that we have 2 nearly identical classes, in different packages. My logging was in the wrong one. Unfortunately, the code to attempt the fix was in the correct class, and so it became clear that nothing had been fixed. However, once logging was working it did show the problem.

The query causing the problem was:

select $xxx <rdfs:subPropertyOf> $xxx from <sourcemodel>

where $xxx <rdf:type> <rdf:Property>

During the course of running this query, the AppendAggregateTuples class was considering appending data found in two tuples objects. The first one had 1 variable called $xxx, while the second had 2 variables labeled $xxx and $xxx. The append operation was expecting that each tuples would have the same number of columns, and so it failed at this point.

The problem comes down to an inability to use the same variable twice in the select clause. From a mathematical perspective, this is correct, but it makes it impossible to express the query.

After discussions with AM and TJ, we've decided to allow variables to be aliased. This would allow the above statement to instead be expressed as:
select $xxx <rdfs:subPropertyOf> $yyy from <sourcemodel>

where $xxx <rdf:type> <rdf:Property>
and $yyy <tucana:is> $xxx

This solves most peoples' problems, but it will make automatic translation of the entailment-rdf-mt-20030123.xml document and its ilk that much harder. The original statement was a direct translation of rule 5b:
<rule name="rdfs5b">

<subject var="xxx"/>
<predicate uri="&rdf;type"/>
<object uri="&rdf;Property"/>
<subject var="xxx"/>
<predicate uri="&rdfs;subPropertyOf"/>
<object var="xxx"/>
<rule name="rdfs2" />
<rule name="rdfs3" />
<rule name="rdfs6" />

Recognising the duplicated var="xxx" and replacing the second with a different variable will be annoying, but apparently necessary.

In the meantime, until pairs of variables are permitted in a <tucana:is> statement, I am writing some Java code to do this rule for me. Unfortunately, it means doing the query and performing an insert for each resulting row. This will be extremely inefficient, but it won't have to last long. As soon as the new queries are available, rules 5b and 7b can be updated to take advantage of them.

Split Infinitives
Yes, I know I'm using them. No, I'm quite comfortable with that. While many claim that it is incorrect, find me a single authority on the subject who claims that it is. Bryson agrees with me here.

Ditto for split compound verbs. :-)

No comments: