Friday, April 23, 2004

Again with the debugging of Jena unit tests.

Getting serious with the logging now, and have added copious print statements and stack traces into the source code. It didn't take to long to realise that the hashmap that JRDFFactory uses to map anonymous Jena nodes to anonymous Kowari nodes seems to be cleaned out between one line and the next. Also, the first set of anonymous nodes were based on positive numbers and the second were on negative numbers.

AN noted that on the first line the nodes are being created, then on the second line the nodes are being queried. This explains by negative nodes. Negative nodes are used for anonymous nodes that are created during queries. However, anonymous nodes are only created in a query if the system has not seen them yet, but these nodes should have been created already. Since the hashmap is empty on the second line, and it gets used for this query the system doesn't realise that the nodes are in use, and creates some temporary ones.

The remaining question then is why was the hashmap cleared out? Some more verbose logging eventually showed that the hashmaps were different between the first line (when the nodes were inserted) and the second line (when they were queried). This meant a new JRDFFactory was being created. Logging a stack trace for each JRDFFactory creation finally showed up the problem. The GraphKowari class is the owner of the JRDFFactory object, and this class creates its JRDFFactory correctly. However, it also gets a "FilteredStatementStore" and this is the cause of the problem. In the method which gets this store for the GraphKowari, a whole new graph is created internally. A FilteredStatementStore is requested from this graph and this is returned to the GraphKowari object. Unfortunately, the new graph also creates its own JRDFFactory. We can't even get this factory from the graph, as we never see the graph, only the filtered statement store that it supplies. I'm not sure why, but when queries are performed it is this graph that gets used, though that seems strange, as the reference to the graph gets lost. I suspect that the FiteredStatementStore knows about the graph that it's attached to.

Essentially, we need to make sure that the same JRDFFactory gets used by both the GraphKowari and the graph used by its filtered statement store. Since we can't get the JRDFFactory back from the filtered statement store without a lot of messy restructuring the logical approach seems to be to create the JRDFFactory first, and pass it on to session.getFilteredStatementStore(). However, I'm going to hold off until tomorrow when I can speak to AN first. This it because DM expressed concern over the use of two (or more) graphs, as their relationship to the session is unclear. It is only after I've confirmed that this structure is correct that I'll go ahead and make this change. However, I think that it should all work to an extent, and regardless of strict correctness, it may have to do until a significant refactor is permitted. Consequently, I think I'm making the change anyway... meaning that I may finally have it working!

Also spent some time converting my time log file into the Cuckoo timesheets. Making more use of the multiple day entry, only 2 entries per day is almost never enough. At least when it's adequate it works quite well, as it means that you don't have to wait for 2 minutes or so to move onto the next day. That is Cuckoo's major problem: it takes too long to do anything. This gets in the way of what you may need to do, making the whole user experience painful. Consequently I rarely use it. If it were simple to just log in, make a change and be done with it then I'd use it more often, but instead I find that even a single day's entry can be a significant undertaking. I asked Ben about the speed, and he explained that it appears to be due to the box that it runs on being badly underpowered. Perhaps I should ask to an upgrade there. The request may get ignored though, as several other people don't seem to mind the interface, given that they are always diligent about filling it every day.

Quick question from AM made me look at RMI Answer objects again. He thinks that a finalize() for RemoteAnswerWrapperAnswer would help avoid running out of memory on the server for when clients are careless and don't close their Answers. This also made me realise that even closed Answers are holding onto memory on the remote server. This is because the RMI server keeps a reference to the AnswerWrapperRemoteAnswer even after it's been closed. This should be fixable with a call to UnicastRemoteObject.unexportObject(). This is a static method which correlates to UnicastRemoteObject.exportObject(), and hence I would have only thought it should apply to objects exported in this way, and not to objects which acheive RMI registration by extending UnicastRemoteObject. However, after looking at it more closely it looks like it should work fine. I'm sure there's some documentation out there that explains it clearly. Otherwise I might just give it a go when there's some time available.

No comments: