Tuesday, May 04, 2010

Web Services Solved

It's a been a long couple of days, and I really want to relax instead of write, but it's been a few days and I've been promising myself that I'd write, so I figured I need to get something written before I can open a beer.

First of all, the web services problem was trivial. I recently added a new feature that allowed ContextHandlers in Jetty to be configured. Currently the only configuration option I've put in there is the one that was requested, and that is the size of a form. Apparently this is 200k by default, but if you're going to load large files then that may not be enough. Anyway, the problem came about when my code tried to read the maximum form size from the configuration. I wasn't careful enough to check if the context was being configured in the first place, so an NPE was thrown if it was missing.

Fortunately, most people would never see the problem, since the default configuration file includes details for contexts, and this ends up in every build by default. The reason I was seeing it is because Topaz replaces the configuration with their own (since it describes their custom resolvers), and this custom configuration file doesn't have the new option in it. Of course, I could just add it to Topaz, but the correct solution is to make sure that a configuration can't throw an NPE – which is exactly what I told the Ehcache guys, so it's fitting that I have to do it myself. :-)


Since I'm on the topic of Topaz, it looks like te OSU/OSL guys and I have both the Topaz and Mulgara servers configured. They wouldn't typically be hosting individual projects (well, they do occasionally), but in this case it's all going in under the umbrella of Duraspace. Of course this has taken some time, and in the case of Topaz I'm still testing that it's all correct, but I think it's there. I'll be changing the DNS for Topaz over soon, and Mulgara was changed last week. Mulgara's DNS has propagated now, so I'm in the process of cutting a long-overdue release.

One thing that changed in the hosting is that I no longer have a Linux host to build the distribution on. Theoretically, that would be OK, since I ought to be able to build on any platform. However, Mulgara is still distributed as a build for Java 1.5 (I've had complaints when I accidentally put out a release that was built for 1.6). This is easy to set up on Linux, since you just change the JAVA_HOME environment variable to make sure you're pointing to a 1.5 JDK. However, every computer I have here is a Mac. Once upon a time that didn't change anything, but now all JDKs point to JDK 1.6. That means I need to configure the compiler to output the correct version. It can be done, but Mulgara wasn't set up for it.

If you read the Ant documentation on compiling you'll see that you can set the target to any JDK version you like. However, that would require editing 58 files (I just had to run a quick command to see that. Wow... I didn't realize it was so bad). I'm sure I'd miss a <javac> somewhere. Fortunately, there is another option, even if the Ant documents discourage it. There's a system parameter called ant.build.java.target which will set the default value globally. I checked to make sure that nothing was going to be missed by this (ie. that nothing was manually setting the target) and when it all looked good I changed the build script to set this to "1.5". I didn't change the corresponding script on Windows, but personally I only want this for distributions. Anyone who needs to set it up on Windows probably has the very JDK they want to run Mulgara on anyway.

Well, that's my story, and I'm sticking to it.

Semantic Universe

What else? Oh yes. I wrote a post for Semantic Universe. It's much more technical that the other posts I've seen there, but I was told that would be OK. I'm curious to know how it will be received.

I was interested in how it was promoted on Twitter. I wrote something that mixes linked data and SPARQL to create a kind of federated query (something I find to be very useful, BTW, and I think more people should be aware of it). However, in the process I mentioned that this shouldn't be necessary, since SPARQL 1.1 will be including a note on federated querying. Despite SPARQL 1.1 only being mentioned a couple of times, the tweet said, that I discussed "how/why SPARQL 1.1 plans to be a bit more dazzling". Well, admittedly SPARQL 1.1 will be more dazzling, but my post didn't discuss that. Perhaps it was a hint to talk about that in a future post.


Speaking of future posts, I realized that I've been indexing RDF backwards, at least for lists. It doesn't affect the maximum complexity of iterating a list, but it does affect the expected complexity. I won't talk about it tonight, but hopefully by mentioning it here I'll prompt myself to write about it soon.

This last weekend was the final weekend that my in-laws were visiting from the other side of the planet, so I didn't get much jSPARQLc done. I hope to fix that tomorrow night. I'm even wondering if the Graph API should be factored out into it's own sister project. It's turning out to be incredibly useful for reading and working with RDF data when you just want access to the structure and you don't need a full query engine. It would even plug directly into almost every query engine out there, so there's a lot of utility to it.

I'm also finally learning Hadoop, since I've had more pressure to consider a clustered RDF store, much as BBN have created. I've read the MapReduce, GFS and BigTable papers, so I went into it thinking I'd be approaching the problem one way, but the more I learn the more I think it would scale better if I went in other directions. So for the moment I'm trying to avoid getting too many preconceived notions of architecture until I've learnt some more and applied my ideas to some simple cases. Of course, Hive tries to do the same thing for relational data, so I think I need to look at the code in that project too. I have a steep learning curve ahead of me there, but I've been avoiding those recently, so it will do me some good.

Other than that, it's been interviews and immigration lawyers. These are horribly time consuming, and way too boring to talk about, so I won't. See you tomorrow.

1 comment:

Neha said...

Amazing! Thanks for sharing
Microstrategy Online Training