Sunday, August 28, 2005

Out of Hours
The rest of my week was almost as busy as the time I spent elucidating Kowari.

On Tuesday DavidW and I went down to the University of Maryland's MIND lab to meet some of the MINDSWAP group. We were shown a very impressive demonstration of ontology debugging in Pellet, using a couple of different methods labelled white-box debugging and black-box debugging. As the names imply, the white-box method carefully follows the reasoning used by Pellet, while the black-box method looks at inputs and outputs. I'm not sure how much of this could be automated, but as a tool for debugging it was really impressive. It was enough to make me consider hooking Kowari into the back end of Pellet.

In fact, hooking Pellet into Kowari has a couple of things going for it. First, it gives me a point of comparison for my own work, both in terms of speed and correctness. Ideally, my own code will be dramatically faster (I can't see why it wouldn't be). However, having followed Pellet for a little while, I'd expect it to provide a more complete solution, in terms of entailment, and particularly with consistency. The second reason is to provide a good set of ontology debugging tools.

Kowari Demo
I was also asked to give a demonstration of the new rules engine in Kowari. I'd been running it for a couple of weeks at this point, and trusted it, but it still made me nervous to show it to a room full of strangers, all of whom understand this stuff. Everyone seemed happy, but it gave me a little more motivation to get back to work on completing the implementation.

Fujitsu has a lab upstairs from MINDSWAP, and a couple of their people had asked to come along to meet me. While we were there, they asked me several questions about how to make Kowari load data quickly. It seemed that they were sending insertions one statement at a time, so we suggested blocking them together to avoid some of the RMI overhead. They also invited me back the following day to see what they've been doing.

OWL for Dinner
Afterwards, DavidW and I went to dinner with Jim Hendler. Nice guy, and he was quite happy to answer some of my questions about RDFS and OWL. The one thing I remember taking away from that night was a better understanding of the agendas of the data modellers and the logic people participating in OWL. This culminated in the explanation that RDFS is that part of OWL that everyone could easily agree to in the short term, thereby enabling an initial release of a standard for this kind of work. This explains quite a lot.

It wasn't explicitly mentioned, but I sort of inferred that the separation of OWL DL and OWL Full was the compromise arrived at between the ontologists who needed to express complex structures (OWL Full) and the logic experts who insisted on decidability (OWL DL).

Fujitsu
The following night I was back down at the University of Maryland, this time visiting Fujistsu labs.

The evening started with a follow up question about bulk loads. There were two problems. The first was that they were running out of memory with insertions, and the second was the speed.

The memory problem turned out to be a result of their insertion method. It seemed that they were using a single INSERT iTQL statement, with all of the statements as a part of a single line. Of course, this was being encoded as a Java String which had to be marshalled and unmarshalled for RMI. Yuck. As a quick fix I suggested limiting the size of the string and using several queries (this worked), but I also suggested using N3 as the input format to avoid RMI altogether.

The speed problem was partly due to the RMI overhead (it was marshalling a LOT of text to send over the network!), but mostly because the insertions were not using transactions. I explained about this, and showed them how to perform a write in a single transaction. The result was a speed improvement of an order of magnitude. I'm sure that made me look very good. :-)

While there I was also shown a project for a kind of "ubiquitous computing environment". This integrated a whole series of technologies that I was familiar with, but hadn't seen together like this before.

The idea was to take data from any device in the vicinity, and direct it to any other device that was compatible with the data type. Devices were found dynamically on the network (with Zeroconf, IIRC) and then queried for a description of their services. These descriptions were returned in OWL-S, providing enough info to describe the name of the service, the data formats accepted or provided by the service, URLs for pages that control the service, and so on. They even had a GUI configuration tool for graphically describing a work flow by connecting blocks representative of the services.

As I said, there was no new technology in this implementation, but it's the first time I've ever seen anyone put it all together and make it work. The devices they had working like this included PDAs, desktops, intelligent projectors, cameras, displays, databases, file servers and telephones. It looked great. :-)

Friday, August 26, 2005

Final Week
Once making it to the final week, the plan was to go through the remaining layers of the storage code, and use any remaining time to go through examples and questions.

There are three main areas components of functionality in Kowari's storage layer: The node pool, the string pool, and the statement store. Once upon a time these three operated together, but now the statement store is off on its own behind the new Resolver interface, while the node and string pools are accessible to the whole system. However, their unified functionality has not changed.

All three components handle transactions autonomously, managing the phases of all their underlying components. The overall Session class uses a two-phase commit operation to keep the transactions of each component in synch with the others. It is also in these top level components that the files are all managed. The files which are created and manipulated at this level are generally used by the classes at lower levels (for instance, the BlockFiles and IntFiles which are used by FreeList and AVLTree) but there are also other files which handle session locking, and persistence of phase information for the transactions.

Once I'd reached this level, I had all of the information needed to explain the data formats in each file. It was awkward to explain the structures before this stage, since several important structures (notably, the phase information) contain information from every layer. Trying to describe a single layer leaves large holes in the structure, and has led me into confusing conversations in the past when I try to skip over these holes. But at this point I was finally able to write up the file structures, one byte at a time (the values are typically 64-bit longs, but ints and bytes are occasionally used).

I'd like to properly document each of these components, along with the associated file formats, but for the moment I'll just give an overview.

Node Pool
The idea of the node pool is to allocate a 64-bit number to represent each resource in the RDF data. We call these numbers "graph nodes", or just gNodes. GNodes get re-used if they are freed. The re-use is for several reasons, the most notable being to prevent an numeric overflow (it also helps the string pool if there are few holes in the address space for nodes). However, a resource ID cannot be re-used if there are any old reading phases which still expect the ID to refer to the old data.

These requirements are exactly met by a FreeList, so the node pool just a FreeList along with all the code required for file management and transactions.

String Pool
The string pool holds all of the URI References and Literals in the database. When it was first written, the only literals we stored were strings, and since URI's are also represented with a string, we called the component the "string pool". The string pool stores lots of other data types as well, but the name has stayed.

The string pool provides a mapping from gNodes to data objects, and from those objects back to the gNode. It also provides a consecutive ordering for data so that it can be used to easily work with a range of values.

The mapping of a gNode to the data is done with a simple IntFile. Each data element can be represented with a buffer of fixed length (overflows for long data types such as string are stored at a location referred to in this buffer). To find the data buffer for a given gNode, the gNode number is multiplied by the record size of the buffer. This is why the string pool prefers the node pool to re-use gNodes, rather than just incrementing a counter. Given that these records are all the same length, I'm not sure why a BlockFile was not used instead of an IntFile, but the effect is the same.

The mapping of data back to the gNode is accomplished by putting all data into an AVLTree. The records in this tree are identical to the records in the IntFile, with an addition of the gNode to the end of the record. The tree also provides the ordering for the data. This allows strings to be searched for by prefix, URIs to be searched for by domain, and date or numeric ranges to be found.

One problem with this structure, is that it is impossible to search for strings by substring or regex. This is why we have a resolver for creating Lucene models. However, it's been in the back of my mind that I'd love to see if I could build a string pool based on a trie structure. (Maybe one day).

The data structure holds up to 72 bytes in the record. Anything longer than this (typically a string) has the remainder stored in a separate file. We have a series of 20 files to handle the overflow, each storing blocks twice the size of the blocks in the previous file. This lets us have random access to the data, while reducing fragmentation. It also allows us to store data objects of up to several GB, though we don't expect to ever need to handle anything that large.

When the string pool and node pool are combined, they provide a mechanism for storing and retrieving any kind of data and associating each datum with a numeric identifier.

Statement Store
The statement store is the heart of Kowari's approach to storing RDF.

Each RDF statement is stored as a quad of the form subject, predicate, object and model. We originally stored a triple of subject, predicate, object, but quickly realised that we needed the model (or graph) element as well. Again, the interfaces already existed, so the statements are referred to throughout the code as triples rather than quads.

These statements are stored in 6 different AVL trees, with each tree providing a different ordering for the statements. I've already discussed the reason for this at length. Ordering the statements like this allows use to treat the storage as an index.

Of course, the representation of the statements is with the gNode IDs for each resource, rather than the resources themselves. This means that the indexes contain numbers and nothing else.

While simple in principle, the code here is actually quite complex, as it has numerous optimisations for writing to multiple indexes at once. Unfortunately for me, several of these optimisations were introduced after I last had a hand in writing the code, so it needed a little effort for me to understand it sufficiently to explain it to others.

Each of the indexes is handled by a class called TripleAVLFile. This class knows about its required ordering, and manages its own AVLFile. The nodes in this tree actually represent a range of RDF statements, with a minimum, a maximum and a count. By handling blocks of statements like this, the overhead of maintaining the tree is reduced, and searching is increased by a significant linear amount (since it's linear it doesn't show up in a complexity calculation, but this is the real world we're talking about, so it matters). Once the correct node in the tree is found, then it contains a block ID for a block in a separate ManagedBlockFile which contains all of the RDF statements represented by the node.

The 6 TripleAVLFiles manage both their trees and the files full of blocks (the AVLFile and the ManagedBlockFile). This is simple enough when simply reading from the index, but takes some work when performing write operations. Trying to insert into a full block requires that block to be "split" in a similar way to node-splitting in B-trees, but with co-ordination between the AVL tree and the block file. Writes are also performed in a thread owned by the TripleAVLFile, so that multiple modifications to a single location in the index can be serialised rather than being interspersed with writes to the other 5 indexes.

The details of these and other optimisations makes this code a complex subject in itself, so I'll leave a full description for when I get around to proper documentation. I should comment that each of these optimisations were only adopted when they were proven to provide a benefit. Complexity can be the bane of performance, but DavidM did his work well here.

Reads and writes are managed by the statement store, with reads being directed to the appropriate index, and writes being sent to all 6 indexes. The other job of the statement store is to manage transactions, keeping all of the indexes reliably in synch.

Integrating
A description of the storage layer is completed by describing how the Node Pool, the String Pool, and the Statement Store are all tied together.

When a query is received by the server, it must first be localized (I'm using the American "z" here, since the methods names use this spelling). This operation uses the String Pool to convert all URIs and literals into gNode numbers. If an insert is taking place, then the Node Pool is also used to allocate new gNodes where needed.

A query's principal components are a FROM clause (which is an expression containing UNIONS and INTERSECTIONS between models) and a WHERE clause (which is an expression of conjunctions and disjunctions of constraints). Each constraint in the WHERE clause may have a model specified, else it will operate on the expression in the FROM clause. To evaluate a query, the FROM and WHERE expressions need to be merged. This results in a more complex constraint expression, with each constraint having its model specified. The operations in the FROM clause get transformed into disjunctions and conjunctions in the new constraint expression, hence the increase in complexity for the expression, though the individual constraints are still quite simple.

The server then iterates over the constraints and works out which resolver should be used to evaluate each one. In most cases, this is the "System Resolver" as defined by the PersistentResolverFactory tag in the kowari-config.xml file. By default, this is set to the Statement Store described above.

Once the resolvers for each constraint are found, the constraints are sent off to be resolved. The Statement Store resolver tests the constraint for the location of variables, and uses this to determine which index is needed. It finds the extends of the solution, and returns an object containing this information and a reference to the index, so that the results can be extracted through lazy evaluation.

Next, the results are "joined" according to the structure of the constraint expression. These operations are done with lazy evaluation, so it is not quite as simple as it sounds, particularly when optimisation is considered, but the method calls at a higher level are all straightforward (as you'd expect).

The final results are then globalized by using the String Pool to convert all the numbers back into URIs and literals. The globalized answer then gets serialized and sent to the client.

Insertions and deletions use a similar process to what I've just described.

This really glosses over the details quite a bit, but it provides an outline of the process, and explains what I was doing for most of the week. When I get time I hope to document it properly, at which point this post should help by providing me with an outline.

Monday, August 22, 2005

End of the Week
OK, so I've written about a lot of the first week, but there is still a bit to go.

Towards the end of the week I took a step back and tried to present an overview of a query, giving some perspective on how BlockFiles and FreeLists sit at the bottom. Once I'd done that I dived back into the details again, this time going into ManagedBlockFile.

ManagedBlockFile is actually pretty simple, as it simply wraps a BlockFile and a FreeList. It uses the FreeList to allocate and release Blocks inside the BlockFile. This is very important at higher levels, where blocks need to be re-used where possible, but should also be consistent from one phase to another.

Sitting on top of ManagedBlockFile comes AVLTree and AVLNode. These manage the structure of Kowari's AVL trees, and define the whole phase-tree concept. All of the work is done in AVLNode, while AVLTree stores the root of the tree, and manages the phases. It was while describing the phases at this level that people finally started to "get it". We're planning on abandoning phase trees for the XA2 data store so that we can have multiple writers at once, but phase trees are still pretty cool. Once the concept was explained, then the people who'd been frustrated at the single-writer restriction suddenly expressed admiration for the way the system works. It was DavidM's idea to use phase trees (not mine), but I did help write the code, so it was nice to get such positive feedback. :-)

Going through the details of AVLNode started getting quite laborious, so I think everyone was grateful when we got to the end of the week. Fortunately, at this point all of the utility classes were described, meaning that I could get into the higher level architecture as soon as we got back on the following Monday.

Evenings
I had hoped to get a bit of programming done during my evenings, but somehow it didn't work out that way. One night was spent trying to organise flight changes at the end of my trip and doing laundry. Another evening was spent catching up with DavidW.

One of the most interesting evenings was spent over drinks with a couple of people from Mantech, including Greg who'd been paying me for the previous 6 months. This was my first chance to meet anyone from Mantech, so it was a real shame that the evening could not have gone on a little longer than it did. Working remotely reduces the amount of feedback you get, so I was pleased to hear that Greg was happy with my work to that point. You can run into periods of self doubt when working on your own (a phenomenon I've often heard PhD student complaining about), and it really helped to hear someone appreciating what I've done.

The other thing I did in my evenings was to use an iSight camera and iChat to talk to Anne and Luc while they had breakfast. H.264 dynamically sacrifices clarity in order to keep the framerate up, which gives a nice, natural feeling interface. It doesn't really bother me (too much) when things go blurry, but systems that give clear pictures with large delays between them don't feel like you're in direct communication with the person at all. So I really like iChat, though there were occasions when the bandwidth into the house in Brisbane was painful. Telstra's broadband network does leave something to be desired on occasion.

I was missing Anne and Luc a lot, but being able to see them helped a little. Luc thought it was great too, though it took a while to convince him that this was not like the TV, and that he could interact with me. We finally managed it when Anne tickled the screen, and I squirmed like she was really tickling me. He thought that was hilarious. :-)

Second Weekend
Anticipating the coming weekend, I drove to Circuit City (Google Maps are great for finding that kind of thing), and bought a new camera. I'd forgotten to bring ours from Australia, and besides it was starting to show signs of age. My criteria was an ability to take longish videos, a short delay from button press to shutter release, and a small size (so it's not awkward to take around as a tourist). I found what I wanted in the Casio Exilim EX-S500. Fortunately it wasn't too expensive (though I used up the last of my credit card to get it). It was nice enough to elicit a few comments from the people who saw it on Friday morning (I enjoy showing off new toys).

With work over on Friday, I had to drive down to Dulles (and I started right next door to BWI! Talk about frustrating!) to catch an evening flight to Pittsburgh. Despite doing the drive during peak hour traffic and in heavy rain (I was listening to local flash flood warnings coming over the radio), I made it with some time to spare. I then waited a couple of hours for a flight that was delayed due to bad weather. :-)

From Pittsburgh to Salem, OH, by car. From here I enjoyed a weekend of rural America with DavidW and his family. Despite some unforeseen problems, I enjoyed the weekend, and I might even try to make it back for the 200th anniversary parade next year. Then it was a road trip back to Virginia on Sunday, and a drive up to Baltimore on Monday morning for work.

So by Monday morning I was feeling pretty tired, and I had only just made it past the halfway point in the trip!

Tuesday, August 16, 2005

Trip
It's been a couple of weeks now, and the details are fading, but I'll give a brief run down on what happened while I was away...

Nothing Technical (yet)
Of course, the Brisbane to LA trip was the hard part. I woke up at 4:30am Wednesday morning with Luc being sick. I thought that his irritability was just a reaction to the natural stress of preparations for my trip, but he cheered up immensely as soon as he threw up all over me. :-)

I made the mistake of reading and watching movies for the first part of the trip (I got to see "Robots" at last). So by the time I decided to do some "work" on my notebook I was too tired to think clearly. Sure, I can work while I'm tired, but I'm only really effective if I'm already in the middle of something. Picking up the notebook from a cold start just didn't work. I did some Javadoc, but that was about it. After that I went back to books and movies. This continued into my flight from LA to D.C.

The Atlantic flight was delayed a little, so I barely had time to clear customs and run across the airport to my next flight. I made the flight, but my luggage didn't. That was unpleasant when I got into Dulles. All I wanted to do was clean up, and get a change of clothes, but I had to wait for several hours first. By the time I landed it was 5:30pm on Wednesday (same day, since I'd crossed the dateline). I'd been in the air for over 18 hours, and I'd been up for 29 hours straight. I'm glad I don't do that trip too often!

The second day was at a relaxed pace, though it was a little depressing hearing about the bombings in London that morning. A couple of years ago I spent a month travelling in the London tube every day, and it's hard to imagine something so catastrophic imposing on such an everyday thing. I checked out of the hotel, went back to the airport and picked up a rental car (they "upgraded" me to an SUV. Yuck). Then I was off to David and Bern's place to use their internet and catch up on news. I also rang my brother Rowan, which ended up having consequences for the weekend.

That afternoon I drove to Baltimore through torrential rain, on roads I'd never seen before, on a side of the road that I'm not used to. Thank goodness for MapQuest (though now I've started converting to Google).

Finally, on Friday I started work. The centre was just a couple of hundred meters away, so I could walk down. The day was just an introduction, and an overview of what was to come over the coming weeks. We went over the directory structure of the project, what can be found at Sourceforge, etc.

Things like the directory structure were tricky to explain, because some of it is unintuitive, and I didn't agree with how it should be structured. It's kind of like debating, where you're given a point of view and you have to sound convincing when you defend it, regardless of whether you agree with it or not. I had to do that a few times while I was there. Sometimes the best argument is just that it was someone else's decision, and the benefits in changing it are not worth the effort required.

When I'd spoken to Rowan he reminded me that it was his 30th birthday that weekend, and asked if I could make it down to Houston to see him. I checked out prices, but it was all looking too expensive. When I rang him back to let him know I couldn't make it, I got my sister-in-law instead, and she was able to organise a cheap stand-by flight for me on Continental (where he Dad works). So my first weekend was spent in Houston. I cleared out of work and went to BWI with one of the guys from Florida (he was commuting every week).

At BWI I was singled out for special testing in a booth that shot me with puffs of air (presumably testing for the presence of Nitrogen compounds). It was kind of interesting, and I got through the line just as fast as the people who weren't singled out... only I got to sit down while they looked through my bag. It ended up being less inconvenient for me than if I'd been left in line, since they did all the work searching my bag while I watched. I noticed that most people seemed happier than usual with the delays, probably because of the events in London two days before. As a PR exercise, the TSA seem to be doing a good job. I'll let others comment on their effectiveness.

Houston was fun, and the look on my brother's face when he saw me was priceless (he thought I couldn't come). I enjoy visiting new places, so I had a great time over the rest of the weekend. I should comment though... the quality of Houston's roads is TERRIBLE. I understand why, but you'd think they'd look for a solution.

Sunday night I flew back into BWI for work that week.

Project Structures
Since my task was to describe the storage code at the lowest levels in Kowari, I decided to go with a bottom-up approach. This had advantages and disadvantages, but I hope it worked out OK in the long run. Every so often (particularly when DavidW suggested that he could hear the sound of people's brains frying) I'd step back and try to give an overview, often coming back in with a top-down approach to where we were in the code.

One thing that would have been nice to have, was a diagram of file structure. That would have made it easier to see what the code was trying to accomplish. Unfortunately, I have no idea where the original diagrams would be (they'd be 4 years old by now, and DavidM and myself never thought too much of them, as they were easy enough to work out again), and I did not have enough preparation time to write them out again (plus, I'd have missed a few things if I'd tried).

Most of the code was perfectly familiar, but while showing some of it I discovered a few new features I'd never seen before. It would normally take me a few minutes to work out what something was doing, and I probably floundered a little on occasion, but I'd get through it in the end. I comfort myself with the knowledge that I'm one of the only people who knew enough about the system to be capable of working these things out, even if I didn't know them when I started. By the time I finished presenting a section of the code I ended up understanding it better than I did when I started. Ideally I'd have had enough preparation time to be all over it, but that wasn't available, so I can't really kick myself for not knowing it perfectly. They got a few free hours out of me on most evenings as I refreshed myself on code in preparation for the following day.

Having documented this code, and presented it in nauseating detail, I came away with a very clear idea of every layer for the storage code. This allowed me to write up those diagrams on file structure very easily, during the last couple of days. A couple of people said that they would have liked to see them at the beginning (something I'd already lamented), but I think they all appreciated getting to see them in the end.

Now that I'm back in touch with this code, I've been thinking that I should write up an overview of most of it, and also provide the file structure diagrams. I've since explained it to a couple of people here in Brisbane, and I think I've structured my thoughts on it well enough to build some developer documentation on it. However, that will take some time to write up and I'm about to start a new job, so I won't commit to it all straight away. I'll do what I can.

Block Files and Free Lists
The first week was spent entirely in the util-xa package.

Starting at the bottom, this includes the BlockFile interface, along with AbstractBlockFile, MappedBlockFile and IOBlockFile. These handle files with fixed-length records called Blocks, and transparently hide if the file is being accessed with normal I/O operations, or if it is memory mapped. Almost all files in Kowari are accessed with these classes, with the only exceptions being IntFile from the util package.

On top of this file abstraction is built the FreeList class. This name is a little misleading, as it only describes a single aspect of the class. However, it is there for historical reasons and it's not going to change any time soon.

Free lists are really a resource manager. Resources in this context are identified by a numerical ID. A FreeList will hand out resources when requested, and will collect resources when they are no longer in use. Resources will be recycled whenever possible (there are a lot of reasons for this, regardless of the resource type, but it always comes down to performance), and so the list of "freed" items has to be maintained in order to hand them out again when it is permissible to do so. This internal list of free items is where the class gets its name. A better name would be something like ResourceManager.

The list of free items is kept in a BlockFile. The blocks in this file form a circular, doubly-linked, linked list. It is in this file that the concept of Phase first appears, which is the basis of Kowari's transaction support. At this level, phases are handled as pointers to the first and last freed items within a phase, and a set of heuristics which manage those items which may and may not be re-allocated, according to the currently opened phases.

Phases don't make a lot of sense until the higher level phase trees are examined, but these rely on the functionality of FreeLists so much that it becomes difficult to to describe phase trees without first describing FreeLists. I've tried to do a top down explanation to a few people in the past, but they invariably question how resources are managed between phases. Hence, my approach of describing phases from the ground up, starting with FreeLists.

I'll describe this all in more detail when I document the FreeList shortly. I'll just say here, that this code took the best part of the first full week, and forms the basis of all the storage code Kowari's XA datastore.

Saturday, August 06, 2005

Paper
Yesterday ended up being busier than I expected. After I helped Anne and Luc get out the door, I went to town to meet a friend (Hi Brad) for a discussion on Resolvers in Kowari. This took some time, as I also discussed several other things, like my recent trip, etc. Once I get home, I went straight to UQ to meet with Bob. Again, I discussed my trip, and then got down to some of the work I should be doing now that I'm back.

Given my upcoming work, I don't expect to have a lot of time for my research degree. So Bob suggested that I should try to get a paper out of the way as soon as possible. He has a point, so I thought I'd aim for Australasian Ontology Workshop later this year. That gives me about 3 weeks to write something. Anything I write here should be a good starting point for a thesis too, so it will be useful in more ways that one.

My plan is to write about 6 way indexing of quads in RDF, and how this leads to easy constraint resolution. This then forms the basis for set-at-a-time rule operations which can then be used for OWL entailment and consistency checking (by extension of the KAON2 rule work). I'm busy tomorrow, but I'll start trying to write something on Monday.

While visiting the MIND lab, I was really impressed with the ontology debugging work in Swoop. I explained to Bob that while I expect my methods to quickly find an inconsistency, it will never provide enough information to track down complex problems in the same way that Swoop can. I thought that it might be worthwhile to use set-at-a-time processing to work with data on a larger scale, and then drop back to tableaux reasoning (or others) when a problem is discovered, so the source of the problem can be identified.

Funnily enough, Bob thought that this would be enough to write up as an MPhil thesis. I think it's funny, as I always thought Bob wanted me to do a lot more than I've been doing, and now he's suggesting that I need to do less. It's also funny, since I've always been much more interested in entailing new data quickly than in checking consistency. Consistency is important (OK, it's vital), but my purpose for this thesis has always been to entail new data efficiently. I find it ironic to be told that I don't need to do the part that I thought was more important (or more interesting).

Thursday, August 04, 2005

Recent Break
I haven't written in a while because of my recent trip to the US. Once I got there I was jet-lagged, but still had to get straight into work. Even after the jet-lag wore off, I was spending my days working, and my evenings were also kept very busy (meeting people, reading code and sleeping). I got back last week, but I've been so wrecked that I've avoided spending too much time on the computer. I've managed to get a few things done (mostly some Kowari support), but haven't yet taken the time to blog.

I have a lot to write at the moment. I want to:

  • Document some of what I was teaching in the US.
  • Write about what I learnt at MIND lab.
  • Talk about what I learnt about RDF and OWL over dinner with Jim Hendler
  • Document some of my ideas for storing Java ASTs in RDF (including an API for manipulating the AST, and a dynamic class loader).
  • Talk about some of the questions I've been asked about Kowari, and how certain problems should be addressed.
  • My visit to Chicago and Herzum Software.
  • My future plans for work.
Unfortunately, writing about all of this in one go just seems too intimidating to attempt in one hit. The only way I'll get through any of this material is to take it one piece at a time.

It's late right now, so tonight's entry won't cover anything of significance. I just thought it would be a good idea to write something, simply to make a start. I'm meeting with a couple of people tomorrow, but I should be able to get some time to record something useful.