Tuesday, December 02, 2008

Size

Disk usage is probably the second most common question I get about Mulgara, after speed. So to complement the plots from Monday, I've also plotted the disk usage for these "number" graphs.

The lowest line represents the space being used for URIs and Literals. The upper line is for the statements themselves. For convenience, the top line is the sum of the other two.

This storage mechanism is doing no compression on the data whatsoever. The current code in XA2 is already using an order of magnitude less space, both because of more intelligent storage, and also because many blocks will be gzip compressed in our structures. Andrae's reasoning for that is that while CPUs are getting faster all the time, disks are not. This means that any processing we do on the data is essentially free, since the CPU can usually be done in less than the time it takes to wait for a hard drive to return a result, even a solid state drive.

I should note that these graphs are all on a version of XA1.1 that is not yet released (it's in SVN, but not in the trunk yet). I've been hoping to get this into the next release, but because I'm doing a release by the end of this week, then I'm thinking it will have to be in the release after (before Christmas).

No comments: