Thursday, December 29, 2005

I've always thought that domain squatting was an unethical way to make money, but had only heard stories of it before now.

Anne suggested that it might be nice to pick up the domain, if it was available. After all, I normally take up a lot of the first result page at Google when you type in my surname. (Just looked, and today I don't! I really need to blog more often.) OK, so I'm not a commercial entity, but everyone recognizes .com, while people often look at me funny when I say .org or .net. The .com thing has brand recognition.

So I had a look, and discovered that the domain is already registered, but it is for sale. It turns out that it's available for purchase through the bidding process available at The minimum bidding price was way more than I'd have liked to spend, but it's my name, so why not?

A week later I discovered the bid was rejected. I asked Afternic what a decent price is supposed to be (according to their market analysis). Their response was that the current market value is $200. Still far too much, but it gave me confidence to ask the current domain holder how much they would like.

The answer? $2950. Quoting from the email:

Price is very low for a family name.

Huh? Whose family? The Rockefellers?

I didn't care about the domain all that much (it should probably go to a more commercial interest, like something run by Michael Gearon or Tierney Gearon), but registering a name and then charging to give it back to an owner of that name is a principle I find rather offensive. I suppose I should be grateful she was asking for $3000 and not $30,000.

I resolved it by registering for $8.20.

The other day I followed a link over to IBM's DeveloperWorks, and found that they have an RSS feed for their tutorials. I was pleased to find a simple Python tutorial that I'm using to finally introduce myself to that language. But more importantly, I found a tutorial for generating a UIMA annotator.

The UIMA docs are very verbose, and a tutorial like this has been great for cutting through the chaff. It's still full of stuff I don't need (mostly because I've already learnt it from the official UIMA docs), but it's still been a real help.

My biggest problem at the moment is that UIMA wants all my annotations in character offsets. Unfortunately the library I'm using is providing my information in word offsets. That's trivial to convert when words are separated by whitespace, but punctuation leads to all sorts of unexpected things, particularly since the grammar parser treats some punctuation as individual words, while others get merged into existing words.

I'm starting to wonder if I need to re-implement the parser so I know what the character offsets of each word will be. Either that, or I'll be doing lots of inefficient string searching. I don't find either prospect enticing. Maybe if I sleep on it I'll come up with something else.

No comments: