Tuesday, August 28, 2012

Tnetstrings

A Forward Reference

Having had a couple of responses to the last post, I couldn't help but revisit the code. There's not much to report, but since I've had some feedback from a few people who are new to Clojure I thought that there were a couple of things that I could mention.

First, (and you can see this in the comments to the previous post) I wrote my original code in a REPL and pasted it into the blog. Unfortunately, this caused me to miss a forward reference to the parse-t function. On my first iteration of the code, I wasn't trying to parse all the data types, to the map of parsers didn't need to recurse into the parse-t function. However, when I updated the map, the  parse-t function had been fully defined, so the references worked just fine.

Testing

That brings me to my second and third points: testing and Leiningen. As is often the case, I found the issues by writing and running tests. Setting up an environment for tests can be annoying for some systems, particularly for such a simple function. However, using Leiningen makes it very easy. The entire project was built using Leiningen, and was set up with the simple command:
  lein new tnetstrings.
I'll get on to Leiningen in a moment, but for now I'll stick with the tests.

Clojure tests are easy to set up and use. They are based on a DSL built out of a set of macros that are defined in clojure.test. The two main macros are deftest and is. deftest is used to define a test in the same way that defn is used to define a function, sans a parameter definition. In fact, a test is a function, and can be called directly (it takes no parameters). This is very useful to run an individual test from a REPL.

The other main macro is called "is" and is simply used to assert the truth of something. This macro is used inside a test.

My tests for tnetstrings are very simple:

(ns tnetstrings.test.core
  (:use [tnetstrings.core :only [parse-t]]
        [clojure.test]))

(deftest test-single
  (is (= ["hello" ""] (parse-t "5:hello,")))
  (is (= [42 ""] (parse-t "2:42#")))
  (is (= [3.14 ""] (parse-t "4:3.14^")))
  (is (= [true ""] (parse-t "4:true!")))
  (is (= [nil ""] (parse-t "0:~"))))

(deftest test-compound
  (is (= [["hello" 42] ""] (parse-t "13:5:hello,2:42#]")))
  (is (= [{"hello" 42} ""] (parse-t "13:5:hello,2:42#}")))
  (is (= [{"pi" 3.14, "hello" 42} ""] (parse-t "25:5:hello,2:42#2:pi,4:3.14^}"))))


Note that I've brought in the tnetstrings.core namespace (my source code), and only referenced the parse-t function. I always try to list the specific functions I want in a use clause, though I'm not usually so particular when writing test code. You'll also see clojure.test. As mentioned, this is necessary for the deftest and is macros. It is worth pointing out that both of these use clauses were automatically generated for me by Leiningen, along with a the first deftest.

I could have created a convenience function that just extracted the first element out of the returned tuple, thereby making the tests more concise. However, I intentionally tested the entire tuple, to ensure that nothing was being left at the end. I ought to create a string with some garbage at the end as well, to see that being returned, but the array and map tests have this built in... and I was being lazy.

Something else that caught me out was that when I parse a floating point number, I did it with java.lang.Float/parseFloat. This worked fine, but by default Clojure uses double values instead, and all floating point literals are parsed this way. Consequently the tests around "4:3.14^" failed with messages like:

expected: (= [3.14 ""] (parse-t "4:3.14^"))
  actual: (not (= [3.14 ""] [3.14 ""]))

What isn't being shown here is that the two values of 3.14 have different types (float vs. double). Since Clojure prefers double, I changed the parser to use java.lang.Double/parseDouble and the problem was fixed.

Leiningen

For anyone unfamiliar with Leiningen, here is a brief rundown of what it does. By running the new command Leiningen sets up a directory structure and a number of stub files for a project. By default, two of these directories are src/ and test. Under src/ you'll find a stub source file (complete with namespace definition) for the main source code, and under test/ you'll find a stub test file, again with the namespace defined, and with clojure.test already brought in for you. In my case, these two files were:

  • src/tnetstrings/core.clj
  • test/tnetstrings/test/core.clj

To get running, all you have to do is put your code into the src/ file, and put your tests into the test/ file. Once this is done, you use the command:
  lein test
to run the tests. Clojure gets compiled as it is run, so any problems in syntax and grammar can be found this way as well.

However, one of the biggest advantages to using this build environment, is the ease of bringing in libraries. Using Leiningen can be similar to using Maven, without much of the pain, and indeed, Leiningen even offers a pom command to generate a Maven POM file. It automatically downloads packages from both Clojars and Maven repositories, so this feature alone makes it valuable.

Leiningen is configured with a file called project.clj which is autogenerated when a project is created. This file is relatively easy to configure for simple things, so rather than delving into it here, I'll let anyone new to the system go the project page and sample file to learn more about it.

project.clj also works for some not-so-simple setups, but it gets more and more difficult the fancier it gets. It's relatively easy to update the source path, test path, etc, to mimic Maven directory structures, which can be useful, since the Maven structure allows different file types (e.g. Java sources, resources) to be stored in different directories. But since I always want this, it's annoying that I always have to manually configure it.

I'm also in the process of copying Alex Hall's setup for pre-compiling Antlr parser definitions so that I can do the same with Beaver. Again, it's great that I can do this with Leiningen, but it's annoying to do so. I shouldn't be too harsh though, as the way that extensions are done look more like they are derived from the flexibility of Clojure than Leiningen itself.

No comments: