InterfacesInterface design appears to fall into two categories.
The first category is where a set of functionality is to be provided, or is available in some way, and the interfaces are then built to provide access to all of this functionality. These designs are typically obtuse, and force a developer to jump through obscure hoops for their own internal gratification. 90% of interfaces fall into this category.
The second category is where the designer thinks about the task to be performed, and then thinks about how the code they would want to write in order to accomplish this task. Subsequent implementation may force some extra requirements into the design, but it is almost always possible to write implementations that largely fit the initial design.
Of course, the first category is built from the bottom up, and the second is top down. Sometimes it is necessary to use the bottom up approach, particularly when providing access to a pre-existing system, but the results are almost never pretty. Unfortunately, given the difficulty of many interfaces available today bottom up design seems to be the norm for designing interfaces. I want to name names here, but with the exception of Microsoft (MFC, or COM+ anyone?) it hardly seems fair to many of those hard working developers out there. (Actually, MFC doesn't seem to be a bottom up design. It's just full of weird inconsistencies where 6 similar-but-slightly different tasks rely on 6 completely unrelated mechanisms).
As an aside, when I say "interface", I'm not referring to Java interface definitions. I'm referring to the more general concept. These can be defined as Java interfaces, C++ headers, IDL files, XML descriptions, IUnknown querying, and more.
RDFS and OWLMany interfaces I've seen for interacting with RDFS and OWL have had bottom up interfaces, often because they are trying to provide all of the functionality available in these languages. However, the result is usually very messy, and difficult to work with. For many of these interfaces, you really need to know OWL in order to use the interface.
However, the task that most developers what to achieve is often much simpler than anything that would require a complete knowledge of OWL. Typically, a developer will want to define two things: A model, and instance data.
- The model will generally involve a taxonomy of classes, each with their own specific fields. It will describe properties on those fields, such as data types, lists, which ones are key fields, which ones are optional, and so on. It will also describe relationships between the classes, and possibly restrictions on those relationships.
- The instance data will simply be a set of objects which each have a type defined in the model.
It should also be noted that OWL isn't the only way to do this. UML has done all of this for a long time now. This should be no surprise, as almost all features from each language (OWL and UML) can be mapped into a representation in the other. The exceptions are rarely employed and can be worked around (one of these exceptions is n-m associations in UML). However, UML is typically used statically at design time, and not dynamically at runtime. This makes sense, since UML is a closed world model, and a runtime system would need to allow temporarily incomplete systems while instance data was being built. OWL has a natural advantage in this regard.
RDFS/OWL InterfacesWhen I needed a modeling interface, I made a conscious decision to avoid all of the OWL constructs, and only pick what I needed. I reasoned that the underlying language already would support any new required constructs, and trust that it would be possible to make sensible additions to the interface if any new requirements came along. My justification here is in my experience with interface changes usually being trivial, but modifying an underlying construct is often difficult or impossible.
Once I made that choice, my next step was to work out just what I wanted to do. The list was short, being comprised of the class definition, and object instantiation described above.
So what is the easiest way to describe each of these things? To me, a class definition is the name of the class, along with any inheritances it may have. It also contains a collection of fields. So a class definition should be a constructor which accepts a name, a list of other class definitions (or their names if I wanted to get into referencing classes before their definitions), and a list of fields. Fields would also require a name, a datatype (object or simple type), and some flags to indicate if they represented required data, a key field, and if they represented a list.
So was this approach useful? Well other than being verbose (having to describe all the fields, and then construct the class definition), it seems easy to use, and has been quite successful in the code we've used it in.
A more interesting question has been object instantiation.
I decided to take a leaf out of Perl here, where objects are just a hashmap, keyed on field name. This works quite well, and described a cheap way for me to get objects up and running. Wrapping the hashmap in a class that is given a copy of the class definition allows the data to be checked for consistency and completeness, and in some cases inferencing can be performed.
My only regret with this approach has been that it is verbose in a similar way to defining classes. The hashmap has to be created and fully populated, with each field taking its own verbose call for insertion. While easy, it isn't the way I would like to create these objects. Using the objects is similarly verbose, with all access going through
Thinking about it, the most obvious thing that I want to do here, is to simply create the object with an inbuilt language constructor. In the case of Java, this means a call to
new, with appropriate parameters. This is what UML would have provided, but UML would have been compiled into Java source code before compilation. What I'm looking for here is dynamic creation of the class.
Bytecode LibrariesThis is where my interest in bytecode libraries like ASM and BCEL comes in. Using these libraries it is possible to turn a class definition into a Java class, that can be instantiated with a custom class loader.
Make the custom class loader the current class loader, and you could theoretically use the
newkeyword. However, you'd have to cheat by doing a bait-and-switch, where you let the compiler build against one instance of a class, but have the class loader provide your class instead. OK, so this is an serviceable hack, but it's fun to know it's possible. Accessing fields isn't so easy though, and reflection is the only effective way.
A bigger problem is evolving class definitions over time. I've dynamically built classes in the past, but never tried to update a class after it has already been loaded once. I suppose the simplest way would be for each modification to get a new version ID that becomes a hidden part of the name, but that could lead to problems.
A better language to do this in is Ruby. Ruby already lets you define classes at runtime. More importantly, it lets you update them at runtime as well. I'm still a Ruby beginner, and I know nothing about the VM, so most of my ideas are just that, but it seems like a good idea.
I haven't had the chance to work on any of this modeling code for some months now, but I'm hoping I'll get the chance again soon. Depending on what seems most "natural" at the time, I may get to do some interesting things yet.
I should point out that most of my API is based on RDFS, with just a little OWL (InverseFunctionalProperty, Transitive, cardinality). While this sounds very restrictive, these few constructs provide a great deal of functionality. I'm looking forward to applying a few more.