XOM Design Principles


Table of Contents

Design Goals
Absolutely correct
Easy to use
Easy to learn
Fast enough
Small enough
No gotchas
Design Principles
As simple as it can be and no simpler!
Use Java idioms where they fit
There’s exactly one way to do it
Start small and grow as necessary
Principles of API Design
APIs are written by experts for non-experts
It is the class’s responsibility to enforce its invariants
Verify preconditions
Do not allow clients to do bad things.
Hide as much of the implementation as possible.
Design for subclassing or prohibit it
Prefer classes to interfaces
XML Principles
All objects can be written as well-formed XML text
It is impossible to create malformed documents
Validity can be enforced by subclasses
Syntax sugar is not represented
Java Principles
Not thread safe
Classes do not implement Serializable; use XML.
Classes do not implement Cloneable; use copy constructors.
Lack of generics really hurts in the Collections API. Hence, don’t use it.
Problems detectable in testing throw runtime exceptions
Assertions that can be turned off are pointless
Setter and mutator methods return void
Development Style
This is a cathedral, not a bazaar
Unit testing
Static testing
Massive samples

I have a very clear vision for XOM based on certain principles for both XML and Java. If you want to request new features or contribute to its development, you should understand these first. They go a long way toward explaining why XOM takes the path it does.

The ultimate goal of course is an API that’s useful for processing XML with Java. However, there are many different ways to design such a thing, and sometimes you have to make trade-offs. The following are the basic goals for this API:

As Donald Knuth wrote in 1974, “premature optimization is the root of all evil.[1] XOM tries to be fast enough for most common uses. It certainly won’t be as fast as SAX for many uses because it depends on an underlying SAX parser to read the document. XOM deliberately trades speed for convenience. it is focused on programmer productivity, not minimum execution time.

That said, I will of course look for opportunities to significantly optimize XOM provided it affects neither the public API nor the fundamental correctness of XOM’s XML handling.

Clients occasionally request the ability to turn off the checks, either because they feel they don't need them or because they have a special case where they actually need to generate incorrect data. Don't let them!

Clients that think they don't need the checks are almost always wrong. Everyone (library authors included) makes mistakes. Throwing out the safety net is not a good idea.

The usual reason given for turning off the checks is "performance". Interestingly, such requests rarely come accompanied by any actual measurements demonstrating that the checks are a significant part of the performance issues the client is encountering. As Donald Knuth wrote, "premature optimization is the root of all evil in programming". XOM is quite fast, and has been extensively optimized. While more optimization remains to be done, it is already faster than many competing libraries that do not perform as extensive verification. Speed issues should be addressed by profiling and optimization, not by removing functionality and safety.

The second reason given for turning off the checks is that the client wants to process or generate bad data. They may indeed want this, but they shouldn't. Furthermore the library should not let them. The library authors have a responsibility not just to the one client of the library but to all clients of the library. Furthermore, they have a responsibility to all users of the underlying technology not to break it. The library must not pollute the ecosystem on which the library depends. A TCP/IP library should not allow malformed IP packets to be sent to the network. An XML library should not allow malformed XML to be created. The library must encourage correct use of the underlying technology and discourage incorrect use.

Think of it this way: it's like designing a car to run on the public roads. It's not just the driver of the car you have to worry about. You also have a responsibility to other drivers, pedestrians, insurance companies, and indeed everyone who breathes on this planet. Their needs sometimes outweigh the needs or desires of the driver. Now if the driver opens up the hood and manually disables the emission control system and various safety features, there's not much you can do about it, but you're not responsible for it either. But if you ship a car that spews noxious emissions and injures pedestrians as soon as the buyer drives it off the lot, you are responsible for that. If the car only does this when driven 75 mph or faster, you're still responsible for that, even if you put a warning in the owner's manual and the fine print of the sales contract saying, "Don't drive faster than 75 mph." It's much better to design the car so that proper operation doesn't rely on unenforced limits.

A direct result of many of the above principles is that a library such as XOM should be developed using classes rather than interfaces. Interfaces simply cannot fulfill many of the requirements of a useful library.

There are two primary and pretty much unrelated reasons XOM relies on classes rather than interfaces:

Interfaces add an additional layer of indirection in programming. While additional layers of indirection can solve many problems, I've observed that they often lead to much confusion among programmers. A significant chunk of JDOM’s ease-of-use relative to DOM comes from using classes and constructors rather than interfaces. Certainly some programmers are comfortable with this level of indirection, but I think they're a minority. They may well be the smartest and most talented programmers, but I still think they're a minority. Most programmers in my experience are much more comfortable with concrete, direct APIs.

There also interface problems that affect everyone. The biggest is that is difficult for interface-based code to determine which class it is actually using. In the ideal world, this shouldn’t matter. Any implementation of the interface should be able to take the place of any other. In practice this simply isn’t true. For instance, almost no DOM implementation is willing to accept or operate on nodes created by a different implementation. In my own work, I repeatedly encounter problems because TrAX loads a different XSLT processor than I was expecting. SAX is a little more stable, but I still often need to choose a particular parser rather than accepting any implementation of XMLReader. The claim of implementation independence is simply not reliable in practice. Like Java itself, programs that process XML and are based on interfaces are very much write once, test everywhere.

The second issue is even more important. Interfaces cannot verify constraints on an object. There is no way to assert in an interface that the name of an element must be a legal XML 1.0 name, or that the text content of an element cannot contain nulls and unmatched halves of surrogate pairs. You must rely on the good faith of implementers not to violate such important preconditions. My experience with DOM has taught me that this is not a sensible bet. In the DOM world, implementations routinely fail to check the constraints DOM requires them to check. Implementations routinely fail to behave as DOM requires them to behave. Sometimes this is out of ignorance. Sometimes it’s a deliberate and knowing choice. Neither case is acceptable to me. If you're using XOM, you're guaranteed well-formedness, even with subclasses. I can only make that guarantee by using concrete classes that include final, non-bypassable code to check all constraints. If you can find a way to make XOM generate a malformed document, even by subclassing, then it’s a bug and I want to know about it so I can fix it.

Let me also address a non-issue for XOM: different implementations. The classic use-case for interfaces instead of classes is to support different implementations of the same API, multiple SAX parsers for example. In the context of XOM, this would most likely mean using a different storage backend; for instance, a native XML database instead of strings in memory. This is an interesting use-case but it is one I have chosen not to support precisely because I cannot figure out how to reconcile it with the requirement that XOM guarantee well-formedness. If somebody invented a means of plugging in different storage engines without allowing well-formedness checks to be removed or bypassed, I'd consider it; but well-formedness comes first. If it isn’t well-formed it isn’t XML, and XOM is an XML API.

I also think that the proper interchange format between different systems such as a native XML database and a custom application is real XML, not a DOM object, not a XOM object, not an Infoset, but real XML. Different local contexts will have different needs. One API will not suit them all. We can let a thousand incompatible APIs bloom, as long they all talk to each other by sending and receiving well-formed XML. On the other hand, subsetting or supersetting XML is an interoperability disaster. It is precisely this that XOM’s draconian focus on well-formedness is designed to avoid.

The Cloneable interface is a huge honking mess. It tries to implement the mix-in pattern in Java, but fails. Just because a class implements Cloneable is no guarantee you can actually clone it, or even that it has a publicly accessible clone method. (On this point, see [Effective Java, Joshua Bloch, Addison-Wesley, 2001, ISBN 0-201-31005-8, pp. 45-52.])

Consequently XOM classes do not implement Cloneable. Instead each node class provides a copy constructor and a copy method. For example, the following code clones an Element object e:

Element copy = new Element(e);

If you have a node whose more specific type is unknown, you can use the copy method instead:

Node copy = node.copy();

I got this idea from Bruce Eckel (Does Java need Checked Exceptions?[2], ) and Joshua Bloch ([Effective Java], particularly Item 40—Use checked exceptions for recoverable conditions and runtime exceptions for programming errors[3]—and 41—Avoid unnecessary use of checked exceptions[4]). Bloch is especially clear that precondition violations should cause runtime exceptions. (Think of IllegalArgumentException). Most of the time this indicates a programming error. Thus rather than putting in a try-catch block to handle the exception, the programmer should fix the mistake that led to the programming error in the first place. Assuming the programmer has fixed the mistake, there’s no reason to catch the exception because it won’t be thrown. And if the programmer’s wrong, and they haven’t fixed their mistake, then they should learn about it as soon as possible rather than having the exception get lost in an empty catch block added just to make the compiler shut up about an uncaught exception.

Most of the runtime exceptions in XOM occur as a result of precondition violations, for instance passing a string to setName that is not a legal XML name. This will likely happen every time the program is run, or every time the program executes a particular section of code, rather than depending on input or temporary conditions. This isn’t always true. For instance, a GUI might ask a user to type in an element name and then pass that string to setName. Whether or not the exception is thrown would then depend on what the user typed. However, I think far more often than not such a precondition violation is internal to the program, and thus should be caught by testing.

On the other hand, not all problems are like this. For instance, a ParsingException is thrown when an external document is discovered to be malformed. There is no way to predict this in advance because the document is not part of the program itself (unlike the string passed to setName). Whether or not the exception is thrown depends completely on which document you're parsing, and the document may not even exist at compile time. There’s no way to tell if it’s well-formed or not until run time. Hence ParsingException is a checked exception.

In Java 1.4 you can use a command line flag to disable assertion checking. However, if an assertion is violated, it’s still an error, just one you no longer notice. Turning off assertions at runtime is like including airbags in a new car model during design and street testing, then removing them before you begin selling the cars to consumers. No matter how rigorously you test, the users of your library will encounter situations and uncover bugs you did not find in testing. As Rolf Howarth wrote on the java-dev mailing list back in February, 2002:

Consequently, I decided not to rely on Java’s new assertion mechanism for precondition checking in XOM. Instead, each method that sets or changes some value verifies all preconditions explicitly and throws a runtime exception (normally a subclass of XMLException) if it detects a problem.

Furthermore, many methods are declared final to prevent subclasses from turning off this checking. Subclasses can override the various protected check methods to add assertions of their own, but they cannot remove the assertions in the core classes.

Simple, clean, easy-to-use APIs eschew side-effects. Each method call should perform one complete, logically unified operation. It should do as much as the operation requires and no more. This knife cuts in both directions. A public method should not do less than one operation (which would leave an object in an inconsistent state), but it should not do more either. Getting, setting, incrementing, and changing an object are logically distinct operations. In most cases, there is no semantic reason for a setter or mutator method to return a value. These methods should return void.

Partially, this is based on my experience with JavaBeans, where returning void is necessary for a method to be recognized as a property setter. However, more importantly, it’s semantically the right thing to do. By invoking a setter method (or any adder or mutator method) you are changing an existing object. You are not creating a new object. You are not getting a reference to some value that you did not have before. Nothing has been created that did not exist before. There is no justification for returning anything.

Some APIs such as JDOM have setter methods that return the object whose property was set. This enables code to use method call chaining. For example,

However, this is logically unjustified. It complicates the API and muddies the semantics of the method in exchange for syntactic convenience. It confuses the levels.

Even syntactically, I don’t like this style of coding. It’s often unclear, even with good indentation, which nodes are being added where. Multilevel hierarchies rely too much on parentheses to specify what goes where. When the statement has executed you often don’t have reference variables pointing to the nodes you need. Plus it's harder to debug when single stepping through the code, because each statement does many separate tasks, and return values are not stored in inspectable intermediate variables.

Dividing this statement up into multiple relatively atomic operations makes the code cleaner, easier to read, easier to understand, and easier to debug. Yes, there are classes in the Java class library that don’t operate this way, most notably StringBuffer. Perhaps this makes sense for StringBuffer, where it’s basically the equivalent of the + operator. Honestly though, this is really just a sop thrown to performance to avoid allocating lots of extra strings. Logically, the plus operator should do what it does for numbers: return a new object that is neither of its operands.

But even if I liked this style of coding, I'd still think that a setter method that returns the object whose property was set was semantically wrong. There is simply no logical justification for a setter or mutator method returning the object itself. It is a crutch designed to support a particular programming idiom, but that idiom is neither necessary nor helpful. Method call chaining is not an improvement.

Unit testing is essential for finding bugs. XOM currently includes over several dozen test classes, containing over 700 test methods, which probably test several thousand different things. A few tests really stretch the boundaries of the notion of unit test. For example one test runs the entire XInclude test suite, and two others canonicalize all well-formed documents in the XML 1.0 test suite, both with and without comments. This has been essential for finding and fixing bugs, as well as for making changes without breaking things. Tests are present to verify correct behavior, and to make sure the right exceptions are thrown when incorrect behavior is encountered. JUnit is the testing framework of choice.

Code coverage tools have been used to verify that the test suite is indeed exercising the entire code base. Clover and Jester have been particularly helpful in this regard. Both have uncovered numerous bugs in previously untested code. They have also resulted in elimination of substantial chunks of dead, unreachable code.

Unit testing has also been found to make debugging much easier. The most significant contribution is a convenient place to put a small test that exposes any newly reported bug. This allows one to set a break point and begin single stepping through the right section of code with the right conditions when the problem is not immediately obvious.

As part of the design of the XOM API, I reimplemented almost every example from Processing XML with Java in XOM. (A few utility classes that were very specific to a particular API, and didn’t really make sense for XOM, were omitted.) This helped me see what I had left out as well as what I had included that wasn’t actually necessary.



[1] Donald Knuth, “Structured Programming with go to Statements”, Computing Surveys 6 (1974): 261-301

[3] [ Use checked exceptions for recoverable conditions and runtime exceptions for programming Effective Java, Joshua Bloch, Addison-Wesley, 2001, ISBN 0-201-31005-8, pp. 172-173.]
[4] [ Avoid unnecessary use of checked exceptions Effective Java, Joshua Bloch, Addison-Wesley, 2001, ISBN 0-201-31005-8, pp. 174-175.]