The Hidden Gems of Jakarta Commons, Part 1
by Timothy M. O'Brien12/22/2004
If you are not familiar with the Jakarta Commons, you have
likely reinvented a few wheels. Before you write any more generic
frameworks or utilities, grok the Commons. It will save you serious
time. Too many people write a StringUtils class that
duplicates methods available in Commons Lang's
StringUtils, or developers unknowingly recreate the
utilities in Commons
Collections even though commons-collections.jar is
already available in the classpath. Seriously, take a break. Check
out the Commons Collections API and then go back to your task; I
promise you'll find something simple that will save you a week over
the next year. If people just took some time to look at Jakarta
Commons, we would have much less code duplication--we'd start making
good on the real promise of reuse. I've seen it happen; somebody digs
into Commons BeanUtils or Commons Collections and invariably they have
a "Oh, if I had only known about this, I wouldn't have written 10,000
lines of code" moment. There are still parts of Jakarta Commons that
remain a mystery to most; for instance, many have yet to hear of Commons CLI or Commons
Configuration, and most have yet to notice the valuable
functors package in Commons Collections. In this series,
I emphasize some of the less-appreciated tools and utilities in the
Jakarta Commons.
In this first part of the series, I explore XML rule set definitions in
the Commons
Digester, functors available in Commons Collections, and an
interesting application, Commons JXPath, to
query a List of objects. Jakarta Commons
contains utilities that aim to help you solve problems at the lowest
level of programming: iterating over collections, parsing XML, and
selecting objects from a List. I would encourage you to
spend some time focusing on these small utilities, as learning about
the Jakarta Commons will save you a substantial amount of time. It
isn't simply about using Commons Digester to parse XML or using
CollectionUtils to filter a collection with a
Predicate. You will start to see benefits once you
realize how to combine the power of these utilities and how to relate
Commons projects to your own applications; once this happens, you will
come to see commons-lang.jar,
commons-beanutils.jar, and
commons-digester.jar as just as indispensable to any system as
the JVM itself.
|
Related Reading
Jakarta Commons Cookbook |
If you are interested in learning more about the Jakarta Commons, check out the Jakarta Commons Cookbook. This book is full of recipes that will get you hooked on the Commons, and tells you how to use Jakarta Commons in concert with other small open source components such as Velocity, FreeMarker, Lucene, and Jakarta Slide. In this book, I introduce a wide array of tools from Jakarta Commons from using simple utilities in Commons Lang to combining Commons Digester, Commons Collections, and Jakarta Lucene to search the works of William Shakespeare. I hope this series and the Jakarta Commons Cookbook provide you with some interesting solutions for low-level programming problems.
1. XML-Based Rule Sets for Commons Digester
Commons Digester 1.6 provides one of the easiest ways to turn XML into objects. Digester has already been introduced on the O'Reilly network in two articles: "Learning and Using Jakarta Digester," by Philipp K. Janert, and "Using the Jakarta Commons, Part 2," by Vikram Goyal. Both articles demonstrate the use of XML rule sets, but this idea of defining rule sets in XML has not caught on. Most sightings of the Digester appear to define rule sets programmatically, in compiled code. You should avoid hard-coding Digester rule sets in compiled Java code when you have the opportunity to store such mapping information in an external file or a classpath resource. Externalizing a Digester rule set makes it easier to adapt to an evolving XML document structure or an evolving object model.
To demonstrate the difference between defining rule sets in XML and
defining rule sets in compiled code, consider a system to parse XML to
a Person bean with three properties--id,
name, and age, as defined in the following class:
package org.test;
public class Person {
public String id;
public String name;
public int age;
public Person() {}
public String getId() { return id; }
public void setId(String id) {
this.id = id;
}
public String getName() { return name; }
public void setName(String name) {
this.name = name;
}
public int getAge() { return age; }
public void setAge(int age) {
this.age = age;
}
}
Assume that your application needs to parse an XML file containing
multiple person elements. The following XML file,
data.xml, contains two person elements
that you would like to parse into Person objects:
<people>
<person id="1">
<name>Tom Higgins</name>
<age>25</age>
</person>
<person id="2">
<name>Barney Smith</name>
<age>75</age>
</person>
<person id="3">
<name>Susan Shields</name>
<age>53</age>
</person>
</people>
You expect the structure and content of this XML file to change over
the next few months, and you would prefer not to hard-code the
structure of the XML document in compiled Java code. To do this, you
need to define Digester rules in an XML file that is loaded as a
resource from the classpath. The following XML document,
person-rules.xml, maps the person element to
the Person bean:
<digester-rules>
<pattern value="people/person">
<object-create-rule classname="org.test.Person"/>
<set-next-rule methodname="add"
paramtype="java.lang.Object"/>
<set-properties-rule/>
<bean-property-setter-rule pattern="name"/>
<bean-property-setter-rule pattern="age"/>
</pattern>
</digester-rules>
All this does is instruct the Digester to create a new instance of
Person every time it encounters a person
element, call add() to add this Person to an
ArrayList, set any bean properties that match attributes
on the person element, and set the name and
age properties from the sub-elements name
and age. You've seen the Person class, the
XML document to be parsed, and the Digester rule definitions in XML
form. Now you need to create an instance of Digester with
the rules defined in person-rules.xml. The following
code creates a Digester by passing the URL
of the person-rules.xml resource to the
DigesterLoader. Since the person-rules.xml
file is a classpath resource in the same package as the class parsing
the XML, the URL is obtained with a call to
getClass().getResource(). The
DigesterLoader then parses the rule definitions and adds
these rules to the newly created Digester:
import org.apache.commons.digester.Digester;
import org.apache.commons.digester.xmlrules.DigesterLoader;
// Configure Digester from XML ruleset
URL rules = getClass().getResource("./person-rules.xml");
Digester digester =
DigesterLoader.createDigester(rules);
// Push empty List onto Digester's Stack
List people = new ArrayList();
digester.push( people );
// Parse the XML document
InputStream input = new FileInputStream( "data.xml" );
digester.parse( input );
Once the Digester has parsed the XML in
data.xml, three Person objects should be in
the people ArrayList.
The alternative to defining Digester rules in XML is to add them using
the convenience methods on a Digester instance. Most
articles and examples start with this method, adding rules using the
addObjectCreate() and
addBeanPropertySetter() methods on Digester.
The following code adds the same rules that were defined in
person-rules.xml:
digester.addObjectCreate("people/person",
Person.class);
digester.addSetNext("people/person",
"add",
"java.lang.Object");
digester.addBeanPropertySetter("people/person",
"name");
digester.addBeanPropertySetter("people/person",
"age");
If you have ever found yourself working at an organization with 2500-line classes to parse a huge XML document with SAX, or a whole collection of classes to work with DOM or JDOM, you understand that XML parsing is more complex than it needs to be, in the majority of cases. If you are building a highly efficient system with strict speed and memory requirements, you need the speed of a SAX parser. If you need the complexity of the DOM Level 3, use a parser like Apache Xerces. But if you are simply trying to parse a few XML documents into objects, take a look at Commons Digester, and define your rule set in an XML file.
Any time you can move this type of configuration outside of compiled code, you should. I would encourage you to define your digester rules in an XML file loaded either from the file system or the classpath. Doing so will make it easier to adapt your program to changes in the XML document and changes in your object model. For more information on defining Digester rules in an XML file, see Section 6.2 of the Jakarta Commons Cookbook, "Turning XML Documents into Objects."
Pages: 1, 2 |