Using Hierarchical Data Sets with Aspire and Tomcat
by Satya Komatineni03/05/2003
What are Hierarchical Data Sets and Why Do You Care?
Hierarchical Data Sets are not new. They already exist in the form of CICS transactional data, files in directories, and plain Java objects, as well as the obvious XML. In the XML Journal in early 2001, I floated the idea that programmers can benefit from hierarchical data abstractions even though many of their data sources are predominantly relational (such as databases including MySQL, Oracle, SQL Server, DB2, etc.).
The .NET world has a similar idea taking root in the notion of "datasets." Although there are important differences between my proposed Hierarchical Data Sets and the nature of Microsoft's datasets, it is evident that Hierarchical Data Sets enhance relational abstractions with richer detail.
This article examines the structure of, and a Java API for, Hierarchical Data Sets. Unlike the XML Journal reference two years ago, you will now actually have a piece of executable code to use to start taking advantage of Hierarchical Data Sets. Although programmers can code in Java to access various data sources and construct the final Hierarchical Data Set, this article has an implementation that you can readily use to construct these Hierarchical Data Sets declaratively by simply composing pre-built relational adapters. Relational adapters include file readers, SQL readers, Stored Procedure readers, et cetera.
|
Related Reading
Java & XML Data Binding |
The question you're probably asking is "What good are these Hierarchical Data Sets?" Although they can't rival the salutary effects of large expensive pieces of Carbon on your most certainly deserving companions, Hierarchical Data Sets are quite useful in the programming world. For starters, an entire HTML page worth of data can be satisfied by a single Hierarchical Data Set. In an MVC model, a controller servlet can deliver a Hierarchical Data Set to a JSP page, which will paint it without further ado. For a warmup, it can be converted to XML and directly returned to the caller by the controller servlet. For the appeal, the Hierarchical Data Set can be converted to Excel. For the stylish, the Hierarchical Data Set can be redirected to a reporting engine or a charting engine that supports XML data.
Although the primary focus of the article is the Java programming API for Java programmers, Hierarchical Data Sets can be used by non-Java programmers quite effectively to obtain XML, HTML, or Excel formats directly from relational databases and other data sources by using a J2EE server such as Tomcat. Without further ado, let us investigate the structure of Hierarchical Data Sets and see how these data sets can be obtained declaratively (while relaxing your programming muscles a bit).
Structure of Hierarchical Data
A Hierarchical Data Structure can be conceptually represented as a Java API, or XML, or some other format. It is easiest to visualize as XML.
<AspireDataSet>
<!-- A set of key value pairs at the root level -->
<key1>val1</key1>
<key2>val2</key2>
<!-- A set of named loops -->
<loop name="loop">
</loop>
<loop name="loop2">
</loop>
</AspireDataSet>
This is a set of key/value pairs. A given set of key/value pairs could
yield n independent loops. Each loop is essentially a table of
data. The term "loop" is synonymous with "table." I
haven't used "table" because people might literally take
"table" to mean only data from a relational table. Having mentioned
that is a collection of rows (RowSet!), let us look closer at the
structure of a loop:
<loop name="loopname">
<row>
<!-- a set of key value pairs -->
<key1>val1</key1>
<key2>val2</key2>
<!-- a set of named loops -->
<loop name="loopname1">
</loop>
<!-- a set of named loops -->
<loop name="loopname2">
</loop>
</row>
<row>
</row>
</loop>
The only odd thing here is the structure of a row. A row is, expectedly, a
collection of key/value pairs. Here a row includes not only key/value pairs, but
also another recursive set of n number of independent loops. This
extension can produce trees with any amount of depth. (Or should I say,
height!)
Structure of Hierarchical Data in Java
The moment I showed the hierarchical data as XML, there is a possibility that people might take a Hierarchical Data Set to be literally XML and, hence, literally DOM and, hence, a lot of memory inside of the JVM. No need to panic. The Hierarchical Data Set can have its own Java API and need not be represented as a DOM. The majority of the time it is a forward-only-traversing-cursor-like-lazy-loading tree. Here is a working Java API for a Hierarchical Data Set:
package com.ai.htmlgen;
import com.ai.data.*;
/**
* Represents a Hierarchical Data Set.
* An hds is a collection of rows.
* You can step through the rows using ILoopForwardIterator
* You can find out about the columns via IMetaData.
* An hds is also a collection loops originated using the current row.
*/
public interface ihds extends ILoopForwardIterator
{
/**
* Returns the parent if available
* Returns null if there is no parent
*/
public ihds getParent() throws DataException;
/**
* For the current row return a set of
* child loop names. ILoopForwardIteraor determines
* what the current row is.
*
* @see ILoopForwardIterator
*/
public IIterator getChildNames() throws DataException;
/**
* Given a child name return the child Java object
* represented by ihds again
*/
public ihds getChild(String childName) throws DataException;
/**
* returns a column that is similar to SUM, AVG etc of a
* set of rows that are children to this row.
*/
public String getAggregateValue(String keyname) throws DataException;
/**
* Returns the column names of this loop or table.
* @see IMetaData
*/
public IMetaData getMetaData() throws DataException;
/**
* Releases any resources that may be held by this loop of data
* or table.
*/
public void close() throws DataException;
}
For brevity, the Java interface ihds represents "Interface to
Hierarchical Data Set." This API allows you to step through your loops
recursively. An implementation has the option to load the loops only when they
are requested. It can also assume either forward-only or random traversal.
Before going further, let me present the two additional interfaces that this API
uses: ILoopForwardIterator and IMetaData.
How to Move Through a Series of Rows in HDS: ILoopForwardIterator
package com.ai.htmlgen;
import com.ai.data.*;
public interface ILoopForwardIterator
{
/**
* getValue from the current row matching the key
*/
public String getValue(final String key);
public void moveToFirst() throws DataException;
public void moveToNext() throws DataException;
public boolean isAtTheEnd() throws DataException;
}
IMetaData: For Reading Column Names
package com.ai.data;
public interface IMetaData
{
public IIterator getIterator();
public int getColumnCount();
public int getIndex(final String attributeName)
throws FieldNameNotFoundException;
}
How Can You Obtain a Hierarchical Data Set, So You Can Use It?
Now that we know the structure of Hierarchical Data Set, how do you get hold of one? As I stated earlier, this is easy under Aspire. The steps are as follows:
- Learn the basics of Aspire.
- Create a definition file for your Hierarchical Data Set.
- Call your definition and receive
ihdsin your Java code.
Each of these steps is explained in some detail below.
Read the Basics on the Usage of the Aspire JAR
Aspire is a small JAR file that can complement your Java programming, particularly when used with an app server such as Tomcat. At the heart of Aspire is a set of configuration files, where you declare your data access mechanisms in terms of Java classes and arguments to those Java classes. Aspire will execute those Java classes and return the resulting objects. Hierarchical Data Sets are no exception.
An earlier O'Reilly article introduced Aspire: "For Tomcat Developers, Aspire Comes in a JAR." This will familiarize you with defining databases and calling SQL and Stored Procedures, as well as configuring and initializing Aspire.
Create a Definition File For your Hierarchical Data Set
A sample definition for a Hierarchical Data Set is as follows:
###################################
# ihdsTest data definition: section1
###################################
request.ihdsTest.className=com.ai.htmlgen.DBHashTableFormHandler1
request.ihdsTest.loopNames=works
#section2
request.ihdsTest.works.class_request.className=com.ai.htmlgen.GenericTableHandler6
request.ihdsTest.works.loopNames=childloop1
request.ihdsTest.works.query_request.className=com.ai.data.RowFileReader
request.ihdsTest.works.query_request.filename=aspire:\\samples
\\pop-table-tags\\properties\\pop-table.data
#section3
request.childloop1.class_request.classname=com.ai.htmlgen.GenericTableHandler6
request.childloop1.query_request.classname=com.ai.data.RowFileReader
request.childloop1.query_request.filename=aspire:\\samples\\pop-table-tags
\\properties\\pop-table.data
This definition has three sections. The data set is named
ihdsTest. The first section tells Aspire that the Java class
com.ai.htmlgen.DBHashTableFormHandler1 is responsible for
returning an object implementing ihds. Unless you code your own
implementation of ihds, you will use this class in every data set
definition. It's the pre-fabricated class that knows how to compose relational
assets into hierarchical assets. Line 2 of section 1 tells
DBHashTableFormHandler1 that this main data set has one loop
called works.
Section2 defines the loop works. A loop structure in Aspire
uses two Java classes: a class request (GenericTableHandler6) and
a Query request (RowFileReader).
RowFileReader reads a set of records from a flat file and makes
them look like a collection of rows and columns.
GenericTableHandler6 takes this collection and applies such
features as aggregate values and row numbers and implements the
ihds interface at the loop level. As with
DBHashtableFormHandler1, GenericTableHandler6 is
present in most definitions. RowFileReader might change, depending
on your data sources. For example, the following parts exist in this
category:
RowFileReader.DBRequestExecutor2(for reading SQL).StoredProcedureExecutor2(for reading from Stored Procedures).XMLReader(for reading XML files).- Or, you can write your own reader that implements
IDataCollection.
Section2 also indicates that it has a child called childloop1.
GenericTableHandler6 will take this cue and look for section3,
identified by childloop1.
Section3 defines childloop1. The definition is identical to
section2, except that childloop1 has no children. Both section2 and
section3 use RowFileReaders. In practice, they can use any
combination of data reader parts.
Let me call this file ihds-test.properties. Include this file
in Aspire's master aspire.properties as follows:
application.includeFiles=aspire:\\samples\\hello-world
\\properties\\hello-world.properties,\
aspire:\\samples\\ihds-test\\ihds-test.properties,\
aspire:\\samples\\xml-reader\\xml-reader.properties
For the sake of completeness, I have included a couple of lines above and below that inclusion process.
Call your Definition and Receive an ihds
Now that we have the definition, how do we call it from Java? Reading that first article will help considerably, but here is the Java code:
Hashtable args = new Hashtable();
args.put("key1".toLowerCase(), "value1");
IFactory factory = AppObjects.getFactory();
ihds hds = (ihds)factory.getObject("ihdsTest",args);
// use ihds
Aspire has a factory service, represented by the IFactory
interface. This factory interface allows you to call a Java class, identified by
a symbolic name called ihdsTest, with any arguments passed in as a
hashtable. The arguments are expected to be lowercase strings for the
downstream relational adapters.
Pages: 1, 2 |