ONJava.com -- The Independent Source for Enterprise Java
oreilly.comSafari Books Online.Conferences.

advertisement

AddThis Social Bookmark Button

Qualities of a Good Middle-Tier Architecture
Pages: 1, 2

Easy to Look Through the Component and Understand What It Does

Even with all of these facilities, it may not be clear what a component does. It becomes necessary to dig into the code and find out what is happening for real. The code of a component should be readable and make sense. For example, stored procedures tend to be (or at least should be) small and less coupled. Because of this, you can understand what a stored procedure does in a single sitting.



This becomes a little challenging in Java, as there tends to be a higher level of coupling with other APIs and modules. There are ways to minimize this coupling in Java. For instance, one can have a set of declarative reusable parts as in an orchestrated workflow. Then a business component becomes merely an orchestration of calling these parts. As a result, the code of the business component will read more like a script rather than low-level Java code. The parts themselves become well known over a period of time. For example, an FTP part, an XML reader part, a file reader part, a database reader part, etc. One advantage to parts is that reading a row from a file or a database doesn't have to involve a file read or a JDBC call, but just the part invocation.

Ability to Develop Components in Relative Isolation

When you decide to write a new component, the idea of isolation plays a part. Relational databases are great examples where isolation levels (the opposite of coupling) are high. For instance, you can develop a stored procedure that can operate on a set of tables with very little regard to the other stored procedures that exist. It is also not uncommon for multiple applications developed in multiple languages to work on the same relational data store. This relative isolation is a great boon for business components and services because they can be developed in parallel by multiple teams.

In a Java application server that is back-ended by relational databases, the same isolation should be possible. For instance, you can declare a series of transactions, each composed of SQL, and then use Java purely as a scripting engine for these transactions. When these scripts become as dynamic as JSP pages, it becomes very easy for developers to write new tasks or services that get dynamically compiled in. Here is an illustration of this approach:

<Tx name=1>
    <type>databaseExec</type>
    <database-name>abc</database-name>
    <statement>insert into a table</table>
</Tx>


<Tx name=2>
    <type>databaseExec</type>
    <database-name>abc</database-name>
    <statement>Update a table</table>
</Tx>

<type-parts>
    <databaseExec>com.parts.DatabaseQueryUpdaterPart</databaseExec>
<type-parts>

In the example above, we have two declarative transactions, Tx1 and Tx2. Now you can write a Java component such as the following:

public class MyUpdateComponent {
    void execute(arg1, arg2) throws ExecutionException	{
        try	{
            Hashtable ht = new Hashtable();
            ht.put("arg1",arg1); ht.put("arg2",arg2);
            execute("Tx1", ht);
            execute("Tx2",ht);
        }
        catch() {}
        finally {}
    }
}

In essence, the bodies of your components look like a scripting engine where the scripting language is Java. And, in fact, the database server should be able to accept these Java scripting elements and define business components out of these elements.

Ability to Debug Through the Component

Stepping through the execution of business components is an important quality, as this allows the testing process to be more streamlined and modern. This is such a common requirement that one would expect all systems to have it. This is not as true as I wish it were.

For instance, debugging stored procedures is not easy. The latest tools are bridging, this gap but the process is very slow, when it is even available. Debugging regular Java programs is not that bad in the Java IDEs, but debugging JSP pages and servlets is still a chore in most of the IDEs.

On the other hand, in single-vendor-controlled environments like ASP.NET, debugging is just great. The primary thing you notice in Visual Studio .NET is how fast the debugging sessions are. Things run almost as fast in the debugging environment as they do in the non-debugging environment. This encourages one to step into debugging with a bit more frequency.

Ability to Ignore Transactions, Connections, Database Details, etc

This quality is one of the most cited reasons for EJBs. The idea that you can write business components without paying attention to transactions programmatically is quite attractive and productive. Actually, stored procedures do this a lot better, for the most common case. When you are writing a stored procedure, you don't think about connections. You just access your tables and update tables. Similarly, you don't think of transactions in stored procedures, as they are all controlled by the invoker. Why can't we do the same for our business procedures that are written in Java? We absolutely can; it is a mindset change. In the example cited above, MyUpdateComponent is unaware of connections and transactions. Connections and transactions are better controlled externally and declaratively.

One way to recognize a spotty business component design is to see if the business components are explicitly passed connections, transactional contexts, etc. In such an environment, it becomes very hard to script business components into higher-level components.

Independently Call a Component with Sample Data

When we define a component, we usually don't think about sample data that can be used to invoke the component to test it. It will be nice to define the sample data when a component is deployed. This will allow for quick testing of the component without spending a lot of time looking for this sample data.

For cases where this sample data cannot be provided beforehand, an explanation should be provided as to how to obtain this sample data from the environment. This is to allow new members of the team to quickly test and use the reusable middle tier components or services.

Easily Substitute One Component for Another (Example: One Stored Procedure Vs. the Other)

This quality goes into the heart of evolvable systems. Interface and implementation separation is the first step to take on this road. Then you have to answer how you can evolve both interfaces and implementations over time while keeping the client code backward compatible. Implementation of an interface can change in three dimensions; one dimension may be provided by multiple vendors supplying the same functionality. One dimension is functional variation; for example, sending a message may mean email in one case and a voice mail in another case. The third dimension is the functionality evolving over time.

In this scenario, tasks and services have an advantage over components. Because components are considered as hosting multiple tasks, replacing one task might mean replacing an entire component, and hence, other tasks with it. But when tasks or services are implemented by single objects with well known methods of execution, the granularity is at the task level. In my mind, this works out better.

Should the Middle Tier Deal with Typed Objects?

There are cases where a full-blown object middle tier makes complete sense. But I think certain concessions can be made when the back end is a relational database. Relational databases are fluid, dynamic, and largely typeless. Stored procedures gained many of their advantages because they remained typeless, as well. Their inputs and outputs are fundamental types. This allows them to have very little dependency on a typed model.

A middle tier that is based on services rather than components is essentially a transactional model, similar to stored procedures. But then there is the question of input and output types. If these were to be typed, then those type libraries need to be available at compile time.

It is also difficult to add fields to the API, as this requires re-compilation. As a result, I am tempted to think that a middle tier that follows common types (result sets, XML documents, etc.) might come out better than a typed solution. A typeless approach should increase interoperability and malleability.

Does this mean we need to abandon types altogether? There is a clever trick we can play so that the clients (or even servers) can still use types, if they choose. Imagine a middle tier API returning a result set (most likely, an abstraction of it, so that you can use it for non-relational data sources as well). The client can decide to work with the result set directly, or the client can loose-cast the row of the result set to a value object. By "loose-cast," I mean a dynamic cast where the fields of the value objects are resolved using reflection and then cached for subsequent calls. This late type binding can offer three advantages:

  1. Objects are not instantiated and tiered down when they don't need to be (in the middle tier).
  2. If you are doing forward iteration on a result set, you may be able to get away with one value object for n number of rows.
  3. You can loose-cast two value objects on to the same result set row. This ensures certain freedom to the middle tier developers to add additional fields.

We can also take it a step further and have the middle-tier API return a hierarchical data set (HDS); basically, an XML format. This generic data can be similarly bound to a typed object on the client side, and thereby give us similar benefits as above. Whether to return an HDS or a ResultSet is very much situation-dependent. Both have their places.

So we have indicated that the client has a choice between objects, result sets, or hierarchical data sets. What about the writer of the service? The writer has similar choices. The writer can create a Java object and the framework can convert it to XML, based on the client, if the client requests it. If the entire service is constructed declaratively, then most likely the entire XML document is constructed on the fly and there won't be any need for objects on the middle tier.

How Do the Existing Middle Tier Components Stack Up?

Let us see how some of the following technologies fare as part of a middle tier strategy. The technologies covered include:

  1. Stored procedures
  2. SOA
  3. EJBs
  4. COM+
  5. SQLJ or Java procedures

Stored Procedures

I have used stored procedures as the benchmark middleware to draw middle-tier qualities. Where available, stored procedures are a very good choice for a number of applications. There are some disadvantages, nevertheless; if you want your system to run on multiple databases, then they become a hindrance. But do realize what you are giving up in the process; sometimes with this single excuse most systems are designed without stored procedures and take lot longer to develop, and never see the light of day.

The other disadvantage of stored procedures is that some databases do not support them. It is possible for these databases to put in place a Java architecture where you write what are called "Java procedures" that parallel the stored procedures.

Even in databases where stored procedures are supported, another drawback is the categorization or classification of these stored procedures into distinct modules. Oracle has packages that can partially solve this problem. The real solution is to have a tool that can walk through the database and create arbitrary Java packages representing all of the stored procedures. This approach has two advantages; the first is that you achieve classification. The second is that you can use JavaDocs to document these stored procedures in an HTML format that the developers can learn about.

SOA

SOA stands for Service-Oriented Architecture. SOAP web services is an example of SOA. SOA has so much in common with stored procedures that you could call it "stored procedures for application servers." Many of the benefits touted for stored procedures should be realizable by SOA as well. Depending on an SOA implementation, these advantages may or may not be available. For example, to discover services, you need UDDI or some sort of a directory service. I think such a solution is too prohibitive for simple applications. It should also be possible to invoke services from web sites using test data without writing any programs. This is partially true at this time. Hopefully, in the next few releases you will have full support for this. Although SOA can be designed with minimum coupling, when session beans are used as the implementation tier for SOA, we reintroduce the coupling again, in terms of Java types. Another key piece that SOA has to address is the "in-process invocation" of services. It should be transparent, whether a service is invoked locally or remotely. When invoked locally, the plumbing should be removed and the client should be able to work with the native objects.

EJBs

Volumes have been written about EJBs already. Session beans closely resemble SOA and stored procedures: they are stateless and transactionally secure. One difference is that in stored procedures, a service is usually a single procedure, so the services can be replaced individually. A session bean, on the other hand, is a collection of APIs. To replace any single service you have to replace an entire collection of APIs. I believe we can improve upon session beans by introducing a layer called "Java procedures" and another layer where the Java procedures are categorized into a dynamically altered set of session beans to provide better categorization.

Entity beans are a very different approach to the middle tier. The heart of the middle tier that has been discussed so far is its "request/reply" or "services" or "API" nature. The idea is not based on objects. Entity beans are completely object-based. The middle tier is seen as a collection of objects. This makes perfect sense if the back end is an object database. When the back end is a relational database, two things seem to trip up entity beans. These are:

  1. Object-to-relational mapping.
  2. The fluid and changing world of relational databases and the fixed and typed world of objects.

The problems that EJBs are trying to solve are the same problems that object databases are trying to solve. The rigidity of types in OO databases, or the inability to accomplish fluidity in types, might be the reason for the lack of success on part of object databases. I believe there is future for object database, but I believe as a technology we may have to take few more steps before returning to this technology.

COM+

COM+ is an interesting animal. And I like what I see so far in .NET. First of all there are no entity beans. So in total, COM+ is lot easier to digest. COM+ is also nudging towards a tierless computing model. For example, writing a component means simply creating an interface and an implementation, and that's it. So it is no different than writing a class. And it is pretty close to being able to invoke this interface either locally or remotely, which means you can write clients and test them in a single-tier environment, and at deployment time, deploy them on multiple servers. I don't mean to say here that one can completely ignore the local and remote issues all the time; my point is there are cases where you can, and let this decision be a deployment decision rather than a compile-time one. Another nice thing about COM+ is that it is integrated into every .NET. This means you can even take advantage of it in console applications.

This goes to say that management of transactions and components is as relevant to standalone applications as it is to application servers. This means some of the facilities of EJBs are well served by bringing them down to the standard JDK. COM+ does not have all of the nice features listed here, but will be a nice backbone to implement the additional services on top of it.

SQLJ or Java Procedures

As I have suggested, the stored procedure model, owing to its simplicity, provides great opportunities to make the life of a client programmer very easy. But stored procedures have some disadvantages, the primary one being portability. I believe we can invent or architect something called "Java procedures" that parallels stored procedures. These Java procedures essentially can auto-manage transactions, connections, composition, etc., while manipulating the database data. The drawbacks that need to be overcome in Java are the following:

  1. Hardcoded references to data sources.
  2. Explicit handling of connections.
  3. Explicit handling of transactions.
  4. Embedded SQL.
  5. Ability to deal with non-database yet relational data sets.

Some of these problems have already been addressed, but the toughest one is embedded SQL. One solution here is the SQLJ standard. But somehow I am not very comfortable with pre-compiled solutions; just one more variable to worry about. One workaround is to use configuration files where the SQL is listed as a transaction. The Java procedure then can invoke that transaction using its name. Under this scenario, the Java procedure looks like a script that executes these transactions. This model has served me well, but one drawback here is that you have to ship your configuration files with the .jar files. One approach is to package these configuration files as part of the .jar. Once databases buy into this approach, they can even embed this approach directly into the database server and provide all of the nice features of stored procedures while using Java as their programming language and plain SQL for database transactions.

Where Am I Heading with the Middle-Tier Architecture?

As part of the architecture for my J2EE product Aspire/J2EE, I have experimented with Java procedures. They seem to work well. There are still some loopholes in this approach that I need to think about, but overall I am quite happy with the idea. Here are some additional things that I have in my middle tier arsenal:

  1. Declarative tasks or parts.
  2. SQL in configuration files and not in Java code.
  3. Transactions controlled by the framework and not by the business components.
  4. Middle tier emits objects, relational data sets, or hierarchical data sets
  5. Client calls to the middle tier are service-oriented, or "request/reply."
  6. Declarative composition of components.
  7. Parts/Workflow facilities.
  8. One object for one service.
  9. Write for single tier, but deploy in a multi-tiered environment.

What is lacking:

  1. A browser for the available middle-tier APIs.
  2. A categorization tool to discover the middle-tier APIs.
  3. Auto-generated language bindings for the data sets (relational or hierarchical).

Where is the Middle Tier Heading?

I think there are two strong moves afoot. On one side, the middle tier is taking a turn towards the well tested service oriented architecture. CICS/DB2 and stored procedures are examples of this same architecture. These are well proven paradigms. SOA is an amalgamation of the same principles, coupled with XML and objects.

The second enhancement is coming from the object camp. Here the focus is shifting from transport technologies (RMI, web services, HTTP) to the core objects: namely, interfaces and implementations. Meaning, if you write a class with an interface and an implementation, you can distribute it using any technology. This is most evident in COM+, but I expect this to happen in Java as well, utilizing dynamic proxies.

There is also an increasing awareness that these components need to be visible. The COM+ component browser is an example. I also believe the REST approach to SOA is a simpler one, compared to SOAP. I also believe the REST approach will make the discoverability better. For example, it is a lot easier to list a collection of URLs on a web site. These URLs cannot only give data when invoked, but they can also give out metadata when invoked with some additional parameters. For example, a URL can give out Java class definitions on the web browser for the data that it will give out. The same URL can also give out other language bindings.

Further References

  1. "Improving Your Career with Aspire and Tomcat." (ONJava.com) Outlines how RDBMS developers can effectively play the roles of web application architects by using stored procedures as their middle tier.

  2. "Bringing the J2EE Cathedral to the Bazaar." (ONJava.com) Outlines how J2EE development can be simplified using declarative paradigms as the middle tier.

  3. "Transparent Data Pipelines for JSP." (ONJava.com) An in-depth study of incorporating data access mechanisms for web application development.

  4. "For Tomcat Developers, Aspire Comes in a Jar." (ONJava.com) Outlines how declarative data access could work in Java.

  5. "Using Hierarchical Data Sets with Aspire and Tomcat." (ONJava.com) Outlines how RDBMS developers can effectively use Aspire to retrieve XML declaratively from relational databases. This article makes a case for web sites that are B2C and B2B at the same time using Aspire and Tomcat.

  6. "Moving Beyond JDBC - OSCON 2003, Portland." This link points you to the speakers page at OSCON 2003. This session at OSCON 2003 gives the advantages of hierarchical data sets in web development.

  7. Aspire Knowledge Central Home page. Some of the principles outlined in this article are used in constructing this web site. This is a content management tool for the general public to do web logging, and as well as maintain online documentation for free/open source products.

Satya Komatineni is the CTO at Indent, Inc. and the author of Aspire, an open source web development RAD tool for J2EE/XML.


Return to ONJava.com.