Sunday, January 11, 2009

New version of MarkUtils-Web: ZipServlet and CompressFilter

If you're not already familiar with my ZipServlet and CompressFilter, see my previous posting where I introduced these Java web utilities.

As with the previous release, the update is available on ziesemer.dev.java.net. The release folder is directly available here. "com.ziesemer.utils.web-2009.01.11.zip" contains the source code, a compiled .jar, and generated JavaDocs.

New in this release are a number of fixes and enhancements. To report any new bugs or enhancement requests, please use the issue tracker on ziesemer.dev.java.net.

ZipServlet combo mode

ZipServlet now provides a feature almost identical to YUI's Combo Handling. This allows for multiple files to be requested and sent joined together in one response, which can reduce the number of HTTP requests and improve performance, as detailed in this post on yuiblog.com.

This feature works particularly well for JavaScript and CSS files. However, only files of one type should be requested together, otherwise the returned MIME Content-Type HTTP header wouldn't make any sense. Unlike the current implementation provided by yahooapis.com, ZipServlet enforces this by returning a HTTP 400 error (Bad Request) if multiple files are requested that have different content types.

The combo mode is enabled by default in ZipServlet, but can be disabled by using the "comboEnabled" servlet parameter. The path on which combination requests are answered can also be configured through the "comboPath" servlet parameter. comboPath defaults to "combo", as used by Yahoo! / YUI. Additional details are listed in the JavaDocs for ZipServlet, included in the download.

A notable difference between ZipServlet and the Yahoo! / YUI implementation is that YUI files require the version to be prefixed to each requested file, e.g. "/combo?2.6.0/build/yuiloader/yuiloader-min.js&2.6.0/build/dom/dom-min.js&2.6.0/build/event/event-min.js". This presumably allows for multiple files to be requested across different versions. This doesn't make a lot of sense, and results in a longer URL as the version is prefixed before each requested file. As ZipServlet is designed to have each instance associated with one .zip file, as each version of a resource (YUI, etc.) belongs in its own file, and due to implementation details, it made sense to include the "combination" functionality directly into ZipServlet rather than as an additional filter, etc. This restricts combination requests to a given ZipServlet instance, while improving code reuse, performance, the URL length, and other aspects. The equivalent URL to the above when requested from ZipServlet, when using the example configuration in the previous post, including the "yui/" zipPrefix, would be "/combo?/build/yuiloader/yuiloader-min.js&/build/dom/dom-min.js&/build/event/event-min.js". Additionally, the leading slash before each file is optional in the ZipServlet implementation.

Unit Tests

New in this release is a complete suite of JUnit tests. I previously had not included any tests partially due to finding a unit testing solution for Java EE servlets and filters that met my requirements. I had looked at HttpUnit ServletUnit, but found a number of shortcomings. While I had previously been testing under Apache Tomcat, this required manual starting and stopping of the server, and registration of the projects. There are methods to automate this, but none that didn't seem overly complex and without their own shortcomings.

I finally decided upon using Jetty. Jetty is both free and open-source, and is 100% Java which makes it very portable. It is very representative of a production servlet engine as it is one, used by many projects including the JBoss and Apache Geronimo application servers. Jetty fully supports embedded within a Java application, which makes it very suitable for unit testing. It can be configured either declaratively or by using standard web.xml files, with details and several examples available at http://docs.codehaus.org/display/JETTY/Embedding+Jetty. As an added bonus, Jetty is readily available as an Apache Maven artifact, including the latest release and beta versions. This means that if opening the project with Maven support, such as with m2eclipse, Jetty and any other dependencies will automatically be downloaded and included on the testing classpath.

For this project, I wrote a ServletTester class that is reused by all the test classes. Each instance starts a Jetty instance on a dynamic port on the loopback address (127.0.0.1), rather than requiring that a specific be used or configured. This class then provides convenience methods for obtaining the server, connector, context, and base URL.

There are 2 types of tests that I provided for each component (ZipServlet and CompressFilter) - standard tests and configuration tests. The standard tests initializes and holds on to a Jetty instance within a ServletTester as a static field, which is reused by all the test methods. This is mainly for performance reasons, so that a new server isn't required for every test. The configuration tests deal specifically with testing the configuration options available for each component. While these configuration tests still reuse the same server instance, the server's context is restarted with a new ServletHandler for every test.

For the details of this methodology, please feel free to download the code and see for yourself. These tests also demonstrate the features and usage of the components.

XML and XSLT Tips and Tricks for Java

1. Get the latest versions

Starting with Java 1.4, the Java runtime has included a default XML parser and transformer implementation as part of the Java API for XML Processing (JAXP).

However, the included versions aren't up-to-date - not even to the latest versions available when each Java version was released.

As of this writing, the latest Apache Xerces2-J version is 2.9.1 (2.10.0 as of June 2010), and the latest Apache Xalan-J version is 2.7.1. I strongly recommend using the latest versions, as the versions built-in to Java are both somewhat limited and buggy. Xalan's FAQ page gives some strict notes concerning using a newer version under Java 1.4, due to the Endorsed Standards Override Mechanism. However, these instructions make it rather difficult - if not impossible - to use an updated library for a particular application on a shared JRE. Fortunately, these steps appear to be no longer required starting with Java 1.5 / 5.0. In these later versions, Sun has repackaged the Apache libraries into rt.jar as com.sun.org.apache.*, and properly load any desired implementation based on the "META-INF/services/javax.xml.*" files found on the classpath. Implementations including these files, including Apache Xerces2-J and Xalan-J, will automatically be used by default if included on the classpath.

2. Use Templates to reuse Transformations

Most of the JAXP interfaces are not thread-safe, including the factories and the instances obtained from them. I.E., neither instance of DocumentBuilderFactory nor DocumentBuilder should be stored statically or in another such way where they could be accessed by multiple threads.

The same applies to a Transformer. While it can be used repeatedly within a given thread, it is not thread-safe for use across multiple threads.

The solution is to use a Templates object, which can be thought of as a compiled-form of a stylesheet. Per the JavaDoc, "Templates must be threadsafe for a given instance over multiple threads running concurrently, and may be used multiple times in a given session." Additionally, use of Templates for repeated transformations will probably provide a performance improvement, as the transformation source (usually an XSLT) doesn't need to be re-read, re-parsed, and re-compiled.

Here is some simple, typical code of performing a transformation without a Templates object:

TransformerFactory tf = TransformerFactory.newInstance();
StreamSource myStylesheetSrc = new StreamSource(
  getClass().getResourceAsStream("MyStylesheet.xslt"));
Transformer t = tf.newTransformer(myStylesheetSrc);
t.transform(new StreamSource(System.in), new StreamResult(System.out));

Here is the improved code, which makes use of a reusable Templates object:

TransformerFactory tf = TransformerFactory.newInstance();
if(!tf.getFeature(SAXTransformerFactory.FEATURE)){
  throw new RuntimeException(
    "Did not find a SAX-compatible TransformerFactory.");
}
SAXTransformerFactory stf = (SAXTransformerFactory)tf;
StreamSource myStylesheetSrc = new StreamSource(
  getClass().getResourceAsStream("MyStylesheet.xslt"));
Templates templates = stf.newTemplates(myStylesheetSrc);

// templates can now be stored and re-used from practically anywhere.

Transformer t = templates.newTransformer();
t.transform(new StreamSource(System.in), new StreamResult(System.out));

3. Chaining Transformations

When multiple, successive transformations are required to the same XML document, be sure to avoid unnecessary parsing operations. I frequently run into code that transforms a String to another String, then transforms that String to yet another String. Not only is this slow, but it can consume a significant amount of memory as well, especially if the intermediate Strings aren't allowed to be garbage collected.

Most transformations are based on a series of SAX events. A SAX parser will typically parse an InputStream or another InputSource into SAX events, which can then be fed to a Transformer. Rather than having the Transformer output to a File, String, or another such Result, a SAXResult can be used instead. A SAXResult accepts a ContentHandler, which can pass these SAX events directly to another Transformer, etc.

Here is one approach, and the one I usually prefer as it provides more flexibility for various input and output sources. It also makes it fairly easy to create a transformation chain dynamically and with a variable number of transformations.

SAXTransformerFactory stf = (SAXTransformerFactory)TransformerFactory.newInstance();

// These templates objects could be reused and obtained from elsewhere.
Templates templates1 = stf.newTemplates(new StreamSource(
  getClass().getResourceAsStream("MyStylesheet1.xslt")));
Templates templates2 = stf.newTemplates(new StreamSource(
  getClass().getResourceAsStream("MyStylesheet2.xslt")));

TransformerHandler th1 = stf.newTransformerHandler(templates1);
TransformerHandler th2 = stf.newTransformerHandler(templates2);

th1.setResult(new SAXResult(th2));
th2.setResult(new StreamResult(System.out));

Transformer t = stf.newTransformer();
t.transform(new StreamSource(System.in), new SAXResult(th1));

// th1 feeds th2, which in turn feeds System.out.

Here is another approach, which makes use of XMLFilter's. This approach is also documented in Sun's J2EE 1.4 Tutorial.

SAXTransformerFactory stf = (SAXTransformerFactory)TransformerFactory.newInstance();

// These templates objects could be reused and obtained from elsewhere.
Templates templates1 = stf.newTemplates(new StreamSource(
  getClass().getResourceAsStream("MyStylesheet1.xslt")));
Templates templates2 = stf.newTemplates(new StreamSource(
  getClass().getResourceAsStream("MyStylesheet2.xslt")));

SAXParserFactory spf = SAXParserFactory.newInstance();
SAXParser parser = spf.newSAXParser();
XMLReader reader = parser.getXMLReader();

XMLFilter filter1 = stf.newXMLFilter(templates1);
XMLFilter filter2 = stf.newXMLFilter(templates2);

filter1.setParent(reader);
filter2.setParent(filter1);

Transformer t = stf.newTransformer();
t.transform(
  new SAXSource(filter2, new InputSource(System.in)),
  new StreamResult(System.out));

Note how in this later approach, the filter is applied at the source instead of the result.

4. Input Validation

Prior to Java 1.5 / 5.0, the only way to control validation through the JAXP API was to set custom attributes. This is described quite well in Sun's J2EE 1.4 Tutorial in "Validating with XML Schema", so I won't repeat it all here. However, do pay attention to the end of the page, which explains that schemas can be loaded from several different sources, including InputStreams and other InputSources - not just local Files or URLs, which many developers seem to overlook.

Starting with Java 1.5 / 5.0, the function of setValidating on DocumentBuilderFactory seems to have changed slightly. It now essentially controls only DTD validation, not modern schema validation e.g. W3C XML Schema or RELAX NG. Instead, a setSchema method is available, which accepts a compiled Schema object. Like the Templates object above, this is one of the few JAXP classes that is thread-safe and is meant for reuse.

One advantage with the DocumentBuilderFactory's setSchema method is that a document can be checked not only for well-formedness and for validity against a schema, but also for validity against a particular, pre-defined schema. Additionally, by default, the parsing process will follow URLs out to the Internet to resolve schemas, etc. Passing in a Schema object built from locally-kept files can improve performance, and eliminate the need for accessing the Internet. However, if there are additional references to be resolved, further attempts may still be made. These can be intercepted by registering an EntityResolver to the DocumentBuilder.

For ensuring that a particular DTD is used, use the extended EntityResolver2. I've found that if the DOCTYPE is missing, getExternalSubset is called. To use a particular DOCTYPE by default, this method could call and return the result from resolveEntity, after passing in the desired publicId and/or systemId. If the XML to be validated already includes a DOCTYPE, then resolveEntity will be called directly. This can be written to either throw an exception or silently return the desired entity when an unexpected entity is received.

5. XML Creation using XSLT

XSLT is a well-known method for transforming XML, but it can also be used for XML generation. The easiest way it to use XSLT as a transformation, similar to the above methods, but with an empty input source. This is additionally noted in the Transformer.transform(…) JavaDoc.

Using XSLT for XML generation works particularly well when the XML is rather static, or when the XSLT can be used as a template. The Transformer's setParameter(…) can be used to pass parameters into the transformation which can then be used as variables. To avoid possible naming collisions, especially when using larger or 3rd-party XSLTs, I strongly recommend using the namespace prefixes.

Below is a sample XSLT with namespaced parameters, then populated by Java code. It generates a valid XHTML document, with the document title and a message in the body passed-in as parameters:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:z="http://namespaces.ziesemer.com/example"
    exclude-result-prefixes="z">
    
  <xsl:output
    method="html"
    doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
    doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/>
  
  <xsl:param name="z:title"/>
  <xsl:param name="z:message"/>
  
  <xsl:template match="/">
    <html xmlns="http://www.w3.org/1999/xhtml" lang="en">
      <head>
        <title><xsl:value-of select="$z:title"/></title>
      </head>
      <body>
        <h1><xsl:value-of select="$z:title"/></h1>
        <p><xsl:value-of select="$z:message"/></p>
      </body>
    </html>
  </xsl:template>
  
</xsl:stylesheet>
final String NAMESPACE_PREFIX = "{http://namespaces.ziesemer.com/example}";

SAXTransformerFactory stf = (SAXTransformerFactory)TransformerFactory.newInstance();
Templates templates = stf.newTemplates(new StreamSource(
  getClass().getResourceAsStream("XHTMLMessage.xslt")));

// templates can now be stored and re-used from practically anywhere.

Transformer t = templates.newTransformer();
t.setParameter(NAMESPACE_PREFIX + "title",
  "Example Title");
t.setParameter(NAMESPACE_PREFIX + "message",
  "Example Message");

t.transform(new DOMSource(), new StreamResult(System.out));

This approach has a number of advantages. It is fairly easy to see what is happening, and it is easy to make changes or additions to the output. It guarantees valid XML output, as an exception will be thrown if the XSLT is invalid. It also performs quite well.

6. XSLT Inheritance

Just as common functionality can be factored out of Java classes into shared parent classes, XSLT can also make similar use of inheritance by using Stylesheet Imports. Here is an example split into a parent and child:

XHTMLTemplate.xslt:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:z="http://namespaces.ziesemer.com/example"
    exclude-result-prefixes="z">
    
  <xsl:output
    method="html"
    doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
    doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/>
  
  <xsl:param name="z:title"/>
  
  <xsl:template match="/">
    <html xmlns="http://www.w3.org/1999/xhtml" lang="en">
      <head>
        <title><xsl:value-of select="$z:title"/></title>
      </head>
      <body>
        <h1><xsl:value-of select="$z:title"/></h1>
        <xsl:call-template name="z:Message"/>
      </body>
    </html>
  </xsl:template>
  
  <!-- This should be overridden by child stylesheets. -->
  <xsl:template name="z:Message"/>
  
</xsl:stylesheet>

XHTMLMessage.xslt:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:z="http://namespaces.ziesemer.com/example"
    exclude-result-prefixes="z">
  
  <xsl:import href="XHTMLTemplate.xslt"/>
  
  <xsl:param name="z:message"/>
  
  <xsl:template name="z:Message">
    <p xmlns="http://www.w3.org/1999/xhtml">
      <xsl:value-of select="$z:message"/>
    </p>
  </xsl:template>
  
</xsl:stylesheet>

Depending upon the type of location the dependent files, a custom URIResolver will probably be needed to properly resolve the resources. In my example, I'm reading from the Java classpath. Other possibilities could include the local file system, or HTTP URLs. Only the necessary changes to the above Java code are shown below:

URIResolver resolver = new URIResolver(){
  @Override
  public Source resolve(String href, String base) throws TransformerException{
    return new StreamSource(getClass().getResourceAsStream(href));
  }};

SAXTransformerFactory stf = (SAXTransformerFactory)TransformerFactory.newInstance();
stf.setURIResolver(resolver);
Templates templates = stf.newTemplates(resolver.resolve("XHTMLMessage.xslt", null));

7. XSLT Extensions

Using parameters is a start, but the limitations are quickly visible. However, when combined with extension mechanisms, XSLT generation should be able to solve almost any requirement. Reading http://xml.apache.org/xalan-j/extensions.html is an excellent starting point. When properly used, extensions can feed into the transformation process and keep the memory footprint to a minimum.

Following is an example that uses an XSLT extension to output a variable number of messages. Additionally, this method allows for the properties to be calculated dynamically on each iteration, rather than pre-processing and storing the formatted messages - which can save memory.

XHTMLMessage.xslt:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:z="http://namespaces.ziesemer.com/example"
    xmlns:zMessageGenExt="com.ziesemer.example.MessageGenerator"
    xmlns:zMessageExt="com.ziesemer.example.Message"
    extension-element-prefixes="zMessageGenExt zMessageExt"
    exclude-result-prefixes="z zMessageGenExt zMessageExt">
    
  <xsl:output
    method="html"
    doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
    doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/>
  
  <xsl:param name="z:title"/>
  <xsl:param name="z:ext"/>
  
  <xsl:template match="/">
    <html xmlns="http://www.w3.org/1999/xhtml" lang="en">
      <head>
        <title><xsl:value-of select="$z:title"/></title>
      </head>
      <body>
        <h1><xsl:value-of select="$z:title"/></h1>
        <xsl:call-template name="z:Messages"/>
      </body>
    </html>
  </xsl:template>
  
  <xsl:template name="z:Messages">
    <xsl:variable name="z:message" select="zMessageGenExt:getNextMessage($z:ext)"/>
    <xsl:if test="string($z:message)">
      <p xmlns="http://www.w3.org/1999/xhtml">
        <b>
          <xsl:value-of select="zMessageExt:getTitle($z:message)"/>
        </b><xsl:text>: </xsl:text>
        <xsl:value-of select="zMessageExt:getDescription($z:message)"/>
      </p>
      <xsl:call-template name="z:Messages"/>
    </xsl:if>
  </xsl:template>
  
</xsl:stylesheet>

XHTMLExample.java:

t.setParameter(NAMESPACE_PREFIX + "ext",
  new MessageGenerator());

IMessage.java:

package com.ziesemer.example;

public interface IMessage{
  String getTitle();
  String getDescription();
}

MesssageGenerator.java

package com.ziesemer.example;

public class MessageGenerator{
  
  protected int index = 0;
  protected int size = 5;
  
  public IMessage getNextMessage(){
    if(index < size){
      IMessage msg = new TestMessage();
      index++;
      return msg;
    }
    return null;
  }
  
  protected class TestMessage implements IMessage{
    @Override
    public String getTitle(){
      return String.format("This is title %d.", index);
    }
    
    @Override
    public String getDescription(){
      return String.format("This is description %d.", index);
    }
  }
}

Output:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
<head>
<title>Example Title</title>
</head>
<body>
<h1>Example Title</h1>
<p>
<b>This is title 1.</b>: This is description 1.</p>
<p>
<b>This is title 2.</b>: This is description 2.</p>
<p>
<b>This is title 3.</b>: This is description 3.</p>
<p>
<b>This is title 4.</b>: This is description 4.</p>
<p>
<b>This is title 5.</b>: This is description 5.</p>
</body>
</html>

XSLT is more of a functional language than procedural, and the demonstrated use of recursion is really the only way to implement a loop. Unfortunately, this can lead to stack overflow errors within the Java implementation. This can be mitigated by increasing the stack size. While this can be set globally for the JVM, it is certainly not the best option. There is a Thread constructor that allows for the thread's stack size to be specified, but it is platform-depdendent, and is still only a mitigation. There is a good article on IBM developerWorks, "Use recursion effectively in XSL" (Jared Jackson, 2002-10-01), that specifically addresses this stack overflow issue with XSL recursion. Unfortunately, the examples provided require either a pre-known list size, and/or only calculate within the loop rather than producing output.

Here are some modifications to my above XSLT that recurses down a tree, splitting into 2 children at each level, and making the maximum necessary depth log2n. (A traditional divide & conquer algorithm.) However, by supporting a loop of an unknown size, the depth cannot be calculated to the appropriate minimum level in advance. In my approach, a pre-defined depth of a "sufficient size" is used. I chose 32, as 2^32 = 4,294,967,296, and would be equal to Java's Integer if it were non-signed. (As Integers are signed in Java, the maximum value of an int is 2,147,483,647.) Also, my trials have shown that the default stack size supports over 1,000 recursions before overflowing, so 32 should be a more than safe value. The tree will be filled "depth-first", and the tree will then continue to grow in "width". Some additional work is done to test if each call still resulted in output. If not, an xsl-if prevents further recursion, otherwise the entire tree would still be built and traversed. At 2 billion+ potential nodes in the tree, completing the recursion would require an unacceptable amount of time.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:z="http://namespaces.ziesemer.com/example"
    xmlns:zMessageGenExt="com.ziesemer.example.MessageGenerator"
    xmlns:zMessageExt="com.ziesemer.example.Message"
    extension-element-prefixes="zMessageGenExt zMessageExt"
    exclude-result-prefixes="z zMessageGenExt zMessageExt">
    
  <xsl:output
    method="html"
    doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
    doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/>
  
  <xsl:param name="z:title"/>
  <xsl:param name="z:ext"/>
  
  <xsl:template match="/">
    <html xmlns="http://www.w3.org/1999/xhtml" lang="en">
      <head>
        <title><xsl:value-of select="$z:title"/></title>
      </head>
      <body>
        <h1><xsl:value-of select="$z:title"/></h1>
        <xsl:call-template name="z:MessagesRecursive">
          <xsl:with-param name="z:depth" select="0"/>
        </xsl:call-template>
      </body>
    </html>
  </xsl:template>
  
  <xsl:template name="z:MessagesRecursive">
    <xsl:param name="z:depth"/>
    <xsl:variable name="x">
      <xsl:call-template name="z:Messages"/>
    </xsl:variable>
    <xsl:if test="string($x) and $z:depth &lt; 32">
      <xsl:copy-of select="$x"/>
      <xsl:call-template name="z:MessagesRecursive">
        <xsl:with-param name="z:depth" select="$z:depth + 1"/>
      </xsl:call-template>
      <xsl:call-template name="z:MessagesRecursive">
        <xsl:with-param name="z:depth" select="$z:depth + 1"/>
      </xsl:call-template>
    </xsl:if>
  </xsl:template>
  
  <xsl:template name="z:Messages">
    <xsl:variable name="z:message" select="zMessageGenExt:getNextMessage($z:ext)"/>
    <xsl:if test="string($z:message)">
      <p xmlns="http://www.w3.org/1999/xhtml">
        <b>
          <xsl:value-of select="zMessageExt:getTitle($z:message)"/>
        </b><xsl:text>: </xsl:text>
        <xsl:value-of select="zMessageExt:getDescription($z:message)"/>
      </p>
    </xsl:if>
  </xsl:template>
  
</xsl:stylesheet>

8. Beware of classloaders

One frustrating issue I recently dealt with was related to multiple classloaders, where extension classes and methods were being reported as "not found", as mentioned in the Xalan-J FAQ. Fortunately, the Xalan implementation appears to handle multiple classloaders in a quite robust fashion, by using the context ClassLoader. In the environment where I was working, the Xalan classes were in a parent classloader from the extension classes; however, this never posed to be a problem for me previously. The actual error in my particular case was that the servlet engine was old and buggy, and was not setting the context ClassLoader on new threads. I worked around this by calling setContextClassLoader(…) in my servlet's service(…) method, before calling super.service(…).

9. Using DocumentFragments

This is really the first use I found of DocumentFragment where it seemed appropriate. Even when using the XSLT approach, there may be instances where it is necessary or easier to build and include a section of XML from within Java rather than XSLT. If used excessively, fragments are counter-productive to the advantages of using XSLT. Understand that while XSLT can stream the content in a pipelined-fashion as it is prodcued, each DocumentFragment must be completely built and returned before it can be streamed, which will increase memory requirements with the size of the fragments.

As doocumented on the Xalan Extensions page, DocumentFragments are a valid return type from an extension. They are also far easier to produce than the other Node-Set types. Here is the best method I found to make use of this functionality:

Java extension method:

public DocumentFragment fill(Node n){
  Document doc = (Document)n;
  DocumentFragment df = doc.createDocumentFragment();
  
  // Append any number of children and/or sub-children...
  Element e = doc.createElement("Example");
  df.appendChild(e);
  
  return df;
}

XSLT:

<xsl:copy-of select="extensionPrefix:fill($instanceVariable, .)"/>

10. XSLT vs. JAXB and JibX, Castor, etc.

While several people I know are big fans of XML data binding frameworks, I try my best to avoid them. For almost any use that I've seen of these frameworks, I would contend that XSLT and/or one of the XML generation techniques I previously described would be a better fit. In general, these frameworks introduce additional complexities and dependencies, along with usually artificial limitations. I've seen several performance comparisons and presentations between such frameworks, but none that dare to include XSLT and the other direct approaches.