Saturday, March 14, 2009

XML Property Expansion with MarkUtils-XML

XML is an increasingly popular format for configuration files. Unfortunately, XML doesn't have a built-in standard for variable substitution or property expansion. This addition to my MarkUtils-XML package is one solution, included as of with version 2009.03.14, and is available for download at http://code.google.com/p/ziesemer/downloads/list?q=label:Featured.

Background

Several existing practices make use of replacement string patterns, e.g. "${…}" described and supported by Apache Commons Digester's MultiVariableExpander. Apache Maven uses the same pattern in its pom.xml files, but doesn't appear to use Digester to implement this. Instead, Maven uses implementations of org.codehaus.plexus.component.configurator.expression.

I see a few issues with this overall approach of using replacement string patterns:

  • While Maven supports resolving properties from 5 different sources, including environment variables and Java system properties, the implementation doesn't eliminate the possibility for naming collisions.
  • Neither Digester nor Maven appear to allow for escaping these patterns. While not common, this can sometimes cause issues if "${" needs to be used and passed as-is.
  • Neither implementation makes it clear what happens if replacement string patterns are present but not expanded due to the variable not existing. (Is an exception thrown, the replacement string pattern returned as part of the result, or the replacement string pattern expanded to nothing ("")?)

Granted, these few issues could all easily be fixed through the use of escape characters, etc. However, more similar issues will probably continue to appear. Instead, let's use XML as I think it is intended.

Consider the use of XSLT. For inserting values from referenced elements into the output, <xsl:value-of select="…"/> is used, rather than a "${…}"-style syntax. However, this isn't the best example, as the select is a XPath Expression, which then makes use of the "$" prefix as part of a VariableReference.

Solution

The solution provided in MarkUtils-XML is in 2 parts: A schema that declares the property elements, and a XmlPropertyExpander class that performs the expansions.

Schema

The schema is namespaced to avoid naming collisions, as "http://namespaces.ziesemer.com/utils.xml/propertyExpansion". It is meant for easily inclusion into other schemas through the use of the <import …/> element. The following property elements are defined, all extending "BasePropertyType", with one required "name" attribute of type NCName:

  • <SystemProperty/> - Resolves Java system properties from System.getProperty(…).
  • <EnvironmentProperty/> - Resolves environment variables from System.getenv(…).
  • <InstanceProperty/> - Resolves properties from those set on the current instance of the property expander.
  • <LocalProperty/> - Resolves properties from <LocalPropertyDef/> elements registered with the current instance of the property expander, typically from the same XML file.

A "PropertyTypeGroup" group exists that allows the above 4 property element types, as well as any elements from any other namespace - allowing for extensions or use-cases not covered by the above pre-defined elements. A "SubstitutionVariableType" group exists that allows for any elements from this group, as well as other mixed content.

For defining <LocalProperty/> values, use one ore more <LocalPropertyDef/> elements. They are of type "LocalPropertyDefType", which extend "BasePropertyType". This allows for references to other properties, including other <LocalProperty/> elements, to be used as part of the definition for a <LocalProperty/>. Just be sure to avoid creating circular references, which would normally result in a StackOverflowError but are instead caught earlier (32 recursions by default) and thrown as a RuntimeException. A default "Properties" element exists to act as a container for one or more <LocalPropertyDef/> elements, including a XSD schema <key/> element to provide a constraint against multiple local property elements with the same name.

Shown below is an example schema that allows for storing a list of file system directories. It imports the "PropertyExpansion" schema, allowing for a <Properties/> element, and the inclusion of one or more "PropertyTypeGroup" elements for property expansion. This example is included in the distribution as "DirectoriesExample.xsd" in the JUnit source:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:pe="http://namespaces.ziesemer.com/utils.xml/propertyExpansion"
    elementFormDefault="qualified">
  
  <xs:import namespace="http://namespaces.ziesemer.com/utils.xml/propertyExpansion"/>
  
  <xs:element name="Directories">
    <xs:complexType>
      <xs:sequence>
        <xs:element ref="pe:Properties" minOccurs="0"/>
        <xs:element ref="Directory" minOccurs="0" maxOccurs="unbounded"/>
      </xs:sequence>
    </xs:complexType>
    
    <xs:keyref name="PropertyKeyRef" refer="pe:PropertyKey">
      <xs:selector xpath=".//pe:LocalProperty"/>
      <xs:field xpath="@name"/>
    </xs:keyref>
  </xs:element>
    
  <xs:element name="Directory" type="pe:SubstitutionVariableType"/>
  
</xs:schema>

Now, a simple example XML document that follows the above schema. It is also included in the distribution as "DirectoriesExample.xml" in the JUnit source:

<?xml version="1.0" encoding="UTF-8"?>
<Directories
    xmlns:pe="http://namespaces.ziesemer.com/utils.xml/propertyExpansion">
  
  <pe:Properties>
    <pe:LocalPropertyDef name="java.lib">
      <pe:SystemProperty name="java.home"/>/lib
    </pe:LocalPropertyDef>
  </pe:Properties>
  
  <Directory><pe:SystemProperty name="java.home"/>/bin</Directory>
  <Directory><pe:SystemProperty name="java.home"/>/lib</Directory>
  
  <Directory><pe:LocalProperty name="java.lib"/></Directory>
  <Directory><pe:LocalProperty name="java.lib"/>/ext</Directory>
  <Directory><pe:LocalProperty name="java.lib"/>/management</Directory>
  <Directory><pe:LocalProperty name="java.lib"/>/security</Directory>
  
</Directories>

By default, whitespace is trimmed from each text node during expansion (but not whitespace within individual text nodes). This allows for new lines or other "pretty-printing" to be used within the elements to be expanded without having the whitespace included in the expanded result. If whitespace must be maintained, use CDATA sections.

Java class

Everything needed to expand these properties from Java is contained within one class, com.ziesemer.utils.xml.XmlPropertyExpander. It is best-suited for use with XML DOM, providing one core method for expanding elements containing child properties into a String:

public String expand(Node parent);

If no <LocalPropertyDef/> elements are to be supported, XmlPropertyExpander can be instantiated using the 0-argument constructor. Otherwise, the constructor accepts one or more Nodes that contain child <LocalPropertyDef/> elements to resolve against, either as a List or as varargs.

While best-suited for use with DOM, supporting methods are publicly accessible for other use, such as SAX parsing:

public String resolveProperty(String nodeName, String propName);
public Element findLocalProperty(String name);

This assumes that <LocalPropertyDef/> elements will still be available through DOM, which should normally not be an issue as the number of local properties defined should typically be a small fraction of the number elements that refer to these properties. (Potentially parse the <Properties/> element using DOM, then the rest of a larger document with SAX.) If you would like additional SAX support, please open an enhancement request on the issue tracker at ziesemer.dev.java.net.

While not shown in the above example, <InstanceProperty/> elements are resolved through a Map set on the instance:

public void setInstanceMap(Map<String, String> instanceMap);

By default, if a standard property fails to resolve (resolves to null), nothing is output for the property - as if the property element didn't exist. A warning with the unresolved property type and name will be output through java.util.logging. (While I prefer and support SLF4J, this avoids an additional compile- and run-time dependency. java.util.logging can be forwarded to SLF4J through jul-to-slf4j.jar.) The same applies if elements from an external namespace are found during the expansion. To change this default functionality, including returning a resolved String, sub-class and override either or both of these methods:

protected String unresolvedProperty(String localName, String propName);
protected String externalNamespaceResolveProperties(Element e);

Finally, a static method is available for returning a StreamSource for the property expansion XSD schema, useful for building Schema instances with for validating instance documents.

public static StreamSource getSchemaStreamSource();

See the distribution's Javadocs or source code for details on any of the above methods. For more examples, including more features and complex use-cases, see the JUnit source.

No comments: