Sunday, September 20, 2009

Xerces-J: XML Schema 1.1 assertions enhancements

The XML Schema 1.1 language, defines an assertions facility (xs:assert and xs:assertion), which constrain the XML Schema simple and complex types.

Apache Xerces-J implements XML Schema 1.1 assertions. As described in the XSD 1.1 spec, assertions typically have following XML representation:
  <assert
    id = ID
    test = an XPath expression
    xpathDefaultNamespace = (anyURI | (##defaultNamespace | ##targetNamespace | ##local)) 
{any attributes with non-schema namespace . . .}>
    Content: (annotation?)
 </assert>

For XML Schema simple type facets, the assertions are named, xs:assertion (as opposed to the xs:assert instruction for complex types) and rest of assertion contents are same.

Some background about XML Schema 1.1 assertions, and it's implementation in Xerces-J, could be referred at following blog post, which I wrote some time ago.

During the last week, we enhanced Xerces-J assertions to support the assertions attribute, 'xpathDefaultNamespace'. I did contribute this patch to Xerces-J, and it's now available on Xerces-J SVN repository.

The following XML Schema 1.1, specification description, describes how an assertions attribute 'xpathDefaultNamespace' works (please see the section, "XML Mapping Summary for XPath Expression Property Record {default namespace}").

Here's a simple example, about how 'xpathDefaultNamespace' functions in XML Schema 1.1 assertions:

XML document [1]:
  <X xmlns="http://xyz">
    <message>hello</message>
  </X>

XSD 1.1 document [2]:
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
             targetNamespace="http://xyz"
             elementFormDefault="qualified">

     <xs:element name="X">
       <xs:complexType>
          <xs:sequence>
             <xs:element name="message" type="xs:string" />
          </xs:sequence>
          <xs:assert test="message = 'hello'" xpathDefaultNamespace="##targetNamespace" />
       </xs:complexType>
     </xs:element>

  </xs:schema>

In the XML document [1] above, since element "message" belongs to the namespace, "http://xyz" (by virtue of the default namespace declaration, xmlns="http://xyz" on element "X"), therefore the XPath (2.0) expression message = 'hello', on xs:assert instruction would return a boolean value "true", only if an element reference "message" in the XPath expression belongs to the namespace, "http://xyz". This namespace information needs to be provided to the XPath engine, via the 'xpathDefaultNamespace' attribute, on xs:assert instruction. If for the above XML instance document [1], 'xpathDefaultNamespace' attribute is not provided on the xs:assert instruction, then the XPath expression, message = 'hello' would return false (as then, the element "message" would be considered in no/null namespace, by the XPath engine), and the element instance at runtime, would become invalid according to such a XSD 1.1 Schema.

Allowing the 'xpathDefaultNamespace' attribute to be working on XML Schema 1.1 assertions, further increases the usefulness of XML Schema 'assertions' instruction, because now the XPath expressions, on assertions can be XML namespace aware.

The implementation of 'xpathDefaultNamespace' attribute on assertions, required enhancing PsychoPath XPath 2.0 engine as well. The updated PsychoPath library, JAR has been copied to Xerces-J SVN repository as well.

No comments: