Saturday, April 17, 2010

XSD 1.1: xs:precisionDecimal, assertions and Xerces-J updates

Section 1
Recently, I went through in sufficient detail about the XSD primitive data-type, xs:precisionDecimal (newly introduced in, XSD 1.1), and was trying to use XSD 1.1 assertions to simulate xs:precisionDecimal (just to vent my curiosity and exploring more of, XSD assertions) as a user-defined (as a restriction of xs:decimal data-type) XSD Simple Type (though I believe, a native implementation of xs:precisionDecimal should also exist in an XSD 1.1 implementation, or in language systems which may use the XSD type system -- for example, a stand-alone XPath (2.x) implementation which uses an XSD type system).

Here's an XSD 1.1 schema example, illustrating these concepts:
[1]
   <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

      <xs:element name="example" type="myPrecisionDecimal" />
  
      <xs:simpleType name="myPrecisionDecimal">
        <xs:restriction base="xs:decimal" xmlns:xerces="http://xerces.apache.org">
           <xs:totalDigits value="6" />
           <xs:fractionDigits value="4" />
           <xs:assertion test="string-length(substring-after(string($value), '.')) ge 2" 
                  xerces:message="minScale of this decimal number should be 2" />
        </xs:restriction>
      </xs:simpleType>
  
   </xs:schema>

The XSD type, "myPrecisionDecimal" defined above has following correspondences with the type, xs:decimal:
a) The facet specification, xs:totalDigits in "myPrecisionDecimal" is equivalent to the facet xs:totalDigits in xs:decimal.
b) The facet specification, xs:fractionDigits in "myPrecisionDecimal" is equivalent to the facet "maxScale" for, xs:decimal.
c) The assertion facet in, "myPrecisionDecimal" is equivalent (an user-defined attempt to equalize!) to the facet "minScale" for, xs:decimal.

When the above schema document [1], is used to validate the following XML instance:
<example>44.4</example>
The following error message is produced:
[Error] test.xml:1:24: cvc-assertion.failure: Assertion failure. minScale of this decimal number should be 2.

It's also worth noting that, the above user-defined type "myPrecisionDecimal" cannot be considered a true equivalent of XSD type, xs:precisionDecimal as defined in XSD 1.1 spec, because xs:precisionDecimal also includes values for positive and negative infinity and for "not a number", and it differentiates between "positive zero" and "negative zero" (these aspects, are not defined for xs:decimal). The above example, for "myPrecisionDecimal" only demostrates, simulating the "minScale" facet (which is not available in the type, xs:decimal) of xs:precisionDecimal.

Section 2
(Xerces-J, assertions implementation update)

Xerces-J recently implemented, an extension attribute "message" (specified in a namespace, http://xerces.apache.org, for Xerces-J XSD 1.1 implementation) on XSD 1.1 assertion instructions. The value of this attribute, needs to be an error message that will be reported by an XSD 1.1 engine upon assertions failure.

An example of this is illustrated, in the schema document above [1].

In the absence of the "message" attribute on assertions (or if it's present, but it doesn't contain any significant non-whitespace characters), the following default error message is produced by Xerces:
[Error] test.xml:1:24: cvc-assertion.3.13.4.1: Assertion evaluation ('string-l
ength(substring-after(string($value), '.')) ge 2') for element 'example' with type 'myPrecisionDecimal' did not succeed.


We could see the benefit of, the "message" attribute on assertions, which to my opinion are following:
a) For complex (& particularly, lengthy) XPath expressions in assertions, the default error messages produced by Xerces, could be quite verbose which the user's may not find convenient to view & debug. The user experience, with default assertions error messages, may be further trouble-some if there are numerous assertion evaluations for XML documents -- we could imagine the user-experience, for say maxOccurs="unbounded" specification on XML elements on which assertions apply OR let's say, there may be of the order of "> 10" different assertions.
b) We could specify, domain specific error messages with the assertions "message" attribute.

Though, the advantage of the default assertion error messages produced by Xerces is that, it prints to the user, the name of XSD type and the element/attribute involved in a particular assertions validation episode.

PS: There's been a recent issue raised with the XSD WG, which proposes addition of a "message" attribute on assertions in the XSD 1.1 language itself. The Xerces implementation of assertions "message" attribute may change in future, depending on a recommendation related to this, from the XSD WG.

I hope, that this post is useful.

3 comments:

Matthew said...

I can't determine exactly what release of Xerces I need to have to support XSD 1.1 assertions. Is it in the latest release?

Please send me an e-mail and let me know - mmellon@ecrsoft.com.

On a side note, IBM makes a different recommendation for error message customization here:

http://www.ibm.com/developerworks/library/x-xml11pt2/

(scroll to "Error message customization for assertions")

It's a real shame that XSD 1.1 doesn't have a standard method of error message customization. I almost can't believe it was overlooked.

Mukul Gandhi said...

The next Xerces-J release (2.10.0), would support XSD 1.1 including assertions. This version, is still not released officially at Apache. But we're hoping to see it released sometime soon :) If you're feeling adventurous, you could try building the Xerces JARs from SVN (at, https://svn.apache.org/repos/asf/xerces/java/branches/xml-schema-1.1-dev/).

Regarding, "Error message customization" for assertions, the current support for this in Xerces requires us to use an attribute, "message" (in a namespace, http://xerces.apache.org) on "assert/assertion" element. I agree, that the idea described at IBM developerWorks site, to use "xs:annotation" for assertions error messages, is a good one.

There's also an issue pending with XSD WG, related to this same very topic. We'll need to wait, for XSD WG's final recommendation regarding assertions "error message" handling, and Xerces implementation might change accordingly.

Javin Paul said...

Nice article , you have indeed cover the topic with great details. I have also blogged my experience ascomparator and comparablein java with example . let me know how do you find it.