Sunday, September 5, 2010

XSD 1.1: Xerces-J implementation updates

Over the past one or two months, there have been few interesting changes happening at Xerces-J XML Schema 1.1 implementation. I feel obliged to share these enhancements with the XML Schema community, and also with folks at Eclipse WTP (where we enhanced few "schema aware" components of PsychoPath XPath 2.0 engine, to support these recent Xerces enhancements -- I think we improved the design of typed values of XML element and attribute XDM nodes in PsychoPath XPath2 engine, in case the XDM node has a type annotation of kind XML Schema simpleType, with varieties list or union).

Here's a summary of XML Schema 1.1 implementation changes that have recently been completed with Xerces (available at Xerces SVN repos as of now), which are planned to be part of the Xerces-J 2.11.0 release, planned to take please during November 2010 time frame.

1. Xerces-J now has a complete implementation of XML Schema 1.1 conditional inclusion functionality. The Xerces-J 2.10.0 release had implementation of XML Schema 1.1 conditional inclusion vc:minVersion and vc:maxVersion attributes. Xerces-J now supports all of "conditional inclusion" attributes as specified by the XML Schema 1.1 spec. The "conditional inclusion" attributes that are now newly supported in Xerces-J are: vc:typeAvailable, vc:typeUnavailable, vc:facetAvailable and vc:facetUnavailable. All of XML Schema 1.1 built-in types and facets are now supported by Xerces-J related to XML Schema 1.1 "conditional inclusion" components.

2. There are few interesting changes that have happened to Xerces-J XML Schema 1.1 assertions implementation as well, that are planned to be part of Xerces-J 2.11.0 release. Xerces now has an improved assertions evaluation processing on XML Schema (1.1) simple types, with varieties 'list' and 'union'.

2.1 Enhancements to assertions evaluation on simpleType -> list:

Here's an example of XML Schema 1.1 assertions on an xs:list schema component:
[XML Schema 1]
   <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

      <xs:element name="Example" type="EXAMPLE_LIST" />
   
      <xs:simpleType name="EXAMPLE_LIST">
         <xs:list>
            <xs:simpleType>
               <xs:restriction base="xs:integer">
                  <xs:assertion test="$value mod 2 = 0" />
               </xs:restriction>
            </xs:simpleType>
         </xs:list>
      </xs:simpleType>
   
   </xs:schema> 

If an XML instance document has a structure something like following:
[XML 1]
<Example>1 2 3</Example>

And if this XML instance document ([XML 1]) is validated by the above XML schema ([XML Schema 1]), Xerces-J would report error messages like following (assuming the name of XML document was, test.xml):
[Error] test.xml:1:25: cvc-assertion.3.13.4.1: Assertion evaluation ('$value mod 2 = 0') for element 'Example' with type '#anonymous' did not succeed. Assertion failed for an xs:list member value '1'.
[Error] test.xml:1:25: cvc-assertion.3.13.4.1: Assertion evaluation ('$value mod 2 = 0') for element 'Example' with type '#anonymous' did not succeed. Assertion failed for an xs:list member value '3'.


An assertion must evaluate on every 'simpleType -> list' item (which is validated by the itemType of xs:list) in an XML instance document. Xerces now does this, and needed error messages are displayed in case of schema assertion failures.

2.2 Enhancements to assertions evaluation on simpleType -> union:

Here's an example of XML Schema 1.1 assertions on an xs:union schema component:
[XML Schema 2]
   <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   
      <xs:element name="Example">
         <xs:simpleType>
            <xs:union memberTypes="MYDATE xs:integer" />
         </xs:simpleType>
      </xs:element>
   
      <xs:simpleType name="MYDATE">
         <xs:restriction base="xs:date">
            <xs:assertion test="$value lt current-date()" />
         </xs:restriction>
      </xs:simpleType>

   </xs:schema>

If an XML instance document has a structure something like following:
[XML 2]
<Example>2010-12-05</Example>

And this instance document is validated by the schema document, [XML Schema 2] the following error message is displayed by Xerces:
[Error] temp.xml:1:30: cvc-assertion.union.3.13.4.1: Element 'Example' with value '2010-12-05' is not locally valid. One or more of the assertion facets on an element's schema type, with variety union, have failed.

Xerces tried to validate an atomic value '2010-12-05' both with schema types xs:integer and MYDATE. Since none of these types could successfully validate this atomic value, and an assertion failed in the process of these validation checks, the relevant assertion failure was reported by Xerces.

If the XML schema, [XML Schema 2] tries to validate the XML instance document:
<example>10</Example>

no validation failures are reported in this case, since an atomic value '10' conforms to the schema type xs:integer, which results in an overall validation success of the atomic value with an 'union' schema type.

I'm ending this blog post now. Stay tuned for more news here :)

And I hope, that this post was useful.