Wednesday, February 24, 2010

XSD 1.1: some more assertions fun

Here are some more XSD 1.1 assertions examples (interesting one's I guess), that I tried running with Xerces-J XSD 1.1 implementation (these ones run fine, with Xerces!):

Example 1 [1]:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="test" type="X" />
   
    <xs:complexType name="X">
      <xs:group ref="List1" />
      <xs:assert test="a and b and d" />
    </xs:complexType>
   
    <xs:complexType name="Y">
      <xs:group ref="List1" />
      <xs:assert test="a and b and c and d" />
    </xs:complexType>
   
    <xs:group name="List1">
       <xs:sequence>
         <xs:element name="a" type="xs:string" minOccurs="0"/>
         <xs:element name="b" type="xs:string" minOccurs="0"/>
         <xs:element name="c" type="xs:string" minOccurs="0"/>
         <xs:element name="d" type="xs:string" minOccurs="0"/>
       </xs:sequence>
    </xs:group>
           
  </xs:schema>

The corresponding XML instance, document is:
<test>
    <a>hello</a>
    <b>world</b>
    <!--<c>hello..</c>-->
    <d>world..</d>
  </test>

Here's the rationale/goal, that motived me to write this XSD sample:
I wanted to define a pair of XSD complex types (something like, X & Y above), such that one of the types could reuse the element particles, from the other type. If this problem could have been solved with XSD type derivation (which I attempted initially), I wanted that only one of the elements in the derived type could become optional -- element, "c" in this example (i.e, with minOccurs = 0 & maxOccurs = 1), while the other elements from the base type should have the same occurrence indicator (i.e, a mandatory indicator -- which is, minOccurs = maxOccurs = 1).

Interestingly, this problem is unsolvable with XSD type derivation (either complex type extension, or restriction mechanism).

For this schema use-case, I came up with the XSD sample above [1], which meets my goal to be able to re-use the element particles in the XSD types. The Schema above [1], defines a global group which contains a sequence of XML element definitions. All of the elements in the group, are marked as optional. Within the complex types (X & Y), the cardinality of elements (0-1 or 1-1) is enforced with XSD assertions. Defining all elements in the group, as optional allows us to reuse this list in different XSD types easily, as we can constrain the elements (say controlling the cardinality of elements, or even the contents of elements/attributes) in different contexts/types say using, assertions.

Using the above schema example [1], therefore if one wants to use a XSD type, where element "c" is optional, one would use the type, "X". While if, one wants to use a XSD type, where all elements are mandatory, one would use the type, "Y".

After having solved the use-case I had in mind (explained above), so just for fun, I wrote another schema using some more assertions.

Here's the 2nd XSD schema:

Example 2 [2]:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

     <xs:element name="test" type="X" />
   
     <xs:complexType name="X">
       <xs:group ref="List1" />
       <xs:assert test="a and b and d" />
     </xs:complexType>
   
     <xs:complexType name="Y">
       <xs:group ref="List1" />
       <xs:assert test="a and b and c and d" />
     </xs:complexType>
   
     <xs:group name="List1">
        <xs:sequence>
           <xs:element name="a" minOccurs="0">
             <xs:complexType>
               <xs:sequence>
                 <xs:element name="a1" type="xs:string" maxOccurs="unbounded" />
               </xs:sequence>
               <xs:attribute name="aCount" type="xs:nonNegativeInteger" />
               <xs:assert test="count(a1) eq @aCount" />
             </xs:complexType>
           </xs:element>
           <xs:element name="b" type="xs:string" minOccurs="0"/>
           <xs:element name="c" type="xs:string" minOccurs="0"/>
           <xs:element name="d" type="xs:string" minOccurs="0"/>
        </xs:sequence>
     </xs:group>
           
  </xs:schema>

The schema [2] is conceptually similar, to schema [1]. The only difference between the two schemas is, that in schema [2], element "a" has complex content, while in schema [1], element "a" is defined to have simple content (which is, xs:string). In schema, [2]'s complex type we define another assertion (which enforces the constraint that, value of attribute "aCount" is equal to the number of, "a1" children of element, "a"). The assertion definition in the complex type of element, "a" in the 2nd schema, is written only to visually increase the complexity of the element a's definition (of-course, this also does increase the functional complexity of element, "a" and subsequently the complexity of contents of the global group definition, in the 2nd schema).

The 2nd schema illustrates, that a more functionally complex list of particles (a, b, c & d here) get more benefit by the schema component re-use technique (accomplished with a XSD group, and assertions) illustrated in this post.

I hope, that this post is useful.

No comments: