Sunday, February 5, 2012

"castable as" vs "instance of" XPath 2.0 expressions for XSD 1.1 assertions

I'm continuing with my thoughts related to my previous blog post (ref, http://mukulgandhi.blogspot.in/2012/01/using-xsd-11-assertions-on-complextype.html). The earlier post used the XPath 2.0 "castable as" expression to do some checks on the 'untyped' data of complexType's mixed content (essentially finding if the string/untyped value in an XML instance document is a lexical representation of an xs:integer value).

This post talks about the use of XPath 2.0 "instance of" vs "castable as" expressions in context of XSD 1.1 assertions -- essentially providing guidance about when it may be necessary to use one of these expressions.

The XSD 1.1 "castable as" use case was discussed in my earlier blog post. Here I essentially talk about "instance of" expression when used with XSD 1.1 assertions.

Let's assume that there is an XML instance document like following (XML1):

<X>
   <elem>
     <a>20</a>
     <b>30</b>
   </elem>
   <elem>
     <a>10</a>
     <b>2005-10-07</b>
   </elem>
</X>

The XSD schema should express the following constraints with respect to the above XML instance document (XML1):
1. The elements "a" and "b" can be typed as an xs:integer or a xs:date (therefore we'll express this with an XSD simpleType with variety 'union').
2. If both the elements "a" and "b" are of type xs:integer (this is allowable as per the simpleType definition described in point 1 above), then numeric value of element "a" should be less than numeric value of element "b".
3. If one of the elements "a" or "b" is an xs:integer and the other one is xs:date, then we would like to express the following constraints,
   - the numeric XML instance value of an xs:integer typed element should be less than 100
   - the xs:date XML instance value should be less that the current date

The following XSD (1.1) schema document describes all of the above validation constraints for a sample XML instance document (XML1) provided above:

[XS1]

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   
     <xs:element name="X">
        <xs:complexType>
           <xs:sequence>
              <xs:element name="elem" maxOccurs="unbounded">
                 <xs:complexType>
                    <xs:sequence>
                       <xs:element name="a" type="union_of_date_and_integer"/>
                       <xs:element name="b" type="union_of_date_and_integer"/>
                    </xs:sequence>
                    <xs:assert test="if ((data(a) instance of xs:integer) and (data(b) instance of xs:integer))
                                              then (data(a) lt data(b))
                                           else if (not(deep-equal(data(a), data(b))))
                                              then (*[data(.) instance of xs:integer]/data(.) lt 100
                                                         and
                                                      *[data(.) instance of xs:date]/data(.) lt current-date())
                                              else true()"/>
                 </xs:complexType>
              </xs:element>
           </xs:sequence>
        </xs:complexType>
     </xs:element>
   
     <xs:simpleType name="union_of_date_and_integer">
        <xs:union memberTypes="xs:date xs:integer"/>
     </xs:simpleType>
   
</xs:schema>

I think it may be interesting for readers to know why I wrote an assertion like the one above. Following are few of the thoughts,
1. Since the XML elements "a" and "b" are typed as a simpleType 'union', therefore for an assertion to access the XML instance atomic values that were validated by such an simpleType we need to use the XPath 2.0 "data" function on a relevant XDM node (elements "a" and "b" in this case). Further determining that the XML document's atomic instance value is typed as xs:integer, we need to use the "instance of" expression -- "castable as" is not needed in this case, since the instance document's data is already typed.
2. The rest of the assertion implements what is mentioned in the requirements above.

If you want to have further visual and/or design elegance within what is written in an assertion above, one may feel free to break assertion rules into two or more assertions.

I would also want to write another XSD 1.1 assertions example which doesn't use an XPath 2.0 "castable as" or an "instance of" expression. This demonstrates that, if an XDM assert node is already typed it would usually be unnecessary to use the "castable as" expression (since "castable as" is essentially useful to programmatically enforce typing with string/untyped values) or an "instance of" expression may be needed for some cases.

Following is a slightly modified variant of the XML instance document specified above (XML1):

[XML2]

<X>
   <elem>
     <a>2</a>
     <b>2012-02-04</b>
   </elem>
   <elem>
     <a>10</a>
     <b>2005-10-07</b>
   </elem>
</X>

The XSD schema should express the following constraints with respect to the above XML instance document (XML2):
1. The element "a" is typed as an xs:nonNegativeInteger value, and element "b" is typed as xs:date.
2. The number of days equal to the numeric value specified in an element "a" if added to the xs:date value specified in an element "b", should result in an xs:date value which must be less than the current date.

The following XSD (1.1) schema document describes all of the above validation constraints for a sample XML instance document (XML2) provided above:

[XS2]

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   
     <xs:element name="X">
        <xs:complexType>
           <xs:sequence>
              <xs:element name="elem" maxOccurs="unbounded">
                 <xs:complexType>
                    <xs:sequence>
                       <xs:element name="a" type="xs:nonNegativeInteger"/>
                       <xs:element name="b" type="xs:date"/>
                    </xs:sequence>
                    <xs:assert test="(b + xs:dayTimeDuration(concat('P', a, 'D'))) lt current-date()"/>
                 </xs:complexType>
              </xs:element>
           </xs:sequence>
        </xs:complexType>
     </xs:element>
   
</xs:schema>

That's all I had to say today.

I hope this post was useful.

No comments: