Sunday, December 13, 2009

XSD 1.1: few more assertions and CTA use cases

In my quest to test Xerces-J's XSD 1.1 implementation, I've come up with another example, using XSD 1.1 assertions and CTA (type alternatives) which I'll like to share here.

Here's a fictitious use-case and some discussions and analysis of the XSD technical options, for solving this use-case, later on in this post.

XML document [1]:
  <shapes>
    <polygon kind="square">
      <a>10</a>  
      <b>10</b>
      <c>10</c>
      <d>10</d>
    </polygon>
    <polygon kind="rectangle">
      <a>10</a>  
      <b>8</b>
      <c>10</c>
      <d>8</d>
    </polygon>
    <polygon kind="triangle">
      <a>5</a>  
      <b>10</b>
      <c>15</c>
    </polygon>
  </shapes>

XML document [2]:
  <shapes xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <polygon kind="square" xsi:type="Quadrilateral">
      <a>10</a>  
      <b>10</b>
      <c>10</c>
      <d>10</d>
    </polygon>
    <polygon kind="rectangle" xsi:type="Quadrilateral">
      <a>10</a>  
      <b>8</b>
      <c>10</c>
      <d>8</d>
    </polygon>
    <polygon kind="triangle">
      <a>5</a>  
      <b>10</b>
      <c>15</c>
    </polygon>
  </shapes>

XSD 1.1 Schema [3]:
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="shapes">
       <xs:complexType>
         <xs:sequence>
           <xs:element name="polygon" type="Triangular" maxOccurs="unbounded">
             <xs:alternative test="@kind = ('square', 'rectangle')" type="Quadrilateral" />
           </xs:element>
         </xs:sequence>   
       </xs:complexType>    
    </xs:element>

    <xs:complexType name="Triangular">
       <xs:sequence>
         <xs:element name="a" type="xs:positiveInteger" />
         <xs:element name="b" type="xs:positiveInteger" />
         <xs:element name="c" type="xs:positiveInteger" />
       </xs:sequence>
       <xs:attribute name="kind" type="xs:string" use="required" />    
    </xs:complexType>

    <xs:complexType name="Quadrilateral">
       <xs:complexContent>
         <xs:extension base="Triangular">
           <xs:sequence>
             <xs:element name="d" type="xs:positiveInteger" />
           </xs:sequence>
           <xs:assert test="if (@kind = 'square') then (a = b and b = c and c = d) else true()" />
           <xs:assert test="if (@kind = 'rectangle') then (a = c and b = d) else true()" />
         </xs:extension>
       </xs:complexContent>
    </xs:complexType>

  </xs:schema>

XSD 1.1 Schema [4]:
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="shapes">
       <xs:complexType>
         <xs:sequence>
           <xs:element name="polygon" type="Polygon" maxOccurs="unbounded" />
         </xs:sequence>   
       </xs:complexType>    
    </xs:element>

    <xs:complexType name="Polygon">
       <xs:sequence>
         <xs:element name="a" type="xs:positiveInteger" />
         <xs:element name="b" type="xs:positiveInteger" />
         <xs:element name="c" type="xs:positiveInteger" />
         <xs:element name="d" type="xs:positiveInteger" minOccurs="0" />
       </xs:sequence>
       <xs:attribute name="kind" type="xs:string" use="required" />
       <xs:assert test="if (@kind = 'triangle') then not(d) else true()" />
       <xs:assert test="if (@kind = 'square') then (a = b and b = c and c = d) else true()" />
       <xs:assert test="if (@kind = 'rectangle') then (a = c and b = d) else true()" />    
    </xs:complexType>

  </xs:schema>

XSD 1.1 Schema [5]:
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="shapes">
       <xs:complexType>
         <xs:sequence>
           <xs:element name="polygon" type="Polygon" maxOccurs="unbounded" />
         </xs:sequence>   
       </xs:complexType>
    </xs:element>

    <xs:complexType name="Polygon">
       <xs:sequence>
         <xs:element name="a" type="xs:positiveInteger" />
         <xs:element name="b" type="xs:positiveInteger" />
         <xs:element name="c" type="xs:positiveInteger" />
         <xs:element name="d" type="xs:positiveInteger" minOccurs="0" />
       </xs:sequence>
       <xs:attribute name="kind" use="required">
          <xs:simpleType>
            <xs:restriction base="xs:string">
              <xs:enumeration value="square" />
              <xs:enumeration value="rectangle" />
              <xs:enumeration value="triangle" />
            </xs:restriction>
          </xs:simpleType>
       </xs:attribute>
    </xs:complexType>

  </xs:schema>

XSD 1.1 Schema [6]:
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="shapes">
      <xs:complexType>
         <xs:sequence>
           <xs:element name="polygon" type="Polygon" maxOccurs="unbounded" />
         </xs:sequence>   
       </xs:complexType>    
    </xs:element>

    <xs:complexType name="Polygon">
       <xs:sequence>
         <xs:element name="a" type="xs:positiveInteger" />
         <xs:element name="b" type="xs:positiveInteger" />
         <xs:element name="c" type="xs:positiveInteger" />
       </xs:sequence>
       <xs:attribute name="kind" use="required">
         <xs:simpleType>
           <xs:restriction base="xs:string">
             <xs:enumeration value="square" />
             <xs:enumeration value="rectangle" />
             <xs:enumeration value="triangle" />
           </xs:restriction>
         </xs:simpleType>
        </xs:attribute>    
    </xs:complexType>
 
    <xs:complexType name="Quadrilateral">
       <xs:complexContent>
         <xs:extension base="Polygon">
           <xs:sequence>
             <xs:element name="d" type="xs:positiveInteger" />
           </xs:sequence>
         </xs:extension>
       </xs:complexContent>
    </xs:complexType>

  </xs:schema>

The goal of this use-case is following:
To define a XSD content model, for the XML document [1].

Solution of the use-case, and analysis:
The XSD 1.1 way of solving this would be schema's, [3] or [4] (These are possible solutions, that come to my mind. There could be other solutions too).

The Schema [3] uses both CTA and assertions. Whereas, Schema [4] uses only assertions. To solve this particular use-case, I might likely prefer Schema [4], because the content model defined in this schema is simpler/smaller, which is achieved by defining less of Schema types (only the 'Polygon' type here), and achieving further validation objectives, by defining assertions within this type.

Though, Schema [3] is also an useful solution to this problem, which according to me depicts better XSD type modularity, and also offers better possibilies to reuse the types, defined here in other contexts/use-cases.

But my gut feeling, is to go for Schema [4], for this use-case :)

I am next trying to think, how to solve this use-case in XSD 1.0 way. Here are the things, that come to my mind (with some of of my analysis):
1. Write a XSD Schema, as number [5] above. This is close to the desired solution of the use-case, described in this post. But this schema, doesn't solve this problem completely, as it doesn't strictly enforce the properties of a traingle (has 3 sides), square (has 4 sides, and all sides are equal) or a rectangle (has 4 sides, and opposite sides are equal). This is where, XSD assertions are really needed, if we want to specify XML validation entirely in the XSD layer (I think specifying much of XML validation in XSD layer is good, from application design point of view, as constraints specified with assertions, are entirely declarative and can be easily specified/modified by people, responsible for maintaining business rules, and without requiring to write say procedural code for these kind of validations, in imperative/OO languages like Java).
2. Modify the XML instance, to something like [2]. i.e, make use of XSD 1.0 construct xsi:type (which needs to be specified in the XML instance document) in some way, and validate it with a Schema like, [6]. This solution again, doesn't (and I think with XSD 1.0, we cannot do so) enforce properties of different kind of polygons (as specified in point 1, above), and this also makes the XML document XSD language specific (because it contains the instruction, xsi:type from XSD namespace), making it inconvenient to use such an XML document in environments, where XSD is not available, or where XSD processing is not needed.

The solutions presented in this post, are some of the possible ways, in which the given problem description here might be solved. But I can imagine, that there could be few other possibilities too (from XSD, syntax point of view), to solve such a use-case.

That's all about, I wanted to write at the moment :)

I hope that this post was useful!

No comments: