Saturday, October 23, 2010

XSD: schema type definition for empty XML content models

I'm inclined to write a little post, suggesting a correction (perhaps a better schema design) to an XML schema document I wrote in the blog post, http://mukulgandhi.blogspot.com/2010/07/xsd-11-xml-schema-design-approaches.html [1].

In this post [1], I suggested the following XML schema type definition for empty content models (I assume there would not be any attributes on an element):
  <xs:complexType name="EMPTY"> 
     <xs:complexContent> 
        <xs:restriction base="xs:anyType" /> 
     </xs:complexContent> 
  </xs:complexType>

Instead of the above schema type definition, I find the following (which is simpler I believe) schema type definition [2] (intending to constrain an XML element) to be better instead:
  <xs:element name="X">
    <xs:complexType/>
  </xs:element>

The element definition [2] above intends to validate an XML fragment like following:
<X/>

In the above example, I intend to suggest that there must not be any child nodes (and neither any XML attributes on an element) within element "X". Interestingly (nothing new really for people knowing XML schema language :) the XML Schema language, only allows constraining XML element and attribute nodes (and optionally these being XML namespace aware) and it doesn't bother about other XML infoset components like comments, processing-instructions and so on (which are present in XPath data model for example) [A] -- this means that any other kinds of nodes, than XML elements and attributes are ignored by XML Schema language and a compliant XML schema validator. This nature [A] of XML schema language is OK as I've learnt (there have been some nice discussions about all of this at XML-DEV list in recent past).

2010-10-26: Here's another variant for definition of empty XML content models.
  <xs:simpleType name="EMPTY">
     <xs:restriction base="xs:string">
        <xs:maxLength value="0"/>
     </xs:restriction>
  </xs:simpleType>

This defines an XML schema 'simpleType' -- and enforces content emptiness with the schema 'maxLength' facet on type xs:string, instead of a complex type as defined in the previous example. I'm more inclined to define element emptiness by an simpleType like above, since intent (and semantics) of schema simple types is never to define XML attributes, but those of complexType are.

I hope the corrections I've shared in this post is appreciated by folks who've read my earlier post cited above [1].

Sunday, October 10, 2010

XSD 1.1: XML schema design approaches cotd... PART 4

In this blog post i'm trying to describe (I find the subject matter here interesting enough to have a new blog post!) few more XML Schema (i'm trying to cook-up XSD 1.1 examples :) use-cases - using largely XSD 1.1 assertions which are now solvable with XML Schema 1.1 (for example constraining cardinality of XML Schema xs:list items as described below), and as per my view-point couldn't be solved with XML Schema 1.0.

I hope, XML Schema community might find few of the things here interesting.

This post can be considered the PART 4 of the XML Schema 1.1 design series that I started couple of weeks ago. The previous parts of this series are available here:

1) PART 1
2) PART 2
3) PART 3

I'm using latest XML Schema 1.1 code-base from Xerces-J SVN repos.

Use-case: (A)
The examples in this post illustrate, how we can constrain the cardinality of XML Schema 1.1 xs:list instance members, and optionally constraining (just to verify myself how XSD 1.1 assertions behave in various combinations) few aspects of list members (like for example that, list items need to be even integers).

Here's an XML instance document (this describes a simple enough list of integers encapsulated in an XML element "X"), which I'll use for illustrations in this post:

[XML 1] (named temp.xml)
  <X>2 4 6 5 10 3</X>

Below are few XML Schema 1.1 examples (with Schema 1.1 instructions highlighted with different color), and explanations from my point of view thereafter:

[XML Schema 1]
  <?xml version='1.0'?>
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   
    <xs:element name="X">
       <xs:complexType>
         <xs:simpleContent>
            <xs:restriction base="INT_LIST">
              <xs:assertion test="count($value) le 5" />
            </xs:restriction>
         </xs:simpleContent>
       </xs:complexType>
    </xs:element>
   
    <xs:complexType name="INT_LIST">
       <xs:simpleContent>
         <xs:restriction base="xs:anyType">
            <xs:simpleType>
               <xs:list itemType="xs:integer" />          
            </xs:simpleType>
            <xs:assert test="every $x in $value satisfies ($x mod 2 = 0)" />
         </xs:restriction>
       </xs:simpleContent> 
    </xs:complexType>

  </xs:schema>

[XML Schema 2]
  <?xml version='1.0'?>
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   
     <xs:element name="X">
        <xs:simpleType>
          <xs:restriction base="INT_LIST">
             <xs:assertion test="$value mod 2 = 0" />
          </xs:restriction>
        </xs:simpleType>
     </xs:element>
   
     <xs:simpleType name="INT_LIST">
       <xs:list itemType="xs:integer" />
     </xs:simpleType>

  </xs:schema>

[XML Schema 3]
  <?xml version='1.0'?>
  <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   
    <xs:element name="X">
      <xs:complexType>
        <xs:simpleContent>
          <xs:extension base="INT_LIST">
             <xs:assert test="count($value) le 5" />
          </xs:extension>
        </xs:simpleContent>
      </xs:complexType>
    </xs:element>
   
    <xs:simpleType name="INT_LIST">
       <xs:list itemType="xs:integer" />
    </xs:simpleType>

  </xs:schema>

Here are results of XML instance (of document [XML 1]) validation, with the specified schema's:

1. When XML document ([XML 1]) is validated by the schema [XML Schema 1], we get following validation outcomes with Xerces:
[Error] temp.xml:1:20: cvc-assertion.3.13.4.1: Assertion evaluation ('every $x in $value satisfies ($x mod 2 = 0)') for element 'X' with type 'INT_LIST' did not succeed.
[Error] temp.xml:1:20: cvc-assertion.3.13.4.1: Assertion evaluation ('count($value) le 5') for element 'X' with type '#anonymous' did not succeed.


2. When XML document ([XML 1]) is validated by the schema [XML Schema 2], we get following validation outcomes (with Xerces):
[Error] temp.xml:1:20: cvc-assertion.3.13.4.1: Assertion evaluation ('$value mod 2 = 0') for element 'X' with type '#anonymous' did not succeed. Assertion failed for an xs:list member value '5'.
[Error] temp.xml:1:20: cvc-assertion.3.13.4.1: Assertion evaluation ('$value mod 2 = 0') for element 'X' with type '#anonymous' did not succeed. Assertion failed for an xs:list member value '3'.


3. When XML document ([XML 1]) is validated by the schema [XML Schema 3], we get following validation outcomes (with Xerces):
[Error] temp.xml:1:20: cvc-assertion.3.13.4.1: Assertion evaluation ('count($value) le 5') for element 'X' with type '#anonymous' did not succeed.

Here's some quick analysis from my point of view, with regards to what I wanted to achieve with these use-cases (A):

The XML Schema 1.1 assertions XPath 2.0 context variable "$value" has a type annotation xs:anyAtomicType*.

1. The first validation result (1. above) illustrates that every item of xs:list needs to be an even integer, and number of list items are constrained to be maximum "5" (this is a sample "max" limit on number of list items).

2. I intended to use validation results 2. and 3. in combination doing an boolean "AND" of them, essentially to have same XML instance validation objective as case 1. The boolean "AND" of two schema validations can be achieved with for example, Java JAXP validation API. I wrote XML Schema document, [XML Schema 2] to have the XML Schema validator return each individual list item, which do not pass test of mathematical evenness (this was not entirely achieved with schema document [XML Schema 1] -- where the schema detected an evenness failure for whole list instance, but didn't report every individual list item which didn't pass evenness test).

I hope the intent of the use-case described here, and the solutions offered are explained clear enough for XML Schema audience.

Thanks for reading, and as usual I hope that this blog post was interesting!