Showing posts with label xml-schema. Show all posts
Showing posts with label xml-schema. Show all posts

Monday, May 30, 2022

XML Schema : identity constraints essentials and best practices

In this blog post, I'll attempt to describe the best practices, for the use of XML Schema (XSD) identity constraints. I'm going to compare here, the XSD identity constraint instructions xs:unique and xs:key, and describe when to use which one of these.

The XSD xs:key serves the same purpose within XML, as the RDBMS primary keys, whereas XSD xs:unique is a generic syntax to enforce unique values within a set of XML data values. xs:key also enforces, unique values within a set of XML data values. Unlike xs:key, xs:unique permits the values within a XML dataset to be absent (i.e, logically speaking as null values).

Please consider following XML Schema validation example.

XML Schema document:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

   <xs:element name="catalog" type="CatalogType">

      <xs:unique name="prodNumKey">

         <xs:selector xpath="*/product"/>

         <xs:field xpath="number"/>

      </xs:unique>

   </xs:element>

   <xs:complexType name="CatalogType">

      <xs:sequence>

         <xs:element name="department" maxOccurs="unbounded">

            <xs:complexType>

               <xs:sequence>

                  <xs:element name="product" maxOccurs="unbounded">

                     <xs:complexType>

                        <xs:sequence>

                           <xs:element name="number" type="xs:positiveInteger" minOccurs="0"/>

                           <xs:element name="name" type="xs:string"/>

                           <xs:element name="price">

                              <xs:complexType>

                                 <xs:simpleContent>

                                    <xs:extension base="xs:decimal">

                                       <xs:attribute name="currency" type="xs:string"/>

                                    </xs:extension>

                                 </xs:simpleContent>

                              </xs:complexType>

                           </xs:element>

                        </xs:sequence>

                     </xs:complexType>   

                  </xs:element>

               </xs:sequence>

               <xs:attribute name="number" type="xs:positiveInteger"/>

            </xs:complexType>

         </xs:element>

      </xs:sequence>

   </xs:complexType>

</xs:schema>

One of valid XML instance document, for the above mentioned XSD schema, is following:

<catalog>

  <department number="021">

    <product>

      <number>557</number>

      <name>Short-Sleeved Linen Blouse</name>

      <price currency="USD">29.99</price>

    </product>

    <product>

      <name>Ten-Gallon Hat</name>

      <price currency="USD">69.99</price>

    </product>

    <product>

      <number>443</number>

      <name>Deluxe Golf Umbrella</name>

      <price currency="USD">49.99</price>

    </product>

  </department>

</catalog>

Pleas note, the following, within above cited XML Schema validation example,

1) Within the XML Schema document, the "number" child of "product" is specified as following,

<xs:element name="number" type="xs:positiveInteger" minOccurs="0"/>

(i.e, with minOccurs="0", meaning that this element is optional within the corresponding XML instance document)

2) The "catalog" element has following XSD xs:unique definition bound to it,

<xs:unique name="prodNumKey">

     <xs:selector xpath="*/product"/>

     <xs:field xpath="number"/>

</xs:unique>

The above stated facts, mean that, the "number" element is not intended to function as the primary key of "product" data set (because, the primary key value has to be present within all the records of the data set), but for the set of "number" elements that are present within the mentioned XML instance document (the "number" element can be absent within certain "product" elements, as per the above mentioned XML Schema document) their values have to be unique.

We've discussed, the role of XSD xs:unique instruction within above mentioned paragraphs.


Now, as we've stated earlier within this blog post, how do we enforce primary key kind of behavior within an XML Schema document.

Within the context, of above mentioned example, this can be simply done by changing the "number" element declaration to following (i.e, we must not write minOccurs="0" within the XML element declaration),

<xs:element name="number" type="xs:positiveInteger"/>

And, write the "catalog" element declaration as following (i.e, we now use xs:key instead of xs:unique), 

<xs:element name="catalog" type="CatalogType">

      <xs:key name="prodNumKey">

         <xs:selector xpath="*/product"/>

         <xs:field xpath="number"/>

      </xs:key>

</xs:element>

The above changes to the XML Schema document mean that,

All "product" elements must have a "number" child, and all the "number" values within XML instance document have to be unique (and, these characteristics shall make, "number" element as a primary key for "product" data set).


The XML Schema features, related to constructs xs:unique and xs:key, described within this blog post, are supported both by 1.0 and 1.1 versions of XML Schema language.


Acknowledgements : The XML Schema validation example, mentioned within this blog post is borrowed from Priscilla Walmsley's excellent book "Definitive XML Schema, 2nd edition".



Thursday, February 24, 2022

XML Schema 1.1 : <assertion> facet with attribute "fixed"

I've come up with an XML Schema 1.1 example, involving XSD <assertion> facet and XSD attribute "fixed", that I thought should be interesting to write about.

Please consider, following two XML instance documents,

XML document 1:
<?xml version="1.0"?>
<Test>
    <A>a</A>
    <country>USA</country>
    <C>c</C>
</Test>

XML document 2:
<?xml version="1.0"?>
<Test>
    <A>a</A>
    <country>U S A</country>
    <C>c</C>
</Test>

According to "XML document 1" specified above, the element "country" needs to have a fixed value "USA". Whereas, according to "XML document 2" specified above, the element "country" needs to have a fixed value USA with any amount of whitespace characters anywhere within the string value.

The XSD 1.1 schema, for "XML document 1" is following (the schema specified below, is a valid XSD 1.0 schema as well),

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    
    <xs:element name="Test" type="TestType" />
   
    <xs:complexType name="TestType">
        <xs:sequence>
            <xs:element name="A" type="xs:string"/>
            <xs:element name="country" type="xs:string" fixed="USA"/>            
            <xs:element name="C" type="xs:string"/>
        </xs:sequence>
    </xs:complexType>

</xs:schema>

Whereas, XSD 1.1 schema, for "XML document 2" is following,

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    
    <xs:element name="Test" type="TestType" />
   
    <xs:complexType name="TestType">
        <xs:sequence>
            <xs:element name="A" type="xs:string"/>
            <xs:element name="country">
               <xs:simpleType>
                  <xs:restriction base="xs:string">
                    <xs:assertion test="replace($value, '\s', '') = 'USA'"/>
                  </xs:restriction>
               </xs:simpleType>
            </xs:element>
            <xs:element name="C" type="xs:string"/>
        </xs:sequence>
    </xs:complexType>

</xs:schema>

According to the latter schema specified above, the XSD 1.1 <assertion> facet lets us achieve, a special notion of a fixed value as illustrated by the mentioned example above.

Saturday, January 29, 2022

XML Schema 1.1 : conditional inclusion

I've been wanting to, write something about XML Schema (XSD) 1.1 conditional inclusion feature. This particular XML Schema 1.1 feature is described here : https://www.w3.org/TR/xmlschema11-1/#cip. I'm copying, some relevant description from XML Schema 1.1 specification about this feature as following,

<quote>
Whenever a conforming XSD processor reads a ·schema document· in order to include the components defined in it in a schema, it first performs on the schema document the pre-processing described in this section.

Every element in the ·schema document· is examined to see whether any of the attributes vc:minVersion, vc:maxVersion, vc:typeAvailable, vc:typeUnavailable, vc:facetAvailable, or vc:facetUnavailable appear among its [attributes].

Where they appear, the attributes vc:minVersion and vc:maxVersion are treated as if declared with type xs:decimal, and their ·actual values· are compared to a decimal value representing the version of XSD supported by the processor (here represented as a variable V). For processors conforming to this version of this specification, the value of V is 1.1.

If V is less than the value of vc:minVersion, or if V is greater than or equal to the value of vc:maxVersion, then the element on which the attribute appears is to be ignored, along with all its attributes and descendants. The effect is that portions of the schema document marked with vc:minVersion and/or vc:maxVersion are retained if vc:minVersion ≤ V < vc:maxVersion.
</quote>

I'll present below a small XML Schema validation example (as tested with Apache Xerces XML Schema 1.1 processor), about XSD 1.1 conditional inclusion.

Following is an XML instance document, that'll be validated by an XML Schema document,

<val>5</val>

One of the validations, that we want to do is that, an integer value of element "val" must be an even number.

Following is an XML Schema document, that'll validate the above cited XML instance document,

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
                    xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning">

  <xs:element name="val" type="Integer"/>
  
  <xs:simpleType name="Integer" vc:minVersion="1" vc:maxVersion="1.05">
      <xs:restriction base="xs:integer"/>
  </xs:simpleType>
  
  <xs:simpleType name="Integer" vc:minVersion="1.1">
      <xs:restriction base="xs:integer">
         <xs:assertion test="$value mod 2 = 0"/>
      </xs:restriction>
  </xs:simpleType>

</xs:schema>

Within the above specified schema document, there's an element declaration for XML element "val" that is of XML schema type "Integer". There are two variants, of schema type "Integer" defined in this schema. One of an "Integer" type simply says that, the value should be xs:integer (the type with attributes vc:minVersion="1" vc:maxVersion="1.05"). The other "Integer" type says that, the value should be an even integer (the type with attribute vc:minVersion="1.1").

When we perform, the above mentioned XML schema validation, using XSD 1.1 processor in XML schema 1.0 mode, the valid outcome is reported (because, the simpleType with attributes vc:minVersion="1" vc:maxVersion="1.05" is selected, and the other simpleType definition is filtered out during XML schema conditional inclusion pre-processing).

Whereas, when we perform, the above mentioned XML schema validation, using XSD 1.1 processor in XML schema 1.1 mode, an invalid outcome is reported (because, the simpleType with attribute vc:minVersion="1.1" is selected, and the other simpleType definition is filtered out during XML schema conditional inclusion pre-processing).

Please note that, when the above mentioned XML schema validation is done with a pure XML Schema 1.0 processor (that's bundled with Apache XercesJ as well) that was written for the XML Schema 1.0 specification https://www.w3.org/TR/xmlschema-1/, the above cited XSD document won't compile successfully (because, with a pure XSD 1.0 processor, we cannot have within a schema document two global type definitions with same name; "Integer" for the above cited schema document).

Tuesday, January 18, 2022

XML Schema 1.1 : using regex

I've been thinking about this for a while, and thought of writing a blog post here, about this.

Consider the following, XML document instance,

<?xml version="1.0"?>
<temp>ABCABD</temp>

And the following, XML Schema (XSD) 1.1 document (that'll validate the above mentioned, XML document instance),

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  
  <xs:element name="temp">
      <xs:simpleType>
         <xs:restriction base="xs:string">
            <xs:pattern value="(ABC)+"/>
            <xs:assertion test="matches($value, '(ABC)+')"/>
         </xs:restriction>
      </xs:simpleType>
  </xs:element>
  
</xs:schema>

At first thought, as shown within the above mentioned XSD 1.1 document, it might seem that both <xs:pattern> and the <xs:assertion> would fail the validation for the XML document instance value "ABCABD" (according to the XSD document shown, the string "ABC" is shown repeating one or more times).

But in reality, and according to the XSD 1.1 specification, for the example shown above, the XML document instance value "ABCABD" would be invalid for the <xs:pattern>, but valid for <xs:assertion>. That's so because, the XPath 2.0 "matches(..)" function, returns true when any substring matches the regex, unless the "matches(..)" regex is written within ^ and & characters.

Therefore, for the above cited XSD 1.1 example, the following are exactly equivalent XSD validation checks,
<xs:pattern value="(ABC)+"/>
<xs:assertion test="matches($value, '^(ABC)+$')"/>

And for <xs:pattern>, there's no explicit regex anchoring with ^ and $ available (its implied always). i.e, with <xs:pattern>, its always the entire string input that is checked against the pattern regex.

Wednesday, June 9, 2021

XML Schema xsi:type and xs:alternative

After having studied little bit deeply about XML Schema's xsi:type attribute, and xs:alternative (introduced in the XML Schema 1.1 version) element, I've come to conclusion that, there are lot of functional similarities between xsi:type and xs:alternative, and of course differences as well. To illustrate these points, I've come up with following XML Schema and XML document instance examples (that I shall also attempt to explain within this blog post).


XML Schema document 1 (conforming to XSD 1.1)

<?xml version="1.0"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="note" type="NoteType"/>

    <xs:complexType name="NoteType">

       <xs:sequence>

          <xs:element name="to" type="xs:string"/>

          <xs:element name="from" type="xs:string"/>

          <xs:element name="heading" type="xs:string"/>

          <xs:element name="body" type="xs:string"/>

       </xs:sequence>

    </xs:complexType>

    <xs:complexType name="NoteType2">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:complexType name="NoteType3">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

             <xs:assert test="to castable as emailAddress"/>

             <xs:assert test="from castable as emailAddress"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:simpleType name="emailAddress"> 

       <xs:restriction base="xs:string"> 

         <xs:pattern value="[^@]+@[^@\.]+(\.[^@\.]+)+"/>

       </xs:restriction> 

    </xs:simpleType>

</xs:schema>


Following are three XML document instances, that are valid with above specified XML Schema document:

XML document instance 1

<note>

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>

XML document instance 2

<note isConfidential="true" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="NoteType2">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>

XML document instance 3

<note isConfidential="true" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="NoteType3">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>

The "XML document instance 1", is an XML document that is valid according to an XSD element declaration and an XSD type definition "NoteType".

The "XML document instance 2" asserts that the type of an XML instance element "note" must be "NoteType2".

The "XML document instance 3" asserts that the type of an XML instance element "note" must be "NoteType3".

Note that, as per XML Schema language, the XSD type named as a value of xsi:type attribute, must be validly substitutable for the declared type (i.e, which is associated within an XML schema) of an XML element. According to the XML Schema language, a type S is validly substitutable for type T, if type S is a type derived from type T.


Now consider another XML Schema document, as following,

XML Schema document 2 (conforming to XSD 1.1)

<?xml version="1.0"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="note" type="NoteType">

       <xs:alternative test="@noteType2 = true()" type="NoteType2"/>

       <xs:alternative test="@noteType3 = true()" type="NoteType3"/>

    </xs:element>

    <xs:complexType name="NoteType">

       <xs:sequence>

          <xs:element name="to" type="xs:string"/>

          <xs:element name="from" type="xs:string"/>

          <xs:element name="heading" type="xs:string"/>

          <xs:element name="body" type="xs:string"/>

       </xs:sequence>

    </xs:complexType>

    <xs:complexType name="NoteType2">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

             <xs:attribute name="noteType2" type="xs:boolean" use="required"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:complexType name="NoteType3">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

             <xs:attribute name="noteType3" type="xs:boolean" use="required"/>

             <xs:assert test="to castable as emailAddress"/>

             <xs:assert test="from castable as emailAddress"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:simpleType name="emailAddress"> 

       <xs:restriction base="xs:string"> 

         <xs:pattern value="[^@]+@[^@\.]+(\.[^@\.]+)+"/>

       </xs:restriction> 

    </xs:simpleType>

</xs:schema>


Following are two XML document instances, that are valid with above specified XML Schema document:

XML document instance 4

<note isConfidential="true" noteType2="true">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>

XML document instance 5

<note isConfidential="true" noteType3="true">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>


I think that, XML Schema documents 1 and 2 as illustrated in examples above, solve the same XML document validation problem, but in two different ways. With XSD element xs:alternative, we need to introduce a new physical XML attribute like "noteType2" & "noteType3", whereas we can achieve the same effect using an attribute xsi:type with another solution.


Following is another XML Schema 1.1 document, that has a little variation than the XML Schema document "XML Schema document 2" specified earlier above,

XML Schema document 3

<?xml version="1.0"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="note" type="NoteType">

       <xs:alternative test="@noteType = 2" type="NoteType2"/>

       <xs:alternative test="@noteType = 3" type="NoteType3"/>

    </xs:element>

    <xs:complexType name="NoteType">

       <xs:sequence>

          <xs:element name="to" type="xs:string"/>

          <xs:element name="from" type="xs:string"/>

          <xs:element name="heading" type="xs:string"/>

          <xs:element name="body" type="xs:string"/>

       </xs:sequence>

    </xs:complexType>

    <xs:complexType name="NoteType2">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

             <xs:attribute name="noteType" type="NoteTypeVal" use="required"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:complexType name="NoteType3">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

             <xs:attribute name="noteType" type="NoteTypeVal" use="required"/>

             <xs:assert test="to castable as emailAddress"/>

             <xs:assert test="from castable as emailAddress"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:simpleType name="emailAddress"> 

       <xs:restriction base="xs:string"> 

         <xs:pattern value="[^@]+@[^@\.]+(\.[^@\.]+)+"/>

       </xs:restriction> 

    </xs:simpleType>

    <xs:simpleType name="NoteTypeVal"> 

       <xs:restriction base="xs:positiveInteger"> 

          <xs:minInclusive value="2"/>

          <xs:maxInclusive value="3"/>

       </xs:restriction> 

    </xs:simpleType>

</xs:schema>


Two valid XML instance documents, with the above mentioned XML Schema document are following,

<note isConfidential="true" noteType="2">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>

<note isConfidential="true" noteType="3">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>


With the XML Schema document "XML Schema document 3" specified above, we've defined an attribute "noteType" for both the types "NoteType2" and "NoteType3". We distinguish within the XML instance document, with which XSD type the "note" element would be validated, by the value of attribute "noteType" within the XML instance document.

Also note that, as per XML Schema 1.1 specification for type alternatives (i.e when having xs:alternative elements within XSD documents), the following must be applicable,

For each type T of sibling xs:alternative elements within an XSD document, type T must be validly derived from an element's default type definition (this is a constraint similar to those for xsi:type), or T can be type xs:error.  

Sunday, May 3, 2020

Online XML Schema validation service

During some of my spare time, I've developed and deployed an 'online XML Schema validation service' using Apache Xerces-J as XML Schema (XSD) processor at back-end. This 'online XML Schema validation service' is located at, http://www.softwarebytes.org/xmlvalidation/. The HTTPS version is available here: https://www.softwarebytes.org/xmlvalidation/.

The mentioned 'online XML Schema validation service', also provides REST APIs to be invoked from any program that can issue HTTP POST requests. The 'online XML Schema validation service' referred above, provides downloadable examples written in Python and C# that use the provided REST APIs. The responses from mentioned REST APIs can be in following formats: XML, JSON, plain text (the REST API response format, can be set while issuing HTTP requests).

Interestingly, I've discovered that, the above mentioned REST APIs can be invoked directly via a tool like curl by using its platform binary. With modern computer OSs (for e.g, Windows 10), curl comes pre-installed within the OS. Following are network responses on the command line, for the few curl requests that I issued to the mentioned REST APIs,

curl --form xmlFile=@two_inp_files/x1_valid_1.xml --form xsdFile1=@two_inp_files/x1.xsd --form ver=1.1 --form xsd11CtaFullXPath=no --form responseType=xml https://www.softwarebytes.org/xmlvalidation/api/xsValidationHandler

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<validationReport>
   <xsdVer>1.1</xsdVer>
   <success>
      <message>XML document is assessed as valid with the XSD document(s) that were provided.</message>
   </success>
</validationReport>

curl --form xmlFile=@two_inp_files/x1_invalid_1.xml --form xsdFile1=@two_inp_files/x1.xsd --form ver=1.1 --form xsd11CtaFullXPath=no --form responseType=xml https://www.softwarebytes.org/xmlvalidation/api/xsValidationHandler

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<validationReport>
   <xsdVer>1.1</xsdVer>
   <failure>
      <message>XML document is assessed as invalid with the XSD document(s) that were provided.</message>
      <details>
         <detail_1>[Error] x1_invalid_1.xml:3:5:cvc-assertion: Assertion evaluation ('if (@isB = true()) then b else not(b)') for element 'X' on schema type '#AnonType_X' did not succeed.</detail_1>
      </details>
   </failure>
</validationReport>

curl --form xmlFile=@two_inp_files/x1_valid_1.xml --form xsdFile1=@two_inp_files/x1.xsd --form ver=1.1 --form xsd11CtaFullXPath=no --form responseType=json https://www.softwarebytes.org/xmlvalidation/api/xsValidationHandler

{
    "xsdVer": "1.1",
    "success": {"message": "XML document is assessed as valid with the XSD document(s) that were provided."}
}

curl --form xmlFile=@two_inp_files/x1_invalid_1.xml --form xsdFile1=@two_inp_files/x1.xsd --form ver=1.1 --form xsd11CtaFullXPath=no --form responseType=json https://www.softwarebytes.org/xmlvalidation/api/xsValidationHandler

{
    "xsdVer": "1.1",
    "failure": {
        "details": ["[Error] x1_invalid_1.xml:3:5:cvc-assertion: Assertion evaluation ('if (@isB = true()) then b else not(b)') for element 'X' on schema type '#AnonType_X' did not succeed."],
        "message": "XML document is assessed as invalid with the XSD document(s) that were provided."
    }
}

curl --form xmlFile=@input_small.xml --form xsdFile1=@assert_2.xsd --form ver=1.1 --form xsd11CtaFullXPath=no https://www.softwarebytes.org/xmlvalidation/api/xsValidationHandler

You selected XSD 1.1 validation.
XML document is assessed as valid with the XSD document(s) you have provided.

(please note that, since the last curl request above doesn't specify a command line argument 'responseType', a response formatted as plain text is received from the server API. i.e, a plain text response from this API, is the default response format)

The mentioned 'online XML Schema validation service', supports both 1.0 and 1.1 versions of XML Schema language.

Saturday, March 21, 2020

Using XML Schema 1.1 <alternative> with Xerces-J

I wish to share little information here, about Apache Xerces-J's implementation of XML Schema (XSD) 1.1 'type alternatives'.

The XSD 1.1 specification, defines a particular subset of XPath 2.0 language that can be used as value of 'test' attribute of XSD 1.1 <alternative> element. The XSD 1.1 language's XPath 2.0 subset is much smaller than the whole XPath 2.0 language. The specification of this smaller CTA XPath subset, can be read at https://www.w3.org/TR/xmlschema11-1/#coss-ta (specifically, the section mentioning '2.1 It conforms to the following extended BNF' which has grammar specification for the CTA XPath subset).

In fact, the XSD 1.1 specification allows XSD validators, implementing XSD 1.1's <alternative> element, to support a bigger set of XPath 2.0's features (commonly the full XPath 2.0 language) than what is defined by XSD 1.1 CTA (conditional type alternatives) XPath subset.

For XSD 1.1 CTAs, Xerces-J with user option, allows selecting either:

1) The smaller XPath subset (the default for Xerces-J), or

2) Full XPath 2.0. How selecting between XPath subset or the full XPath 2.0 language, can be done for Xerces-J's CTA implementation is described here, https://xerces.apache.org/xerces2-j/faq-xs.html#faq-3.

I've analyzed a bit, the nature of XSD 1.1 CTA XPath subset language. Following are essentially the main XSD 1.1 CTA XPath subset patterns, that may be used within XSD 1.1 schemas when using XSD <alternative> element,

1) Using comparators (like >, <, =, !=, <=, >=):

The example CTA XPath expressions are following,
@x = @y,
@x = 3,
@x != 3,
@x > @y

2) Using comparators with logical operators:

The example CTA XPath expressions are following,
(@x = @y) or (@p = @q),
((1 = 2) or (5 = 6)) and (5 = 7),
(1 and 2) or (5 and 7)

3) Using XPath 2.0 'not' function:

An example XPath expression is following,
(@x = @y) and not(@p)

Interestingly, the XSD 1.1 CTA XPath subset language, allows using only the XPath 2.0 fn:not function and no other XPath 2.0 built-in functions. Constructor functions, for all built-in XSD types may be used, for e.g xs:integer(..), xs:boolean(..) etc, in XSD 1.1 CTA XPath subset expressions.

As per the XSD 1.1 specification, during XSD 1.1 CTA evaluations, the XML element and attribute nodes are untyped (i.e the XML nodes do not carry any type annotation coming from a XML schema). Therefore, in many cases, XSD 1.1 CTA XPath subset expressions when used with Xerces-J need to use explicit casts (for e.g, <xs:alternative test="(xs:integer(@x) = xs:integer(@y)) and fn:not(xs:boolean(@p))"> with namespace prefix 'fn' bound to the URI 'http://www.w3.org/2005/xpath-functions'). For the CTA XPath subset language or the full XPath 2.0 language for CTAs, it is optional for the XPath expressions to have the "fn" prefix with the XPath built-in functions. Typically, XML schema authors would not use the "fn" prefix for XPath built-in functions.

Tuesday, March 10, 2020

XML Schema 1.1 <assert> continued ...

This blog post is related to the XML Schema (XSD) use case that I've discussed within my previous two blog posts. Consider the following XML Schema 1.1 document, having an XSD <assert> element,

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="X">
       <xs:complexType>
          <xs:sequence>
              <xs:element name="isSeqTwo" type="xs:boolean"/>
              <xs:choice>
                 <xs:sequence>
                    <xs:element name="a" type="xs:string"/>
                    <xs:element name="b" type="xs:string"/>
                 </xs:sequence>
                 <xs:sequence>
                    <xs:element name="p" type="xs:string"/>
                    <xs:element name="q" type="xs:string"/>
                 </xs:sequence>
                 <xs:sequence>
                    <xs:element name="x" type="xs:string"/>
                    <xs:element name="y" type="xs:string"/>
                 </xs:sequence>
               </xs:choice>
           </xs:sequence>       
           <xs:assert test="if (isSeqTwo = true()) then p else not(p)"/>
       </xs:complexType>
    </xs:element>

</xs:schema>

The above schema document, is different than my earlier schema documents that I've presented within my previous two blog posts, in following way:
The XML child content model of an element "X", is a sequence of an element followed by a choice.

Within the earlier two blog posts that I've presented, the XML child content model of element "X" is dependent on the value of an attribute on an element "X", which could be enforced using either an XSD 1.1 <assert> or an <alternative>.

Few XML instance documents that are valid or invalid, according to the above XSD schema document are following:

Valid,

<X>
    <isSeqTwo>0</isSeqTwo>
    <x>string1</x>
    <y>string2</y>
</X>

Valid,

<X>
    <isSeqTwo>1</isSeqTwo>
    <p>string1</p>
    <q>string2</q>
</X>

Invalid,

<X>
    <isSeqTwo>1</isSeqTwo>
    <x>string1</x>
    <y>string2</y>
</X>

The XSD use case illustrated above, is useful and could only be accomplished using an XSD 1.1 <assert> element.

As a side discussion, to re-affirm I would like to cite from the XML Schema 1.1 structures specification the following rules: 3.4.4.2 Element Locally Valid (Complex Type) that say,
For an element information item E to be locally ·valid· with respect to a complex type definition T all of the following must be true:
1
2
3
...
6 E is ·valid· with respect to each of the assertions in T.{assertions} as per Assertion Satisfied (§3.13.4.1).

We can infer, from the above rules from XSD 1.1 spec, that an XML instance element is valid according to a XSD complex type definition, if an XML instance element is valid with respect to each of the assertions present on the complex type with which an XML instance element is validated, in addition to other XSD complex type validation rules.

Sunday, March 1, 2020

XML Schema 1.1 <alternative> use cases with <choice> and <attribute>

While using XML Schema (XSD) 1.1, many times when we use XSD 1.1 <assert> we could find a solution using XSD 1.1 <alternative> as well for the same use cases (and vice versa as well). This is usually the case, when the XML child content model of an element, is dependent on the values of attributes of an element on which the attributes appear. This is evident for the first example, of my previous blog post. Given the same XML input examples, as in the first example of my previous blog post, the following XML Schema 1.1 example using <alternative> is also a possible solution,

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="X">
       <xs:alternative test="xs:boolean(@isB) eq true()">
          <xs:complexType>
             <xs:sequence>
               <xs:element name="b" type="xs:string"/>
            </xs:sequence>
             <xs:attribute name="isB" type="xs:boolean" use="required"/>
          </xs:complexType>
       </xs:alternative>
       <xs:alternative>
          <xs:complexType>
             <xs:choice>
               <xs:element name="a" type="xs:string"/>            
               <xs:element name="c" type="xs:string"/>
            </xs:choice>
             <xs:attribute name="isB" type="xs:boolean" use="required"/>
          </xs:complexType>
       </xs:alternative>
    </xs:element>

</xs:schema>

Then the question arises, for these same use cases should we use XSD 1.1 <assert> or an <alternative>? Below are the pros and cons for this, according to me:
1) An XSD 1.1 solution, using <assert> has less lines of code than the one using <alternative>, which many would consider as a benefit.
2) I personally, prefer an XPath expression '@isB = true()' (within 'if (@isB = true()) then b else not(b)') of an <assert> over 'xs:boolean(@isB) eq true()' in an <alternative>. With these examples, for the example involving <alternative> an attribute node 'isB' has a type annotation of xs:untypedAtomic that requires an explicit cast with xs:boolean(..). I tend to prefer, the XPath expressions that don't use explicit casts (since, such XPath expressions look more schema aware).
3) One of the benefits, I see with the solution using an XSD 1.1 <alternative> over <assert>, is better error diagnostics in case of XML validation errors.

Saturday, February 15, 2020

XML Schema 1.1 <assert> use cases with <choice> and <attribute>

I've been imagining that, what could be useful use cases of XML Schema (XSD) 1.1 <assert> construct.

According to the XSD 1.1 structures specification, "assertion components constrain the existence and values of related XML elements and attributes".

One of useful use cases possible for XSD 1.1 <assert> is, to constrain the standard behavior of XSD 1.0 / 1.1 <choice> construct. I'll attempt to write something about this, here on this blog post.

Below is an XSD schema example using the <choice> construct, that is correct for both 1.0 and 1.1 versions of XSD language:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="X">
       <xs:complexType>
          <xs:choice>
             <xs:element name="a" type="xs:string"/>
             <xs:element name="b" type="xs:string"/>
             <xs:element name="c" type="xs:string"/>
          </xs:choice>
       </xs:complexType>
    </xs:element>

</xs:schema>

The above schema document, ensures that following XML instance documents would be valid:

<X>
    <a>some string</a>
</X>

,

<X>
    <b>some string</b>
</X>

,

<X>
    <c>some string</c>
</X>

(essentially showing that, element 'X' can have only one of the elements 'a', 'b' or 'c' as a child element)

Lets see how the above XSD example, can be made a little different using XSD elements <attribute> and <assert>. Below is such a modified XSD document,

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="X">
       <xs:complexType>
          <xs:choice>
             <xs:element name="a" type="xs:string"/>
             <xs:element name="b" type="xs:string"/>
             <xs:element name="c" type="xs:string"/>
          </xs:choice>
          <xs:attribute name="isB" type="xs:boolean" use="required"/>
          <xs:assert test="if (@isB = true()) then b else not(b)"/>
       </xs:complexType>
    </xs:element>

</xs:schema>

The complete meaning of above XSD document is following,
1) The <choice> with three <element> declarations below it, essentially are the same constraints as the earlier XSD document has shown.
2) This schema additionally specifies, a mandatory boolean typed attribute named 'isB'.
3) The <assert> specifies that, if value of attribute 'isB' is true then element 'b' must be present as a child of element 'X'. If value of attribute 'isB' is false, then element 'X' cannot have element 'b' as its child but one of elements 'a' or 'c' would be a valid child of element 'X'.

The following XML instance documents would be valid according to above mentioned XSD document:

<X isB="1">
  <b>some string</b>
</X>

,

<X isB="0">
  <a>some string</a>
</X>

,

<X isB="0">
  <c>some string</c>
</X>

And, the following XML instance documents would be invalid according to the same XSD document:

<X isB="0">
  <b>some string</b>
</X>

,

<X isB="1">
  <a>some string</a>
</X>

,

<X isB="0">
  <d>some string</d>
</X>

Now lets consider another XSD example, where the schema document specifies a choice between three or more sequences. Below is mentioned such a schema document:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="X">
       <xs:complexType>
          <xs:choice>
             <xs:sequence>
                <xs:element name="a" type="xs:string"/>
                <xs:element name="b" type="xs:string"/>
             </xs:sequence>
             <xs:sequence>
        <xs:element name="p" type="xs:string"/>
        <xs:element name="q" type="xs:string"/>
             </xs:sequence>
             <xs:sequence>
        <xs:element name="x" type="xs:string"/>
        <xs:element name="y" type="xs:string"/>
             </xs:sequence>
          </xs:choice>
          <xs:attribute name="isSeqTwo" type="xs:boolean" use="required"/>
          <xs:assert test="if (@isSeqTwo = true()) then p else not(p)"/>
       </xs:complexType>
    </xs:element>

</xs:schema>

The complete meaning of above XSD document is following,
1) A <choice> is specified between three <sequence> elements. Therefore, element 'X' can have one of following sequences as its child: {a, b}, {p, q} or {x, y}.
2) This schema additionally specifies, a mandatory boolean typed attribute named 'isSeqTwo'.
3) The <assert> specifies that, if value of attribute 'isSeqTwo' is true then sequence {p, q} must be present as a child of element 'X'. If value of attribute 'isSeqTwo' is false, then element 'X' cannot have sequence {p, q} as its child but one of sequences {a, b} or {x, y} would be a valid child of element 'X'.

The following XML instance documents would be valid according to above mentioned XSD document:

<X isSeqTwo="1">
  <p>string1</p>
  <q>string2</q>
</X>

,

<X isSeqTwo="0">
  <a>string1</a>
  <b>string2</b>
</X>

,

<X isSeqTwo="0">
  <x>string1</x>
  <y>string2</y>
</X>

And, the following XML instance documents would be invalid according to the same XSD document:

<X isSeqTwo="0">
  <p>string1</p>
  <q>string2</q>
</X>

,

<X isSeqTwo="1">
  <a>string1</a>
  <b>string2</b>
</X>

,

<X isSeqTwo="0">
  <i>string1</i>
  <j>string2</j>
</X>


All the above examples, and any other XSD 1.0/1.1 constructs may be used with any standards compliant XSD validator.

That's about all I wanted to say, about this topic.

Sunday, July 29, 2018

Co-occurrence constraints and Conditional Type Assignment, with XML Schema 1.1

With respect to following article, that I wrote for XML.com : https://www.xml.com/articles/2018/05/29/co-occurrence-cta-xsd/ , I wish to add few more points to that article, via this blog post as mentioned below,

1) Using "if" control expressions in an XSD <assert>

The XPath 2.0 "if" expression is an useful facility for an <assert>. I'll explain this, via a XML schema validation example, as mentioned below.

XML Schema 1.1 document:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:element name="X">
     <xs:complexType>
        <xs:sequence>
           <xs:choice>
              <xs:element name="b" type="xs:integer"/>
              <xs:element name="c" type="xs:integer"/>
           </xs:choice>
           <xs:element name="a" type="xs:integer"/>
        </xs:sequence>
        <xs:attribute name="el" use="required">
           <xs:simpleType>
              <xs:restriction base="xs:string">
                 <xs:enumeration value="b"/>
                 <xs:enumeration value="c"/>
              </xs:restriction>
           </xs:simpleType>
        </xs:attribute>
        <xs:assert test="if (@el = 'b') then b else c"/>
     </xs:complexType>
  </xs:element>

</xs:schema>

(the XSD <choice> specifies, that either element "b" should occur or the element "c" should occur. this is further controlled by an <assert> constraint, which specifies that if value of attribute "el" is 'b' then element "b" should occur, otherwise element "c" should occur for the <choice>.)

Below are few XML instance documents, that can be validated with the above XSD document:

<X el="b">
  <b>100</b>
  <a>200</a>
</X>

(valid document)

<X el="b">
  <c>100</c>
  <a>200</a>
</X>

(invalid document, since the preceding-sibling of element "a" must be element "b")

<X el="p">
  <b>100</b>
  <a>200</a>
</X>

(invalid document, since the value of attribute "el" is not as per the simpleType definition of attribute "el". the <assert> would also fail.)

2) Using XSD <alternative> instead of <assert>

The XSD example mentioned in point 1) above, can easily be converted to an XSD document using <alternative> to solve the same use case. Below is a modified XSD 1.1 document, using <alternative> element solving the same use case as mentioned in point 1) above.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:element name="X">
     <xs:alternative test="@el = 'b'" type="B_type"/>
     <xs:alternative test="@el = 'c'" type="C_type"/>
     <xs:alternative type="xs:error"/>
  </xs:element>
  
  <xs:complexType name="B_type">
     <xs:sequence>
        <xs:element name="b" type="xs:integer"/>
        <xs:element name="a" type="xs:integer"/>
     </xs:sequence>
     <xs:attribute name="el" type="xs:string" use="required"/>
  </xs:complexType>
  
  <xs:complexType name="C_type">
     <xs:sequence>
        <xs:element name="c" type="xs:integer"/>
        <xs:element name="a" type="xs:integer"/>
     </xs:sequence>
     <xs:attribute name="el" type="xs:string" use="required"/>
  </xs:complexType>

</xs:schema>


Wednesday, February 28, 2018

XML Schema 1.1 enhancements for simpleType with variety list

In this blog post, I wish to express my thoughts about new features that have been introduced in XML Schema 1.1 while defining XSD simpleType lists. I'd like to write few XML Schema validation examples here illustrating the same.

Example 1: Using the <xs:assertion> facet, to enforce sorted order on the list data.

Here's the XSD 1.1 document:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" vc:minVersion="1.1"
                    xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning">

   <xs:element name="X">
      <xs:complexType>
         <xs:sequence>
            <xs:element name="a" type="xs:integer"/>
            <xs:element name="b">
               <xs:simpleType>
                  <xs:restriction base="IntegerList">                   
                     <xs:assertion test="every $val in (for $x in 1 to count($value)-1 return ($value[$x] le $value[$x+1])) 
                                         satisfies ($val eq true())">
                        <xs:annotation>
                           <xs:documentation>
                              Assertion facet checking that, items in the list are in ascending sorted order.
                           </xs:documentation>
                        </xs:annotation>
                     </xs:assertion>
                   </xs:restriction>               
               </xs:simpleType>
            </xs:element>
         </xs:sequence>
      </xs:complexType>
   </xs:element>

   <xs:simpleType name="IntegerList">
      <xs:list itemType="xs:integer"/>
   </xs:simpleType>

</xs:schema>

A valid XML document, when validated by the above schema document:

<?xml version="1.0"?>
<X>
   <a>100</a>
   <b>-20 1 2 3</b>
</X>

(the integer list in element "b", is sorted)

An invalid XML document, when validated by the above schema document:

<?xml version="1.0"?>
<X>
   <a>100</a>
   <b>-20 1 2 3 1</b>
</X>

(the integer list in element "b", is not in a sorted order)

Example 2: Using the <xs:assertion> facet, to enforce size of the list using relational operators other than equality (the equality was supported by XSD 1.0 using the xs:length facet).

Here's the XSD 1.1 document:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" vc:minVersion="1.1"
                    xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning">

   <xs:element name="X">
      <xs:complexType>
         <xs:sequence>
            <xs:element name="a" type="xs:integer"/>
            <xs:element name="b">
               <xs:simpleType>
                  <xs:restriction base="IntegerList">                   
                     <xs:assertion test="count($value) lt 10">
                        <xs:annotation>
                           <xs:documentation>
                              Assertion facet checking that, cardinality/size of list should be less than 10.
                           </xs:documentation>
                        </xs:annotation>
                     </xs:assertion>
                   </xs:restriction>               
               </xs:simpleType>
            </xs:element>
         </xs:sequence>
      </xs:complexType>
   </xs:element>

   <xs:simpleType name="IntegerList">
      <xs:list itemType="xs:integer"/>
   </xs:simpleType>

</xs:schema>

A valid XML document, when validated by the above schema document:

<?xml version="1.0"?>
<X>
   <a>100</a>
   <b>-20 1 2 3</b>
</X>

(the integer list in element "b", has less than 10 items)

An invalid XML document, when validated by the above schema document:

<?xml version="1.0"?>
<X>
   <a>100</a>
   <b>-20 1 2 3 1 1 1 1 1 1 1</b>
</X>

(the integer list in element "b", has more than 10 items)

Example 3: Using the <xs:assertion> facet, to enforce that each item of list must be an even number.

Here's the XSD 1.1 document:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" vc:minVersion="1.1"
                    xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning">

   <xs:element name="X">
      <xs:complexType>
         <xs:sequence>
            <xs:element name="a" type="xs:integer"/>
            <xs:element name="b">
               <xs:simpleType>              
          <xs:list>
             <xs:annotation>
                <xs:documentation>
                   The simpleType definition below, is itemType of this list.
                </xs:documentation>
             </xs:annotation>
             <xs:simpleType>
                <xs:annotation>
                  <xs:documentation>
                      Every item of list must be an even number.
                  </xs:documentation>
                </xs:annotation>
                <xs:restriction base="xs:integer">
                   <xs:assertion test="$value mod 2 = 0"/>
                </xs:restriction>
             </xs:simpleType>
          </xs:list>
               </xs:simpleType>
            </xs:element>
         </xs:sequence>
      </xs:complexType>
   </xs:element>

</xs:schema>

A valid XML document, when validated by the above schema document:

<?xml version="1.0"?>
<X>
   <a>100</a>
   <b>2 4 6</b>
</X>

(the integer list in element "b", has each item as even number)

An invalid XML document, when validated by the above schema document:

<?xml version="1.0"?>
<X>
   <a>100</a>
   <b>2 1 6</b>
</X>

(the integer list in element "b", has one or more item not even)

As illustrated by the above examples, some new XML Schema validation scenarios are possible with the introduction of <xs:assertion> facet on XSD simple types in 1.1 version of the XML Schema language.

2018-04-11: I've been thinking about this post for a while, and wish to say something about sorting. I think, sorting as a problem in programming is more meaningful to solve, when we have to *do* sorting and not when, we have to determine whether some list is sorted or not. What my XSD Example 1 showed above, is *not doing* sorting but the other thing. Nevertheless, I'm happy to discover what my XSD shows.

Monday, August 7, 2017

Mathematical table data with XML Schema 1.1

Here's a simple example, using XML Schema 1.1 <assert> to validate elementary school mathematical tables.

XML document:
<?xml version="1.0"?>
<table id="2">
  <x>2</x>
  <x>4</x>
  <x>6</x>
  <x>8</x>
  <x>10</x>
  <x>12</x>
  <x>14</x>
  <x>16</x>
  <x>18</x>
  <x>20</x>
</table>

XSD 1.1 document:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 
   <xs:element name="table">
     <xs:complexType>
        <xs:sequence>
           <xs:element name="x" minOccurs="10" maxOccurs="10"/>
        </xs:sequence>
        <xs:attribute name="id" type="xs:positiveInteger" use="required">
           <xs:annotation>
             <xs:documentation>Mathematical table of @id is represented.</xs:documentation>
           </xs:annotation>
        </xs:attribute>
        <xs:assert test="x[1] = @id"/>
        <xs:assert test="every $x in x[position() gt 1] satisfies $x = $x/preceding-sibling::x[1] + @id">
           <xs:annotation>
              <xs:documentation>An XPath 2.0 expression validating the depicted mathematical table.    
              </xs:documentation>
           </xs:annotation>
        </xs:assert>
     </xs:complexType>
   </xs:element>
 
</xs:schema>

Tuesday, August 1, 2017

Great write up on XML Schema 1.1

On this page, http://www.xfront.com/xml-schema-1-1/ Roger L. Costello has posted some wonderful write up on XML Schema 1.1 technology. Enthusiasts are encouraged to read that.

Roger's language is very simple, and covers almost everything from the perspective of XML Schema 1.1 user's needs.

Thursday, July 13, 2017

XML Schema 1.1 inheritable attributes

I've been thinking to write a small and complete example, clarifying the role of XML Schema 1.1 inheritable attributes (please see inheritable = boolean within the definition "XML Representation Summary: attribute Element Information Item" in the XSD 1.1 specification).

The XSD 1.1 inheritable attributes, are primarily useful when implementing XSD Type Alternatives.

Below is an XSD 1.1 example, and two corresponding valid XML instance documents. This example as a whole implements inheritable attributes and few other areas of XSD 1.1.

XSD document:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 
   <xs:element name="X">
      <xs:complexType>
         <xs:sequence>
            <xs:element name="Y">
               <xs:alternative test="@y = 'A'">
                  <xs:complexType>
                     <xs:sequence>
                        <xs:element name="A">
                           <xs:complexType>
                              <xs:simpleContent>
                                 <xs:extension base="xs:string">
                                    <xs:attribute name="attr" type="xs:integer"/>
                                 </xs:extension>
                              </xs:simpleContent>                            
                           </xs:complexType>
                        </xs:element>
                     </xs:sequence>
                  </xs:complexType>
               </xs:alternative>
               <xs:alternative test="@y = 'B'">
                 <xs:complexType>
                     <xs:sequence>
                        <xs:element name="B">
                           <xs:complexType>
                              <xs:simpleContent>
                                 <xs:extension base="xs:string">
                                    <xs:attribute name="attr" type="xs:integer"/>
                                 </xs:extension>
                              </xs:simpleContent>                            
                           </xs:complexType>
                        </xs:element>
                     </xs:sequence>
                  </xs:complexType>            
               </xs:alternative>
            </xs:element>
         </xs:sequence>
         <xs:attribute name="y" inheritable="true">
            <xs:simpleType>
               <xs:restriction base="xs:string">
                 <xs:enumeration value="A"/>
                 <xs:enumeration value="B"/>
               </xs:restriction>
            </xs:simpleType>
         </xs:attribute>
      </xs:complexType>
   </xs:element>
 
</xs:schema>


XML document 1:

<?xml version="1.0"?>
<X y="A">
  <Y>
    <A attr="1">hello</A>
  </Y>
</X>

XML document 2:

<?xml version="1.0"?>
<X y="B">
  <Y>
    <B attr="1">hello</B>
  </Y>
</X>

The XML instance documents shown above should be validated with the XSD document provided.

Following is little explanation with respect to the semantics of this example:
An inheritable attribute "y" is defined on element "X". The XSD type of element "Y" is chosen at runtime (i.e at validation time), depending on the value of this attribute. Please notice that, how an attribute defined on element "X" makes the definition of attribute accessible to element "Y" in the schema document.

I hope that, this blog post is useful.

Thursday, June 1, 2017

XPath 2.0 atomization with XML Schema 1.1 validation

XPath 2.0 atomization as a concept, as applicable to XML Schema 1.1 validation is worth looking at. I would attempt to write something about this topic, here in this blog post.

Lets look at the following XML Schema 1.1 validation example, that we'll use to discuss this topic.

XSD 1.1 document:

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 
   <xs:element name="X">
      <xs:complexType>
         <xs:sequence>
            <xs:element name="a" type="xs:integer"/>
            <xs:element name="b" type="xs:integer"/>
         </xs:sequence>
         <xs:assert test="a gt b"/>
      </xs:complexType>
   </xs:element>
 
</xs:schema>

XML instance document that is validated with above mentioned XSD document:

<?xml version="1.0"?>
<X>
  <a>4</a>
  <b>7</b>
</X>

Upon XML Schema validation, the above mentioned XML instance document would be reported as invalid, because numeric value of "a" is less than "b". Now what is XPath 2.0 atomization, as for in this example that I wish to talk about?

Since the XML document has been validated with the mentioned XSD document, while building the XPath data model tree to evaluate <assert>, the nodes of XPath tree are bound with the XSD types as mentioned in the XSD document. Therefore, the <assert> XPath 2.0 expression "a gt b", comes with runtime availability of the corresponding XSD types on <assert> tree nodes for elements "a" and "b". In XPath 2.0 terms, the values as a result of atomization operation of nodes for XML elements "a" and "b" are used when an XPath expression "a gt b" is evaluated. We can't test a greater/less than relation on XML nodes, but we can do that on numbers for example, and the conversion of XML runtime nodes to atomic values like number is what XPath 2.0 atomization achieves.

I've used Apache Xerces as an XSD 1.1 validator, for testing examples for this blog post.

Sunday, December 18, 2016

XML Schema 1.1 type alternative patterns


Please note that, XML Schema 1.1 type alternatives allows selection of an XML Schema type depending on something related to attributes.

I think use of XML Schema 1.1 type alternatives fall into the following 2 broad patterns:

1) When an attribute is mandatory

The following XML Schema document illustrates this use:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

   <xs:element name="X">
      <xs:alternative test="@a = 'val1'" type="XType1"/>
      <xs:alternative test="@a = 'val2'" type="XType2"/>
      <xs:alternative type="XType3"/>
   </xs:element>
  
   <xs:complexType name="XType1">
      <xs:sequence>
        <xs:element name="A" type="xs:integer"/>
        <xs:element name="B" type="xs:integer"/>
      </xs:sequence>
      <xs:attribute name="a" type="xs:string" use="required"/>
   </xs:complexType>
  
   <xs:complexType name="XType2">
      <xs:sequence>
         <xs:element name="C" type="xs:integer"/>
         <xs:element name="D" type="xs:integer"/>
      </xs:sequence>
      <xs:attribute name="a" type="xs:string" use="required"/>
   </xs:complexType>
  
   <xs:complexType name="XType3">
      <xs:sequence>
         <xs:element name="P" type="xs:integer"/>
         <xs:element name="Q" type="xs:integer"/>
      </xs:sequence>
      <xs:attribute name="a" type="xs:string" use="required"/>
   </xs:complexType>
 
</xs:schema>

Following are some of XML instance documents, which are valid according to the above schema document.

<X a="val1">
  <A>1</A>
  <B>2</B>
</X>

i.e when attribute's value is "val1", a specific complex type is assigned to X.

<X a="val2">
  <C>1</C>
  <D>2</D>
</X>

i.e when attribute's value is "val2", a specific complex type is assigned to X.

and

<X a="something else">
  <P>1</P>
  <Q>2</Q>
</X>

i.e when attribute's value is anything other than "val1" or "val2", a specific complex type is assigned to X.  

2) When an attribute is optional

The following XML Schema document illustrates this use:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

   <xs:element name="X">
      <xs:alternative test="@a" type="XType1"/>
      <xs:alternative test="not(@a)" type="XType2"/>
   </xs:element>
  
   <xs:complexType name="XType1">
      <xs:sequence>
        <xs:element name="A" type="xs:integer"/>
        <xs:element name="B" type="xs:integer"/>
      </xs:sequence>
      <xs:attribute name="a" type="xs:string" use="optional"/>
   </xs:complexType>
  
   <xs:complexType name="XType2">
      <xs:sequence>
         <xs:element name="C" type="xs:integer"/>
         <xs:element name="D" type="xs:integer"/>
      </xs:sequence>
      <xs:attribute name="a" type="xs:string" use="optional"/>
   </xs:complexType>
 
</xs:schema>

Following are some of XML instance documents, which are valid according to the above schema document.

<X a="something1">
  <A>1</A>
  <B>2</B>
</X>

i.e when an attribute is present, a specific complex type is assigned to X.

and

<X>
  <C>1</C>
  <D>2</D>
</X>

i.e when an attribute is absent, a specific complex type is assigned to X.

I hope that this post is useful.

Sunday, November 13, 2016

XML Schema : <assert> helps us process wild-cards and attributes

Please let me illustrate my point with the following XML Schema (1.1) example:

XML Schema document:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

   <xs:element name="X">
     <xs:complexType>
        <xs:sequence>
           <xs:any processContents="skip" minOccurs="3" maxOccurs="3"/>
        </xs:sequence>
        <xs:attribute name="y1" type="xs:string"/>
        <xs:attribute name="y2" type="xs:string"/>
        <xs:attribute name="y3" type="xs:string"/>
        <xs:assert test="deep-equal(for $el in * return name($el), for $at in @* return name($at))"/>
     </xs:complexType>
   </xs:element>
 
</xs:schema>

This schema document says following:
1) A wild-card requires 3 element nodes.
2) There are 3 attribute nodes, of same cardinality as the elements.
3) The <assert> says that, name of elements validated by wild-cards must be same as the names of attributes.

Here's a valid XML document for the above schema document:
<?xml version="1.0" encoding="UTF-8"?>
<X y1="A" y2="B" y3="C">
  <y1>A</y1>
  <y2>B</y2>
  <y3>C</y3>
</X>

Reference to XPath 2.0 language (for <assert> path expressions) : https://www.w3.org/TR/xpath20/.

I hope this example is useful.

Friday, October 28, 2016

XML Schema 1.1 : assertion refines enumeration

In this post, I'll try to explain how the XML Schema 1.1 <assertion> facet refines the XML Schema <enumeration> facet in useful ways, and also helps us build a useful domain vocabulary written in the XML Schema language.

Consider the following XML Schema 1.1 document:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

   <xs:element name="day" type="Day"/>
  
   <xs:element name="workday" type="WorkDay"/>
  
   <xs:element name="holiday" type="Holiday"/>
  
   <xs:simpleType name="WorkDay">
      <xs:restriction base="Day">
         <xs:assertion test="$value != 'saturday' and $value != 'sunday'"/>
      </xs:restriction>
   </xs:simpleType>
  
   <xs:simpleType name="Holiday">
     <xs:restriction base="Day">
        <xs:assertion test="$value = 'saturday' or $value = 'sunday'"/>
     </xs:restriction>
   </xs:simpleType> 
  
   <xs:simpleType name="Day">
      <xs:restriction base="xs:string">
         <xs:enumeration value="monday"/>
         <xs:enumeration value="tuesday"/>
         <xs:enumeration value="wednesday"/>
         <xs:enumeration value="thursday"/>
         <xs:enumeration value="friday"/>
         <xs:enumeration value="saturday"/>
         <xs:enumeration value="sunday"/>
      </xs:restriction>
   </xs:simpleType>
 
</xs:schema>

This is a very simple XSD document, and the XML documents validated by this XSD document will also be very simple.

The following is one invalid document for the given schema,
<holiday>wednesday</holiday>

To my opinion, this example has illustrated how the <assertion> facet refines the <enumeration> facet during simple type derivation, and also helps us clearly build a fine domain vocabulary. The types "WorkDay" and "Holiday" are verbally linked to the type "Day" in the XML Schema document as in this example.