Wednesday, June 9, 2021

XML Schema xsi:type and xs:alternative

After having studied little bit deeply about XML Schema's xsi:type attribute, and xs:alternative (introduced in the XML Schema 1.1 version) element, I've come to conclusion that, there are lot of functional similarities between xsi:type and xs:alternative, and of course differences as well. To illustrate these points, I've come up with following XML Schema and XML document instance examples (that I shall also attempt to explain within this blog post).


XML Schema document 1 (conforming to XSD 1.1)

<?xml version="1.0"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="note" type="NoteType"/>

    <xs:complexType name="NoteType">

       <xs:sequence>

          <xs:element name="to" type="xs:string"/>

          <xs:element name="from" type="xs:string"/>

          <xs:element name="heading" type="xs:string"/>

          <xs:element name="body" type="xs:string"/>

       </xs:sequence>

    </xs:complexType>

    <xs:complexType name="NoteType2">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:complexType name="NoteType3">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

             <xs:assert test="to castable as emailAddress"/>

             <xs:assert test="from castable as emailAddress"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:simpleType name="emailAddress"> 

       <xs:restriction base="xs:string"> 

         <xs:pattern value="[^@]+@[^@\.]+(\.[^@\.]+)+"/>

       </xs:restriction> 

    </xs:simpleType>

</xs:schema>


Following are three XML document instances, that are valid with above specified XML Schema document:

XML document instance 1

<note>

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>

XML document instance 2

<note isConfidential="true" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="NoteType2">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>

XML document instance 3

<note isConfidential="true" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="NoteType3">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>

The "XML document instance 1", is an XML document that is valid according to an XSD element declaration and an XSD type definition "NoteType".

The "XML document instance 2" asserts that the type of an XML instance element "note" must be "NoteType2".

The "XML document instance 3" asserts that the type of an XML instance element "note" must be "NoteType3".

Note that, as per XML Schema language, the XSD type named as a value of xsi:type attribute, must be validly substitutable for the declared type (i.e, which is associated within an XML schema) of an XML element. According to the XML Schema language, a type S is validly substitutable for type T, if type S is a type derived from type T.


Now consider another XML Schema document, as following,

XML Schema document 2 (conforming to XSD 1.1)

<?xml version="1.0"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="note" type="NoteType">

       <xs:alternative test="@noteType2 = true()" type="NoteType2"/>

       <xs:alternative test="@noteType3 = true()" type="NoteType3"/>

    </xs:element>

    <xs:complexType name="NoteType">

       <xs:sequence>

          <xs:element name="to" type="xs:string"/>

          <xs:element name="from" type="xs:string"/>

          <xs:element name="heading" type="xs:string"/>

          <xs:element name="body" type="xs:string"/>

       </xs:sequence>

    </xs:complexType>

    <xs:complexType name="NoteType2">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

             <xs:attribute name="noteType2" type="xs:boolean" use="required"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:complexType name="NoteType3">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

             <xs:attribute name="noteType3" type="xs:boolean" use="required"/>

             <xs:assert test="to castable as emailAddress"/>

             <xs:assert test="from castable as emailAddress"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:simpleType name="emailAddress"> 

       <xs:restriction base="xs:string"> 

         <xs:pattern value="[^@]+@[^@\.]+(\.[^@\.]+)+"/>

       </xs:restriction> 

    </xs:simpleType>

</xs:schema>


Following are two XML document instances, that are valid with above specified XML Schema document:

XML document instance 4

<note isConfidential="true" noteType2="true">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>

XML document instance 5

<note isConfidential="true" noteType3="true">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>


I think that, XML Schema documents 1 and 2 as illustrated in examples above, solve the same XML document validation problem, but in two different ways. With XSD element xs:alternative, we need to introduce a new physical XML attribute like "noteType2" & "noteType3", whereas we can achieve the same effect using an attribute xsi:type with another solution.


Following is another XML Schema 1.1 document, that has a little variation than the XML Schema document "XML Schema document 2" specified earlier above,

XML Schema document 3

<?xml version="1.0"?>

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:element name="note" type="NoteType">

       <xs:alternative test="@noteType = 2" type="NoteType2"/>

       <xs:alternative test="@noteType = 3" type="NoteType3"/>

    </xs:element>

    <xs:complexType name="NoteType">

       <xs:sequence>

          <xs:element name="to" type="xs:string"/>

          <xs:element name="from" type="xs:string"/>

          <xs:element name="heading" type="xs:string"/>

          <xs:element name="body" type="xs:string"/>

       </xs:sequence>

    </xs:complexType>

    <xs:complexType name="NoteType2">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

             <xs:attribute name="noteType" type="NoteTypeVal" use="required"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:complexType name="NoteType3">

       <xs:complexContent>

          <xs:extension base="NoteType">

             <xs:attribute name="isConfidential" type="xs:boolean" use="required"/>

             <xs:attribute name="noteType" type="NoteTypeVal" use="required"/>

             <xs:assert test="to castable as emailAddress"/>

             <xs:assert test="from castable as emailAddress"/>

          </xs:extension>

       </xs:complexContent>

    </xs:complexType>

    <xs:simpleType name="emailAddress"> 

       <xs:restriction base="xs:string"> 

         <xs:pattern value="[^@]+@[^@\.]+(\.[^@\.]+)+"/>

       </xs:restriction> 

    </xs:simpleType>

    <xs:simpleType name="NoteTypeVal"> 

       <xs:restriction base="xs:positiveInteger"> 

          <xs:minInclusive value="2"/>

          <xs:maxInclusive value="3"/>

       </xs:restriction> 

    </xs:simpleType>

</xs:schema>


Two valid XML instance documents, with the above mentioned XML Schema document are following,

<note isConfidential="true" noteType="2">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>

<note isConfidential="true" noteType="3">

   <to>abc.pqr@gmail.com</to>

   <from>no-reply@gmail.com</from>

   <heading>hi</heading>

   <body>this is test</body>

</note>


With the XML Schema document "XML Schema document 3" specified above, we've defined an attribute "noteType" for both the types "NoteType2" and "NoteType3". We distinguish within the XML instance document, with which XSD type the "note" element would be validated, by the value of attribute "noteType" within the XML instance document.

Also note that, as per XML Schema 1.1 specification for type alternatives (i.e when having xs:alternative elements within XSD documents), the following must be applicable,

For each type T of sibling xs:alternative elements within an XSD document, type T must be validly derived from an element's default type definition (this is a constraint similar to those for xsi:type), or T can be type xs:error.