Saturday, April 30, 2011

XML Schema: facets constraining the cardinality of simpleType->list

I thought I should write a little clarification of a point I mentioned in my blog post, http://mukulgandhi.blogspot.com/2010/10/xsd-11-xml-schema-design-approaches.html.

I seem to have suggested in the above cited post, that XML Schema 1.1 assertions are probably necessary to impose restrictions on cardinality of an XML Schema simpleType list instance. But this fact doesn't appear to be true, after I realized this reading the XML Schema spec lately; which allows the following constraining facets on XML Schema simpleType's with variety list:
[1]
<xs:length ../>
<xs:minLength ../>
<xs:maxLength ../>

(ref, http://www.w3.org/TR/xmlschema11-2/#defn-coss which says, "If {variety} is list, then the applicable facets are assertions, length, minLength, maxLength, pattern, enumeration, and whiteSpace")

These constraining facets [1], on simpleType with variety list were available in XML Schema 1.0 too.

These facets [1] may serve the design purpose (and should probably be even efficient than using assertions, since assertions require compiling the XPath expressions in their "test" attribute's, and to build quite a bit of context information for XPath expression evaluation) I had mentioned in the above cited post.

Also to mention, that an assertion facet for simpleType with variety list, could be found useful for other purposes (i.e they are not without purpose!), for example as follows:

<xs:assertion test="count($value) mod 2 = 0"/>

(the list instance must have even number of items)

Thanks for reading this post!

4 comments:

ud said...

Hey Mukul, I am really excited about using XSD 1.1 assertions with Xerces. I was using Schematron, but adoption is very low. One thing that would be nice is instead of having the xerces:message="" attribute be a static string, have it contain a message key, and a list of parameters that are XPath expressions. This would allow for internationalization of the messages, and for passing values back to the message. For example: xerces:messagekey="messages.endDateMustBeAfterStartDate" xerces:messageXpathParam="/startDate" with an english translation of the message be "The end date must be after {1}" and messageXpathParam resolve to the actual date value of the xpath expression and replace {1} in the final message.

Mukul Gandhi said...

Thanks for your comment.

There's already something like what you've proposed, available in Xerces. But it's for the xs:assertion facet, and nothing like what you're suggesting is available for xs:assert.

Here's a summary of the currently available support:

This XSD document,

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xerces="http://xerces.apache.org">

<xs:element name="X">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:assertion test="$value mod 2 = 0" xerces:message="The value {$value} is not divisible by 2"/>
</xs:restriction>
</xs:simpleType>
</xs:element>

</xs:schema>

intends to validate the following XML document,

<X>3</X>

Since the above XML document is invalid as per the schema given, the following error message is produced by Xerces:

Assertion failed for schema type '#AnonType_X'. The value 3 is not divisible by 2.

The instance document value 3 replaces the parameter {$value} (this is a built in Xerces key word for such a case) in the xs:assertion error string. This facility is currently only available for xs:assertion facet and is only tested for atomic values.

If we don't specify xerces:message in the above xs:assertion, the error message would look like,

Assertion evaluation ('$value mod 2 = 0') for element 'X' on schema type '#AnonType_X' did not succeed.

This has the full XPath expression in the error message.

Similar behavior is there for the xs:assert element; but that could only have a static custom error message, or an error message with the full XPath expression when not using the xerces:message attribute. I believe, the problem with XPath expressions in assert error messages is that, sometimes XPath expressions can be lengthy and can obscure the error messages from useful interpretation.

I think, we have to do trade off for assert error messages. For small XPath expressions, having them in error messages should be fine. But for lengthy assert XPath expressions; one could use the xerces:message facility if working with Xerces, to have shorter user defined assert failure messages. But since xerces:message is a proprietary facility and the XSD 1.1 spec doesn't define what the XSD processor may specify in error strings, I believe the currently
available error information for assertions with Xerces serves the assert error reporting purpose to large extent.

But I do acknowledge, that your suggestion to improve assert error messages is good. I'll take it up to the Xerces forum, and we may try to implement this feature sometime in future.

ud said...

Thanks for the reply Mukul! One more question, is there a possibility of adding some sort of error level element for assertions? This would be similar to the Schematron "flag" attribute. Having all assertions equate to error is not always desired, often it is useful for an assertion failure to result in a warning, something like: "this password is not secure enough" or "this element should be set to a value". This way an xml document could be considered valid enough for additional processing, but with additional warning/information that would be useful to the user.

Mukul Gandhi said...

XSD 1.1 assertion implementation within Xerces follows the existing Xerces framework of error reporting. Since assertion is part of an XSD 1.1 type definition, therefore an assertion failure would result in an XML element or attribute to become invalid, and an (XSD) invalid element or attribute would normally raise a validity error. I therefore believe, that due to this nature of XSD processing it would architecturally be wrong for an XSD processor to introduce special processing for XSD 1.1 assertions wrt to error levels.

Perhaps you might try to use an xs:annotation element within xs:assert, to specify that a warning level is desired upon an assertion failure. i.e something like below,

<xs:assert test="xpath-expr">
<xs:annotation>
<xs:appinfo>
<my:warning xmlns:my="http://myerr-info">this password is not secure enough</my:warning>
</xs:appinfo>
</xs:annotation>
</xs:assert>

Of course with this technique as well, an XML document when validated by the above assertion on 'false' xs:assert path would be invalid. But your application now has some extra information to take a "warning" path upon assertion failure.