Thursday, April 10, 2008

Xalan-J serializer

I thought there was some problem with Xalan-J serializer. I posted the following question on xalan-dev mailing list:

I think, there is scope of improvement to the Xalan-J 2.7.1 serializer.

I tried this sample XSLT stylesheet with Xalan-J 2.7.1.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:output method="xml" indent="yes" />

<xsl:template match="/">
  <x>
    <y/>
  </x>
</xsl:template>

</xsl:stylesheet>

The output produced by Xalan is:
<?xml version="1.0" encoding="UTF-8"?><x>
<y/>
</x>

Please note that top most element tag, <x> is not indented properly.

I wish the output in this case should be:

<?xml version="1.0" encoding="UTF-8"?>
<x>
  <y/>
</x>

This problem seems to happen with any XML output.

Henry Zongaro provided a good argument that why this is so:

The problem here is that the serializer considers that the result document might be used as an external general parsed entity. So, suppose the result is named result.xml. If it's referenced inside a document such as the following, inserting whitespace before the x element in result.xml would affect the text content of its parent element, doc.

<!DOCTYPE doc [
<!ENTITY ref SYSTEM "result.xml">
]>
<doc>Some non-whitespace test &ref; Some more non-whitespace text</doc>

No comments: