Friday, March 29, 2019

XSLT 1.0 transformations for large xml input documents

I thought that, this topic could be of interest to XML community.

I've discovered an interesting aspect of JAXP API (Java API for XML Processing) that seems to have relations to streaming that we talk with XSLT 3.0. Please see following document, and an example given in its section 4.12 (that explains JAXP's StAX API and using it with JAXP's transformation APIs)


Using the cited JAXP code in above document, one can transform very large XML input documents (I've tried an XML input document with size of about 700 MB, that worked) using XSLT 1.0 (the JDK's built in JAXP implementation can do this. I've tried with JDK 1.8 which works fine for this). It can do certain kinds of XSLT 1.0 transformations with very large XML input documents, very well. When doing the same transformations with XSLT 2.0, or with XSLT 3.0 in non streaming way, we would usually get following errors 'java.lang.OutOfMemoryError: Java heap space'.

I've written few complete examples for this topic here, https://github.com/mukulga/largexml_xslt10.

Notes: My intention for writing this blog post is not to endorse in any way that XSLT 1.0 is better than XSLT 2.0/3.0 for every aspect. XSLT 2.0/3.0 have various new language features as compared to XSLT 1.0, that raise the productivity of XSLT developers and allow development of XSLT stylesheets with ease that could be much more complex in terms of functionality, than with XSLT 1.0.