Sunday, January 4, 2009

XSLT: sorting data by duration

I've this input text file (test.txt):

A started at 03:12:10
A ended at 03:20:20
B started at 03:20:25
B ended at 03:22:21
C started at 03:22:23
C ended at 03:22:55
D started at 03:22:57
D ended at 03:23:21
E started at 03:23:25
E ended at 03:24:40

Here A, B, C etc. are some events, and they start at a particular time and end at another time.

I need to produce an output like following:

D : 0-0-24
C : 0-0-32
E : 0-1-15
B : 0-1-56
A : 0-8-10

i.e., events sorted by the time they took (in ascending order of durations). The duration format in the output is, hr-min-sec.

The following XSLT 2.0 stylesheet worked well for this problem,
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:xs="http://www.w3.org/2001/XMLSchema"
                exclude-result-prefixes="xs"
                version="2.0">

<xsl:output method="text" />

<xsl:variable name="time-data" select="tokenize(unparsed-text('test.txt', 'UTF-8'), '\r?\n')" />

<xsl:template match="/">
  <xsl:variable name="temp-data">
    <xsl:for-each select="$time-data">
      <xsl:variable name="val" select="normalize-space(.)" />
      <xsl:variable name="pos" select="position()" />
      <xsl:if test="position() mod 2 = 1">
        <data key="{tokenize($val, '\s')[1]}">
          <xsl:value-of select="xs:time(tokenize($time-data[$pos + 1], '\s')[last()]) -xs:time(tokenize($val, '\s')[last()])" />
        </data>
      </xsl:if>
    </xsl:for-each>
  </xsl:variable>
  <xsl:for-each select="$temp-data/*">
    <xsl:sort select="xs:dayTimeDuration(.)" />
    <xsl:variable name="hr" select="hours-from-duration(.)" />
    <xsl:variable name="min" select="minutes-from-duration(.)" />
    <xsl:variable name="sec" select="seconds-from-duration(.)" />
    <xsl:value-of select="@key" /> : <xsl:value-of select="concat($hr, '-', $min, '-', $sec)" /> <xsl:text>
</xsl:text>
  </xsl:for-each>
</xsl:template>

</xsl:stylesheet>

The following XSLT sort instruction worked, well for this use case:
<xsl:sort select="xs:dayTimeDuration(.)" order="descending" />

I've used Saxon to solve this.

If anybody bumps by this post, and could think of a better solution (particularly written in a more functional style, and avoiding the temporary tree), I would be very glad to know that.

No comments: