Casting the Dice with FXSL: Random Number Generation Functions
in XSLT
Dimitre Novatchev, March - April 2002
This article is a follow-up from two recent publications [1], [2] on functional
programming in XSLT. Described is the Random module
of the XSLT functional programming library FXSL [3], which implements a linear congruential
method of generating a pseudo-random sequence of natural numbers
starting with a seed and then getting the next element of the
sequence, as described by Simon Thompson [4].
Such a sequence can then be scaled to natural or real numbers in
any given interval. Numbers can be also generated, denoting a given
event outcome with a specified probability (distribution).
Implemented is a simplified Monte Carlo integration to test the
"randomness" of the functions' results. The code of the
implementation demonstrates the use of such powerful, functional
programming design patterns as partial application and creation of
functions dynamically.
The source code for this article is part of the FXSL library version 1.1
and can be downloaded here:
Using MSXML? Get the MSXML source code!
Using SAXON? Get the SAXON source code!
Using XALAN? Get the XALAN source code as adapted by Knut Wannheden !
As the XSLT language and its usage evolve and mature over time,
there have been repeated questions about the availability of random
numbers generation in the language or how to implement it. The
prevailing answer has been that one is to write their own extension
functions, reflecting the lack of any standard random generation
functions in XPath 1.0 [5]. Such a feature is
also missing in the published "XQuery 1.0 and XPath 2.0 Functions
and Operators" Working Draft [6].
This article has two goals:
- To provide a set of functions for a majority of random numbers
related tasks as part of the FXSL XSLT functional programming (FP)
library.
- To demonstrate how the implementation is taking advantage of
some very powerful FP design patterns, such as partial application
and creation of functions dynamically.
In this article we follow the approach to random numbers
generation described by Simon Thompson in [4]. While it may not be
realistic to aim at producing a truly random number sequence, we'll
implement generation of a pseudo-random sequence of natural
numbers, using a linear congruential method. This method
works in the following way: we start with seed and then get
the next element of the sequence from the previous one using the
following definition:
nextRand n = (n*multiplier + increment) `mod` modulus (1)
In our implementation we'll be using the following
constants:
seed = 17489
multiplier = 25173
increment = 13849
modulus = 65536
From (1) and using the Haskell function iterate, the following
will produce an infinite sequence of random numbers ranging from 0
to modulus - 1:
randomSequence :: Int -> [Int]
randomSequence = iterate nextRand
where the function iterate is defined in Haskell's Prelude in
the following way:
iterate
:: (a -> a) -> a -> [a]
iterate f x = x : iterate f
(f x)
so it creates an infinite list with head the argument x and each
consecutive element is the result of applying the function f to the
previous element.
XSLT is a strict language (does not support lazy evaluation),
therefore we need a finite analog of the iterate function:
scanIter :: (Ord a, Num a) => a -> (b -> b) -> b
-> [b]
scanIter n f x
| n == 0 = [x]
| n > 0 = result ++ [f(last
result)]
where result = scanIter (n-1) f x
As defined above, scanIter takes a number n, a function f and an
argument (for f) x, and produces a list of
n+1 elements, the first
of which is x, and each next element is the result of applying f on
the current element.
Now, we can produce a sequence of n + 1 random numbers in the
following way:
randomSequence :: Int -> [Int]
randomSequence = scanIter n nextRand
If we want the numbers to come in the integer range s to t
inclusive, we need to scale the sequence:
scaleSequence :: Int -> Int -> [Int] -> [Int]
scaleSequence s t
= map scale
where
scale n = n `div` denom + s
denom = modulus `div`
range
range = t - s + 1
Note that we are using partial application in the definition of
both randomSequence and scaleSequence. In our XSLT implementation
of these functions partial application is produced by using the
curry function, described in detail in [2].
Bellow is the XSLT implementation of nextRand, scanIter,
scaleSequence and randomSequence.
randNext
<!--
===========================================================
Stylesheet: random.xsl Randomization Functions
Version: 1.0 (2002-03-16)
Based on: Random number algorithms in Simon Thompson's
"Haskell, the craft of functional programming"
===========================================================-->
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:vendor="urn:schemas-microsoft-com:xslt"
xmlns:randScale="f:randScale"
xmlns:myRandNext="f:myRandNext"
xmlns:mySingleRandDistFun="f:mySingleRandDistFun"
xmlns:x="f:fxslRandom.xsl"
exclude-result-prefixes="xsl vendor randScale
myRandNext mySingleRandDistFun x"
>
<!--
========================================================
Imported files:
========================================================-->
<xsl:import href="map.xsl"/>
<xsl:import href="curry.xsl"/>
<xsl:import href="iter.xsl"/>
<!--
========================================================
Global Randomizing constants:
========================================================-->
<xsl:variable name="seed" select="17489"/>
<xsl:variable name="multiplier" select="25173"/>
<xsl:variable name="increment" select="13849"/>
<xsl:variable name="modulus" select="65536"/>
<!--
This is a linear congruental method of generating
a pseudo-random sequence of natural numbers
starting with a seed and then getting the next element
of the sequence from the previous value like this:
nextRand :: Int -> Int
nextRand n = (multiplier * n + increment) `mod` modulus
-->
<!--
Template: randNext
Purpose : Return the value of the next random number
from the current
Parameters:
$arg1 - the current random number,
from which to produce the next
======================================================== -->
<xsl:template name="randNext" match="myRandNext:*">
<xsl:param name="arg1"/>
<xsl:value-of select="($multiplier * $arg1 + $increment)
mod
$modulus"/>
</xsl:template
> |
scanIter:
<!--
Template: scanIter
Purpose: Iterate (compose a function with itself)
N times and produce a list, whose elements
are the results from the partial iterations
Parameters:-
$arg1 - the number of times to iterate
$arg2 - a template reference to the function
that's to be iterated
$arg3 - an initial argument to the function
$arg4 - [optional] name for the elements
in the result node-set
=========================================================-->
<xsl:template name="scanIter">
<xsl:param name="arg1"/> <!-- n -->
<xsl:param name="arg2" select="/.."/> <!-- f -->
<xsl:param name="arg3"/> <!-- x -->
<xsl:param name="arg4" select="'el'"/> <!-- elName -->
<xsl:choose>
<xsl:when test="$arg1 < 0">
<xsl:message>
[scanIter]Error: Negative value for n.
</xsl:message>
</xsl:when>
<xsl:when test="$arg1 = 0">
<xsl:element name="{$arg4}">
<xsl:value-of select="$arg3"/>
</xsl:element>
</xsl:when>
<xsl:otherwise>
<xsl:variable name="vrtfIterMinus">
<xsl:call-template name="scanIter">
<xsl:with-param name="arg1" select="$arg1 - 1"/>
<xsl:with-param name="arg2" select="$arg2"/>
<xsl:with-param name="arg3" select="$arg3"/>
<xsl:with-param name="arg4" select="$arg4"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="vIterMinus"
select="vendor:node-set($vrtfIterMinus)/*"/>
<xsl:variable name="vrtfLastIter">
<xsl:element name="{$arg4}">
<xsl:apply-templates select="$arg2">
<xsl:with-param name="arg1"
select="$vIterMinus[last()]"/>
</xsl:apply-templates>
</xsl:element>
</xsl:variable>
<xsl:copy-of select="$vIterMinus"/>
<xsl:copy-of
select="vendor:node-set($vrtfLastIter)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
|
scaleSequence, randomSequence, randScale:
<!--
Template: scaleSequence
Purpose : Return a scaled version of a sequence
from a given sequence
Parameters:
$pStart - [optional] the start of the interval,
in which randoms
are to be produced
$pEnd - [optional] the end of the interval,
in which randoms
are to be produced
$pList - the list of numbers that are to be scaled
=========================================================-->
<xsl:template name="scaleSequence">
<xsl:param name="pStart" select="0"/>
<xsl:param name="pEnd" select="1"/>
<xsl:param name="pList" select="/.."/>
<xsl:variable name="vScaleFun"
select="$x:st/randScale:*[1]"/>
<xsl:variable name="vRange" select="$pEnd - $pStart + 1"/>
<xsl:variable name="vrtfCurriedScale">
<xsl:call-template name="curry">
<xsl:with-param name="pFun" select="$vScaleFun"/>
<xsl:with-param name="pNargs" select="3"/>
<xsl:with-param name="arg2" select="$pStart"/>
<xsl:with-param name="arg3" select="$vRange"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="vFixedScaleFun"
select="vendor:node-set($vrtfCurriedScale)/*"/>
<xsl:call-template name="map">
<xsl:with-param name="pFun" select="$vFixedScaleFun"/>
<xsl:with-param name="pList1" select="$pList"/>
</xsl:call-template>
</xsl:template>
<!--
Template: randomSequence
Purpose : Return a sequence of random numbers
starting from a seed
Parameters:
$pLength - [optional] the length of the sequence
of randoms that must be produced
$pSeed - [optional] the seed for the randomization
$pStart - [optional] the start of the interval,
in which randoms are to be produced
$pEnd - [optional] the end of the interval,
in which randoms are to be produced
=========================================================-->
<xsl:template name="randomSequence">
<xsl:param name="pLength" select="1"/>
<xsl:param name="pSeed" select="$seed"/>
<xsl:param name="pStart" select="0"/>
<xsl:param name="pEnd" select="$modulus - 1"/>
<xsl:variable name="vFunRandNext"
select="$x:st/myRandNext:*[1]"/>
<xsl:variable name="vrtfUnscaledSeq">
<xsl:call-template name="scanIter">
<xsl:with-param name="arg1"
select="$pLength - 1"/><!-- n -->
<xsl:with-param name="arg2"
select="$vFunRandNext"/><!-- f -->
<xsl:with-param name="arg3"
select="$pSeed"/> <!-- x -->
</xsl:call-template>
</xsl:variable>
<xsl:choose>
<xsl:when test="$pStart = 0
and $pEnd = $modulus - 1">
<xsl:copy-of
select="vendor:node-set($vrtfUnscaledSeq)/*"/>
</xsl:when>
<xsl:otherwise>
<xsl:call-template name="scaleSequence">
<xsl:with-param name="pStart" select="$pStart"/>
<xsl:with-param name="pEnd" select="$pEnd"/>
<xsl:with-param name="pList"
select="vendor:node-set($vrtfUnscaledSeq)/*"/>
</xsl:call-template>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- The following scales a single number n
from the interval [0, modulus - 1]
to the interval [s, t]
scale n = n `div` denom + s
range = t - s + 1
denom = modulus `div` range
-->
<xsl:template match="randScale:*">
<xsl:param name="arg1"/> <!-- n -->
<xsl:param name="arg2"/> <!-- s -->
<xsl:param name="arg3"/> <!-- range -->
<xsl:variable name="vDenom"
select="floor($modulus div $arg3)"/>
<xsl:value-of select="floor($arg1 div $vDenom)
+ $arg2"/>
</xsl:template>
|
Let's test these functions now:
test1-random.xsl
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:vendor="urn:schemas-microsoft-com:xslt"
exclude-result-prefixes="xsl vendor"
>
<xsl:import href="random.xsl"/>
<!-- To be run on any input xml file -->
<xsl:output indent="yes" omit-xml-declaration="yes"/>
<xsl:variable name="NL" select="'
'"/>
<xsl:template match="/">
<xsl:variable name="vrtfUnscaledRandoms">
<xsl:call-template name="randomSequence">
<xsl:with-param name="pLength" select="100"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="vUnscaledRandoms"
select="vendor:node-set($vrtfUnscaledRandoms)/*"/>
100 Randoms in (0, <xsl:value-of select="$modulus - 1"/>):
<xsl:for-each select="$vUnscaledRandoms
[position() mod 8 = 1]">
<xsl:value-of select="$NL"/>
<xsl:for-each select=".
| following::*[position() < 8]">
<xsl:value-of select="concat(., ', ')"/>
</xsl:for-each>
</xsl:for-each>
<xsl:variable name="vStart" select="1"/>
<xsl:variable name="vEnd" select="10"/>
<xsl:value-of
select="concat($NL,$NL,
'100 Randoms in (',
$vStart, ', ',
$vEnd, '):',
$NL
)"/>
<xsl:variable name="vrtfScaledRandoms">
<xsl:call-template name="randomSequence">
<xsl:with-param name="pLength"
select="100"/>
<xsl:with-param name="pStart"
select="$vStart"/>
<xsl:with-param name="pEnd"
select="$vEnd"/>
</xsl:call-template>
</xsl:variable>
<xsl:variable name="vScaledRandoms"
select="vendor:node-set($vrtfScaledRandoms)/*"/>
<xsl:for-each select="$vScaledRandoms
[position() mod 10 = 1]">
<xsl:value-of select="$NL"/>
<xsl:for-each select=".
| following::*[position() < 10]">
<xsl:value-of select="concat(., ', ')"/>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
|
When this transformation is performed (ignoring the source xml
document) the result is:
|
100 Randoms in (0, 65535):
17489, 59134, 9327, 52468, 43805, 8378, 18395,
59344,
52777, 23478, 21895, 18924, 6517, 29682, 22899,
61256,
14593, 34158, 40863, 5092, 6349, 60458, 46091,
13248,
58585, 17446, 25271, 2780, 2341, 26978, 47011,
38200,
12721, 30686, 65231, 3796, 19069, 52122, 50235,
62384,
32649, 150, 54247, 65484, 15573, 62162, 14803,
12072,
11873, 48718, 16895, 48580, 16429, 48906, 30827,
10144,
40505, 37126, 43287, 10428, 46213, 4162, 57347,
48408,
12049, 22718, 26927, 8372, 63965, 50810, 53403,
53136,
16617, 62838, 57927, 34220, 28725, 49586, 43571,
16136,
13249, 18222, 29791, 14244, 30605, 57834, 52427,
60288,
26521, 11750, 32631, 5788, 28645, 1826, 39011,
46328,
15473, 35230, 25487, 660,
100 Randoms in (1, 10):
3, 10, 2, 9, 7, 2, 3, 10, 9, 4,
4, 3, 1, 5, 4, 10, 3, 6, 7, 1,
1, 10, 8, 3, 9, 3, 4, 1, 1, 5,
8, 6, 2, 5, 10, 1, 3, 8, 8, 10,
5, 1, 9, 10, 3, 10, 3, 2, 2, 8,
3, 8, 3, 8, 5, 2, 7, 6, 7, 2,
8, 1, 9, 8, 2, 4, 5, 2, 10, 8,
9, 9, 3, 10, 9, 6, 5, 8, 7, 3,
3, 3, 5, 3, 5, 9, 9, 10, 5, 2,
5, 1, 5, 1, 6, 8, 3, 6, 4, 1,
|
Suppose we want to model an event, which has N distinct
outcomes. Sometimes we want that each outcome must have a
predefined probability. First we define a Distribution
type, which is a list of pairs of outcome object and outcome
object's probability. In order for this to make sense, the objects
in the list must all be distinct, and their probabilities must sum
up to
1.
type Distribution a = [(a,
Float)]
Suppose we need to model random outcomes 1 to 6 each with its
given probability:
dist1 :: Distribution
Int
dist1 = [(1,0.2), (2,0.25), (3,0.25), (4,0.15), (5,0.1), (6,
0.05)]
We will use a function, which given a distribution produces
another function, that maps the interval [0,65535] into the set
of distinct objects according to their probability. The idea is to
divide the interval [0,65535] into N intervals
(where N = length(dist) is the number of distinct objects), but
in such a way, that the lengths of these intervals are proportional
to the probabilities of the corresponding
objects.
makeFunction :: Distribution a -> (Float -> a)
makeFunction dist = makeFun dist
0.0
makeFun ((ob, p) : dist) nLast rand
| nNext >= rand && rand > nLast
= ob
| otherwise
= makeFun dist nNext
rand
where
nNext = p*fromInt modulus + nLast
Then the transformation of a list of random numbers into a list
of (weighted) random numbers according to a distribution dist is
given by the
expression:
map (makeFunction dist)
Here's the XSLT implementation.
dist-randomSequence:
<!--
Template: dist-randomSequence
Purpose : Return the value of the next random number
from the current
Parameters:
$pLength - [optional] the length of the sequence
of randoms that must be produced
$pDist - a list of distributions
(outcome,probability pairs)
e.g.:
<myDistribution:myDistribution>
<o>1</o><p>0.2</p>
<o>2</o><p>0.25</p>
<o>3</o><p>0.25</p>
<o>4</o><p>0.15</p>
<o>5</o><p>0.1</p>
<o>6</o><p>0.05</p>
</myDistribution:myDistribution>
$pSeed - [optional] the seed for the randomization
=========================================================-->
<xsl:template name="dist-randomSequence">
<xsl:param name="pLength" select="1"/>
<xsl:param name="pDist" select="/.."/>
<xsl:param name="pSeed" select="$seed"/>
<xsl:variable name="vrtfNormalRandomSeq">
<xsl:call-template name="randomSequence">
<xsl:with-param name="pLength" | |