Fast construction of good lattice rules

Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Fast construction of
good lattice rules
Dirk Nuyens
[email protected]
15
30
45
90
10
5
9
PhD defense
Arenberg Castle, April 20th, 2007.
18
6
3
2
1
Department of Computer Science, K.U.Leuven
Introduction
Motivating example
Outline
1
Introduction
2
Motivating example
3
Lattice rules
4
Fast constructions
5
Contributions
Lattice rules
Fast constructions
Contributions
Introduction
Motivating example
Outline
1
Introduction
Integration over [0, 1)s
Point sets to choose from
2
Motivating example
3
Lattice rules
4
Fast constructions
5
Contributions
Lattice rules
Fast constructions
Contributions
Introduction
Integration over [0, 1)
Motivating example
Lattice rules
Fast constructions
Contributions
s
Multivariate integration
Approximate the s-dimensional integral
Z
I(f ) :=
f (x) dx
[0,1)s
by a simple cubature rule
Q(f ; Pn ) :=
n−1
X
wk f (xk )
k=0
using n points xk ∈ Pn and associated weights wk .
→ What rule to use? There are many choices. . .
→ It is probably important to have an open method that allows us to
add more points if needed. . .
Introduction
Motivating example
Integration over [0, 1)
Lattice rules
Fast constructions
Contributions
s
What is multivariate integration?
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
A one-dimensional integral:
Z
2.
1
A two-dimensional integral:
Z
1Z 1
f (x) dx.
f (x1 , x2 ) dx1 dx2 .
0
3.
0
An s-dimensional integral:
Z
Z 1Z 1
Z
f (x) dx =
···
[0,1)s
0
0
0
0
1
f (x1 , x2 , . . . , xs ) dx1 dx2 · · · dxs .
Introduction
Motivating example
Integration over [0, 1)
Lattice rules
Fast constructions
Contributions
s
Classical product rules look like
Z
f (x) dx ≈
[0,1)s
m−1
X
(1)
wk1 · · ·
k1 =0
m−1
X
(s)
(1)
(s)
wks f (xk1 , . . . , xks ).
ks =0
By construction the total number of points is exponential in the
number of dimensions, e.g., n = ms .
The weights slightly complicate matters.
A possible solution is Monte Carlo type integration:
n−1
Z
f (x) dx ≈
[0,1)s
1X
f (xk )
n
k=0
with the sample points xk ∈ Pn uniformly distributed over [0, 1)s
and all weights equal to 1/n.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Point sets to choose from
Point sets to choose from
Figure: Three point sets with each 64 samples in the unit square.
1
The product left-rectangle rule.
→ Classical product rule
Note: grids don’t work for high dimensions since, e.g., taking 2 points per
dimensions in 100 dimensions requires a total of
2100 = 1267650600228229401496703205376 points. . .
2
3
Pseudo-random numbers.
Low-discrepancy points.
→ Monte Carlo
→ Quasi-Monte Carlo
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Point sets to choose from
Point sets to choose from
Figure: Three point sets with each 64 samples in the unit square.
1
The product left-rectangle rule.
2
Pseudo-random numbers.
→ Classical product rule
→ Monte Carlo
(Fig: mt19937, the Mersenne Twister, with default initial state.)
3
Low-discrepancy points.
→ Quasi-Monte Carlo
(Fig: A good lattice sequence in base 3: example technology from the thesis.)
Introduction
Motivating example
Lattice rules
Fast constructions
Point sets to choose from
Using uniformly distributed samples
Monte Carlo type integration is a very simple method.
Take the mean over a set of samples:
n
1X
Q(f ; Pn ) :=
f (xk ).
n
k=1
The xk ∈ Pn are n uniformly distributed sample points.
How to choose these uniformly distributed sample points?
Random points: the “Monte Carlo” method.
Low-discrepancy points: the “quasi-Monte Carlo” method.
For low dimensional integrals more advanced methods are to be preferred.
Nevertheless we will demonstrate the theory by a 2 dimensional example.
Contributions
Introduction
Motivating example
Outline
1
Introduction
2
Motivating example
Using Monte Carlo
Using quasi-Monte Carlo
3
Lattice rules
4
Fast constructions
5
Contributions
Lattice rules
Fast constructions
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Using Monte Carlo
A motivating example
Suppose we want to measure the water volume of a creek
in a 1 m × 1 m area.
First we use the “Monte Carlo” method: taking random samples.
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Using Monte Carlo
What do random points look like?
For “Monte Carlo” we take uniformly distributed random samples.
Random points in a square look like falling raindrops. . .
c Andrew Mobbs
(http://www.chiark.greenend.org.uk/~andrewm/)
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Using Monte Carlo
Sampling with random points
n
1
2
4
9
14
25
30
45
50
55
Mean depth for 55 samples is 0.2565 m.
Qn
0.0628
0.1860
0.2472
0.1991
0.2085
0.2421
0.2439
0.2555
0.2509
0.2565
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Using Monte Carlo
Sampling with random points
n
1
2
4
9
14
25
30
45
50
55
Qn
0.0628
0.1860
0.2472
0.1991
0.2085
0.2421
0.2439
0.2555
0.2509
0.2565
Did we sample enough deep pieces? Or maybe too many?
In practice we don’t know what the function looks like.
⇒ We should sample as uniform as possible. . . But that’s for in a minute.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Using Monte Carlo
After a while. . .
If we wait long enough,
the approximation will approximate the true value.
But: The convergence of Monte Carlo is only
1
O √
.
n
This means that if we want one extra decimal figure correct we will
have to take 100 times more samples.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Using quasi-Monte Carlo
A look at quasi-Monte Carlo
The idea: sample “more uniformly”.
64 random points (bad)
64 lattice sequence points
53 lattice points
→ How to find such points? This thesis is about finding good lattices.
FAST !
Introduction
Motivating example
Lattice rules
Fast constructions
Using quasi-Monte Carlo
The quasi-Monte Carlo result
Mean depth for 53 samples is 0.2496 m.
Monte Carlo got an estimate of 0.2565 m for 55 samples.
(The exact value up to 4 digits is 0.2499 m).
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Using quasi-Monte Carlo
The performance of quasi-Monte Carlo
It is possible to reach a convergence for quasi-Monte Carlo of
1
.
O
n
This means that if we want one extra decimal figure correct we will
have to take 10 times more samples. (Against 100 for Monte Carlo.)
Not so nice property of lattice rules:
we had to use all the points of the lattice rule at once.
→ Later we show a result to construct good lattice rules which can be
used point by point, i.e., “good lattice sequences”.
(Joint work with Frances Kuo.)
Introduction
Motivating example
Lattice rules
Outline
1
Introduction
2
Motivating example
3
Lattice rules
Good lattice rules
The worst-case error in a function space
Component-by-component construction
A matrix-vector formulation (part of Ch. 3)
4
Fast constructions
5
Contributions
Fast constructions
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Good lattice rules
Rank-1 lattice rules
We will look at the construction of rank-1 lattice rules.
Given a number of points n and a “generating vector” z, the
approximation is given by
n
1X
f
n
k=1
kz
n
.
With {·} the fractional part: what falls out on one side
of the cube flips back in on the opposite side.
The generating vector consists of s integers
z = (z1 , z2 , . . . , zs ).
→ How to choose these s integers is what we
will look at next.
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Good lattice rules
Good and bad lattice rules
Given a number of points n, the choice of the generating vector z
determines if the lattice rule is any good.
E.g., for n = 17 points in s = 2 dimensions:
A bad choice: z = (1, 2).
A good choice: z = (1, 5).
Mathematically we will measure this by the worst-case error.
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Good lattice rules
What do we know?
f(x)
Suppose we have 7 sample values of an unknown function f .
What does this function look like?
x
→ We will have to assume some smoothness of the function.
⇒ Function classes.
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
The worst-case error in a function space
The worst-case error
A function class F can be defined in terms of the allowed variation of
the function between two sample points.
The worst-case error can then be defined as
e(Q, F) := sup |I(f ) − Q(f )|.
f ∈F
kf kF ≤1
Given a set of sample points, the
worst-case error can be calculated as
the error for the worst-case function,
i.e., a function with the worst possible
variation in between these sample
points.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
The worst-case error in a function space
Reproducing kernel Hilbert spaces
Elegant technology: reproducing kernel Hilbert spaces.
In a reproducing kernel Hilbert space H(K) the reproducing
kernel K(x, y) is able to reproduce f ∈ H(K):
f (x) = hf (·), K(·, x)iH(K) .
The squared worst-case error in H(K) has the very nice form
e2 (Q, K) = hI(K) − Q(K), I(K) − Q(K)iH(K)
= I(I(K)) − 2Q(I(K)) + Q(Q(K))
Z
n−1 Z
2X
=
K(x, y) dx dy −
K(xk , y) dy
n
[0,1)2s
[0,1)s
k=0
+
1
n2
n−1
X
k,`=0
K(xk , x` ).
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
The worst-case error in a function space
Computationally it is interesting when
1
the kernel K is shift-invariant:
K(x, y) = K({x − y}, 0) = K({x − y}),
2
the points form an abelian group, (e.g., Q is a lattice rule):
{x − y} ∈ Pn
3
∀x, y ∈ Pn ,
(for simplicity) the operator norm of integration is 1:
kIk2 = I(I(K)) = 1,
4
(for simplicity) the s-dimensional space is a tensor-product space:
K(x) =
Qs
j=1 ηj (xj ),
then the squared worst-case error simplifies to
n−1 s
e2 (Q, K) = −1 +
1 XY
(k)
ηj (xj )
n
k=0 j=1
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
The worst-case error in a function space
Weighted function spaces
Tractability of multivariate numerical integration can be defined
using the worst-case error.
Almost all standard function classes turn out to be intractable.
I.e., one needs an exponential amount of points n = es to be able
to reduce the worst-case error.
Weighted function spaces were introduced to overcome this
problem under certain conditions on the weights.
(Sloan, Woźniakowski, Hickernell, Dick, Wang)
The weights model our idea of importance of certain sets of variables.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
The worst-case error in a function space
Two important weight settings are:
(assuming shift-invariant kernel, Q a lattice rule and kIk = 1)
Q
1 Product weights: γ =
u
j∈u γ{j} with squared worst-case error:
n−1 s
e2 (Q, K) = −1 +
1 XY
(k)
(1 + γ{j} ω(xj ))
n
k=0 j=1
2
Order-dependent weights: γu = Γ|u| with squared worst-case
error:
n−1 s
XY
1 XX
(k)
e (Q, K) =
Γ`
ω(xj ).
n
2
k=0 `=1
u⊆Ds j∈u
|u|=`
This is called of finite order q? if Γ` = 0 for ` > q? .
Introduction
Motivating example
Lattice rules
Fast constructions
Component-by-component construction
The component-by-component algorithm
Component-by-component algorithm (Sloan, Joe, Kuo, . . . ):
for s = 1 to smax do
for all zs ∈ Un do
calculate e2s (z1 , . . . , zs−1 , zs )
end for
zs = argmin e2s (z1 , . . . , zs−1 , z)
z∈Un
end for
The choices of zj are taken relatively prime to n, i.e., from the set
Un := {v ∈ Zn : gcd(v, n) = 1} = Z×
n,
|Un | = ϕ(n).
Achieves rules with optimal rate of convergence in weighted
Korobov space, i.e., O(n−α/2+δ ), δ > 0 (Kuo; Dick).
Achieves rules with optimal rate of convergence in weighted
Sobolev space, i.e., O(n−1+δ ), δ > 0 (Kuo; Dick).
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Component-by-component construction
Calculating the worst-case error
The worst-case error can be calculated iteratively.
E.g., for product weights:
n−1 s
e2s (zs ) = −1 +
1 XY
(k)
(1 + γj ω(xj ))
n
k=0 j=1
= −1 +
n−1
s−1
k=0
j=1
Y
1X
(k)
(k)
(1 + γs ω(xs )) (1 + γj ω(xj ))
n
n−1
= −1 +
1X
(k)
(1 + γs ω(xs )) ps−1 (k)
n
k=0
costing O(n) and using O(n) memory for a specific choice of zs .
The cost of the component-by-component algorithm is then O(sn2 ).
(Calculate this for every choice of zs ∈ Un and each s.)
Introduction
Motivating example
Lattice rules
Fast constructions
A matrix-vector formulation (part of Ch. 3)
Our first step: matrix-vector form
Let’s look back at the formula
e2s (zs ) = −1 +
n−1 1X
k · zs mod n
1 + γ{s} ω
ps−1 (k).
n
n
k=0
The interesting part is
n−1 X
k · zs mod n
y(zs ) =
ω
ps−1 (k)
n
k=0
which needs to be calculated for all zs ∈ Un .
⇒ This is a matrix-vector product:
y = Ωn ps−1 .
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
A matrix-vector formulation (part of Ch. 3)
Define the matrix
kz
k · z mod n
Ωn := ω
= ω
z∈Un
z∈Un
n
n
k∈Zn
k∈Zn
which looks like the image on the left (for n = 61):
(The image on the right is the matrix for n = 122, but z taken from just
Zn \ {0} instead of Un .)
Contributions
Introduction
Motivating example
Lattice rules
Outline
1
Introduction
2
Motivating example
3
Lattice rules
4
Fast constructions
The prime case (Ch. 3)
The general case (Ch. 4)
Lattice sequences (Ch. 5)
Even more fast constructions (Ch. 6 & 7)
5
Contributions
Fast constructions
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
The prime case (Ch. 3)
Prime number of points
Remember
Un := {v ∈ Zn : gcd(v, n) = 1} = Z×
n,
|Un | = ϕ(n).
When n is prime we can find a generator g for Un such that
hgi = {gk mod n : k ∈ Z}
= {g0 , g1 , . . . , gϕ(n)−1 } mod n.
And we could of course also take negative powers of g
g0 , g−1 , g−2 , . . . , g1
≡ g0 , gϕ(n)−1 , gϕ(n)−2 , . . . , g1 (mod n).
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
The prime case (Ch. 3)
When n is prime Zn = Un ∪ {0} . . . and Ωn is basically just
homomorph to the multiplication table of the group Un .
E.g., for n = 7:
·
1
2
3
4
5
6
1
1
2
3
4
5
6
2
2
4
6
1
3
5
3
3
6
2
5
1
4
4
4
1
5
2
6
3
5
5
3
1
6
4
2
6
6
5
4
3
2
1
7→
·
1
3
2
6
4
5
1
1
3
2
6
4
5
5
5
1
3
2
6
4
4
4
5
1
3
2
6
6
6
4
5
1
3
2
2
2
6
4
5
1
3
3
3
2
6
4
5
1
Multiplication modulo n can be much easier using a representation in
powers of a generator g. . .
gα · gβ ≡ gα+β
(mod ϕ(n))
(mod n).
⇒ The multiplication table now takes the form of a circulant matrix.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
The prime case (Ch. 3)
41
1
This is just a permutation on the rows and columns:
The CBC algorithm multiplies with this matrix in each step.
Matrix-vector multiplication with a circulant matrix Cm of size
m × m takes O(m log(m)) (instead of O(m2 )):
Cm x = IFFT(FFT(c) .* FFT(x)).
⇒ Fast construction of good lattice rule in time O(sn log(n)).
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
total time for s = 20 (in secs)
The prime case (Ch. 3)
109
106
10
120 years
8 years
1 year
1 month
fastrank1
spmvrank1
rank1
slowrank1
1 day
1 hour
10 mins
1 min
3
100
10−3
102
103
104
105
n
106
107
108
Figure: Timings generated on a P4 2.4GHz ht, 2GB RAM for 20 dimensions
Notes:
This is known for prime number FFTs as Rader factorization.
Normal FFT tricks do not work here since in general
ω(ab) 6= ω(a)ω(b).
Also works for polynomial lattice rules modulo an irreducible
polynomial (' prime number → field).
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
The general case (Ch. 4)
Non-prime number of points
The technique for prime n makes use of a generator for Un , so. . .
The multiplicative group Un = {z ∈ Zn : gcd(z, n) = 1} = Z×
n,
|Un | = ϕ(n), is cyclic whenever
n = 2, 4, pk or 2pk ,
with p an odd prime. A generator for the cyclic group Un is
called a primitive root modulo n.
Every group Un , n = n1 n2 · · · nr , all relatively prime, can be
written as
(
Un1 ⊕ Un2 ⊕ · · · ⊕ Unr
if 8 - n,
Un '
(−h5i2k ⊕ h5i2k ) ⊕ Un2 ⊕ · · · ⊕ Unr if 8 | n, n1 = 2k , k ≥ 3,
where every subgroup is a cyclic multiplicative group.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
The general case (Ch. 4)
A “generator” for Un is now obtained by using the Chinese
remainder theorem to resolve the isomorphism:
v mod n ' (gα1 1 mod n1 , gα2 2 mod n2 , . . . , gαr r mod nr ).
Now remember:
k · z mod n
Ωn = ω
z∈Un
n
k∈Zn
So a permutation on the columns to group the v ∈ Zn that are
also in Un gives us a first “nested block circulant” block of size
ϕ(n) × ϕ(n) (using a permutation based on the g1 , g2 , . . . , gr ).
What to do with the rest of Zn ?
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
The general case (Ch. 4)
Using the partition
Zn =
[
d Un ,
d|n
we can do the trick from the previous slide for every divisor of n.
It turns out that the complete matrix Ωn can be permuted into
nested block circulant form for each divisor block with the same
permutation.
An example for n = 90 = 2 × 5 × 32 is on the next slide. . .
A fast matrix-vector product is possible in O(n log(n)) (although
this is a little bit complicated).
⇒ Fast construction of a good lattice rule with n composite.
Introduction
Motivating example
Lattice rules
Fast constructions
9
5
10
15
30
45
90
5
10
15
30
45
90
6
6
18
3
3
9
2
2
18
1
1
The general case (Ch. 4)
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Lattice sequences (Ch. 5)
Prime powers and sequences
Now take n = pm2 , a prime power, and set
k · z mod pm
Ppm =
: k ∈ Zpm ,
pm
0 ≤ m ≤ m2 ,
then the point set is embedded
P0 ⊂ Pp ⊂ Pp2 ⊂ · · · ⊂ Ppm2 .
For a sequence we need:
1
each embedded point set, Ppm , to be a good lattice rule;
2
every subsequence anchored at and spanning a power of the base,
xk pm , xk pm +1 , . . . , x(k+1) pm −1 ,
be a good lattice.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Lattice sequences (Ch. 5)
For property 1 (every embedded rule is a good rule) we just need to
alter the CBC algorithm to minimize
Xm1 ,m2 ,s (z) :=
max
m1 ≤m≤m2
epm ,s (z)
,
epm ,s (z(m) )
with z(m) the optimal generating vector for n = pm according to the
CBC algorithm.
Suppose r is a generator for Up , then
(
r + p if rp−1 ≡ 1 (mod p2 ),
g=
r
otherwise,
is a generator for all Up` .
If looking for the smallest generator, case 1 happens for the first time at p = 40487.
→ Ωpm2 is also embedded: calculating epm2 ,s reveals all epm ,s , m ≤ m2 .
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Lattice sequences (Ch. 5)
For property 2 (every subsequence is a good rule), which in effect
means you can use the sequence point by point, we use
ϕp (k) z
kz
xk =
instead of
xk =
.
pm2
pm2
The ϕp is a permutation which keeps the embedding and has the
effect that the sequence consists of smaller shifted rules.
→ These are good lattice rules if they have size ≥ pm1 : by
construction and since we use a shift-invariant worst-case error.
These permutations are well known for (t, s)-sequences, e.g.,
radical inverse and Gray ordering.
⇒ Fast construction of good lattice sequences.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Lattice sequences (Ch. 5)
9
27
81
9
3
3
27
81
1
1
Examples for a normal prime power case with
n = 34 = 81 and the “odd” even prime power
case with n = 27 = 128.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Lattice sequences (Ch. 5)
1
A lattice sequence in base 3 with 9, 27 and 81 points:
2
A (t, s)-sequence (here Faure) in base 3 with 9, 27 and 81 points:
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Lattice sequences (Ch. 5)
Stopping at any point. . .
Base 3 lattice sequence with 9, 64 (not a power of 3) and 81 points:
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Even more fast constructions (Ch. 6 & 7)
Other fast constructions
We also have derived fast construction. . .
1
. . . for polynomial lattice rules mod an irreducible polynomial.
Similar theory but more involved. The algebraic structure of
polynomial fields allows to find a generator and the polynomial matrix
can thus also be brought into a circulant form.
2
. . . for copy rules.
3
. . . for polynomial copy rules.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Even more fast constructions (Ch. 6 & 7)
Other fast constructions
We also have derived fast construction. . .
1
. . . for polynomial lattice rules mod an irreducible polynomial.
2
. . . for copy rules.
Based on Fourier analysis of the kernel it turns out that the worst-case
error for copy rules has a similar form as for a rank-1 rule.
3
. . . for polynomial copy rules.
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Even more fast constructions (Ch. 6 & 7)
Other fast constructions
We also have derived fast construction. . .
1
. . . for polynomial lattice rules mod an irreducible polynomial.
2
. . . for copy rules.
3
. . . for polynomial copy rules.
This is a combination of the two previous constructions. Instead of an
analysis in terms of a Fourier basis, the analysis is done in terms of a
Walsh basis.
Introduction
Motivating example
Outline
1
Introduction
2
Motivating example
3
Lattice rules
4
Fast constructions
5
Contributions
Lattice rules
Fast constructions
Contributions
Introduction
Motivating example
Lattice rules
Fast constructions
Contributions
Contributions
Fast construction. . .
1
. . . for a prime number of points.
2
. . . for a composite number of points.
3
. . . of lattice sequences.
4
. . . of polynomial lattice rules modulo an irreducible polynomial.
5
. . . of copy rules and polynomial copy rules.
Further we have shown:
1
Usage of a good lattice sequence as a low-discrepancy sequence.
2
Good results with the order-2 lattice sequence we published.
3
Usage of any good lattice rule as a sequence.
4
Sample Matlab/Octave code for almost all fast constructions.