Haar Wavelets

Haar Wavelets
−50
−100
accel
0
50
Motorcycle data, again:
10
20
30
40
50
times
It is nice to have N = 2K to fit wavelets, so since N = 133, I’ll just drop the last
five observations so that N = 128. Also, as before, I’ll take equally-spaced xi ’s: x =
(1, 2, . . . , 128).
x <- 1:128
y <- mcycle[1:128,2]
The Haar wavelet X matrix is then 128 × 128. It is a simple matrix, but there are easier
and harder ways to get it. I’ll use Kronecker products, where the Kronecker product of
matrices A and B is


a11 B a12 B · · · a1q B


 a21 B a22 B · · · a2q B 

(1)
A ⊗ B =  ..
..
.. 
..
.
.

.
.
. 
ap1 B ap2 B · · · apq B
1
Mainly we have A = Iq and B = b, a vector, so that

Iq ⊗ b =





b
0
..
.
0 ···
b ···
.. . .
.
.
0 0 ···
0
0
..
.
b



.


(2)
Then in R, to get the W matrix:
h <- cbind(1,rep(c(1,-1),64*c(1,1)))
That gives the first two nonnormalized columns: the 1, and the one with 64 +1’s and 64
−1’s. Then to add the other columns:
h
h
h
h
h
h
<<<<<<-
cbind(h,kronecker(diag(2),rep(c(1,-1),32*c(1,1))))
cbind(h,kronecker(diag(4),rep(c(1,-1),16*c(1,1))))
cbind(h,kronecker(diag(8),rep(c(1,-1),8*c(1,1))))
cbind(h,kronecker(diag(16),rep(c(1,-1),4*c(1,1))))
cbind(h,kronecker(diag(32),rep(c(1,-1),2*c(1,1))))
cbind(h,kronecker(diag(64),rep(c(1,-1),c(1,1))))
(You could easily make a for loop, of course.) Next, normalize the columns so that they
each have length 1:
h2 <- sqrt(apply(h^2,2,sum))
w <- sweep(h,2,h2,"/")
The least squares estimates θi∗ ’s of the θi ’s and their sums of squares are then
theta <- t(w)%*%y
ss <- theta^2
plot(1:128,ss)
2
60000
0
20000
ss
0
20
40
60
80
100
120
1:128
For a given λ, the threshold estimates (lasso) of θ are given by
λ
θi = Sign(θi∗ )(|θi∗ | − )+ .
2
We start with λ = 2σe 2 log(N), but we need an estimate of σe2 . Some possibilities: The
mean of the sums of squares corresponding to the highest-level wavelets; the pairwise variances as before, the mean of the smallest 100 sums of squares:
mean(ss[65:128])
[1] 493.3084
sum((y[-1]-y[-128])^2)/(127*2)
[1] 549.0531
mean(sort(ss)[1:100])
[1] 225.6335
So, e.g,
lambda <- 2*sqrt(493.3084*2*log(128))
thetahat <- sign(theta)*pmax(0,abs(theta)-lambda/2)
plot(thetahat)
plot(x,y)
lines(x,w%*%thetahat)
3
−100
−50
y
0
50
(In R, z+ is pmax(0,z).)
0
20
40
60
80
100
120
x
It looks like the fit is reasonable, except that it does not do down far enough in the
middle. So we’ll unshrink the nonzero θi∗ ’s:
i <- (1:128)[thetahat!=0]
plot(x,y)
lines(x,w[,i]%*%theta[i])
4
50
0
−100
−50
y
0
20
40
60
80
100
120
x
Better? Try a smoother one...
lambda <- 2*sqrt(549.0531*2*log(128))
thetahat <- sign(theta)*pmax(0,abs(theta)-lambda/2)
plot(x,y)
lines(x,w%*%thetahat)
i <- (1:128)[thetahat!=0]
plot(x,y)
lines(x,w[,i]%*%theta[i])
5
50
0
−100
−50
y
0
20
40
60
80
x
6
100
120
50
0
−100
−50
y
0
20
40
60
80
100
120
x
And a rougher one...
lambda <- 2*sqrt(225.6335*2*log(128))
thetahat <- sign(theta)*pmax(0,abs(theta)-lambda/2)
plot(x,y)
lines(x,w%*%thetahat)
i <- (1:128)[thetahat!=0]
plot(x,y)
lines(x,w[,i]%*%theta[i])
7
50
0
−100
−50
y
0
20
40
60
80
x
8
100
120
50
0
−100
−50
y
0
20
40
60
80
100
120
x
Whether you like a smoother or rougher one, at least here it looks better not too shrink
the nonzero parameter estimates.
9