Universality of random matrices: Local semicircle law

Universality for Random Matrices:
The local semicircle law and the Green function
comparison theorem
László Erdős
Ludwig-Maximilians-Universität, Munich, Germany
AIM, Palo Alto, December, 2010
Joint with B. Schlein, H.T. Yau, and J. Yin
1
GENERALIZED WIGNER ENSEMBLES
H = (hkj )1≤k,j≤N ,
Ehij = 0,
h̄ji = hij
2,
E|hij |2 = σij
independent
X
2 = 1,
σij
i
c
2 ≤ C
≤ σij
N
N
sub-exponential decay:
(A)
√
c
|
Ee N hij | 1 < ∞
(B)
If hij are i.i.d. then it is called Wigner ensemble.
No explicit formula for the eigenvalues (not invariant)
Universality conjecture (Dyson, Wigner, Mehta etc) : If hij are
independent, then the local eigenvalues statistics are the same as
the Gaussian ensembles.
2
Theorem [E-Schlein-Yau-Yin, 2009-2010] The bulk universality holds
for generalized Wigner ensembles satisfying (A) and (B), i.e., for
−2 < E < 2, b = N −c, c > 0
Z E+b
b
b
dE 0 (k)
(k)
pF,N − pµ,N E 0 + 1 , . . . , E 0 + k = 0
lim
N →∞ E−b 2b
N
N
F
weakly
µ
generalized symmetric matrices
GOE
generalized hermitian
GUE
generalized self-dual quaternion
GSE
real covariance
real Gaussian Wishart
complex covariance
complex Gaussian Wishart
Variances can vary in this theorem.
3
KEY STEPS IN OUR PROOF
Step 1. Good local semicircle law including a control near the edge.
Method: System of self-consistent equations for the Green function,
control the error by large deviation methods.
Step 2. Universality for Wigner matrices with a small (∼ N −ε)
Gaussian component.
Method: Modify DBM to speed up its local relaxation, then show
that the modification is irrelevant for statistics involving differences
of eigenvalues.
Step 3. Universality for arbitrary Wigner matrices.
Method: Remove the small Gaussian component in Step 2 by resolvent perturbation theory and moment matching.
4
SEMICIRCLE LAW (WIGNER 1955)
N [I] : = number of eigenvalues in interval I
1q
(4 − E 2)+
%sc(E) : =
2π
For any δ > 0 and |E| ≤ 2
η
η
N [E − 2 , E + 2 ]
− %sc(E) ≥ δ = 0
lim lim P η→0 N →∞
Nη
Remark 1: Semicircle is independent of the distribution of entries
Remark 2: Wigner’s result concerns the macroscopic density, i.e.
density in intervals containing order N eigenvalues.
5
HISTORICAL DETOUR
We have been subsequently improving the local semicircle law in
several papers focusing on shrinking the window. The following
prototype result is optimal away from the edge:
Local Semicircle Law. [E-Schlein-Yau, 2007-08]
Let N [I] := #{λn ∈ I} be no. of evalues in I ⊂ R. Then


√

 N [E − K , E + K ]
−cδ
K
2N
2N − % (E) ≥ δ ≤ Ce
P sc


K
for all |E| < 2 and K > 0 uniformly in N ≥ N0(E, δ) and the eigenvectors are completely delocalized.
Previous results only down to scale |I| N −1/2.
Method: Green function + Large deviation
Local semicircle law is always the starting point of any analysis.
6
Step 1: LOCAL SEMICIRCLE LAW
1
1X
Gii
m(z) = TrG =
Green function :
N
N i
Let msc be the Stieltjes transform of the semicircle measure, i.e.,
1
(i, j),
Gij =
H −z
msc(z) =
Z
%sc(x)dx
,
x−z
1
4 − x2
%sc(x) =
2π
q
Theorem [E-Yau-Yin, 2010] Suppose assumptions (A) and (B)
hold. For any z = E + iη with η & N −1 the following holds with
exponentially high probability:
1
max |Gii(z) − msc(z)| + max |Gij (z)| . √
i
i6=j
Nη
|m(z) − msc(z)| .
1
Nη
where . means up to (log N )# factors. Estimates are optimal.
7
Corollary 1. [Complete eigenvector delocalization]
With very high probability, the L2-normalized eigenvectors satisfy
1
α
N
No concept of abs. cont. spectrum unlike in random Schrodinger,
but this is a signature of the delocalized regime.
sup kvαk2
∞.
Proof:
η|vα(i)|2
−1 |v (i)|2
&
η
ImGii(z) =
α
2 + η2
(E
−
λ
)
α
α
X
i.e.
|vα(i)|2 . η ∼
1
N
if Gii is controlled on scale η.
In fact, eigenvector estimate follows from |m(z)| . O(1) as well (our
earlier proof E-Schlein-Yau, [2008])
8
Corollary 2. [Rigidity of Eigenvalues] Let γj be the classical location
R γj
j
. Then
of the j-th ev., i.e. −∞ %sc(x)dx = N
∀ j ≤ N/2 : |λj − γj | . j
1
−3
N
2
−3
(1)
with a very high probability. Optimal up to (log N )-factors.
Previous results:
√
N
• For GUE: λj − γj ∼ log
[Gustavsson]
N
• For Wigner matrices with vanishing third moment [Tao-Vu] proved
(1) in the bulk (εN ≤ j ≤ (1 − ε)N ) and also proved
h
E
|λj − γj |2
i1/2
≤j
1
−3
N
1 −ε
−6
0
with some small positive ε0.
9
From Stieltjes transform to counting function
Let %(λ)dλ be a signed measure on a compact interval in R, say
[−3, 3] and suppose its Stieltjes transform satisfies
U
,
y > 0, |x| + y ≤ 10
|Im m(x + iy)| ≤
Ny
Then for any E1, E2 ∈ [−3, 3] we have
Z
CU | log η|
≤
f
(λ)%(λ)dλ
E1 ,E2 ,η
N
where fE1,E2,η is a smoothed char. function of [E1, E2] on scale η.
Helffer-Sjöstrand formula: Let f ∈ C 1(R). Let χ(y) be a cutoff
function in [−1, 1]. Define the quasianalytic extension of f as
fe(x + iy) = (f (x) + iyf 0(x))χ(y),
then
1
∂z̄ fe(x + iy)
1
iyf 00(x)χ(y) + i(f (x) + iyf 0(x))χ
f (λ) =
dxdy =
2
2π R λ − x − iy
2π R2
λ − x − iy
Z
Z
10
Detour: band matrices [E-Yau-Yin, 2010]
Consider a band matrix with band-width W , i.e.
Ehij = 0,
2 = W f (|i − j|/W )
E|hij |2 = σij
Assuming subexp decay, we have (z = E + iη)
|Gij | + |Gii(z) − msc(z)| . √
1
Wη
and
|m(z) − msc(z)| .
1
Wη
away from the edge.
I.e. our control holds down to scale the η ∼ W −1.
Eigenfunctions are localized at least on a scale W .
11
Proof of the local semicircle law
H (i) denotes the (N −1)×(N −1) minor of H after removing the i-th
column/row and similarly for H (ij) etc. Let ai be the i-th column.
The Green functions are:
1
1
(i)
,
G =
G=
H −z
H (i) − z
Then from Schur decomposition we have
1
1
Gii =
=
H − z ii
hii − z − ai · G(i)ai
Schur Decomposition of an (n + m) × (n + m)-matrix
D :=
=⇒
A B∗
B
C
!
,
c := A − B ∗C −1B,
D
c−1) ,
(D−1)ij = (D
ij
∀1 ≤ i, j ≤ n
12
With the notation Kij = hij − zδij − ai · G(ij)aj we also have
(j)
Gij = −Gjj Gii Kij
Moreover, we have a different set of formulas (i, j, k different)
(j)
Gii = Gii +
Gij Gji
,
Gjj
(k)
Gij = Gij +
Gik Gkj
Gkk
To see it for 2 × 2:
!
M =
a b
,
c d
G = M −1 =
!
1 d −c
,
−b
a
∆
with
∆ = ad − bc,
so the first identity (i = 1, j = 2) is
c b
d
1
= + ∆a∆ .
∆
a
∆
Rule of thumb: Gii ∼ O(1), Gij is small.
13
2 = 1)
Crudest local semicircle law (for σij
N
Gii =
1
hii − z − Ei ai · G(i)ai − Zi
with
Zi := ai · G(i)ai − Ei ai · G(i)ai
By a simple computation
Ei ai · G(i)ai =
N − 1 (i)
m ,
N
and by the interlacing property of the eigenvalues of H and H (i),
1
(i)
,
|m(z) − m (z)| ≤
Nη
η = Im z
Putting all together
1
Gii =
,
−z − m(z) + Ωi
1
Ωi = hii − Zi + O
Nη
14
1
Gii =
,
−z − m(z) + Ωi
1
Ωi = hii − Zi + O
Nη
Expand and sum up, with Ω = maxi Ωi 1
m(z) = −
1
+ O(Ω)
z + m(z)
1 ,
By the stability analysis of this equation, and msc = − z+m
sc
CΩ
√
,
|m − msc| ≤
Ω+κ
κ := |E| − 2
By the large deviation bound on Zi, we have Ω . (N η)−1/2, thus
1
1
|m − msc| . min √
,
N ηκ (N η)1/4
It can be improved in various directions (removing edge dependence
and get the correct (N η)-power)
15
Some large deviation results
Let ai (1 ≤ i ≤ N ) be independent complex random variables with
mean zero, variance σ 2 and having a uniform subexponential decay
P(|ai| ≥ xσ) ≤ ϑ−1 exp
ϑ
−x ,
∀ x ≥ 1,
with some ϑ > 0. Let Ai, Bij ∈ C (1 ≤ i, j ≤ N ). Then there exists
a 0 < φ < 1, depending on ϑ, such that for any ξ > 0


sX
N

 X
h
i
2
ξφ
ξ
|Ai|
≤ C exp − (log N ) ,
P aiAi ≥ (log N ) σ


i
i=1


s
N
N
 X

h
i
X
X
2
ξφ
2
ξ
2
P aiBiiai −
σ Bii ≥ (log N ) σ
|Bii|
≤ C exp − (log N ) ,


i=1
i=1
i




sX
 X

h
i
2
ξφ
ξ
2
P aiBij aj ≥ (log N ) σ
|Bij |
≤ C exp − (log N ) .


i6=j

i6=j
16
For example, for Zi := ai · G(i)ai − Ei ai · G(i)ai, we have
|Zi| .
v
u X 2
u
(i)
t
σij Gjk σki
j,k6=i
.√
1
Nη
since
2
X 1 X
C
1 1X
(i)
(i)
(i)
2
|G |
Im Gjj ≤
=
σij Gjk σki ≤
2
jj
N j
Nη N j
Nη
j,k6=i
using
X
k
|Gjk |2 =
X
1
∗
∗
Gjk Gkj = (GG )jj = Im Gjj
η
(Ward identity), and similarly
Zij = ai · G(ij)aj
1
.√
Nη
17
Improved local semicircle law. I: Bound on matrix elements
Self-consistent equation for the diagonal elements:
1
Gii =
P 2
−z − j σij Gjj + Yi
Yi := hii + Ai − Zi
2G +
Ai := σii
ii
2
σij
X
j
GjiGij
Gii
Zi := ai · G(i)ai − Eiai · G(i)ai
Neglect hii and Ai, focus on Yi ∼ Zi.
Further key quantities
Λd := max |Gkk − msc|,
k
Λo := max |Gjk |,
j6=k
Λ := |m − msc|
(they all depend on z = E + iη)
18
Dichotomy lemma
Away from the spectral edge (for simplicity), we have that
IF
Λo+Λd ≤
1
,
2
(log N )
THEN
Λo+Λd . √
1
Nη
with high prob
Proof. For the offdiagonal part, use
(i)
|Gij | = |Gii||Gjj ||Kij | ≤ C(|hij | + |Zij |) . √
1
Nη
For the diagonal part, we will use the self-consistent equation and
|Yi| . |Zi| . √
1
Nη
19
Expansion of the self-consistent equation
With the notation vi := Gii − msc, we have
1
− msc
vi =
P 2
−z − msc −
j σij vj − Yi
We know that |z + msc| ≥ c and by the assumption (. . .) 1, so we
can expand
vi = m2
sc
X
2v − Y + O
σij
j
i
j
X
2v − Y
σij
j
i
2
j
1
using msc + z+m
= 0. (More precise expansion is possible – this is
sc
the key to further improvements). Express vi as
vi =
X
j
1
2 Y + O(Λ2 )
−
m
sc j
d
1 − m2
Σ
ij
sc
2 ) matrix has a gap in the spectrum (the trivial
Use that Σ = (σij
eigendirection is directly estimated).
20
=⇒
Λd ≤ C(Λ2
d + Y )(log N ),
Y = max |Yj |
and yields the quadratic dichotomy:
If Λd (log N )−1, then Λd ≤ C(log N )Y . √1
Nη
Use a continuity argument and a trivial expansion for η = 10 to get
Λd . √1 for η & 1/N .
Nη
21
Improved local semicircle law II.
The key to improve the previous argument for the average of Gii is
to exploit a CLT-like cancellation in
1X
Yi ,
N i
i.e., mainly in
1X
Zi
N i
Zi = ai · G(i)ai − Ei ai · G(i)ai
Here Zi and Zj are not independent, but “almost”.
Lemma
X p
h
i
1
Cp
2p
−p
E
Zi ≤ p E Λo + N
N i
Using Λo ≤ (N η)−1/2, p ∼ log N , we get
1 X p
Zi ≤ pCp(N η)−p
E
N i
=⇒
1X
Zi . (N η)−1
N i
The estimate improved from (N η)−1/2, giving |m − msc| ≤ (N η)−1.
22
Sketch of the sketch of the proof: p = 2
1 X 2
1 X
1 X
Zi = 2
EZiZj + 2
E|Zi|2
E
N i
N i6=j
N i
We look only at the first term, this is EZ1Z2 (by symmetry)
EZ1Z2 = E a1 · G(1)a1 − E1a1 · G(1)a1 a2 · G(2)a2 − E1a2 · G(2)a2
Make the main terms independent by
(1) (1)
Gk2 G2`
(12)
(1)
(1)
(12)
=: Pk` + Pk`
Gk` = Gk` +
(1)
G22
(superscripts indicate independence of the givens rows/columns)
Idea: P (12) ∼ Λo but more indep., P (1) ∼ Λ2
o is one order better.
b a1P (12)a1 + E
b a1P (1)a1 E
b a2P (12)a2 + E
b a2P (2)a2
EZ1Z2 = E E
1
1
2
2
b := I − E , i.e. acting on a r.v. X, we have E
b X = X − E X.
where E
i
i
i
i
23
b a1P (12)a1 + E
b a1P (1)a1 E
b a2P (12)a2 + E
b a2P (2)a2
EZ1Z2 = E E
1
1
2
2
b := I − E .
with E
i
i
Claim: only one term (the small one), survives:
ih
i
1
(1)
1
2
(2)
2
b
b
EZ1Z2 = E E1a P a E2a P a
(∗)
P
(1)
2 σ 2 ∼ Λ4 .
and, after Schwarz, E[a1P (1)a1]2 ∼ k`6=1 |Pk` |2σ1k
o
1`
h
To prove (*), note that if A(i) is indep of i and Y is arbitrary, then
h
i
(i)
b YA
b Y
E
= A(i)E
i
i
thus
b X (2)
E E
1
b X (1) = EE
b X (2)E
b X (1) = 0
E
2
1
2
b X (1)), since EE
b = 0. Thus, e.g.
(with Y := X (2), A(1) := E
2
i
b a1P (12)a1 E
b a2P (12)a2 = 0
E E
1
2
24
Step 3: Green function comparison theorem
Theorem: Consider two generalized N × N Wigner ensembles, with
matrix elements N −1/2vij and N −1/2wij , respectively, with vij and
wij satisfying the uniform subexponential decay condition.
Assume moment matching
k =
Evij
k,
Ewij
4
4
Evij − Ewij ≤ N −δ
k = 1, 2, 3,
Then


1
Ev − Ew 
Tr
N
1
H − z1
1
1
1
1
Tr
Tr
···
N H − z2
N H − zk

 ≤ N −c0
with zj = Ej ± iη, |Ej | < 2, η ≥ N −1−ε.
Similar statement holds for matrix elements (instead of traces) and
for more general functions (not only polynomials)
Moral of the statement: if four moments (almost) match, local
statistics are the same.
25
The four moment condition first appeared in Tao-Vu’s Four momemt theorem, stating that observables of the form
X
F N λi1 , N λi2 , . . . , N λik
i1 <i2 <...<ik
have the same expectation under the two ensembles, if the first four
moments of the single entry distribution match. Their proof also
used local semicircle law and it was quite involved due to resonances
[they used level repulsion].
Our proof is a simple resolvent perturbation, exploiting the stability
of the resolvent; |G(z)| ≤ η −1 (never blows up).
26
Sketch of the proof of the comparison theorem
Replace the matrix elements one by one (Lindeberg strategy, used
recently in this context by Chattarjee and Tao-Vu). When replacing
the γ = (i, j) element, we compare

0...
0

0 . . . 0
1 
V := √ 
0 . . . 0
N 0 . . . v̄

ij
Hγ−1 = Q + V,
0...
Hγ = Q + W,
0

... 0 ...0

. . . vij . . . 0

. . . 0 . . . 0

. . . 0 . . . 0

... 0 ...0
W := vij → wij
Qij = Qji = 0
E
1
1
Tr
−E
(v)
N H
−z
=
1
1
Tr
N H (w) − z
"
γ(N
X)
γ=1
E
1
1
Tr
N Hγ − z
!
−E
1
1
Tr
N Hγ−1 − z
!#
.
27
By resolvent expansion
1
1
1
1
1
1
1
=
+
V
+
V
V
+ ...
Q+V −z
Q−z
Q−z Q−z
Q−z Q−z Q−z
up to fifth order. Note that V ∼ N −1/2 and Q is indep of V .
Same expansion for Q + W :
1
1
1
1
1
1
1
=
+
W
+
W
W
+ ...
Q+W −z
Q−z
Q−z Q−z
Q−z Q−z Q−z
The Ev and Ew expectations of the fully expanded terms coincide.
Green function estimates on scale η ≥ N −1 imply that
1
≤ N ε,
Q − z ij Im z = N −1−ε
so for the fifth order error term is ∼ N −5/2+5ε N −2.
28
SUMMARY
• All results for general Wigner matrices, no explicit formula used
• Local semicircle law for the DOS on the optimal scale ∼ 1/N
• Eigenvectors are fully delocalized away from the spectral edges.
• Universality of local eigenvalue statistics for general Wigner matrices with subexponentially decaying matrix elements.
OPEN: Universality for band matrices and random Schrödinger.
29