### Embedding into ultrametric with scaling distortion.

```CS-67720
Metric Embedding Theory and Its Algorithmic Applications
Lecture 9
Lecturers: Yair Bartal, Nova Fandina
9.1
Fall 2015/16
Partial and Scaling Distortion Embeddings
In this section we focus our attention on non-contractive embeddings (contraction ≤ 1). Note
that we can always assume w.l.g. that contraction=1 (by scaling the distances of dy ). Let
f : X → Y be a non-contractive embedding. Recall the following definitions:
n
o
(y))
dx (x,y)
Definition 9.1. ∀x 6= y ∈ X distf (x, y) = max dy (fdx(x),f
,
(x,y)
dy (f (x),f (y))
dy (f (x),f (y))
,
dx (x,y)
and dist(f ) = sup{distf (x, y)}.
Definition 9.2 (Partial Embedding). Let (X, dx ), (Y, dy ) be any metric spaces, and G ⊆ X2 .
We say that (f, G) is partial embedding with distortion α ≥ 1 if ∀(x, y) ∈ G it holds that
distf (x, y) ≤ α. Partial embedding (f, G) is called (1 − ) – partial if |G| ≥ (1 − ) n2 .
Note that for a non-contractive embedding distf (x, y) =
Definition 9.3 (Scaling Distortion). Let α : [0, 1] → R+ be a non-increasing function. We say
that embedding f : X → Y has an α – scaling distortion if
∀ 0 ≤ ≤ 1 ∃ G ⊆ X2
such that (f, G ) is a (1 − ) – partial embedding with distortion α().
Note that for < 1/ n2 the above definition captures the notion of the worst case distortion of
a non-contractive embedding f .
Scaling with Average Distortion. We discuss the strong relation between scaling and `q
distortions. Recall the definition of `q -distortion.
1q
P
q
x6=y∈X (distf (x,y))
, ∀q ≥ 1.
`q -dist(f ) =
(|X|
2 )
`∞ -dist = maxx6=y∈X {distf (x, y)} =f or non−contractive embedding = dist(f ).
Claim 9.1 (Exercise). If f has α-scaling distortion then ∀1 ≤ q < ∞ it holds that:
! 1q
R1
`q -dist(f ) ≤ 2 1 (α(x))q dx
2(n
2)
And discrete formula is given by:
n
q
q 1q
P( 2 )−1
1
i
1
`q -dist(f ) ≤ n 1
α n
+α 2 n
i=1
(2)
(2)
(2) q
9-1
If we have an estimation on `q -dist(f ) then we can bound the number of the pairs that have a
“bad” distortion (many pairs are preserved good).
Claim 9.2. Let 1 ≤ q < ∞, αq ≥ 1, and f be an embedding. If `q -dist(f ) ≤ αq then for =
it holds that f is an (1 − )-partial embedding with distortion at most 2αq .
1
2q
Proof. We have to show that there exists a set G ⊆ X2 , such that |G| ≥ (1 − ) n2 , and the
distortion of f on that set is at most 2α. Namely, we have to show that there are at most n2
pairs that can be distorted by more than 2α.
Assume by contradiction there is S ⊆ X2 , |S| > n2 and every pair from S distorted by more
than 2α. Therefore,
(`q -dist(f ))q >
(n
(2α)q
2)
(n2 )
The above claim means that for all γ > 1 f is (1 − 1/γ q ) partial embedding with distortion γα.
In the other words, for any 0 < < 1 ( = 1/γ q ) f is (1 − ) partial embedding with distortion
1
1
α/ q . Namely, f has `q -dist(f )/ q scaling distortion.
9.1.1
Embedding into Trees with Scaling Distortion
Theorem 9.3. The following holds.
1.  Any finite metric space is embeddable into ultra-metrics with scaling distortion O
2.  Any weighted graph contains a spanning tree with scaling distortion O
√1
√1
.
.
3. For
any 0 <
p
ρ < 1, any weighted graph contains a spanning tree with scaling distortion
Õ
1//ρ , and lightness 1 + ρ. This result is tight with respect to ρ.
For the first two items we conclude:
∀ 1 ≤ q < 2, `q -dist(f ) = O(1);
√
f or q = 2, `2 − dist(f ) = O( log n);
∀ q > 2, `q -dist(f ) = O(n1−2/q ).
It is interesting to compare the average quality of the embedding of H-PM with the embedding
of ABN. What is the average distortion (or `q -dist) of the embedding of Har-Peled and Mendel?
Is it constant? What is the `q -distortion of it? Does it resembles the parameters of ABN?
[Exercise].
9-2
We present the proof of the first item of the theorem.
Proof of item (1). We will build the embedding by induction (on n). Let (X, d) be an n-point
metric space. The idea is to partition (in a smart way) X = (X1 , X2 ), and build the ultra-metric
tree for X by combining the trees obtained by inductive steps on X1 and X2 . Let f1 : X1 → U1
and f2 : X2 → U2 be embeddings obtained by induction. The ultra-metric tree for X will be
obtained by composing U1 and U2 on the new root r with label ∆(r) = diam(X) = ∆. Thus, we
have to show how to decompose X. Next we discuss what properties should such decomposition
satisfy.
We have to show that there exists c > 0 such that for any 0 < < 1 there exists G ⊆ X2 with
|G | ≥ (1 − ) |X|
such that ∀x, y ∈ G it holds that distf (x, y) ≤ c√1 . Or, in other words, we
2
have to show that there exist at most |X|
pairs of elements of X with distortion greater than
2
1
√
.
c By the induction’s assumption there exist at most |X21 | pairs of X1 and at most |X22 | pairs
√ of
1
√
X2 with distortion greater than c . In addition, if x ∈ X1 and y ∈ X2 such that d(x, y) ≥ c ∆
then distortion of this pair is distf (x, y) ≤ c√1 .
√
Denote by B = {(x, y)|x ∈ X1 , y ∈ X2 , d(x, y) < c ∆}. Thus we have to show that
|X1 |
|X2 |
|X|
+
+ |B | ≤ 2
2
2
Namely, we have to show that |B | ≤ |X1 | · |X2 |. Thus we have to show how to partition X
such that above inequality holds for any .
Let u ∈ X be a point and r > 0 radius (will we show later how to chose r). We define
(r)
(r)
X1 = B̊(u, r) and X2 = X \ X1 (note that X1 and X2 are dependent on r). We choose a point
u ∈ X such that it holds that |B̊(u, ∆2 )| ≤ n2 . Note that such point does exist as if x, y ∈ X such
that ∆ = d(x, y) then open balls of radius ∆2 around x and y are disjoint and at least one of
them contains at most n2 points.
Next we show how to choose r. Define
(r,)
√
(r)
= {w ∈ X1 |d(w, u) > r − c ∆}
(r,)
√
(r)
= {w ∈ X2 |d(w, u) < r + c ∆}
S1
S2
(r,)
(r,)
(r,)
(r,)
Then B ⊆ S1 × S2 . Namely |B | ≤ |S1 | · |S2 |. Thus it is enough to prove that there
(r,)
(r,)
(r)
(r)
exists r such that ∀ 0 < < 1, |S1 | · |S2 | ≤ |X1 | · |X2 |.
√
Denote by ¯ = max{||B(u, 4∆ )| ≥ n}. Note that this set is not empty, as√at least = n1
belongs to it.
Also
note that ¯ ≤ 1/2. Thus for any > ¯ it holds that |B(u, 4∆ )| < n. We
√
√
choose r ∈ [ 4¯∆ , 2¯∆ ].
√
Lemma 9.4. If > 32¯ then (every r is good) ∀r ∈ [
(r)
(r)
|X1 | · |X2 |.
9-3
¯∆
,
4
√
¯∆
]
2
(r,)
it holds that |S1
(r,)
| · |S2
|≤
√
¯∆
,
4
Proof. Let r be any value from [
(r,)
|S1
(r,)
|S2
√
¯∆
]
2
and let > 32¯.
(r)
| ≤ |X1 |
√
| ≤ |B(u, r + c ∆)|
√
√
¯∆
r + c ∆ ≤(r≤ 2 ) ≤
√
¯∆
2
√
√ + c ∆ ≤(¯< 32 ) ≤ ∆ 2√132 + c ≤
where the last inequality holds if we choose c =
1
50
√
2
∆
4
which is good for all inductive steps. Therefore
p ∆ (r,)
|S2 | ≤ |B(u, r + c ∆)| ≤ B(u, 2 ) ≤( 2 >¯) ≤ n
4 2
√
(r,)
Namely, |S1
(r,)
| · |S2
(r)
(r)
| ≤ |X1 | n2 ≤(|X2
√
¯∆
,
4
Lemma 9.5. There exists r ∈ [
(r)
(r)
|X1 | · |X2 |.
|≥ n
)
2
(r)
(r)
≤ |X1 | · |X2 |
√
¯∆
]
2
(r,)
= I such that ∀ ≤ 32¯ it holds that |S1
(r,)
| · |S2
|≤
We shall prove a small lemma first. Let 0 ≤ r1 ≤ r2 be any real numbers, denote A(r1 , r2 ) the
size of the strip B(u, r2 ) \ B(u, r1 ).
√
Lemma 9.6. A(
Proof. A(
¯∆
,
2
√
√
¯∆
¯∆
,
)
2
4
√
¯∆
)
4
≤ 4¯n
√
≤ |B(u,
¯∆
)|
2
√
= |B(u,
4¯
∆
)|
4
≤(4¯>¯) ≤ 4¯n.
(r,)
(r,)
Proof of Lemma 9.5. We say that r is a “bad” radius for some ≤ 32¯ if |S1 | · |S2 | >
(r)
(r)
|X1 | · |X2 |. Denote by J the set of all bad values of r from I. We will show that |J| < |I|.
We build J iteratively. At the beginning J = ∅. On some step of the construction consider all
the values of r ∈ I \ J and all the values of ≤ 32¯ such that the pair (r, ) is a “bad” pair: r is
bad for . From all these pairs we choose the one with the maximum (actually it may happen
that we should take supremum of the set of such epsilons, which does exist√since we consider
∈ [1/n,√
32¯]). Denote
√ this pair by (r̂, ˆ). We add to J segment of length 2c ˆ∆ with center in
r̂: [r̂ − c ˆ∆, r̂ + c ˆ∆]. Note that the length of the segment we add does not increase from
step to step. We have to prove that it holds that |J| < |I| after the algorithm terminates.
√
√
Consider some step of the algorithm and the chosen pair (r̂, ˆ). Then,
A(r̂ − c ˆ∆, r̂ + c ˆ∆) ≥
√
(r̂,ˆ
)
(r̂,ˆ
)
(r̂,ˆ
)
(r̂,ˆ
)
(r̂)
|S1 ∪ S2 | = |S1 | + |S2 |. Note that |X1 | ≥(r̂∈I) ≥ |B(u, 4¯∆ )| ≥ ¯n. Therefore,
(r̂,ˆ
)
|S1
(r̂,ˆ
)
| · |S2
(r̂)
(r̂)
| >( (r̂,ˆ)is bad ) > ˆ|X1 | · |X2 | ≥
ˆ¯n2
.
2
Therefore by the inequality of arithmetic and geometric means:
q
√
√
A(r̂ − c r̂∆, r̂ + c r̂∆) > 2 ˆ2¯n.
9-4
|J|
≤
√
¯∆
.
4
Pt
i=1
(lengths of segments of Ji ) =
Recall that we have chosen c =
√
18.75 ¯ .
1
.
150
P
i
√
P √
2c ˆi ∆ = 2c∆ i ˆi <( have to prove ) < |I| =
Therefore we have to show that
P √
i
√
ˆi < (150/8) ¯ =
Note that each point of I belongs to at most 2 segments of J. This is because the radius is
always chosen outside the segments of J and the lengths of the segments do not increase from
step to step.
√
√
√
√
P
P q ˆi ¯
¯∆
¯∆
(each point belongs to at most 2 segments)
n
<
A(
r
ˆ
−
c
ˆ
∆,
r
ˆ
+
c
ˆ
∆)
≤
≤
2A(
,
)≤
2
i
i
i
i
i
i
2
4
2
2 · 4 · ¯n.
√ √
P √
Therefore i i < 4 2 ¯.
References
 Ittai Abraham, Yair Bartal, and Ofer Neiman. Embedding metrics into ultrametrics and
graphs into spanning trees with constant average distortion. SIAM J. Comput., 44(1):160–
192, 2015.
9-5
```