Balanced BST
Balanced BSTs guarantee O(logN)
performance at all times
the height or left and right sub-trees
are about the same
simple BST are O(N) in the worst case
Categories of BSTs
AVL, SPLAY trees: dynamic environment
optimal trees: static environment
E.G.M. Petrakis
Trees in Main Memory
1
AVL Trees
AVL (Adelson, Lelskii, Landis): the height
of the left and right subtrees differ by at
most 1
the same for every subtree
Number of comparisons for membership
operations
best case: completely balanced
worst case: 1.44 log(N+2)
expected case: logN + .25 <= log(N
E.G.M. Petrakis
Trees in Main Memory
1)
2
AVL Trees
“—” : completely balanced sub-tree
“/” : the left sub-tree is 1 higher
“\” : the right sub-tree is 1 higher
E.G.M. Petrakis
Trees in Main Memory
3
AVL Trees
E.G.M. Petrakis
Trees in Main Memory
4
Non AVL Trees
critical node
“//”
“\\”
E.G.M. Petrakis
: the left sub-tree is 2 higher
: the right sub-tree is 2 higher
Trees in Main Memory
5
Single Right Rotations
Insertions or deletions may result in non AVL
trees => apply rotations to balance the tree
Α
Β
Α
T3
T1
E.G.M. Petrakis
Β
T1
T2
T2
Trees in Main Memory
T3
6
Single Left Rotations
B
Α
Α
T1
T2
E.G.M. Petrakis
Α
Β
T3
T1
Trees in Main Memory
T3
T2
7
4
4
insert 1
2
2
2
single right
rotation
1
4
1
4
8
4
insert 9
8
single left rotation
4
8
9
9
E.G.M. Petrakis
Trees in Main Memory
8
Double Left Rotation
Composition of two single rotations (one
right and one left rotation)
Β
Α
Α
C
Α
C
C
Α
Β
A
A
T4
T4
B
Α
T3
T2
E.G.M. Petrakis
or
T1
Α
Τ2
Α
T3
T4
T2
Τ1
T3
or
T1
or in Main Memory
Trees
9
Example of Double Left Rotation
7
Critical node
4
4 \\
4
2
2 =
7
/ 8 =
2
6
/ 7 =
6
9
9
6
8
=
9
insert 6
E.G.M. Petrakis
8
Trees in Main Memory
10
Double Right Rotation
Composition of two single rotations (one
left and one right rotation)
B
Α
Α
A
Α
B
Α
A
T3
T4
T2
T3
E.G.M. Petrakis
or
C
T4
Β
Α
T1
Α
Β
C
Α
T1
T1
T2
Τ4
T3
or
T2
Trees in MainorMemory
11
Insertion (deletion) in AVL
1. Follow the search path to verify
that the key is not already there
2. Insert (delete) the key
3. Retreat along the search path and
check the balance factor
4. Rebalance if necessary (see next)
E.G.M. Petrakis
Trees in Main Memory
12
Rebalancing
For every node reached coming up from its
left sub-tree after insertion readjust
balance factor
‘=’ becomes ‘/’ => no operation
‘\’ becomes ‘=’ => no operation
‘/’ becomes ‘//’ => must be rebalanced!!
The “//” node becomes a “critical node”
Only the path from the critical node to the
leaf has to be rebalanced!!
Rotation is applied only at the critical node!
E.G.M. Petrakis
Trees in Main Memory
13
Rebalancing (cont.)
The balance factor of the critical node
determines what rotation is to take place
single or double
If the child and the grand child (inserted
node) of the critical node are on the same
direction (both “/”) => single rotation
Else => double rotation
Rebalance similarly if coming up from the
right sub-tree (opposite signs)
E.G.M. Petrakis
Trees in Main Memory
14
Performance
Performance of membership operations on
AVL trees:
easy for the worst case!
An AVL tree will never be more than 45%
higher that its perfectly balanced
counterpart (Adelson, Velskii, Landis):
log(N+1) <= hb(N) <=
l.4404log(N+2) – 0.302
E.G.M. Petrakis
Trees in Main Memory
15
Worst case AVL
Sparse tree => each sub-tree has
minimum number of nodes
E.G.M. Petrakis
Trees in Main Memory
16
Fibonacci Trees
Th: tree of height h
Th has two sub-trees, one with height h-1
and one with height h-2
else it wouldn’t have minimum number of nodes
T0 is the empty sub-tree (also Fibonacci)
T1 is the sub-tree with 1 node (also
Fibonacci)
E.G.M. Petrakis
Trees in Main Memory
17
Fibonacci Trees (cont.)
Average height
Nh number of nodes of Th
Nh = Nh-1 + Nh-2 + 1
N0 = 1
N1 = 2
1
Nh
5
h 2
1 5
2
1
5
h 2
1 5
2
1
From which h <= 1.44 log(N+1)
E.G.M. Petrakis
Trees in Main Memory
18
More Examples
single rotation
7
7
8
8
8
7
9
insert 9
9
E.G.M. Petrakis
Trees in Main Memory
19
Examples (cont.)
double rotation
7
6
6
8
8
insert 7
E.G.M. Petrakis
6
8
7
Trees in Main Memory
20
Examples (cont.)
double rotation
8
8
6
6
insert 7
E.G.M. Petrakis
7
6
8
7
Trees in Main Memory
21
Examples (cont.)
single rotation
7
7
6
8
8
8
9
E.G.M. Petrakis
delete 6
7
9
9
Trees in Main Memory
22
Examples (cont.)
single rotation
5
6
8
6
8
8
7
9
7
9
6
9
7
delete 5
E.G.M. Petrakis
Trees in Main Memory
23
Examples (cont.)
double rotation
5
6
7
6
8
8
7
6
8
7
delete 5
E.G.M. Petrakis
Trees in Main Memory
24
General Deletions
5
5
3
2
1
8
4
6
E.G.M. Petrakis
7
2
delete 4
1
10
9
8
11
Trees in Main Memory
3
6
7
delete 8
10
9
11
25
General Deletions (cont.)
5
2
5
2
7
delete 8
10
delete 5
delete 6
1
3
6
9
E.G.M. Petrakis
1
10
11
Trees in Main Memory
3
7
11
9
26
General Deletions (cont.)
3
2
delete 5
1
10
7
11
9
E.G.M. Petrakis
Trees in Main Memory
27
Self Organizing Search
Splay Trees: adapt to query patterns
move to root (front): whenever a node is
accessed move it to the root using
rotations
equivalent to move-to-front in arrays
current node
insertions: inserted node
deletions: father of deleted node or null
if this is the root
membership: Trees
thein Main
last
accessed node
E.G.M.Petrakis
Memory
28
20
Search(10)
30
15
8
second rotation
13
14
8
14
20
10
8
30
15
first rotation 10
13
10
20
10
30
15
13
third rotation
8
20
15
30
13
14
14
E.G.M. Petrakis
Trees in Main Memory
29
Splay Cases
If the current node q has no
grandfather but it has father p =>
only one single rotation
–
two symmetric cases: L, R
p
b
E.G.M. Petrakis
p
q
a
q
c
a
Trees in Main Memory
b
c
30
If p has also grandfather qp => 4 cases
gp
p
a
q
c
q
b
E.G.M. Petrakis
RL
p
c
gp
a
d
gp
a
p
LL
b
c
q
b c
LL symmetric of RR
RL symmetric of RL
Trees in Main Memory
d
b
gp
a
d
q
p
d
31
1
1
a
q
α
j
8
c
4
current
node
5
E.G.M. Petrakis
g
5
4
e
h
6
g
d
LL
7
h
3
i
2
RR
7
6
j
8
i
2
q
f
c
e
3
f
d
Trees in Main Memory
32
1
a
q
α
j
8
6
4
3
g
f
c
RL
8
4
3
7
j
2
5
b
q
5
LR
i
2
c
1
i
6
e
d
e
7
f
g
h
a, b, c … are sub-trees
E.G.M. Petrakis
Trees in Main Memory
33
5
1
a
q
8
2
b
c
E.G.M. Petrakis
6
4
3
i
7
f
e
j
g
h
d
Trees in Main Memory
34
Splay Performance
Splay trees adapt to unknown or changing
probability distributions
Splay trees do not guarantee logarithmic
cost for each access
AVL trees do!
asymptotic cost close to the cost of the optimal
BST for unknown probability distributions
It can be shown that the “cost of m
operations on an initially empty splay tree,
where n are insertions is O(mlogn) in the
worst case”
E.G.M. Petrakis
Trees in Main Memory
35
Optimal BST
Static environment: no insertions or
deletions
Keys are accessed with various
frequencies
Have the most frequently accessed
keys near the root
Application: a symbol table in main
memory
E.G.M. Petrakis
Trees in Main Memory
36
Searching
Given symbols a1 < a2 < ….< an and their
probabilities: p1, p2, … pn minimize
cost
n
Successful search cos t pi level (ai )
i 1
Transform unsuccessful to successful
consider new symbols E1, E2, … En
- … α1 … α2 …αi … αi+1 ….…αn… αn+1
E0
E1
Ei= (αi , αi+1 )
E.G.M. Petrakis
E2
Ei
E0= (- , α1 )
Trees in Main Memory
En
En= (αn , )
37
Unsuccessful Search
an
an-1
an-2
Ei
unsuccessful search for
all values in Ei
terminates on the same
failure node (in fact, one
node higher)
failure node
an-3
E.G.M. Petrakis
Trees in Main Memory
38
Example
(a1, a2, a3) = (do, if, read)
p i = q i = 1/7
if
do
ifif
read
do
read
if
read
do
cost = 13/7
Optimal BST
cost = 15/7
E.G.M. Petrakis
Trees in Main Memory
cost = 15/7
39
read
do
read
do
if
if
cost = 15/7
cost = 15/7
E.G.M. Petrakis
Trees in Main Memory
40
Search Cost
If pi is the probability to search for
ai and qi is the probability to search in
Ei then
p q 1
n
i 1
n
i
i
i 1
n
n
i 1
i 1
cost pi level(ai) qi {level (Ei) 1}
E.G.M. Petrakis
successful
search
Trees in Main Memory
unsuccessful
search
41
Observation 1
In a BST, a subtree has nodes
that are consecutive in a sorted
sequence of keys (e.g. [5,26])
20
10
5
13
12
E.G.M. Petrakis
25
24
26
14
Trees in Main Memory
42
Observation 2
If Tij is a sub-tree of an optimal
BST holding keys from i to j then
Tij must be optimal among all
possible BSTs that store the same
keys
optimality lemma: all sub-trees of
an optimal BST are also optimal
E.G.M. Petrakis
Trees in Main Memory
43
Optimal BST Construction
1) Construct all possible trees:
1 2n
trees!!
NP-hard, there are
n 1 n
2) Dynamic programming solves the
problem in polynomial time O(n3)
at each step, the algorithm finds and
stores the optimal tree in each range
of key values
increases the size of the range at each
step until the
range is obtained 44
E.G.M. Petrakis
Treeswhole
in Main Memory
Example (successful search only):
keys
1
probabilities 0,3
10
0,2
20
0,1
40
0,4
1) BSTs with 1 node
range 1
cost= 0.3
10
0.2
20
0.1
2) BSTs with 2 nodes
range 1-10
optimal
1
40
0.4
k=1-10 k=10-20 k=20-20
range 10-20
optimal
10
10
cost=0.3 1+0.2 2=0.7
cost=0.2 1+0,3.2=0.8
E.G.M. Petrakis
20
20
cost=0.2+0.1 2=0.4
10
1
range 20-40
20
10
cost=0.1+0.2 2=0.5
Trees in Main Memory
40
cost=0.1+0.8=0.9
40
optimal
20
cost=0.4+0.2=0.6
45
3) BSTs with 3 nodes
k=10-40
range 10-40
10
40
20
k=1-20
range 1-20
20
1
10
cost=0.1+2 0.3+3 0.2=1.3
cost=0.2+2 0.4+3 0.1=1.3
10
20
1
20
cost=0.2+2(0.3+0.1)=1
1
10
40
cost=0.1+2(0.2+0.4)=1.3
optimal
10
10
20
20
cost=0.4+2 0.2+30.1=1.1
cost=0.3+2 0.2+3 0.1=1
E.G.M. Petrakis
40
Trees in Main Memory
46
4) BSTs with 4 nodes
range 1-40
1
40
cost=0.3+2 0.4+3 0.2+4 0.1=2.1
10
20
10
1
40
20
cost=0.2+2(0.3+0.4)+3 0.1=1.9
OPTIMAL BST
20
40
1
cost=0.1+2(0.3+0.4)+3 0.2=2.1
10
40
1
cost=0.4+2 0.3+3 0.2+4 0.1=2
E.G.M. Petrakis10
Trees in Main Memory
20
47
Complexity
Compute all optimal BSTs for all Cij, i,j=1,2..n
Let m=j-i: number of keys in range Cij
n-m-1 Cij’s must be computed
The one with the minimum cost must be
found, this takes O(m(n-m-1)) time
2
3
(nm
m
)
O(n
)
For all Cij’s it takes
1 m n
There is a better O(n2) algorithm by Knuth
There is also a O(n) greedy algorithm
E.G.M. Petrakis
Trees in Main Memory
48
Optimal BSTs
High probability keys should be near
the root
But, the value of a key is also a factor
It may not be desirable to put the
smallest or largest key close to the
root => this may result in skinny trees
(e.g., lists)
E.G.M. Petrakis
Trees in Main Memory
49
© Copyright 2026 Paperzz