Lecture 11

Lecture 11
Representing Sets (2.3.3)
Huffman Encoding Trees (2.3.4)
11 ‫ שיעור‬- ‫מבוא מורחב‬
1
The Abstract Data Type Set
A set is a collection of distinct items. The following are
allowable set operations.
(empty-set? set)
Set  Boolean
(make-empty-set)
 Set
(element-of-set? x set)
Item × Set  Boolean
(adjoin-set x set)
Item × Set  Set
(union-set s1 s2)
Set × Set
 Set
(intersection-set s1 s2) Set × Set
 Set
Contract: the operations have the usual
meaning of set operations.
11 ‫ שיעור‬- ‫מבוא מורחב‬
2
Implementing Sets
•
Must decide on a Representation
 How a set is represented.
•
Then must write an implementation
 Write the code of the operations.
 Each operation can be written separately
We will take a look at several alternative
representations/implementations
11 ‫ שיעור‬- ‫מבוא מורחב‬
3
Version 1: Unordered List
Representation: a set is represented as a list.
No duplicates are allowed.
Implementation:
(define (empty-set? set)(null? set))
(define (make-empty-set)'())
11 ‫ שיעור‬- ‫מבוא מורחב‬
4
Version 1: element-of-set?
(define (element-of-set? x set)
(cond ((null? set) #f)
((equal? x (car set)) #t)
(else (element-of-set? x (cdr set)))))
equal? : Like eq? for symbols.
Compares the contents of lists and trees. Can be
applied for numbers and strings.
(eq? (list 'a 'b) (list 'a 'b))
 #f
(equal? (list 'a 'b) (list 'a 'b))  #t
11 ‫ שיעור‬- ‫מבוא מורחב‬
5
Version 1: Adjoin-set
(define (adjoin-set x set)
(if (element-of-set? x set)
set
(cons x set)))
11 ‫ שיעור‬- ‫מבוא מורחב‬
6
Version 1: Intersection
(define (intersection-set set1 set2)
(cond ((or (null? set1) (null? set2)) '())
((element-of-set? (car set1) set2)
(cons (car set1)
(intersection-set (cdr set1) set2)))
(else (intersection-set (cdr set1) set2))))
Or is this better?
(define (intersection-set set1 set2)
(cond ((or (null? set1) (null? set2)) '())
((element-of-set? (car set1) set2)
(adjoin-set (car set1)
(intersection-set (cdr set1) set2)))
(else (intersection-set (cdr set1) set2))))
11 ‫ שיעור‬- ‫מבוא מורחב‬
7
Version 1: Union
(define (union-set set1 set2)
(cond ((null? set1) set2))
((not (element-of-set? (car set1) set2))
(cons (car set1)
(union-set (cdr set1) set2)))
(else (union-set (cdr set1) set2))))
Is the alternative better this time?
(define (union-set set1 set2)
(cond ((null? set1) set2))
(else
(adjoin-set (car set1)
(union-set (cdr set1) set2)))))
11 ‫ שיעור‬- ‫מבוא מורחב‬
8
Version 1: Time Complexity
Suppose a set contains n elements
Element-of-set
(n)
Adjoin-set
(n)
Intersection-set
(n2)
Union-set
(n2)
11 ‫ שיעור‬- ‫מבוא מורחב‬
9
Version 2: Ordered List
Representation: a set (of numbers) is
represented as an ordered list without
duplicates.
(define (element-of-set? x set)
(cond ((null? set) #f)
((= x (car set)) #t)
((< x (car set)) #f)
(else (element-of-set? x (cdr set)))))
Time complexity: (n)
You will implement adjoin-set yourself.
empty-set? and make-empty-set are the same.
11 ‫ שיעור‬- ‫מבוא מורחב‬
10
Version 2: Intersection
intersection-set from version 1 will work here.
(define (intersection-set set1 set2)
(cond ((or (null? set1) (null? set2)) '())
((element-of-set? (car set1) set2)
(cons (car set1)
(intersection-set (cdr set1) set2)))
(else (intersection-set (cdr set1) set2))))
But, its complexity is (n2).
Can we do it better ?
11 ‫ שיעור‬- ‫מבוא מורחב‬
11
Version 2: Better Intersection
(define (intersection-set set1 set2)
(if (or (null? set1) (null? set2))
'()
(let ((x1 (car set1)) (x2 (car set2)))
(cond ((= x1 x2)
(cons x1
(intersection-set (cdr set1)
(cdr set2))))
((< x1 x2)
(intersection-set (cdr set1) set2))
(else
(intersection-set set1 (cdr set2)))))))
11 ‫ שיעור‬- ‫מבוא מורחב‬
12
Version 2: Intersection Example
set1
set2
intersection
(1 3 7 9) (1 4 6 7)
(1
(3 7 9)
(4 6 7)
(1
(7 9)
(4 6 7)
(1
(7 9)
(6 7)
(1
(7 9)
(7)
(1
(9)
()
(1 7)
Time and space  (n)
Union -- similar
11 ‫ שיעור‬- ‫מבוא מורחב‬
13
Complexity
empty-set?
unordered ordered
(1)
(1)
make-empty-set
(1)
(1)
element-of-set
(n)
(n)
adjoin-set
(n)
(n)
intersection-set
(n2)
(n)
union-set
(n2)
(n)
11 ‫ שיעור‬- ‫מבוא מורחב‬
14
Version 3: Binary Trees
Binary Search:
Lion in the desert
11 ‫ שיעור‬- ‫מבוא מורחב‬
15
Version 3: Binary Trees
Store the elements in the nodes of a binary tree.
The values in the left subtree of a node v are all
smaller than the value stored at v.
The values in the right subtree of a node v are all
larger than the value stored at v.
7
A possible representation
of the set {1,3,5,7,9,12} :
9
3
1
11 ‫ שיעור‬- ‫מבוא מורחב‬
5
12
16
Version 3: Binary Trees
A set has many representations:
7
9
3
1
3
5
7
1
12
Height= (log n)
5
9
12
Balanced Tree
Unbalanced Tree
11 ‫ שיעור‬- ‫מבוא מורחב‬
17
Version 3: Representation
(define (make-tree entry left right)
(list entry left right))
7
9
3
1
5
12
(define (entry tree)
(car tree))
(define (left-branch tree)
(cadr tree))
(define (right-branch tree)
(caddr tree))
11 ‫ שיעור‬- ‫מבוא מורחב‬
18
Version 3: Element-of-set
(define (element-of-set? x
(cond ((null? set) #f)
((= x (entry set))
((< x (entry set))
(element-of-set?
(else
(element-of-set?
set)
true)
x (left-branch set)))
x (right-branch set)))))
Complexity: (h), where h is the height of the tree.
If tree is balanced, then h  log(n)
In the worst case, h  n
11 ‫ שיעור‬- ‫מבוא מורחב‬
19
Version 3: Adjoin-set
(define (adjoin-set x set)
(cond ((null? set) (make-tree x '() '()))
((= x (entry set)) set)
((< x (entry set))
(make-tree (entry set)
(adjoin-set x (left-branch set))
(right-branch set)))
(else
(make-tree (entry set)
(left-branch set)
(adjoin-set x (right-branch set))))))
Complexity: (h), where h is the height of the tree.
11 ‫ שיעור‬- ‫מבוא מורחב‬
20
Complexity
We omit the trivial operations.
unordered ordered
trees
Element-of-set
(n)
(n)
(h)
Adjoin-set
(n)
(n)
(h)
Intersection-set (n2)
(n)
(n)
(n2)
(n)
(n)
Union-set
If a tree is roughly balanced, then h  log(n).
Main challenge: Keep the trees roughly balanced.
(More on this next term in “Data Structures”.)
11 ‫ שיעור‬- ‫מבוא מורחב‬
21
Random trees are fairly balanced
(define (rand-tree n range)
(if (= n 0) '()
(adjoin-set (random range)
(rand-tree (- n 1) range))))
(define (height tree)
(if (null? tree) 0
(+ 1 (max (height (left-branch tree))
(height (right-branch tree))))))
(height (rand-tree 1000 1000000))  ~ 22
(height (rand-tree 10000 1000000))  ~ 31
• average over several runs.
• height of a balanced tree: ~ 10 , ~13 resp.
11 ‫ שיעור‬- ‫מבוא מורחב‬
22
Huffman encoding trees
11 ‫ שיעור‬- ‫מבוא מורחב‬
23
Data Transmission
“sos”
Bob
Alice
We wish to send information
efficiently from Alice to Bob
Morse code not necessarily the most
efficient you could think of
11 ‫ שיעור‬- ‫מבוא מורחב‬
24
Fixed Length Codes
Represent data as a sequence of 0’s and 1’s
Sequence: BACADAEAFABBAAAGAH
A fixed length code (ASCII):
A 000
B 001 C 010 D 011
E 100
F 101 G 110 H 111
Encoding of sequence:
001000010000011000100000101000001001000000000110000111
The Encoding is 18x3=54 bits long.
Can we make the encoding shorter?
11 ‫ שיעור‬- ‫מבוא מורחב‬
25
Variable Length Code
Make use of frequencies. Frequency of A=8, B=3, others 1.
A 0
B 100
C 1010
E 1100 F 1101 G 1110
D 1011
H 1111
Example: BACADAEAFABBAAAGAH
100010100101101100011010100100000111001111
42 bits (20% shorter)
But how do we decode?
11 ‫ שיעור‬- ‫מבוא מורחב‬
26
Prefix code  Binary tree
Prefix code: No codeword is a prefix of any other codeword
A 0
0
1
B 100
C 1010
E 1100 F 1101 G 1110
D 1011
H 1111
A
0
0
1
0
1
1
B
0
1
C
D
0
E
1
F
0
G
11 ‫ שיעור‬- ‫מבוא מורחב‬
1
H
27
Decoding Example
0
10001010
10001010 B
10001010 BA
10001010 BAC
1
A
0
0
1
0
1
1
B
0
1
C
D
0
E
11 ‫ שיעור‬- ‫מבוא מורחב‬
1
F
0
G
1
H
28
Abstract representation of code trees
Constructors:
make-leaf
make-code-tree
- Construct a leaf
- Construct a code tree
Predicates:
leaf?
- Is leaf?
Selectors:
left-branch
right-branch
symbol-leaf
- Select left branch
- Select right branch
- the symbol attched to leaf
11 ‫ שיעור‬- ‫מבוא מורחב‬
29
Decoding a Message
(define (decode bits tree)
(define (decode-one bits current-branch)
(if (null? bits)
'()
(let ((next-branch
(choose-branch (car bits) current-branch)))
(if (leaf? next-branch)
(cons (symbol-leaf next-branch)
(decode-one (cdr bits) tree))
(decode-one (cdr bits) next-branch)))))
(decode-one bits tree))
(define (choose-branch bit branch)
(cond ((= bit 0) (left-branch branch))
((= bit 1) (right-branch branch))
(else (error "bad bit -- CHOOSE-BRANCH" bit))))
11 ‫ שיעור‬- ‫מבוא מורחב‬
30
Huffman Tree = Optimal Length Code
0
1
8 A
0
0
1
0
1
1
3 B
0
1
C
D
1
1
1
0
E
F
G
H
1
1
1
1
0
1
Optimal: no code has better weighted average length
11 ‫ שיעור‬- ‫מבוא מורחב‬
31
Representation
17
A
8
5
3
9
{B,C,D,E,F,G,H}
{B,C,D}
2
B
4
2
{C,D}
C
1
{A,B,C,D,E,F,G,H}
D
1
2
{E,F}
E
1
{E,F,G,H}
F
1
11 ‫ שיעור‬- ‫מבוא מורחב‬
G
1
{G,H}
H
1
32
Representation (Cont.)
(define (make-leaf symbol weight)
(list 'leaf symbol weight))
(define (leaf? object)
(eq? (car object) 'leaf))
(define (symbol-leaf x) (cadr x))
(define (weight-leaf x) (caddr x))
11 ‫ שיעור‬- ‫מבוא מורחב‬
33
Representation (Cont.)
(define (make-code-tree left right)
(list left
right
(append (symbols left) (symbols right))
(+ (weight left) (weight right))))
(define (left-branch tree) (car tree))
(define (right-branch tree) (cadr tree))
11 ‫ שיעור‬- ‫מבוא מורחב‬
34
Representation (Cont.)
(define (symbols tree)
(if (leaf? tree)
(list (symbol-leaf tree))
(caddr tree)))
(define (weight tree)
(if (leaf? tree)
(weight-leaf tree)
(cadddr tree)))
11 ‫ שיעור‬- ‫מבוא מורחב‬
35
Huffman’s Algorithm
Build tree bottom-up, so that lowest weight leaves are
farthest from the root.
Repeatedly:
Find two trees of lowest weight.
merge them to form a new tree whose
weight is the sum of their weights.
11 ‫ שיעור‬- ‫מבוא מורחב‬
36
Construction of Huffman tree
17
{A,B,C,D,E,F,G,H}
9
5
{B,C,D}
2
8
3
1
4
2
{C,D}
C
B
A
{B,C,D,E,F,G,H}
D
1
11 ‫ שיעור‬- ‫מבוא מורחב‬
2
{E,F}
E
1
{E,F,G,H}
F
1
G
1
{G,H}
H
1
37
Construction of Huffman Tree
Initial leaves
{(A 8) (B 3) (C 1) (D 1) (E 1) (F 1) (G 1) (H 1)}
Merge
{(A 8) (B 3) ({C D} 2) (E 1) (F 1) (G 1) (H 1)}
Merge
{(A 8) (B 3) ({C D} 2) ({E F} 2) (G 1) (H 1)}
Merge
{(A 8) (B 3) ({C D} 2) ({E F} 2) ({G H} 2)}
Merge
{(A 8) (B 3) ({C D} 2) ({E F G H} 4)}
Merge
{(A 8) ({B C D} 5) ({E F G H} 4)}
Merge
{(A 8) ({B C D E F G H} 9)}
Final merge
{({A B C D E F G H} 17)}
11 ‫ שיעור‬- ‫מבוא מורחב‬
38
Construction of Huffman tree
(generate-huffman-tree '((A 8) (B 3) (C 1) (D 1)
(E 1) (F 1) (H 1) (G 1))
((leaf a 8) ((((leaf g 1) (leaf h 1) (g h) 2) ((leaf
f 1) (leaf e 1) (f e) 2) (g h f e) 4) (((leaf d 1)
(leaf c 1) (d c) 2) (leaf b 3) (d c b) 5) (g h f e d
c b) 9) (a g h f e d c b) 17)
left-branch
symbols
weight
11 ‫ שיעור‬- ‫מבוא מורחב‬
right-branch
39
Construction of Huffman tree
(define (generate-huffman-tree pairs)
(successive-merge (make-leaf-srt-lst pairs)))
(define (make-leaf-srt-lst pairs)
(if (null? pairs)
Sort pairs
'()
(let ((pair (car pairs)))
(insert-srt (make-leaf (car pair)
(cadr pair))
(make-leaf-srt-lst (cdr pairs))))))
Ordered insert
11 ‫ שיעור‬- ‫מבוא מורחב‬
40
Construction of Huffman tree
(define (insert-srt x s-lst)
(cond ((null? s-lst) (list x))
((< (weight x) (weight (car s-lst)))(cons x s-lst))
(else (cons (car s-lst)
(insert-srt x (cdr s-lst))))))
(define (successive-merge trees)
(if (null? (cdr trees))
(car trees)
(let ((smallest (car trees))
(2smallest (cadr trees))
(rest (cddr trees)))
(successive-merge
(insert-srt
(make-code-tree smallest 2smallest)
11 ‫ שיעור‬- ‫מבוא מורחב‬
rest)))))
41
Summary
We saw today that even a seemingly simple
abstract concept like a set, when implemented on a
computer, can give rise to many implementations,
each with a different complexity.
11 ‫ שיעור‬- ‫מבוא מורחב‬
42