B - Piazza

IERG6120
Lecture 14 – Introduction to Regenerating Codes
Kenneth Shum
Nov 2016
Outline
• Single-loss Recovery
– Information flow graph and cut-set bound
– An explicit code construction
• Cooperative Recovery
– Further reduction in the repair traffic
– An explicit code construction
Nov 2016
kshum
2
Recall from lecture 0
• Locality
– a.k.a. repair degree
– The number of nodes contacted by the new node.
• Repair bandwidth
– The amount of data downloaded from the
contacted nodes.
Nov 2016
kshum
3
Three examples
Repetition
scheme
Reed-Solomon Codes
Regenerating codes
Storage
efficiency
1/2
1/2
1/2
Reliability
Tolerate one
disk failure
Tolerate any two disk
failures
Tolerate any two
disk failures
Repair
bandwidth
1G
2G
1.5 G
Locality
1
2
3
Nov 2016
kshum
4
2x Repetition scheme
Divide the data
file into 2 parts
A, B
1G
1G
1G
A
Data
Collector
B
A
1G
Cannot tolerate
double disk failures
B
Nov 2016
kshum
5
Repair for repetition-based system
New node
A
A
B
1G
A
B
Nov 2016
kshum
6
Repair for repetition-based system
New node
A
A
B
1G
A
B
Nov 2016
Locality = 1
Repair bandwidth =1G
kshum
7
Reed-Solomon Code
Divide the
file into 2 parts
A
A, B
Data
Collector
B
A+B
It can tolerate
double disk failures
A+2B
Nov 2016
kshum
8
Repair requires essentially decoding the
whole file
A
A
New node
1G
B
1G
A+B
A+2B
Nov 2016
Locality = 2
Repair bandwidth = 2G
kshum
9
Vector-linear code
Wu, Dimakis
Int. Symp. Inform.
Theory 2009
A1
A2
A1, A2,
B1, B2
Data
Collector
B1
B2
C1=A1+B1
C2=2 A2+B2
D1=2 A1+B1
D2=A2+B2
Nov 2016
kshum
10
Repair with ``network coding’’
A1
A2
A1, A2,
B1, B2
A1
A2
B1
B2
C1=A1+B1
C2=2 A2+B2
Locality = 3
Repair bandwidth = 1.5G
D1=2 A1+B1
D2=A2+B2
Nov 2016
kshum
11
Singleton vs Gopalan et al.
Singleton
Gopalan et al.
If locality r is strictly less than k,
then the code cannot be MDS.
Nov 2016
kshum
12
Three examples
Repetition
scheme
Reed-Solomon Codes
Regenerating codes
Storage
efficiency
1/2
1/2
1/2
Reliability
Tolerate one
disk failure
Tolerate any two disk
failures
Tolerate any two
disk failures
Repair
bandwidth
1G
2G
1.5 G
Locality
1
2
3
Not MDS
Nov 2016
MDS
kshum
13
SINGLE NODE FAILURE
Nov 2016
kshum
14
Information flow graph
In1


B
In2
In3
In4




Out1

Out2

Out3
Data
Collector
Out4
Dimakis et al., INFOCOMM, May, 2007
Nov 2016
kshum
15
Information flow graph (cont’d)
In1


B
In2
In3
In4
Nov 2016




Out1
Out2
Out3

In5

Out5




Out4
kshum
Data
Collector
16
Back to the opening example
In1


4
In2
In3
In4
Nov 2016
2
2
2
2
Out1
Out2
Out3
Out4

In5
2
Out5




2 + 2  4
1
kshum
Data
Collector
17
Different  (first cut)
In1


4
In2
In3
In4
Nov 2016
2
2
2
2
Out1
Out2
1
3

kshum
Out6

2
Out3
Out4
In6
2
Data
Collector
2 +  1+  2  4
18
Different  (second cut)
In1


4
In2
In3
In4
Nov 2016
2
2
2
2
Out1
Out2
1
2
In6
3
Out4
kshum
Out6


Out3
2
Data
Collector
2 +  1+  2  4
2 +  1+  3  4
19
Different  (third cut)
In1


4
In2
In3
In4
Nov 2016
2
2
2
2
Out1
Out2
1
In6
Out6
2


Out3
2
3
Out4
kshum
Data
Collector
2 +  1+  2  4
2 +  1+  3  4
2 +  2+  3  4
  1+  2+  3  3
20
Min-cut bound
For d  k
•
•
•
•
Nov 2016
B: total file size
: storage per node
d: total repair bandwidth
k: No. of connections from a
data collector
kshum



d

DC
k
21
Waterfilling interpretation

• Given
B, d, k, 
d
(d–1)
*
• Find the
minimum
storage *
(d–k+1)
Area = B
 d



Nov 2016
k
DC
1 2 … k
kshum
22
Tradeoff between storage and
bandwidth
0.4
•
•
•
•
0.38
0.36
Storage per node, 
0.34
0.32
B=1
n>16
k=4
d=16
0.3
0.28



0.26
0.24
d=16

0.22
0.2
0.2
Nov 2016
0.25
0.3
0.35
Repair bandwidth per failed node, d
kshum
0.4
DC
k=4
23
Waterfilling when  is large

• * = B/k
• Any combinations of
k nodes will contain
B bits.
d
d
(d–1)
(d-1)
(d–k+1)
* = B/k
Area = B
 d



Nov 2016
k
1 2 … k
DC
kshum
24
Minimum-storage regeneration (MSR)

0.32
d
0.31
Storage per node, 
0.3
•
•
•
•
B=1
n>16
k=4
d=16
0.29
0.28
0.27
0.26
Area = B
0.25
0.24
0.24
Nov 2016
0.25
0.26
0.27 0.28 0.29
Repair bandwidth, d
0.3
0.31
kshum
0.32
1 2 … k
25
Mincut

Out1
In1


d






S



DC

…
…
Inn
Nov 2016


k-1
Outn
kshum
26
Decreasing the repair bandwidth

0.32
d
0.31
Storage per node, 
0.3
•
•
•
•
B=1
n>16
k=4
d=16
0.29
0.28
0.27
0.26
Area = B
0.25
0.24
0.24
Nov 2016
0.25
0.26
0.27 0.28 0.29
Repair bandwidth, d
0.3
0.31
kshum
0.32
1 2 … k
27
Decreasing the repair bandwidth

0.32
0.31
d
Storage per node, 
0.3
•
•
•
•
B=1
n>16
k=4
d=16
0.29
0.28
0.27
0.26
Area = B
0.25
0.24
0.24
Nov 2016
0.25
0.26
0.27 0.28 0.29
Repair bandwidth, d
0.3
0.31
kshum
0.32
1 2 … k
28
Minimum-bandwidth regeneration (MBR)

0.32
0.31
Storage per node, 
0.3
d
•
•
•
•
B=1
n>16
k=4
d=16
0.29
0.28
0.27
0.26
Area = B
0.25
0.24
0.24
Nov 2016
0.25
0.26
0.27 0.28 0.29
Repair bandwidth, d
0.3
0.31
kshum
0.32
1 2 … k
29
Storage-bandwidth tradeoff curve
•
•
•
•
Min-Bandwidth regeneration (MBR)
0.5
0.45
Minimum Storage per node, 
0.4
0.35
0.3
B=1
n>12
k=10
d=12
0.25
0.2
0.15
0.1
0.05
0
0
Nov 2016
Min-Storage regeneration (MSR)
0.1
0.2
0.3
Repair bandwidth, d
0.4
0.5
kshum
30
AN EXPLICIT CONSTRUCTION FOR
EXACT-REPAIR MBR CODE
Nov 2016
kshum
31
An encoding scheme by Rashmi, Shah
Kumar and Ramchandran
•
•
•
•
9 information symbols: A, B, C, D, E, F, G, H, I
Parity-check symbol P = A+B+C+D+E+F+G+H+I
Any 9 symbols among A, B, C, D, E, F, G, H, I and P can recover the data.
Place the 10 symbols on the edges of the complete graph on 5 vertices.
1
A
D
C
B
G
2
5
I
F
E
3
Nov 2016
P
H
kshum
4
32
An encoding scheme by Rashmi, Shah,
Kumar and Ramchandran
•
•
•
•
Distribute the 10 encoded symbols to five nodes.
Each node is identified with a vertex in the graph.
A node stores the symbols on the edges incident with it.
Each node stores  = 4 symbols.
A, B, C, D
1
A
D
B
A, E, F, G
Each symbol is
replicated twice
C
G
2
5
D, G, I, P
I
F
E
B, E, H, I
Nov 2016
3
P
H
kshum
4
C, F, H, P
33
Repair
• If a node fails, request the lost symbols from the adjacent nodes.
• Repair bandwidth  = 4
A, B, C, D
1
D
A, E, F, G
2
D, G, I, P
5
G
P
I
4
3
C, F, H, P
B, E, H, I
Nov 2016
kshum
34
Repair (cont’d)
• If a node fails, request the lost symbols from the adjacent nodes.
• Repair bandwidth  = 4
A, B, C, D
1
D, G, I, P
A, E, F, G
B
5
2
E
I
3
H
C, F, H, P
B, E, H, I
Nov 2016
4
kshum
35
Decode
• Any two nodes share one common symbol.
• Any two nodes contain 7 distinct symbols.
• Any three nodes contain 9 distinct symbols.
A, B, C, D
1
D, G, I, P
A, E, F, G
5
2
3
B, E, H, I
Nov 2016
For example, nodes 1, 4, 5
contain A, B, C, D, F, G, H, I, P.
Compute E by
A+ B+ C+ D+ F+ G+ H+ I+ P
4
C, F, H, P
kshum
36
It attains the minimum-bandwidth
regeneration point
• File size B = 9.
• Any k = 3 nodes can decode
the original data
• A failed node download from
d = 4 nodes.
Nov 2016
kshum
37
MULTIPLE NODE FAILURES
Nov 2016
kshum
38
Multiple node failures
• Large-scale storage system
– Google data center
– 800000 servers, fail rate = 4% per year
– Repair in 2 days
– Mean number of failed servers in 2 days = 175.
• The lazy-repair policy in TotalRecall
– A repair process is triggered only after the number
of failed nodes has reached a certain threshold.
Nov 2016
kshum
39
Jointly repair multiple failures
Storage nodes
Newcomers
Data exchange
Can we further reduce the
repair-bandwidth?
Hu et al. (JSAC, Feb 2010)
Nov 2016
kshum
40
Multiple failures, separate repair
8 packets in total
4 packets per newcomer
A1
A2
A1, A2,
B1, B2
B1
B2
B1
B2
A1+B1
2 A2+B2
2 A1+B1
A2+B2
2 A1+B1
A2+B2
Nov 2016
kshum
41
Multiple failures, cooperative repair (II)
6 packets in total
3 packets per newcomer
A1
A2
A1, A2,
B1, B2
B1
B2
A1
A1+B1
B1
A2
2A2+B2
2A1+B1
B2
A1+B1
2 A2+B2
2 A1+B1
A2+B2
Nov 2016
kshum
A2+B2
42
Information flow graph

In1

S
In2
In3
In4
In5
Nov 2016
Out1


1
Out3
1
1

Out4

In6
Out2



1
1
2
Out6
Mid6
2
In7  Mid7

Out7

1

Out5
kshum
Data
Collector
43
Is this regenerating code optimal ?
6 packets in total
3 packets per newcomer
A1
A2
A1, A2,
B1, B2
B1
B2
A1
A1+B1
B1
A2
2A2+B2
2A1+B1
B2
A1+B1
2 A2+B2
2 A1+B1
A2+B2
Nov 2016
kshum
A2+B2
44
First cut

In1

B
Out1
In2
In3
In4
Out2

In6
1


Out3 1


1
In7
2
2


Mid7

Out7

1

Out4
B  4 1
Nov 2016
Out6
Mid6
kshum
Data
Collector
45
Second cut


Out1
1

Out2

In1
1 In
2
Out3 1
2
2

2
Mid1
Mid2
2
Out1

Out2
1
1
Data
Collector
1


Out4
In3
In4
B  2+1+ 2
Nov 2016
kshum
2
2

Out3
Mid3
Mid4

Out4
46
A linear programming problem
• Minimize 21+ 2 (repair bandwidth)
• Subject to
4  41
4  2+1 + 2
1 , 2  0
2
1
 1  1  1 + 2  2
Nov 2016
1
1
 At least 3 packets
kshum
47
MBCR and MSCR
140
Storage per node
Minimum bandwidth
135
cooperative repair (MBCR)
130
125
120
One-by-one repair
115
110
Cooperative repair
105
100
120
130
140
150
160
170
180
Repair bandwidth per failed node
Minimum storage
cooperative repair (MSCR)
Nov 2016
kshum
48
MBCR and MSCR (cont’d)
• Repair t nodes cooperatively
• Minimum storage cooperative regeneration
• Minimum bandwidth cooperative
regeneration
Nov 2016
kshum
49
How much can we improve?
500
One-by-one repair
490
Storage per node, 
File size = 2275
d = 30
k=5
480
When d is large,
joint repair does not have
significant advantage over
one-by-one repair.
470
460
450
480
d

Repairing 10 newcomers jointly
490
500
510
520
530
540
Repair bandwidth per failed node
550
k
DC
Nov 2016
kshum
50
How much can we improve?
Storage per node, 
200
190
File size = 616
d=8
k=4
One-by-one repair
180
170
160
150
180
200
220
240
Repair bandwidth per failed node
Repairing 10 newcomers jointly
d

260
Repair-bandwidth reduction
is more prominent
when d is not so large.
k
DC
Nov 2016
kshum
51
AN EXPLICIT CONSTRUCTION FOR
MINIMUM-BANDWIDTH
COOPERATIVE REPAIR
Nov 2016
kshum
52
Existing Constructions of
cooperative regenerating code
Type
Code parameters
Reference
MBCR n=d+t, d=k, t1
Shum and Hu (2011)
MBCR n=d+t, dk, t1
Jiekak and Le Scouarnec
(2012)
MBCR nd+t, dk, t1
Wang and Zhang (2013)
MSCR
n=d+2, k=t=2
Le Scouarnec (2012)
MSCR
n=2k, d=n-t, k2,
t=2
Chen and Shum (2013)
MSCR
Chen and Shum (2013)
n=2k, d=n-t, kt2
(repair of systematic
nodes only)
Nov 2016
kshum
n: no. of storage nodes
k: k-out-of-n property
d: no. of surviving nodes
contacted by a newcomer
t: no. of new nodes
repaired cooperatively
MBCR: minimum bandwidth
cooperative regeneration
MSCR: minimum storage
cooperative regeneration
53
An explicit construction for MBCR
Require d = k, t = n – d
•
•
•
•
•
•
B = 8 information
packets
n = 4 nodes
Each node stores 5
packets.
Repair t = 2 failures
simultaneously
No. of connections
for each DC = k=2
No. of helpers for
each failed node =d=2
Nov 2016
(S., Hu, ISIT 2011.)
• Minimum repairbandwidth
• Storage per node
kshum
54
Min-Bandwidth point
6
Storage per node
5.5
5
4.5
4
Repairing 2 new nodes cooperatively
3.5
5
Nov 2016
5.5
6
6.5
7
7.5
8
Repair bandwidth per failed node
kshum
8.5
9
55
Data Distribution
XOR
A, B, C, D, F+G
C, D, E, F, H+A
8 data packets:
A, B, C, D, E, F, G, H
E, F, G, H, B+C
G, H, A, B, D+E
5 packets: 4 systematic, 1 parity-check
Nov 2016
kshum
56
Data collection
A, B, C, D, F+G
C, D, E, F, H+A
Data
collector
E, F, G, H, B+C
A,B,C,D,E,F,G,H
G, H, A, B, D+E
Nov 2016
kshum
57
Data collection
A, B, C, D, F+G
Data
collector
C, D, E, F, H+A
ABCDEFGH
A
B
C
D
E
F
F+G
H+A
E, F, G, H, B+C
G, H, A, B, D+E
Nov 2016
kshum
58
Exact Repair
A, B, C, D, F+G
How to
repair?
A B C D F+G
C, D, E, F, H+A
B+C
E, F, G, H, B+C
F+G
E F G H B+C
G, H, A, B, D+E
Total repair-bandwidth=10
Nov 2016
kshum
59
Exact Repair
How to
repair?
A, B, C, D, F+G
C, D, E, F, H+A
C D D+E
E F H+A
E
E, F, G, H, B+C
F
FF G H B+C
EF+G
G, H, A, B, D+E
Total repair-bandwidth=10
Nov 2016
kshum
60
References
• A. G. Dimakis, P. B. Godfrey, M. J. Wainwright and K. Ramchandran,
Network coding for distributed storage system, INFOCOM, May, 2007.
• Y. Wu and A. G. Dimakis, Reducing repair traffic for erasure coding-based
storage via interference alignment, ISIT, Jul, 2009.
• Y. Hu, Y. Xu, X. Wang, C. Zhan and P. Li, Cooperative recovery of distributed
storage systems from multiple losses with network coding, J. Sel. Area
Comm., vol. 28, no. 2, Feb, 2010.
• A.-M. Kermarrec and N. Le Scouarnec and G. Straub, Repairing Multiple
Failures with Coordinated and Adaptive Regenerating Codes, Netcod, Jul,
2011.
• K. W. Shum and Y. Hu, Cooperative Regenerating Codes, Trans. Information
Theory, 2013.
Nov 2016
kshum
61
Summary
• Regenerating codes is a class of erasure codes
with
– (n,k)-recovery property
– Tradeoff between repair-bandwidth and storage
• Cooperative regenerating codes repair
multiple failures jointly
– Not so many constructions for cooperative
regenerating codes
Nov 2016
kshum
62