CSE Buffalo

Soft decoding, dual BCH codes,
& better -biased list decodable
codes
Venkat Guruswami (U. of Washington)
Atri Rudra (U. at Buffalo)
Error-correcting codes
C(x)
x

Mapping C : kn

Dimension k, block length n



n≥ k
Rate = k/n  1
Efficient means polynomial in n


y
Decoding Complexity
 : fraction of errors
x
Give up
Codes for Complexity


Binary codes
Correct =½ -  fraction of worst case errors


  0 ( can depend on n)
Unique decoding

Cannot correct more than ¼ fraction of errors
The list decoding problem
Given a code and an error parameter 
For any received word w
Output all codewords c such that
c and w disagree in at most  fraction of places


(,L)-list decodable code
(½-,poly(n))-list decodable code exist!
Applications of list decoding

Hardcore predicates from one way functions
[Goldreich-Levin 89; Impagliazzo 97; Ta-Shama-Zuckerman 01]

Worst-case vs. average-case hardness [Cai-PavanSivakumar 99; Goldreich-Ron-Sudan 99; Sudan-Trevisan- Vadhan 99;
Impagliazzo-Jaiswal-Kabanets 06]

Pseudorandomness [Trevisan 99; Shaltiel-Umans 01; TaShama-Zuckerman 01; Guruswami-Umans-Vadhan 07]


Membership-Comparable Sets [Sivakumar 99]
Approximating NP-Witnesses [Gal-Halevi-Lipton-Petrank
99, Kumar-Sivakumar99]
5
Approximating NP-witness

L is NP-complete



Polytime computable relation R
It is NP-hard to compute y given x
Approximating the certificate?


Compute w such that y and w differ in  (½-)|y|
positions
[Kumar-Sivakumar 99] NP-hard to approximate
x  L   y such that R(x,y) holds
Kumar-Sivakumar reduction

NP-hard to do approximation for relation R’



C is list decodable from ½- fraction of errors
Main idea: Replace y with C(y)
Keep track of
required
R’ is also polytime computable

Decode C from zero error in poly time
properties of C
x  L   z such that R’(x,z) holds
R’(x,z) holds if  y s.t. R(x,y) holds and z=C(y)
x  L   y such that R(x,y) holds
Kumar-Sivakumar reduction


Assume can compute w s.t. w and z differ in
 (½-)|z| positions in poly time
C is
Run list decoder for C on w
(½ -,poly(|z|)


Returns a list y1,…., ym
Check if for any i, R(x, yi) holds
list decodable
C can be list
decoded in
polytime
Can check if x in L in poly time!
 R’(x,z)
Assume
x in Lif  y s.t. R(x,y) holds and z=C(y)
holds

Some possible concerns…



R’ is the not the same as R!
[Gal-Halevi-Lipton-Petrank 99] considers hardness of
approximation for the same R
Crucially uses properties of R and L
Required properties of C




C : {0,1}k  {0,1}n
n = poly(k,1/)
C can be list decoded from (½-)n errors in
poly(n) time
C can be constructed in poly(n) time

Not necessary in the NP-witness application
Given these constraints, how small can n be?
Binary list decodable code

Say n is O(kb/a)


a and b are constants
For NP-witness application, smallest   1/n1/a
What is the smallest a, given b is some constant?

(½-,poly(n))-list decodable code with n in
O(k/2) exist


a cannot be < 2
Existence proved via random coding argument
Efficient list decodable binary codes

a = 4, b = 2; i.e. n in O(k2/4)


[Guruswami, Sudan 00]
The code is explicit


-biased code



Hadamard “concatenated” with Reed-Solomon code
All non-zero codewords have [½-,½+] frac. of 1s
Important pseduorandom object
a=3, b=1; i.e. n in O(k/3)


[Guruswami, R. 06+07]
Construction and decoding time > n1/
Only useful for
constant 
Our main result

a = 3+g, b = 3; i.e. n in O(k3/3+g)


g>0
The code is explicit




Dual BCH “concatenated” with “Folded” Reed-Solomon
-biased code
All algorithms run in time poly(k,1/)
More or less recovers the a=3 result

[Guruswami, R. 06+07]
(Folded) Reed-Solomon codes

View mesg. as f(X) = m0+m1X+…+mk-1Xk-1


F* ={1,g1,g2,…,gN-1}
(q=N+1=|F|)
Reed-Solomon (RS) codeword
Optimal list decodable
properties for large
alphabet
f(1) f(g2) f(g3) f(g4)

f(gN-1)
s-order Folded RS codeword (N=ms)
f(1)
f(g)
f(gs)
f(gs-1)
f(g2s-1)
f(gs+1)
f(g(m-1)s )
f(g(m-1)s+1 )
f(gms-1)
How do we get binary codes ?



Concatenation of codes [Forney 66]
C1: (GF(2k))K  (GF(2k))N (“Outer” code)
C2: GF(2)k  (GF(2))n
(“Inner” code)

C1C2: (GF(2))kK (GF(2))nN

Typically k=O(log N)

Brute force decoding for inner code
m1
w1
C2(w1)
w2
C2(w2)
m2
m
mK
wN
C1(m)
C2(wN) C  C (m)
1
2
List Decoding concatenated code

One natural decoding algorithm



Divide up the received word into blocks of length n
Find closest C2 codeword for each block
Run list decoding algorithm for C1
Loses
Information!
List Decoding C2
y1
y2
2GF(2)n
yN
2GF(2)k
T1
T2
How do we “list
decode” from lists ?
TN
The list recovery problem
Given a code and an error parameter 
For any set of lists T1,…,TN such that
|Ti|  t, for every i
Output all codewords c such that
ci 2 Ti for at least 1- fraction of i’s

List decoding is special case with t=1
List Decoding C1C2
y1
y2
yN
List
decode
C2
T1
T2
TN
List Recovering Algorithm for C1
[Guruswami, R. 06] Result

Pick C1 to be folded RS of rate 



Has optimal list recoverability
Pick C2 to be “suitably” chosen binary code of
rate 2
C1C2 List decodable from ½- frac of errors
A Closer look…
y1
y2
a1
at
T1
T2
Dist(C2(a1),y1)  ½-
Dist(C2(a2),y1)  ½-
What if
Dist(C2(a1),y1) <<
Dist (C2(a2),y1)?
yN
TN
List Recovering Algorithm for C1
List
decode
C2 from
½-
frac of
errors
Weighted List Recovery
2{a1,…,aq}
y1
y2
a1
w1,1
w2,1
wN,1
aq
w1,q
w2,q
wN,q
T1
T2
TN
yN
List Recovering Algorithm for C1
Soft Decoding
Output
codeword
s w/ small
weighted
distance
List Recovery as a special case
wi,j
½
y1
y2
yN
½-
List
decod
e C2
T1
T2
TN
List Recovering Algorithm for C1
1
Dist(C2(ai),yj)
A natural weight function

Give more weight to “closer” symbols


List decode from ½- frac of errors
C1 is RS and C2 is Hadamard

Hadamard codewords are
evaluations of linear
functions
wi,j
½
½
1
Dist(C2(ai),yj)
RS concat Hadamard

Need to show the following for every j:





N  O(K/ 2)
n= q


i wi,j2  O(1)
RS code of rate 2
Follows from Parseval’s identity
Uses properties of
Hadamard code
wi,j
½
Hadamard code
Final length = nN = N2
 O(K2/ 4) [GS00]
½
1
Dist(C2(ai),yj)
Using order s folded RS as outer code

Need to show the following for every j:




i wi,js+1  O(1) ….. (*)
s≥1
Folded RS of rate 1+1/s
wi,j
Hadamard as inner code


(*) is true
n = qs = Ns

As Folded RS alphabet
size is qs
Too large 
½
½
1
Dist(C2(ai),yj)
Using dual BCH as inner code

Need to show the following for every j:




i wi,js+1  O(1) ….. (*)
s≥1
Folded RS of rate 1+1/s
wi,j
Dual BCH as inner code



(*) from [Kaufman-Litsyn 05]
n  N2 suffices
Final length = N3
 O(k3/3+3/s)
½
½
1
Dist(C2(ai),yj)
Some comments

Final codes are -biased

Dual BCH have non-zero weights close to ½



Dual BCH is a generalization of Hadamard
Our result needed to generalize both



Weil-Carlitz-Uchiyama bound
Outer code (RS  folded RS)
Inner code (Hadamard  dual BCH)
Can replace folded RS with Parvaresh-Vardy
codes
Open questions

Improve on the cubic dependence on 



Needs new algorithmic ideas
Use information about outer codes while decoding
inner codes
Rate 2 using concatenated codes possible


[Guruswami, R. 08]
Any application that also uses -biasedness?