Sub Graph Approach In Iterative Sum-Product Algorithm

Turbo – Coding – 2006
·
April 3–7, 2006, Munich
Sub Graph Approach In Iterative Sum-Product Algorithm
Muhammet Fatih Bayramog̃lu1 , Ali Özgür Yılmaz2 , Buyurman Baykal3
1
Dept. of Electrical and Electronics Eng. Middle East Technical University, [email protected]
2
Dept. of Electrical and Electronics Eng. Middle East Technical University, [email protected]
3
Dept. of Electrical and Electronics Eng. Middle East Technical University, [email protected]
Abstract
A new scheduling algorithm to the iterative sum-product algorithm, which is called sub-graph scheduling, will be
presented in this paper. The propesed algorithm provides a schedule which has a higher convergence rate than the
iterative sum-product algorithm while keeping the complexity of one iteration withot degrading the performance.
Our method also gives an explanation to the fact that turbo decoders have faster convergence rate than LDPC
decoders.
1
Introduction
The sum-product algorithm is a very efficient tool for
solving the marginalization problem of a global function that can be factored into local functions which have
smaller number of arguments [1]. This algorithm is
usually used for obtaining marginal probability density
functions (pdf) from a given joint pdf. The sum-product
algorithm is actually a generalization of some well
known algorithms in many different areas of science
such as communications and artificial intelligence [1].
The well known BCJR algorithm [3], Pearl’s belief
propagation algorithm [4], Gallager’s probabilistic decoding algorithm of LDPC codes [5], the turbo decoding algorithm [6] are all instances of the sum-product
algorithm[1].
The sum-product algorithm is a message passing algorithm which operates on factor graphs. Factor graphs
figures out the factorization of the global functions
into local functions which contain less number of arguments. The classical sum-product algorithm determines
the schedule of the nodes of the factor graph for sending
their messages. After two messages pass through an
edge in both directions message passing terminates and
the sum-product algorithm provides the exact marginal
functions.
However, if the factor graph contain loops then
obtaining a message calculation schedule according
to the classical sum-product algorithm is impossible.
In this case an iterative version of the sum-product
algorithm is used. If the iterative sum-product algorithm
requires large number of iterations to converge then
its complexity may be prohibitive for power critical
applications. If the standard message update schedule
of the iterative sum-product algorithm is modified,
different convergence characteristics can be obtained.
In [11] a shuffled version of the iterative sum-product
algorithm is proposed for decoding LDPC and turbo
codes. In this algorithm a parity check node calculates
its message more than once during an iteration. Although this algorithm improves the convergence rate, it
also increases the complexity of one iteration.
The aim of the sub-graph scheduling is improving
the iterative sum-product algorithm in convergence rate
aspect. The main idea of the sub-graph scheduling is
applying the divide-and-conquer strategy to the iterative sum-product algorithm. In sub-graph scheduling a
loopy factor graph is divided into sub-graphs which are
loop-free. Then the classical sum-product algorithm,
which supplies the exact marginal functions very efficiently, runs on these loop-free sub-graphs. By this
way each sub-graph generates a portion of the total
result. Since these partial results are calculated by using
only a portion of the total factor graph, they are not
very accurate. In order to improve accuracy, each subgraph runs again iteratively by using the results of
the other sub-graphs as a priori information. Hopefully
the algorithm will converge to a point in less number
of iterations than the iterative sum-product algorithm.
An important property of the sub-graph sum-product
algorithm is that, its complexity of one iteration is
exactly equivalent to that of the standard iterative sumproduct algorithm.
In [12] an implementation oriented method for decoding LDPC codes is presented. The proposed method
also resulted in higher convergence rate. The message
update schedule used in [12] can be obtained automatically as an instance of the sub-graph sum-product
algorithm. Sub-graph sum-product algorithm can also
provide better converging schedules very easily.
This paper is organized as follows. Next section is
a brief introduction to the factor graphs and the sumproduct algorithm. Sub-graph approach is explained in
Section 3. In Section 4 some of the numerical results
are presented and the paper is concluded in the final
section.
Turbo – Coding – 2006
2
Factor Graphs and the SumProduct Algorithm
In this section factor graphs and the sum-product algorithm is summarized. The main aim of this section
is not explaining these concepts in detail but just an
introduction. For further information [1] and [2] are
authoritative references.
2.1 Factor Graphs
Factor graphs represent the factorization of a global
function into local functions which have less number
of arguments [1], [2]. They can be defined formally as
follows:
Definition 2.1: Factor Graph: Let X be a set of
random variables x1 , x2 , . . . , xn . Let a function of these
variables, f (X) , can be factored into functions of Xi ,
gi (Xi ), where Xi s are subsets of X. In other words:
Y
f (X) =
gi (Xi ).
(1)
Then a factor graph is defined as a bipartite graph consisting of factor nodes assigned to each factor function
gi (Xi ) and variable nodes assigned to each variable xi .
There is an edge between the ith factor node and k th
variable node if and only if xk ∈ Xi .
Note that in this definition, a priori information is
also represented with a factor node.
2.2 The Sum-Product Algorithm
The sum-product algorithm provides the exact marginal
functions of a global function if it is represented
by a loop-free factor graph. Sum-product algorithm
determines the message calculation schedule and the
content of the messages formally. In this section we will
summarize the important aspects of the sum-product
algorithm. For a complete explanation refer to Section
II.C of [1].
One of the most important properties of the messages of the sum-product algorithm is the fact that
the message from a node p to node r, is independent
from the message that p receives from r. This fact also
determines the scheduling of the message calculation.
Messages will be calculated when they are able to be
calculated. This means that a node p will wait until
all of the neighbors except one of them, q, send their
messages. At this instant, node p can calculate and
send its message for q. After p sends its message, it
starts to wait a message from q which is required for
calculating the messages for all of the other neighbors.
This scheduling ends when two messages pass through
an edge in both directions.
2.2.1 The Iterative Sum-Product Algorithm
The schedule explained in Section 2.2 fails if there is a
loop in the factor graph. This problem is solved by
making the algorithm iterative. In the iterative sumproduct algorithm, initially all of the variable nodes
·
April 3–7, 2006, Munich
send empty messages. These initial messages enables
all of the factor nodes to send their messages. Then
again variable send their messages and this process goes
on iteratively until a stop condition is satisfied.
3
Sub-Graph Approach
Sub-graph approach determines a schedule of execution
of factor nodes in loopy factor graphs by applying the
divide and conquer strategy. Sub-graph scheduling may
improve convergence rate and/or accuracy characteristics of the iterative sum-product algorithm. In this
section we will explain the details sub-graph approach.
Basically divide and conquer strategy consists of
three steps. The first step is dividing the problem into
smaller pieces which are expected to be solved easily.
The second step is solving these smaller problems. The
final step is combining the results obtained from smaller
pieces to form the result of the whole problem.
The factorization equation should be reconsidered
for dividing the marginalization problem into smaller
pieces. Assume that the global function f (X) is factored into n functions, g1 (X1 ), g2 (X2 ), . . . , gn (Xn ),
and R1 , R2 , . . . , Rk are mutually exclusive subsets
of
S
the set of numbers from 1 to n such that ∀i Ri =
{1, 2, . . . , n}. Then we can modify the Equation (1) as
below:
n
Y
gi (Xi )
f (X) =
i=1
=
Y
gi (Xi )
i∈R1
=
k
Y
Y
gi (Xi ) . . .
i∈R2
Y
gi (Xi )
i∈Rk
(2)
ri (X̂i )
i=1
where X̂i =
S
l∈Ri
Xl and
ri (X̂i ) =
Y
gl (Xl ).
(3)
l∈Ri
Equation (3) shows that ri (X̂i )’s are functions which
can be factored into functions of smaller number of
arguments. Thus, factorization of ri (X̂i )’s can be represented with factor graphs. The original factor graph,
which represents the marginalization problem of f (X)
is the superposition of these newly formed small factor
graphs, which will be called “sub-graphs” from now
on. In other words the problem has been divided into
pieces of the same kind.
The second step of the divide and conquer strategy
is solving the problem for these smaller pieces. If the
Ri ’s are picked in such a way that the resulting ri (X̂i )’s
can be represented with loop-free sub-graphs, then
solving the marginalization problem for the small subgraphs is easy. The classical sum-product algorithm,
can calculate the marginal densities efficiently for these
loop-free sub-graphs.
Turbo – Coding – 2006
·
April 3–7, 2006, Munich
S3
A Sub-Graph
S1
S2
S5
Other Sub-Graphs
S4
Fig. 1. Pictorial demonstration of an iteration in the sub-graph sumproduct algorithm
D
Fig. 2.
A simple flow graph of sub-graphs. In this figure Si s
represent subgraphs and D represents a delay element.
We propose an iterative method for the final step
of the divide and conquer strategy which is depicted
in Figure 1. Initially a sub-graph operates and its
result is supplied to the other sub-graphs. Other subgraphs will use this result as a priori information and
generate a result. Then the first sub-graph uses the
information generated by the other sub-graphs as a
priori information in the next run and hopefully it will
generate a more accurate result which will be used by
other sub-graphs as a priori information. This more
accurate a priori information will improve the accuracy
of the output of other sub-graphs. This process goes
on iteratively until a stop condition is satisfied. We call
this iterative algorithm as the “sub-graph sum-product
algorithm”.
In the light of these explanations a sub-graph can be
defined formally as follows.
Definition 3.1: Sub-graph: A sub-graph, represented
by S, is a region of a factor graph which does not
contain any loop, and if a factor node p is included in
a sub-graph S all of the variable nodes connected to p
should be included in S.
A given factor graph can be partitioned into subgraphs in many different ways. Therefore, we should
also define what kind of partitioning is proper.
Definition 3.2: Proper partitioning of a factor graph:
A partitioning of a factor graph is proper if and only if
all of the factor nodes in the factor graph is contained
in a sub-graph, and only one sub-graph.
A question that may arise is whether a proper partitioning exists for every factor graph. The answer is
positive. First of all, there exists a trivial partitioning,
in which every sub-graph is composed of just a single
factor node and all of the variable nodes connected to
it. In other words every subset Ri = {i}. Depending on
the factor graph, other partitioning schemes may exist.
Different partitioning methods will result in different
convergence and accuracy characteristics.
3.1 Block Diagram Representation of SubGraphs
Block diagrams usually provides simple representations
for complex problems in different areas of science.
Therefore, we employ block diagrams for explaining
the sub-graph sum-product algorithm. The blocks are
connected in cascade or in parallel or a mixture of these
two to form a directed flow graph. Since our algorithm
is an iterative one, we should represent the iterations.
An iteration will be represented by feeding the output of
the flow graph back to input of the flow graph through
a delay element. Figure 2 shows a simple flow graph
of sub-graphs.
In order to simplify the notation we enforce every
variable node to be included in every sub-graph. Note
that this enforcement does not conflict with definitions
3.1 and 3.2.
The signals flowing through the edges of the flow
graph are vectors of density functions. Since every
variable node in the factor graph is included in every
sub-graph, the vectors contain density functions of the
all variable nodes included in the whole factor graph.
We represent these vectors with capital boldface letters
with a ( e ) sign on top of them. For instance, (eI)j (xj )
is a density function of variable xj which is the j th
component of the vector eI.
We define two operators on these vectors in order to
simplify the representation while explaining the details
of the sub-graph sum-product algorithm. The first one
of these operators is the separation operator. Separation
operator is represented with ® and defined as follows:
e ® B)
e j (xj ) = (A)
e j (xj )/(B)
e j (xj ).
(A
(4)
e ⊗ B)
e j (xj ) = (A)
e j (xj )(B)
e j (xj ).
(A
(5)
The second one of these operators is the combination
operator and represented by ⊗. Definition of the combination operator is as follows:
3.2 Structure and Operation of a SubGraph Block
When a sub-graph block is executed its basic job is to
distribute the received information to variable nodes as
a priori information, then run the classical sum-product
algorithm, and finally combine the results obtained
from variable nodes to form the output vector. As it is
shown in [14], the input received by a sub-graph from
other sub-graphs also contains information generated
by itself during the previous iteration. This part of
the information should be separated from the input.
Therefore, a sub-graph should have a state, and store
its generated information to use it in the next iteration.
If we represent the input vector to a sub-graph with
eI and the state of that sub-graph with X
e then we can
separate the information generated by the current subgraph during the previous iteration by Equation 6.
e = eI ® D(X)
e
R
(6)
Turbo – Coding – 2006
Fig. 3.
Structure of a sub-graph block
where D(.) is the delay operator. In turbo decoding
e is the extrinsic information
terminology the vector R
generated by all other sub-graphs.
e should be used as a priori inComponents of the R
formation by the current sub-graph. As it is mentioned
in Section 2.1, a priori information about a variable
node can be represented by a factor node which is
connected to just that variable node. Therefore, we add
artificial factor nodes, one for each variable node, to
handle the a priori information received by the subgraph. Since these artificial factor nodes are connected
to just one variable node the sub-graph is still loop-free.
Next task of a sub-graph block is operating the
classical sum-product algorithm on its loop-free factorgraph. After sum-product algorithm finalizes, the outputs generated at the variable nodes are combined to
form the output vector of the sub-graph block which is
e
represented with O.
e is a combination of the information
The vector O
generated by the current sub-graph and information
received by current sub-graphs. The information generated by the current sub-graph should be stored in the
state of the sub-graph block in order to use it in the next
iteration. This state can be obtained by the following
equation.
e =O
e ®R
e
X
(7)
·
April 3–7, 2006, Munich
Fig. 4.
Cascaded connection of two sub-graphs
Fig. 5.
Parallel connection of two sub-graphs
branch. In other words,
eI1 = eI2 = eI.
(8)
Then the output of the parallel branch is given by the
equation
e =O
e1 ⊗ O
e 2 ® eI.
O
(9)
In [14] it is showed parallel connection operation of
sub-graphs is associative. This associativity property
allows us to connect any number of sub-graphs in
parallel.
3.4 Partitioning Types
The operation of a sub-graph block is demonstrated in
Figure 3.
A loopy factor graph may be partitioned into sub-graph
properly in many different ways. One of the criteria in
determining the partitioning of a factor graph into subgraph is the accessibility property of sub-graphs. What
we mean by accessibility is how many different factor
nodes can be accessed from a factor node in a subgraph.
The two extreme type of sub-graphs in the sense of
accessibility are all-accessible and non-accessible subgraphs. In an all accessible sub-graph there exist a path
between any two factor nodes. On the contrary, in an
non-accessible sub-graph there are no paths between
any two factor nodes. Figure 6 shows the all-accessible
and the non-accessible partitionings together with the
trivial partitioning on a small factor graph.
3.3 Connection Types
3.5 Complexity Issues
The sub-graph blocks, whose operation is explained
in the previous section, is connected to each other
basically in two different types. These connection types
are cascade and parallel connection. Using these two
connection types mixed forms can be obtained.
In cascade connection type output of a sub-graph
block is applied as input of the next sub-graph block
e 1.
as shown in Figure 4. In this figure eI2 = O
Parallel connection of sub-graph blocks is a little bit
more complicated. Figure 5 depicts parallel connection
of two sub-graphs. In this case both of the input vectors
of sub-graphs is equal to the input of the parallel
Complexity of the classical sum-product algorithm is
determined by the number messages calculated since
message calculation is the most processing power demanding process of the algorithm. We can equivalently
say that complexity is proportional to the number of
edges since two messages pass through an edge during
the algorithm.
Similarly, complexity of the iterative sum-product
algorithm is also proportional to the number of edges.
However, since two messages are calculated per an
edge during every iteration, the complexity of the
single iteration of the iterative sum-product algorithm is
Turbo – Coding – 2006
p1
p2
p3
·
April 3–7, 2006, Munich
xk’
y1k’
p4
Σ
v3
D
+
Λ1ext’(xk)
v4
y2k’
(a) The original
factor graph
Sub-Graph 1
Sub-Graph 2
p2
p1
v1
v2
v1
D
Sub-Graph 3
p3
v3
v1
v2
Sub-Graph 4
p4
v3
v3
Sub-Graph 1
Sub-Graph 2
p3
v1
v2
p2
v3
v4
v1
p4
v3
v4
(c) All-accessible partitioning
Sub-Graph 1
p1
v1
Sub-Graph 2
p4
v2
v3
v4
Sub-Graph 3
p2
v1
p3
v3
v2
v3
v4
(d) Non-accessible partitioning
Fig. 6.
graphs
+
+
Σ -
D
π
Λ2ext’(xk)
Decoder 2
Λ2(xk)
Σ
+
-
π−1
Decoding
Output
Λ2(xk)
Modified turbo decoding schematic
3.6 Turbo and LDPC Decoding Algorithms
as Instances of the Sub-Graph SumProduct Algorithm
v4
(b) The trivial partitioning
p1
Fig. 7.
Σ
Λ2(xk)
v2
-
-
Λ2rec(xk)
v1
Λ1(xk)
Decoder 1
Λ1rec(xk)
Different choices of partitioning a factor graph into sub-
equivalent to the total complexity classical sum-product
algorithm.
The complexity of the sub-graph sum-product algorithm is proportional to the total number of edges in
the sub-graphs. In Definition 3.2 it is ensured that all
of the variable nodes connected to a factor node are
also included in the same sub-graph. Hence, all of the
edges connected to a factor node are guaranteed to be
included in the same sub-graph with the factor node.
Definition 3.2 also states that a factor node should be
included in one and only one sub-graph. If these two
facts are combined then it can be deduced that an edge
in the original factor graph is included once and only
once in a sub-graph. Therefore, the number of edges in
the original factor graph is equal to the total number
of edges in the all sub-graphs. Hence, the complexity
of the one iteration of the sub-graph sum-product algorithm is equivalent to the one iteration of the iterative
sum-product algorithm. This means that, if the number
of iterations required for the sub-graph sum-product
algorithm is less than the number of iterations required
for the iterative sum-product algorithm then the subgraph sum-product algorithm has less complexity.
The two capacity approaching codes, turbo and LDPC
codes, use sum-product algorithm in their decoders [1],
[7]. It can be shown that these two decoders apply subgraph scheduling in their decoders.
The classical structure of a turbo decoder which is
first presented in [6] can be represented equivalently as
in Figure 7. The component decoders shown in Figure
7 resembles the sub-graph block structure which is
depicted in Figure 3. The component decoders employs
BCJR algorithm [6], [3] which is a sum-product algorithm running on a loop-free factor graph. Therefore,
we may deduce that the component decoders of the
turbo decoding algorithm are sub-graphs. Together with
their peripherals component decoders form a sub-graph
block. Finally we may conclude that the turbo decoding
algorithm employs sub-graph scheduling with two subgraphs connected in cascade.
Factor graph or Tanner graph representation of LDPC
codes is very popular and the most widely used LDPC
decoding algorithm is the iterative sum-product algorithm that runs on the factor graph that represents the
LDPC code. This algorithm can also be derived as
an instance of the sub-graph sum-product algorithm.
The LDPC decoding algorithm is parallel connection
of non-accessible sub-graphs.
4
Numerical Results
Although our algorithm can be applicable to any problem which can be represented with a loopy factor graph,
We have simulated the sub-graph sum-product algorithm for LDPC decoding. We have used four different
structured forms of sub-graph sum-product algorithm
for simulations. The first one of these algorithms is
the cascade connection of non-accessible (CAS-NON)
sub-graphs which is equivalent to the method proposed
in [12]. The second one is the parallel connection
of the non-accessible (PAR-NON) sub-graphs. Note
that this form is equivalent to the standard LDPC
decoding algorithm. The third and fourth forms are
the cascade and parallel connections of all-accessible
(CAS-ALL,PAR-ALL) sub-graphs respectively.
Turbo – Coding – 2006
Frame Error Rate (FER)
0.1
CAS-NON
PAR-NON
CAS-ALL
PAR-ALL
0.001
0
0.5
1
1.5
2
Eb
SNR:10 log( N
)
0
Number of Iterations
Fig. 8.
April 3–7, 2006, Munich
graphs converge around 10% less iterations than the
decoders formed by non-accessible sub-graphs. The
decoder which is abbreviated by “CAS-ALL” converges
55% faster than the standard LDPC decoder while
maintaining the accuracy. In other words complexity
is decreased without sacrificing from FER vs. SNR
performance.
1
0.01
·
FER. vs. SNR characteristics of all of the four decoders.
45
40
35
30
25
20
15
10
5
0
CAS-NON
PAR-NON
CAS-ALL
PAR-ALL
5
Conclusion
In this paper we have presented a new scheduling
algorithm for sum-product algorithm on loopy factor
graphs. Experimental results demonstrated that our algorithm may increase the convergence rate of the sumproduct algorithm while keeping the complexity of one
iteration constant. Our algorithm also gives an answer
to the question why turbo codes converges faster than
LDPC codes. With better partitioning algorithms and
different connection forms performance of the subgraph sum-product algorithm can be increased even
further.
References
1
2
3
4
5
6
7
8
9
10
Eb
SNR:10 log( N
)
0
Fig. 9. Average number of iterations done by decoders at different
SNR’s.
The code we have used is a randomly generated
regular LDPC code of block length 2000 and rate
R = 1/2. Our code has a column weight of 3. We have
modulated the codewords with the binary phase shift
keying modulation and passed the modulated signals
through additive white Gaussian noise channel.
Figure 8 presents the frame error rate (FER) vs.
signal to noise ratio characteristics of the four decoders.
This result shows that all of the four decoder have
almost equivalent accuracy as expected.
In Figure 9 we have plotted the average number of
iterations required to converge to the correct solution. In
this result we did not take into account the codewords
which did not converge. The first observation that can
be acquired from this result is that, cascade connected
decoders require 50% less iterations to converge than
their parallel connected counterparts. This observation
also explains why turbo-decoding algorithm converges
faster than LDPC decoding algorithm if it is recalled
that turbo decoding algorithm is cascade connected
algorithm whereas the LDPC decoding algorithm with
standard scheduling is a parallel connected algorithm.
This explanation closes the gap between LDPC and
turbo codes one step further.
Another observation that can be made is that, the
decoders which are formed by all-accessible sub-
[1] F. R. Kschischang, B. J. Frey, and H. A. Loeliger, “Factor
Graphs and the Sum-Product Algorithm”, IEEE Transactions
on Information Theory, vol.47, No.2,pp.498-519 February 2001
[2] H. A. Loeliger, “An Introduction to Factor Graphs”, IEEE
Signal Processing Magazine, Vol. 21, Issue 1,pp.28-41 Jan.
2004,
[3] L. R. Bahl, J. Cocke, F. Jelinek, and J.Raviv, “Optimal decoding
of linear codes for minimizing symbol error rate” IEEE Transactions on Information Theory, vol. IT-20, pp.284-287, Mar.
1974
[4] J. Pearl, “Probabilistic Reasoning in Intelligent Systems : Networks of Plausible Inference”,Morgan Kauffman Publishers,
San Mateo, CA, 1998
[5] R. G. Gallager, “Low-density Parity-check” Codes, M.I.T.
Press, 1963.
[6] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon
Limit error correcting coding and decoding: Turbo Codes” in
Proc. 1993 IEEE Int. Conf. Communications, Geneva, Switzerland, May. 1993, pp. 1064-1070
[7] R. McEliece, D. MacKay, and J. Cheng, “Turbo decoding as
an instance of pearl’s ‘belief propagation’ algorithm,” IEEE J.
Select. Areas Commun., vol. 16, pp. 140–152, Feb. 1998.
[8] D. J. C. MacKay, “Information Theory, Inference, and Learning
Algorithms”, Cambridge Univ. Press, Cambridge, UK, 2003
[9] B. J. Frey and D. J. C. MacKay, “A Revolution: —Belief
Propagation in Graphs With Cycles” in Advances in Neural Information Processing Systems 10, MIT Press, Cambridge,MA,
1998
[10] F. R. Kschischang and B. J. Frey,“Iterative decoding of compound codes by probability propagation in graphical models”
IEEE J. Sel. Areas Comm., vol. 16, pp. 219-230, Feb. 1998
[11] J. Zhang and M. P. C. Fossorier, “Shuffled Iterative Decoding”,
IEEE Tran. on Comm., Vol. 53, No. 2, pp. 209-213, February
2005
[12] M. M. Mansour and N. R. Shanbhag, “Turbo Decoder Architectures for Low-Density Parity-Check Codes”, in Proc. IEEE
GLOBECOM, pp. 1383-1388, Nov. 2002
[13] N. Wiberg, “Codes and Decoding on General Graphs”, Ph. D.
Thesis, Department of Electrical Eng. Linköping University,
Linköping, Sweden, 1996
[14] M. F. Bayramog̃lu, “Sub-Graph Approach in Iterative SumProduct Algorithm”, M. Sc. Thesis, Department of Electrical
and Electronics Eng. Middle East Technical University, Ankara,
Turkey, September 2005