60
IEEE
TRANSACTIONS
ON
INFORMATION
THEORY,
VOL.
IT-23,
NO.
1,
JANUARY
1977
General Broadcast Channels with Degraded Message Sets
JANOS KijRNER
.
AND
Abstract-A
broadcast channel with one sender and two receivers is considered. Three independent messages are to be
transmitted over this channel: one common message which is meant
for both receivers, and one private message for each qf them. The
coding theorem and strong converse for this communication situation is proved for the case when one of the private messages has
rate zero.
I. I-NTRODUCTION
E CONSIDER a two-receiver broadcast channel defined by T. M . Cover [l] as a pair of discrete
memoryless channels (V, W) with common input alphabet
Y and respective output alphabets X and 2. (We use the
same symbol for discrete memoryless channels and for
their transition probability matrices, and we suppose that
all alphabets are finite.) The nth memoryless extension of
this broadcast channel is defined by the pair (VI”, W ”),
where, e.g.,
W
KATALIN
MARTON
(Seealso [7].) We now formulate the problem and state the
main result.
Definition: (Code for the general broadcast channel with
degraded messages.)An (n,c)-code for the general discrete
memoryless broadcast channel with degraded messages
is given by codewords y,“l E Y n (1 I j < Ml, 1 I I I MO)
and corresponding decoding sets SQjl c X”, @ l c 2” such
that both (3Qjl) and (el] are disjoint families, and
Vn(Ajllyjnl)
2 1 - E, Wm(e!,lyjnl) 1 1 - 6,
(1)
for all j,Z.
A pair of nonnegative numbers (Rl,Ro) is called an Eachievable rate pair for the broadcast channel with degraded messagesif, for any 6 > 0 and large enough n, there
exists an (n,e)-code ((y$3qjl,@l); 1 5 j 5 Ml, 1 5 1 5 MO)
satisfying
n-1 - log Mi 2 Ri - 6, i = l,O.
Denote by R(t) the set of all the t-achievable rate pairs. (It
is easily seen that this coincides with the previous definiE Y”,x~=x1x2”‘xn
E X”.
foryn
= YIYZ “‘Yn
tion of B(t).)
An (n,t)-code for this channel is given by codewords yjnkl
A pair of nonnegative numbers (Rl,Ro) is called an
E Y n (1 _<j 5 M1, 1 _<k 5 M2, 1 5 1 < MO); and corre- achievable rate pair if it is t-achievable for every t > 0. We
sponding decoding sets 3Qjl c X”,,@kl c 2” such that
write
both (&jl{ and (@hll are disjoint families, and
Yl = n Yi?R(t).
c>o
Vn(AjllYjnkl) 2 1 - t,
w”(@)klIy~~~)L 1 - e
for all j,/z,l.
A triple of nonnegative numbers (R~,Rz,Ro)is called an
E-achievablerate triple for this channel, if, for any 6 > 0 and
large enough n, there exists an (n,t)-code (yj”kl,&jl, @kl; 1
< j < MI, 1 5 K 5 Mz, 1 5 1 < MO] satisfying
n-l.logMi
1 Ri - 6,
i = 1,2,0.
The determination of all the t-achievable rate triples for
a general two-receiver broadcast channel is still an open
problem. A number of partial results are available, including the complete solution of the so-called “degraded”
case. (See [2]-[8]. References [5] and [8] contain essentially
the same results, however, only [5] contains complete
proofs. Therefore, we shall refer to [5] in the sequel.)
In this paper we describe all the pairs (Rl,Ro) such that
(Rl,O,Ro) is an t-achievable rate triple. We denote by B(E)
the set of all such pairs, and let .%!P ne>o n(6).
Notice that, if channel W is a degraded version of V,
then a rate triple (Rl,Rz,Ro) is c-achievable (for the general
broadcast problem) if and only if (Rl,Ro + R2) E 33(c).
Manuscript received December 29,1975; revised April 29,1976.
The authors are with the Mathematical Institute of the Hungarian
Academy of Sciences, 1053 Budapest, ReBltanoda-u.l3-15, Hungary.
For a quadruple of random variables (r.v.‘s) (U, Y,X,Z),
we write (U,Y,X,Z) E P if U,Y and the pair (X,2) form
a Markov chain in this order, and if the conditional distributions of X given Y and of Z given Y are defined by V
and W, respectively. Here and hereafter, all r.v.‘s have finite range. For an r.v. U, (1UII denotes the cardinality of the
range of U.
Theorem: Let
9 A {(R1,Ro); RI I I(Y
A XI U), R. I I(U
RI + Ro 5 ICY A X), (U,Y,X,Z)
A Z),
E PD, IiUiI 5 I/Y/I + 4;
then, for every t > 0,
92(t) = n = 9.
Remarks: 1) The interesting part of this theorem is
the converse, (i.e., that no rate pairs outside 9 are achievable for any t > 0). This will be deduced from [9, theorem
21. The direct part is trivial for those familiar with Bergmans’proof of the coding theorem for degraded broadcast
channels [4].
2) We shall show in the Appendix that the rates
found by Cover [5] and van der Meulen [B] for this problem
exhaust 3.
KiiRNER
AND
MARTON:
GENERAL
BROADCAST
C,
C,
61
CHANNELS
Rd
we put
Cl
S,Q(t,v)
= n-l.logmin(Gw,~(~,77):~
c
T,(Q),
2 exp (nt)j.
Gv,Q(-%~~)
L
G
R,
Further, TQ(t) (t 5 0) is defined as follows: write
t
R,
C,
R,
Tl,Q(t) = inf (-I(u
A Z: -I(U
A X) 2 t,
F ig.1. Typicalshapes
of therateregion9.
,
E P(Q)];
(U,Y,X,Z)
define to,Q as the m inimal-value oft such that the slope of
3) It is clear that 5’is always contained in, and may
T~,Q( - ) at t is greater than or equal to 1, (to,Q = 0 if no such
coincide with, the triangle with vertices (O,O),(Ci,O), (O,Ci),
t exists); and set
where Ci is the capacity of the channel V. The point (Ci,O)
if t 5 to,Q
always belongs to 8. In general, the upper boundary of 5’
TI,Q@),
T&) =
contains a line segmentof slope -1 going through the point
if to,Q < t < 0,
1t + T l,&o,Q) - tO,Q,
(Ci,O), but this line segmentmay reduceto the point (Ci,O).
(i.e., we define TQ(~) by replacing Tl,~(t) with its tangent
F ig. 1 shows the three possible shapes of 8.
O f dope I for t > tog).
For n = 1,2, - - -, a distribution Q on Y is called an emII. PRELIMINARIES
pirical distribution of order n, if Q(b) - n is an integer for
W e use the notation introduced in [9] with the few all b E Yy.
W e shall need the following slightly m o d ified version of
m o d ifications listed below. W e recall that
[9, theorem 21.
U-Ymeans that U,Y and the pair (X,2) form a
Proposition: For any 0 < r < 1,
Markov chain in this order;
(-%.a
denotes the matrix defined by the condiFYI
u
“Q ”y I’%,Q,(hd
- TQ,(t)l
- 0
tional distributions of the r.v. Y given U;
n,
dist U
is the distribution of the r.v. U;
denotes the family of quadruples of r.v.‘s where Q n ranges over the e m p irical distributions of order
P'(Q)
n.
(U, Y,X,Z) E P such that dist Y = Q ;
denotes the set of n-length Q-typical seProof: Checking the proof of [9, theorem 21, we see
T,(Q)
quences in Y n, i.e., the set of all sequences thatforanyt _<OandO<rl<l,
y n E Y n such that, for any b E Y,
(iV(b(y”) -n- Q(b)1 <r, whereN(b]y”) is
“a”n”(&,Q,&,d
TQ,(t)(
0,
the number of occurrences of b in yn and
(ra]kl is a fixed sequenceof positive numwhere the maximum is taken over the e m p irical distribubers with rJv% - ~0,r,/n - 0;
tions of order n. (The uniformity of the above convergence
denotesthe n-sequencesin X n generatedby relies on the uniformity of elementary estimates of the
7, (Y n, V)
the fixed sequenceyn, i.e., the set of all se- type
quencesxn E 5 % ”such that, for any a E X
max
max
In-l -log Qn(yn) + fOQn)l - 0.1
and b E Y,
Q n g:gTj$fp
INbJyR,xn)
- Nbly”)
- V(alb)J <s,
where {S, ),“=i is a fixed sequenceof numbers
satisfying s,/fi
- ~0,s,/n - 0.
W e assume that all r.v.‘s throughout the paper have finite range.
For _LBc Y n, (n = 1,2, . . a),0 < 17< 1 and a distribution
Q on Y, we write Gv,Q(~~,v),and Gw,g(%,~) for what was
denoted by Gv($,v) and Gw(8,~), respectively, in [9]. W e
recall the definitions:
Gv,g(B,r]) = m in (P(A):
A E X”, Vn(.A ly”) 1 0,
for yn E Bl,
where P is the output distribution on X corresponding to
the input distribution Q via channel V. Gw,Q(%,TJ)is defined analogouslyvia channel W . Similarly, we indicate the
dependenceon Q in some other definitions from [9]. Thus
Now the proposition follows immediately from the monotonicity of the functions S,,Q( - ,q), TQ( s), and the
equicontinuity of the functions TQ( - ).
III. PROOFO F THE THEOREM
Note first that
9 = ((R1,Ro); RI I I(Y
A X(U),
RI + Ro 5 I(Y
A Xl,
R. 5 I(U
(u,Y,X,Z>
A Z),
E PI
(2)
as follows from [3, lemma 31 in the usual way. wow we
prove the direct part.
Noticing that
P = U P’(Q),
Q
62
IEEE
we see that it is sufficient to prove, for any fixed Q, that all
the rate pairs in
such that
B(Q) C {(Rl,Ro): RI 5 Z(Y A X(U),
TRANSACTIONS
P(.&Jy~)
ON
INFORMATION
2 1 - v5
THEORY.
Wn(eqyn)
R. 5 Z(U A Z),
JANUARY
1977
2 1 - 2/;
fory” E 81
(7)
l
RI
+
Ro
5
Z(Y
A
Xl,
(U,Y,X,Z)
E
and F’+,&?f~1Juf) 1 1 - 34. By the Corollary to [9, theorem 11, each .fBl contains an t-code ((yj”l,Jjl))pi for the
channel V such that n-l - log Mi > Z(Y A Xl U) - 6. Set ~
&jl e Ajl f7 &Zl; then by (7) we have
‘-P(Q)1
are achievable. Consider the plane regions
91(Q) 4 ((R1,Ro); RI I Z(Y A X(U),
RO I Z(U A Z),
(U,y,X,Z) E PO(Q)1
and
L(Q)
e ((R1,Ro): RI + RO 5 Z(Y A X),
dist Y = Q, Fxl y = V).
If&(Q) E Q,(Q), i.e. O(Q) = 91(Q) 0 L(Q) = L(Q), then
it is easily seen that Z( Y A X) I I( Y A Z), and it is obvious
that g(Q) E R in this case. Suppose therefore that L(Q)
$ 91(Q). Define
s;(Q)
A ((R1,Ro); RI = Z(Y A X/U),
R. = Z(U A Z),
(U, y,X,Z)
E P(Q)!.
We claim that
491(Q)) = 49;(Q)),
(3)
vqAjlJy$)
2 1 - 24.
We see that our system is a 22/;-code of approximate rate
pair (4) for the broadcast channel (V, W). This completes
the proof of the direct part.
We proceed to the proof of the converse part.
For a positive integer n, let ((yjnl,3Qjl,@l); 1 _i j 5 Mi, 1
5 1 5 Mel be an c-code for our broadcast channel. Then,
in particular,
WW1Jyi”l) 1 1 - 6,
15j5Mr,l_<Z_<M().
(8)
Define
231d {yj”l; 1 -< j -< Ml).
By our assumption, for any 1, fB1 is an c-code for the
channel V.
For 1 5 1 I Me and any empirical distribution Q, let
where for any plane region 9, u(B) denotes the upper
boundary of 9. In fact, it can be shown that !$(Q) is convex -S(Q) A lu,"l; yj"l E -%,
(cf. the proof of [9, lemma 31). Moreover, g;(Q) contains
rz-l shJ(b(y$ = Q(b), for all b E Y).
the points (I( Y A X),0), (O,Z(Y A Z), and is contained in
the rectangle defined by the points (O,O), (Z(Y A X),0), We have
(O,Z(Y A Z)), and (Z(Y A X), Z(Y A 2)). Hence u(g*(Q))
is a monotonically decreasing convex (n) curve. This
proves (3) and hence also the convexity of 91(Q).
Now if L(Q) c 91(Q), then (3) and the convexity of where the union is taken over all empirical distributions.
91(Q) imply 49(Q)) = @l(Q) f7 L(Q)> = u@(Q) n Fix a 6 > 0. Since there are at most (n + l)IIyl empirical
distributions of order n, it follows that for large enough n
L(Q)). This means that for any point (Rl,Ro) E u($(Q)),
each 3~ contains a subset %l(Ql) such that
there exists a quadruple (U,Y,X,Z) E P(Q) such that
RI = Z(Y A X(U),
z(Y
A xlu)+z(u
R. = Z(U A z),
(4)
A Z)rz(Y
(5)
A x).
A z)Sz(u
slog II&
+ 6.
(9)
For an empirical distribution Q denote by Me(Q) the
number of those 1 for which &I = Q. Clearly,
From (5) it follows that
z(u
n-l - log Mi = 12-l. log (lB31((< n-l
A x).
(6)
Now it should be clear for those familiar with Bergmans’
proof of the coding theorem for the degraded broadcast
channel that rate pairs (Rl,Ro) satisfying (4) and (6) are
t-achievable for any 6 > 0. For the sake of completeness,
we give an outline of the proof.
Fix a 6 > 0 and an E > 0. It is easily shown that because
of (6), for every c > 0 and large enough n, an t-code
Iuy,u;, * - * ,u&} c ?-‘n(dist U) with n-l . log Mc > I( U A
2) - 6 can be constructed for simultaneous use on the
channels Fxl u and Fzlu. Denote by ~41and @Lthe decoding sets corresponding to ur in 5%”and 2”, respectively. Applying a reverse Markov inequality [lo], we see
that for each 1 there is a subset
MO = 5 M,(Q),
where again Q is running over all empirical distributions.
Since there are at most (n + l)l~“yl~empirical distributions
of order n, it follows that for large enough n there exists
an empirical distribution Q such that
n-l.logMo<n-l.logMo(Q)
+ 6.
(10)
Fixing this Q, we consider those sets Sl(Ql) for which
&I = Q, and denote by P and R the output distributions
corresponding to the input distribution Q via channels V
and W, respectively. Since the sets @l are disjoint, we
have
&f,(Q). min R,L(C’l) I 1.
(11)
l:Ql=Q
KiiRNER
AND
MARTON:
GENERAL
BROADCAST
63
CHANNELS
This and the fact that (Z?&J lies above the curve TQ ( . )
imply, by the definition of the function TQ ( . ), that the
point (Z?&J lies above the curve TlQ( . ). This means by
definition that an T.v. U satisfying (U,Y,X,Z) E % ‘(&I
exists, for which Z?i I -Z(U A X) and fit 1 -Z(U A 2).
Hence, the definition (19) of (Z?i,&) implies
Let le achieve the m inimum in (ll), and let
B C %~l,(Ql,) = Bl,(Q) and @ = @ lo
Then (8) implies that
1 Gw,Q(B,l - E).
R”(e)
Comparing this with (10) and (II), we see that
n-l.logMi
log M e < -n- l- log Gw,~(.%,l - E) + 6. (12)
n-l.
O n the other hand, since B _cTT,(Q), and since B is an
c-codefor the channel V, we have by the conversepart of
the maximal code lemma [9]
n-l.
I Z(Y A X(U)
log II% \/ I Z(Y A X) + n-l * log Gv,g(%,l - 6) + 6
for large enoughn, where dist Y = Q and Fxl y = V. By the
definition of the set B and by (9), this means that
n-l*logMl
<z(Y
A x)
+ n-l - log Gv,~(8,1 - E) + 26. (13)
*log Gw,~(8,1
- E)>
T,(t)
-TQ(t)
Let U and R be two independent r.v.‘s and let Y be a deterministic function of them. W e shall say that the point (R1,Ro)
belongsto the plane region @ if, for ((U,R),y,T?$) E P satisfying
the preceding conditions, the following inequalities hold:
(14)
n-1alogM1<Z(YAX)+t+26.
(1%
Now we shall exploit the fact that (yyl,s’ljl) is an t-code
for the channel V. Consider the set
B* e
U
l:Q/=
26.
In order to relate our Theoremto Cover’sresult [5], we have
to m o d ify our notation. In this Appendix,we shall denoteby R
that r.v. which was denotedby U in the body of the paper.
R1 5 I(CJ A R,T?) = I(U A 81R),
(20)
R. 5 I(R A z),
(21)
R,, I Z(R A U,T?) = I(R A .T?jU),
(2%
R1 + R. 5 I(U,R h 8).
(23)
- 6.
+ 26,
A i?)+
APPENDIX
W ith this notation, (12) and (13) become
n-l - log M IJ <
< z(u
This and (18) prove the theorem by (2).
No+, letting t = n- l . log G~,Q(%,1 - t), the Proposition
of Section II implies that, for large n,
n-l
+ 26,
n-‘-logi&
Cover [5] proved that points belonging to @ are achievable rate
pairs for our coding scheme, i.e., (? E 9, which also follows immediately from the definitions of 9 and @ . Now we shall prove
that @ 1 9. (This and Cover’stheorem[5] give anotherproof of
the direct part of our Theorem.)
T&(Q).
Proposition: 9 = @ .
Q
Proof: It is sufficient to prove that @ 1 O(Q) for any fixed
Clearly, %* c 7,(Q), and %* is a subset of the set {yj”l);
Q. For g(Q) = L(Q), this is trivial. Now suppose the contrary.
thus it is an t-code for the channel V. Hence
n-l
. log iI%* 1)< I( Y A X) + 6,
(16)
Consider an arbitrary (Rl,Ro) E u&(Q)). W e have shown that
there exists a quadruple of r.v.‘s (R, Y,X,Z) E P(Q) such that
if n is large enough. O n the other hand,
Il% *l/2 MO(&) - m in ll-% (Q~)ll.
l:Ql=Q
(17)
log Mc < n-l.
(24
Ro = I(R A Z),
(25)
RI+R()-<I(Y
Comparing (9), (lo), (17) and (16), we see that
rz-l. log A41+ n-l.
R1 = I(Y A XIR),
m in log ll% l(Q)ll
l:Ql=Q
A X).
@ f3)
Consider the stochastic matrix FYIR. It is well known (and
easily proved by induction) that Fyl~ can be represented as a
convex combination
+ n-1 a log Me(Q) + 26 < n-1 - log I(%* (I
+ 26 <z(Y
A x)+
36.
(27)
(18)
Rearranging inequalities (14) and (15), and using the
monotonicity of the function TQ( * ), we see that the
point
(Rl,&)
P (n-l.
as
1 o g M i - Z(Y A X) - 26,
-n-l
lies abovethe curve TQ(
that
n log M e + 26)
(19)
. ). Further, from (18) we conclude
l&J 2 I&.
where Z$=, xk = 1, xk L 0 and Fk is a stochastic O-l matrix of
order ]IR (1X I]Y I]. Each Fk defines a function fk (F) on the range
ofRsuchthatfk(r)=bifandonlyifFk(bIr)=l.LetUbeanr.v.
independent of R and such that Pr(U = k) = Xk, and define P
P = fu(R).
.
(28)
It easily follows from (27) and the independence of U and R
that Fpp = Fyp, i.e., that dist (R,n = dist (R,Y). Finally, define
the r.v.‘s x,2 so as to satisfy ((U,R),p,x,z) E p(Q). It is clear
that dist (R,F,L?,& = dist (R,Y,X,Z).
64
IEEE
Now we prove (20)-(23). Using (24) and (28), we see that
R1 = I(Y
A XIR)
= I@
A TlR)
= I(fr,(R)
I I(U,R
A 81R)
A 8jR)
= I(U
A 8jR).
Thus (20) is proved. (21) follows at once from (25). Using (26) and
(20), we get
Ro_<I(Y
A X)-I(Y
A XlR)=I(R
AX)
= I(R h 2) I I(R A U,x),
which is (22). Finally, by (26) and (28), we have
R1 + R. I I(Y
A X)
= I@
A 2)
I I(U,R
A R),
proving (23) and completing the proof of our proposition.
REFERENCES
[l] T. M. Cover, “Broadcast channels,” IEEE Trans. Inform Theory,
vol. IT-18, pp. 2-14, Jan. 1972.
[2] R. Ahlswede, et al., “Bounds on conditional probabilities with applications in multi-user communication,” Zeitschr. f. Wuhrscheinl.
&erw. Geb., vol. 34, pp. 157-179,1976.
TRANSACTIONS
ON
INFORMATION
THEORY,
JANUARY
1977
[31 R. Ahlswede and J. Korner, “Source coding with side information
and a converse for degraded broadcast channels,” IEEE Trans.
Inform. Theorv. vol. IT-21. vn. 629-637, Nov. 1975.
[41 P.‘P. Bergman;; “Coding theorem for broadcast channels with degraded components,” IEEE Trans. Inform. Theory, vol. IT-19, pp.
197-207, Mar. 1973.
[51 T. M. Cover, “An achievable rate region for the broadcast channel,”
IEEE Trans. Inform. Theory, vol..IT-21, pp. 399-404.
161R.
G. Gallager, “Capacity and coding for degraded broadcast
channels,” Probl. Peredaci Znformacii, vol. 10, pp. 3-14, July-Sept.
1974.
171 J. Kiirner and K. Marton, “The comparison of two noisy channels,”
Keszthely Colloquium on Inform. Theory, Hungary, 1975.
@I E.
PI
[W
[Ill
C. Van der Meulen, “Random coding theorems for the general
discrete memoryless broadcast channel,” IEEE Trans. Inform.
Theorv. vol. IT-21. DD. 180-190, Mar. 1975.
J. Kdrner and K. M&on, “Images of a set via two channels and their
role in multi-user communication,” IEEE Trans. Inform. Theory,
vol. IT-23, pp. 000-000, Jan. 1977.
M. Loeve, Probability Theory. New York: Van Nostrand, pp. 157
and 28-42,1955.
A. D. Wyner and J. Ziv, “A theorem on the entropy of certain binary
sequences and applications,” IEEE Trans. Inform. Theory, vol.
IT-19, pp. 769-777, Nov. 1975
© Copyright 2026 Paperzz