On the structure of optimal entropy

416
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002
On the Structure of Optimal Entropy-Constrained
Scalar Quantizers
András György, Student Member, IEEE, and Tamás Linder, Senior Member, IEEE
Abstract—The nearest neighbor condition implies that when
searching for a mean-square optimal fixed-rate quantizer it is
enough to consider the class of regular quantizers, i.e., quantizers
having convex cells and codepoints which lie inside the associated
cells. In contrast, quantizer regularity can preclude optimality in
entropy-constrained quantization. This can be seen by exhibiting
a simple discrete scalar source for which the mean-square optimal
entropy-constrained scalar quantizer (ECSQ) has disconnected
(and hence nonconvex) cells at certain rates. In this work, new
results concerning the structure and existence of optimal ECSQs
are presented. One main result shows that for continuous sources
)= (
), where
and distortion measures of the form (
is a nondecreasing convex function, any finite-level ECSQ can
be “regularized” so that the resulting regular quantizer has the
same entropy and equal or less distortion. Regarding the existence
of optimal ECSQs, we prove that under rather general conditions
there exists an “almost regular” optimal ECSQ for any entropy
constraint. For the squared error distortion measure and sources
with piecewise-monotone and continuous densities, the existence
of a regular optimal ECSQ is shown.
Index Terms—Convex distortion measures, entropy coding, optimal quantization, regular quantizers.
I. INTRODUCTION
T
HE main objective of quantizer design is to find a collection of codepoints (the codebook) and associated quantization cells providing minimum distortion subject to a rate constraint. In fixed-rate quantization, where the quantizer’s rate is
measured by the log-cardinality of the codebook, the rate constraint means that the number of codepoints is fixed. In this
case, efforts to design optimal quantizers have lead to the wellknown necessary conditions of quantizer optimality (namely,
the nearest neighbor and centroid conditions), first for scalar
quantizers and the squared error distortion measure [1], [26],
[2], and subsequently for vector quantizers and more general
distortion measures [3], [4]. An important consequence of the
nearest neighbor condition is that an optimal fixed-rate quanManuscript received December 24, 2000; revised September 6, 2001. This
work was supported in part by the Natural Sciences and Engineering Research
Council (NSERC) of Canada. The work of A. György was also supported by the
Soros Foundation. The material in this paper was presented in part at the IEEE
International Symposium on Information Theory, Washington, DC, June 2001.
A. György is with the Department of Computer Science and Information
Theory, Budapest University of Technology and Economics, H-1521 Budapest,
Hungary (e-mail: [email protected]).
T. Linder is with the Department of Mathematics and Statistics, Queen’s University, Kingston, ON K7L 3N6, Canada (e-mail: [email protected]).
Communicated by P. A. Chou, Associate Editor for Source Coding.
Publisher Item Identifier S 0018-9448(02)00314-0.
tizer is essentially determined by its codepoints since its cells
are the Voronoi regions (with respect to the source distribution)
associated with the codepoints. For the squared error distortion measure this implies that an optimal quantizer is regular,
i.e., each of its cells is a convex set and the associated codepoint lies inside the cell. The cells of a regular scalar quantizer are intervals, and the cells of a regular vector quantizer
with a finite number of codepoints are convex polytopes. In
this sense, the structure of optimal fixed-rate quantizers for the
squared error distortion measure (and to a certain extent for
more general norm-based distortion measures [5]) is relatively
well understood. Moreover, for reasonable distortion measures
and source distributions, the distortion of a quantizer satisfying
the nearest neighbor condition is a continuous function of its
codepoints, and so the existence of optimal fixed-rate quantizers
can be deduced using standard continuity-compactness arguments [6]–[8].
The average rate of a quantizer can further be reduced if
a variable-rate lossless code (entropy code) is applied to its
output. In this case, the rate is usually defined as the entropy of
the output of the quantizer [9] in order not to tie the performance
of such a scheme to a particular entropy code, and the resulting
scheme is called an entropy-constrained quantizer. The objective of the design is then to minimize the quantizer’s distortion
for a given entropy constraint, and a quantizer achieving this
minimum distortion is called an optimal entropy-constrained
quantizer. Since the nearest neighbor condition is no longer necessary for the optimality of an entropy-constrained quantizer,
existence and structural problems concerning optimal quantization appear to be harder in this case. In contrast to fixed-rate
quantization, where particular attention has been payed to structural and existence issues, works on entropy-constrained quantization have focused more on design issues. For scalar sources,
Berger [10] and Farvardin and Modestino [11] found necessary
conditions for the optimality of a regular entropy-constrained
scalar quantizer (ECSQ) with a fixed number of output points.
These conditions give rise to practical algorithms for designing
locally optimal ECSQs with a fixed number of codepoints [10],
[12], [11], [13]. Chou et al. [14] gave an effective iterative descent algorithm using a Lagrangian formulation for the design
of locally optimal entropy-constrained vector quantizers. Sufficient conditions for the existence of an optimal quantizer among
all regular ECSQs with a fixed number of codepoints were given
for sources with log-concave densities by Kieffer et al. [13].
It appears, however, that no general result concerning the existence of optimal entropy-constrained quantizers is known.
The assumption of quantizer regularity seems to be ubiquitous in the literature on entropy-constrained quantization. A reg-
0018–9448/02$17.00 © 2002 IEEE
GYÖRGY AND LINDER: ON THE STRUCTURE OF OPTIMAL ENTROPY-CONSTRAINED SCALAR QUANTIZERS
ular quantizer with a finite number of codepoints can be described using a finite number of parameters, while a more general quantizer structure may not be described this way. Thus, a
fundamental question is whether it is sufficient to consider only
regular quantizers when searching for a (mean-square) optimal
entropy-constrained quantizer. In fact, this question can be answered in the negative by a simple example; there exists a discrete scalar source distribution and an interval of entropy constraints for which no quantizer with interval cells is optimal (see
Example 1 in Section III). One main contribution of this paper is
to show that such a pathological example cannot exist when the
source distribution is continuous; for such a source any “good”
quantizer is essentially regular.
In a recent work, Chou and Betts [15] showed that if an
entropy-constrained quantizer is optimal and achieves the
, the lowest possible distortion of
lower convex hull of
any quantizer with entropy not greater than , then it satisfies
a modified version of the nearest neighbor condition. For the
squared error distortion measure, all quantizers satisfying
this modified nearest neighbor condition can be shown to be
regular. This result is very general in that it is valid for an
arbitrary source distribution and quantizer dimension. On the
other hand, it does not cover optimal quantizers that lie above
, which can happen if
is
the lower convex hull of
is not convex for a uniform
not convex. (For example,
scalar source and the squared error distortion measure [16].)
Moreover, the result already presumes the optimality of the
is an open issue.
quantizer, but the achievability of
Our purpose in this paper is to give new results on the structural properties and the existence of optimal ECSQs. The paper
is organized as follows. In Section II, notation and definitions
are introduced. In Section III, regularity type properties of
ECSQs are investigated. Throughout, our basic assumption is
that the source has a nonatomic distribution (i.e., its distribution
function is continuous) and the distortion measure is of the
, where is a nondecreasing convex
form
function. Theorem 1 shows that for any finite-point quantizer
there is a regular quantizer with the same entropy and equal
or less distortion. As a consequence, Corollary 2 shows that
regular finite-point quantizers can perform arbitrarily close to
. Theorem 2 exthe operational distortion-rate curve
tends Theorem 1 to infinite-point quantizers; it shows that any
quantizer can be replaced with an “almost regular” quantizer,
that is, with a quantizer which has interval cells but may be
undefined on a set of probability zero.
In Section IV, existence results concerning optimal ECSQs
are given. Theorem 3 proves that an optimal ECSQ achieving
always exists, and that such an optimal quantizer can
be assumed to be almost regular. Theorem 4 shows that a regular optimal ECSQ exists among all quantizers having a fixed
number of codepoints. In Section V, for the squared error distortion measure and sources with densities, the almost regularity
of an optimal ECSQ is strengthened to regularity. Theorem 5
shows the existence of regular optimal ECSQs for a wide class
of source densities which contains all univariate densities commonly used as parametric source models. Concluding remarks
are given in Section VI.
417
II. PRELIMINARIES
An -point (or -level) scalar quantizer is a (Borel) measurable mapping of the real line into a finite or countably incalled the
finite set of distinct reals
codebook of . In case the codebook is not finite, we formally
and call an infinite-point quantizer. The
define
are called the codepoints and the sets
are called the cells (or decision regions) of . The codebook
and the collection of cells
completely
characterize since is a partition of and
for
The distortion of in quantizing a real random variable
with distribution is measured by the expectation
where the distortion measure
is a nonnegative measurable function of two real variables. When the distribution is
will be used.
clear from the context, the short notation
The partial distortion of the th cell of is defined by
(1)
. In case is an arbitrary finite
so that
and the
are still well defined by the
measure,
corresponding integrals.
The entropy-constrained rate of is the entropy of the discrete random variable
where
denotes base logarithm. A scalar quantizer whose
is called an entropy-constrained
rate is measured by
scalar quantizer (ECSQ).
Unless otherwise stated, we always assume that the cell prob,
, are all positive. One can
abilities
always redefine on a set of probability zero (by possibly reducing the number of cells) to satisfy this requirement.
let
denote the lowest possible distorFor any
tion of any quantizer with output entropy not greater than .
This function is formally defined by
where the infimum is taken over all finite or infinite-point scalar
quantizers whose entropy is less than or equal to . If there is
, then we
no with finite distortion and entropy
. The existence of a “reference
formally define
such that
is a sufficient (but
letter”
to be finite for all
.
not necessary) condition for
in the sense that
and
Any that achieves
is called an optimal ECSQ.
A scalar quantizer is called regular if i) its cells are subintervals of the real line; ii) each of its codepoints lies inside the
associated cell; iii) the collection of cells is locally finite in
the sense that the number of cells in intersecting any bounded
subset of the real line is finite. This definition reduces to the
418
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002
usual definition of regularity [9] if
has a finite number of
codepoints since in this case iii) is automatically satisfied.
be a regular finite-point scalar quantizer with codeLet
. Then the correpoints indexed so that
satisfy
sponding interval cells
where, for
,
and
. Defining
boundaries
means that
and
for all
, the interval
, satisfy
there is a
(the “generalized centroid” of ) minimizing
the distortion over , i.e.,
(4)
Since is nondecreasing, if is an interval, then there is a
minimizing that lies inside . If is strictly convex then
is unique for any
such that
[4]. For example,
for the squared error distortion measure.
III. REGULARIZING ECSQS
Consider a scalar quantizer
with a finite codebook
. For any source random variable
and
if
The points ,
, called the subdivision points or
.
thresholds of , may belong to either or
Similarly, if an infinite-point quantizer is regular, then its
can be linearly ordered so that
if
,
codepoints
where the index set is either the positive integers (if there is a
smallest codepoint), or the negative integers (if there is a largest
codepoint), or the set of all integers (if there are no smallest
are intervals with
and largest codepoints), and the cells of
and such that
. Throughout
endpoints
has a
the paper we assume that the source random variable
for all
.1
nonatomic distribution, i.e.,
Thus, each subdivision point can be mapped arbitrarily to
without changing the distortion and entropy
either or
of . Hence we adopt the convention that the bounded cells
. For
of a regular quantizer are of the form
the sake of unifying the notation, if has a leftmost cell, we
and
sometimes formally extend the domain of to
instead of
.
write
In what follows, we often assume that is a difference distortion measure of the form
(2)
where :
the limit
is a nondecreasing function. Then
exists, and is either finite or
formally extend the domain of
ingly define
to
. It will be convenient to
and accord(3)
.
for all
It is easy to show that if is also lower semicontinuous, then
such that
for any Borel set
X
1Note that if the distribution of
is “continuous” in the sense that it has a
probability density function, then it is necessarily nonatomic. On the other hand,
there exist nonatomic distributions (such as the Cantor measure; see, e.g., [17])
that do not have densities.
Thus, if is a quantizer that maps any input
to a code,
, then it has minpoint in minimizing
imum distortion among all quantizers with codebook . This is
the well-known nearest neighbor condition for fixed-rate quanof
are not
tizers. Although the cells
satisfies
uniquely defined, each
For the squared error distortion measure
,
, where is nonor more generally for
is an interval and
; hence is regdecreasing, each
ular. This means that any finite point quantizer can be “regularized” to obtain a regular quantizer with the same number of
codepoints and equal or less distortion, and that it is enough to
consider the much smaller, parametric class of -point regular
quantizers when searching for an optimal quantizer in the nonparametric family of all -point quantizers.
Although the nearest neighbor condition is no longer valid
for entropy-constrained quantizers, it was recently shown in [15]
that a modified version of it still holds in the entropy-constrained
and cells
setting.2 If a quantizer with codebook
achieves the lower convex hull of
for a source
with
(the negative slope of a line
distribution , then there is a
of support to the lower convex hull at ) such that for all and
-almost all
(5)
For the squared error distortion (and more generally for
th-power distortions),
if
. In this case, the
infimum in (5) is achieved by some index , and (5) defines a
quantizer with convex cells such that only finitely many cells
intersect any bounded set [15]. Thus, for the squared error
distortion measure, any achieving the lower convex hull of
can be assumed to be regular. Moreover, it was shown
in [15] that has a finite number of codepoints if the source
distribution has sufficiently light tail. Condition (5) works
for arbitrary source distribution and quantizer dimension, but
it does not cover optimal quantizers that lie above the lower
2A similar necessary condition for the optimality of variable-rate quantizers
also appeared in [14].
GYÖRGY AND LINDER: ON THE STRUCTURE OF OPTIMAL ENTROPY-CONSTRAINED SCALAR QUANTIZERS
419
convex hull of
, which can happen if
is not
for a uniform scalar source and
convex. For example,
the squared error distortion coincides with its lower convex
for
[16]. In
hull only at rates
, and
general, little is known about the properties of
so the achievability of the lower convex hull is very difficult
to check. Thus, other methods are needed to find for each
rate a tractable, sufficiently small family of quantizers that
still contains an optimal entropy-constrained quantizer. In
the scalar case, such a family will turn out to be the class of
regular quantizers if the source distribution is nonatomic. For
discrete sources, however, it may happen that no regular ECSQ
achieves the minimum mean-squared distortion among all
ECSQs satisfying a given entropy constraint.
Proof of Lemma 1: Since the distribution function of is
such that
.
continuous, there is an
,
,
Let be the quantizer with cells
. We show that
. If
and codepoints
is not finite, we are done; so assume
. The key
observation is that the function
, and let be a discrete
with the following
Since is nondecreasing, is clearly nondecreasing in the in, and the convexity of readily implies that is
terval
and
.
also nondecreasing in
and
Since
Example 1: Let
random variable taking values in
distribution:
and
for is in effect defined by a partition of the set
(the definition of for other values is immaterial)
and by the corresponding codepoints (the optimal codepoints for
each partition can easily be found). Checking the five possible
partitions, it turns out that for all such that
is nondecreasing. To show this, rewrite
in the following form:
if
if
if
.
A quantizer
where
has two codepoints
cells must satisfy
and
, an optimal ECSQ
, and the corresponding
and
Thus, if
is an interval, then
is a union of two disjoint intervals, each with positive probability. It follows that no regular
quantizer can be optimal in this case.
The first result of this section shows that no such pathological example can exist if has a nonatomic distribution. More
has a nonatomic distribution,
specifically, we show that if
then, for nondecreasing convex difference distortion measures,
every finite-point ECSQ can be “regularized” so that the resulting regular quantizer has the same cell probabilities (and
hence the same entropy) and equal or less distortion. The key
to this result is the following lemma.
has a
Lemma 1: Assume that the random variable
, where
nonatomic distribution , and let
:
is convex and nondecreasing. Let
be
and coran arbitrary two-point quantizer with cells
such that
. Then there
responding codepoints
with interval cells
exists a two-point quantizer
and corresponding codepoints
such that
,
,
, and
.
has interval cells but it is not necessarily regular
Remark:
is not guaranteed by the construction). However, since
(
is nondecreasing, the centroid rule (4) guarantees that each
can be replaced by a
such that the distortion is not
increased.
(It is easy to see that
and the monotonicity of
imply that both integrals are finite.) Adding
to both sides and rearranging terms yields
.
We showed that convex and nondecreasing distortion measures allow regularizing any two-point quantizer if the source
distribution is nonatomic. We call a distortion measure possessing this property two-point regular. More generally, given a
positive integer we say that a distortion measure is -point
regular if for any nonatomic distribution and any -point
quantizer there is a regular -point quantizer such that
and and have the same cell probabilities
(i.e., there is a one-to-one mapping between the cells of and
satisfying
for any cell of ). Finally, is
called finitely regular if it is -point regular for every positive
integer .
Lemma 1 showed that nondecreasing and convex difference
distortion measures are two-point regular. The next theorem
shows that if is two-point regular, then it is also finitely regular. The proof can be found in Appendix A.
Theorem 1: Every two-point regular distortion measure is
has a nonatomic distribution ,
finitely regular. That is, if
and is two-point regular, then for any finite-point quantizer
with cells
there exists a regular quantizer
with interval cells
such that
,
, and
.
Remarks: i) Note that not only do and have equal enand
are encoded using the
tropies, but if the outputs of
same variable-length code (e.g., a Huffman code [18]), then the
resulting average code lengths will be equal. ii) By Lemma 1, the
theorem holds for any difference distortion measure
with a nondecreasing and convex . Although we
420
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002
have not yet found other examples for two-point regular distortion measures, we conjecture that this class is actually larger
than the class of nondecreasing and convex difference distortion
measures.
with a nondecreasing and convex
For
, the construction of Lemma 1 does not change the codepoints
of the quantizer, and it orders the cells according to the corresponding codepoints. It is easy to see that this order preserving
construction is maintained in Theorem 1 if Lemma 1 is used in
each step of the induction argument given in the proof. Thus,
we obtain the following corollary.
Corollary 1: Assume that the real random variable has a
, where
nonatomic distribution , and let
is nondecreasing and convex. Then for any
with cells
and correfinite-point quantizer
there exists a regular quansponding codepoints
with interval cells
such that
tizer
for all
,
, and if
,
.
then
Remark: It follows from Lemma 1 and the construction in
the proof of Theorem 1 that can also preserve the codepoints
of , not only their order. In this case, has interval cells, but
is not guaranteed).
it is not necessarily regular (
Another immediate consequence of Theorem 1 is that
can be arbitrarily well approximated by regular finite-point
ECSQs. This fact can be very useful in analyzing the high-rate
[19].
asymptotic behavior of
Corollary 2: Assume has a nonatomic distribution , and
let be a two-point regular distortion measure. If there is a
such that
, then for any
and
,
such that
there is a regular finite-point quantizer
and
Remark: The conditions of the corollary are satisfied for the
.
squared error distortion measure if
Proof of Corollary 2: First we show that for any quantizer
with finite distortion, there is a sequence of regular finitesuch that
for all and
point quantizers
(6)
If is a finite point quantizer, then (6) follows by regularizing it
using Theorem 1. Otherwise, assume has cells
and corresponding codepoints
, and for
deto have cells
fine the -point quantizer
and codepoints
. Moreover,
and so
and
. It is clear that
are identical on
,
since
,
increases to , and
.
is an -point quantizer, Theorem 1 can be
Since each
such that
used to obtain a regular -point quantizer
, and
. Hence (6) is
choose a quantizer
such
proved. Now for any
and
. Then, by
that
such that
(6), there is a regular finite-point quantizer
and
, which in turn
.
satisfies
Unexpected problems may arise if one wants to extend Theorem 1 to infinite-point quantizers. To illustrate the problem assume that the order-preserving construction of Corollary 1 carries over to the regularization of infinite-point quantizers. In this
case, if and are two codepoints of the initial infinite-point
, then the corresponding cells
quantizer such that
and
of the regularized quantizer satisfy
. As a result, can have quite an unusual structure if the initial quantizer
is arbitrary. For example, assume that is an infinite-point
such that if
, then
quantizer with codepoints
. Then the collection of
there exists a such that
of the regularized will have the propinterval cells
, there is an
such that
erty that for any and with
. It can be seen that such collection of intervals
cannot form a partition of the real line.3
To deal with such cases, we introduce the notion of almost
regular quantizers. Given a finite measure , a quantizer is
called -almost regular (or almost regular if is clear from the
with
such that is
context) if there exists an
, and every cell of is an interval containing
defined on
the associated codepoint. Thus, an almost regular quantizer has
interval cells but it may not be defined on a set of -measure
in an arbitrary manner for all
zero. (We can define
without changing the entropy and distortion of .) As an exbe the Cantor ternary set [17], and let
ample, let
be absolutely continuous with respect to the Lebesgue measure
. Then
is a union of countably many
with
open intervals which form the cells of a -almost regular quantizer. As this example shows, an almost regular quantizer can
have infinitely many cells in a bounded interval. Also note that
the quantizer of this example cannot be (re)defined on a set of
probability zero to obtain a quantizer with interval cells.
This definition can be extended similarly to higher dimensions. A -dimensional vector quantizer is called -almost
such that
, is defined
regular if there is a set
, and has convex cells containing the corresponding
on
codepoints. To exhibit an example for a -almost regular vector
quantizer, we can use a corollary of the Vitali covering theorem
[20]. The corollary states that there exists a countable collection
of disjoint closed balls in the unit cube
of
such that the set
has Lebesgue
measure zero. Now let be absolutely continuous with respect
.
to the -dimensional Lebesgue measure with
form the cells of an almost regular -diThen
mensional quantizer.
3Let S
^ denote the interior of S
^ . Then n (
S^ ) is nonempty and perfect
(since S^ and S^ cannot have a common endpoint if i 6= j ), and thus uncountable. Therefore, n ( S^ ) is nonempty and uncountable.
GYÖRGY AND LINDER: ON THE STRUCTURE OF OPTIMAL ENTROPY-CONSTRAINED SCALAR QUANTIZERS
Remark: If
is a -almost regular scalar quantizer with
such that only finitely many of the
intersect any
cells
bounded interval, then can be redefined on a set of -measure
zero to obtain a regular quantizer with the same distortion and
entropy. This follows since in this case there is an indexing of
for all , and now the set of points
the cells such that
and
where is pos(of measure zero) lying between
to obtain interval
sibly not defined can be assigned to (say)
cells whose union covers the real line.
The last result of this section shows that for two-point regular difference distortion measures, any infinite-point quantizer
can be replaced by an almost regular quantizer with the same
entropy and equal or less distortion. The theorem is proved in
Appendix A.
has a
Theorem 2: Assume that the random variable
be a
nonatomic distribution , and let
two-point regular distortion measure, where
is nondecreasing and left-continuous. Then for any infinite-point quantizer there exists an almost regular quantizer
with the same cell probabilities such that
.
Then
421
, where
for all
Now for any
, let be a positive-integer,
valued random variable with distribution
. Then by a result of Wyner [21, Theorem 1] we have
Thus, for any
, Markov’s inequality implies
Hence, the family of distributions corresponding to is tight.
has a pointTherefore, by Prokhorov’s theorem [17],
, which
wise-convergent subsequence, denoted also by
.
converges to some probability distribution
, and by Fatou’s lemma
Clearly,
IV. THE EXISTENCE OF OPTIMAL ECSQs
The regularization result of Theorem 2 makes it possible to
show the existence of an optimal ECSQ for any source with
a nonatomic distribution. Theorem 2 also implies that such an
optimal ECSQ can be assumed to be almost regular.
have a nonatomic distribution and let
be a two-point regular distortion measure,
is nondecreasing and left-continwhere
there exists a -almost regular quanuous. Then for any
and
.
tizer such that
Theorem 3: Let
Remarks: i) The conditions on are satisfied when is
convex and nondecreasing; hence, the result holds for the
squared error distortion measure. ii) This theorem would imply
the existence of a regular optimal ECSQ if one could directly
prove that an optimal ECSQ cannot have infinitely many cells
in a bounded interval. While this is a reasonable conjecture
under general conditions, we have only been able to prove
it for the squared error distortion measure and sources with
well-behaved densities (see Section V).
and assume
is
Proof of Theorem 3: Fix
finite; otherwise, the statement is trivial. Consider a sequence
such that
for all and
of quantizers
By Theorem 2, we can assume that each
is almost regular.
denote the
For positive integers and let
cell of
with the th largest probability, let
denote the
. In case of
corresponding codepoint, and let
has
ties, any ordering of the equiprobable cells suffices. If
cells, then
is formally defined to be empty for all
.
For every , let
(thus,
, implying that is compact under pointwise concorrevergence). Now for the subsequence of quantizers
, form the vectors
sponding to
If is empty, define
. By Cantor’s diagonalconverging pointwise to
ization method, a subsequence
can be chosen (cona vector
or
is also allowed). For this vector, we can
vergence to
construct a quantizer with codepoints and corresponding
. If
(in this case also
by concells
struction), then is empty by definition. Since for every fixed
the intervals
,
are pair,
wise disjoint, it is easy to see that the intervals
are also pairwise disjoint. Since is nonatomic
for all , and hence
. Thus, if is defined to have
and codepoints
(note that just as in the proof of
cells
,
Theorem 2, some of the may not be finite), then since
we have
On the other hand, Lemma 4 in Appendix B shows that
,
, and
imply
Now we can use the centroid rule (4) to replace any nonfinite
by a finite one that lies inside its associated cell. The modified
.
quantizer is -almost regular and achieves
422
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002
The following result shows that for any finite
there is an
optimal quantizer among all ECSQs with no more than codepoints.
has a
Theorem 4: Assume that the random variable
be a
nonatomic distribution , and let
two-point regular distortion measure, where
is nondecreasing and left-continuous. Then for any
and
positive integer , there exists a regular quantizer with at most
codepoints achieving
Example 2: Let denote the distribution of in Example 1,
, be defined by the density such that
and let ,
if
or
,
if
, and is zero otherwise. Then
. Let
if
the two-point quantizer be defined by
and
if
. Consider the squared error distortion
for all , implying
measure. Then
Also, it is easy to see that
has at most
codepoints
Remark: Note that the quantizer achieving
may
codepoints. For the squared error distortion
have less than
case and for sources with log-concave densities, Kieffer et al.
is achieved by
[13] provided conditions under which
a quantizer having exactly codepoints.
Proof of Theorem 4: The proof of Theorem 3 is used with
a slight modification. By Theorem 1, there exists a sequence of
, each having no more than
coderegular quantizers
for all and
points, such that
Then the construction in the proof of Theorem 3 yields an almost
regular quantizer with at most codepoints. Since
and
, can be redefined to obtain a
.
regular quantizer which achieves
We conclude this section with a short discussion on the (lack
. For two probability distributions and
of) stability of
with
and
define
where the infimum is over all joint distributions of the pairs of
such that
has distribution and
random variables
has distribution . Then
is a metric on probability distributions with finite second moments which has been widely
used in fixed-rate quantization (see, e.g., [22], [6], [5]). For the
squared error distortion measure, optimal fixed-rate quantizer
performance is a continuous function of the source distribution
denote the minin this metric [6], that is, letting
imum mean-squared distortion for a source with distribution
of any quantizer with
codepoints, one has
for any sequence of source distributions
with
. The convergence in is easy to characterize
if and only if
and
(
weakly [6]), and the continuity of
in is an
important tool in proving consistency and convergence rate results for empirical quantizer design. In particular, if a quantizer
is optimal for , and
and are close in , then
is
nearly optimal for .
One can ask whether an analogous stability result holds for
, the minimum ECSQ distortion (here we have made
on ). The following simple
explicit the dependence of
is not conexample uses this fact to demonstrate that
tinuous in .
Since
from Example 1, we obtain
V. OPTIMALITY
AND REGULARITY
WITH DENSITIES
FOR
SOURCES
In the previous section, we showed the existence of almost
regular optimal ECSQs for nonatomic source distributions and
a wide class of difference distortion measures. We can obtain
the stronger result that a regular optimal ECSQ exists if we restrict our attention to the squared error distortion measure and
sources with well-behaved densities. For convenience, we assume in this section that the source density is supported in a
of the real line , where we allow the possisubinterval
and
. Accordingly, quantizers need
bility that
.
only be defined on
is called piecewise monotone if
A function
can be partitioned into countably many intervals such
that any bounded set intersects only a finite number of these
intervals and is monotone in each of these intervals. Piecewise continuity is similarly defined with continuity in place of
monotonicity. All continuous unimodal densities are, of course,
piecewise monotone and piecewise continuous, and all densities commonly used in source modeling belong to this class, including the generalized Gaussian, Cauchy, and beta densities
[11].
The following theorem resolves the problem of the regularity
of optimal ECSQs in the special, but important case of the
squared error distortion measure and piecewise-monotone and
piecewise-continuous source densities.
be a random variable with a density
Theorem 5: Let
which is piecewise monotone and piecewise continuous in
. Assume that
. Then for any entropy
there is an optimal ECSQ which is regular.
constraint
To prove this result, we need two useful technical lemmas.
The proofs of these are deferred to Appendix C. The first lemma
is an extension of a result of Kieffer et al. [13, Lemma A.5].
be a random variable with a density
Lemma 2: Let
function which is continuous, positive, and nonincreasing
and
. Assume
(resp., nondecreasing) on
. Then for any scalar quantizer
there
exists a quantizer satisfying the following:
GYÖRGY AND LINDER: ON THE STRUCTURE OF OPTIMAL ENTROPY-CONSTRAINED SCALAR QUANTIZERS
a)
and
have the same cell probabilities and
;
has interval cells
such that
(resp.,
b)
);
are nonincreasing as inc) the cell probabilities
creases.
The next lemma is a simple consequence of a necessary condition for the optimality of a finite-point regular ECSQ due to
Farvardin and Modestino [11], who generalized a similar result
of Berger [10] from the squared error distortion to more general
distortion measures.
have a density which is positive and
and let
, where
is nondecreasing and continuous. Assume
is an ECSQ with finite distortion that is optimal for some
. If has codepoints
and interval
entropy constraint
, where
for all , then there is
cells
such that for all
a
Lemma 3: Let
continuous in
(7)
where
.
by
Proof of Theorem 5: Denote the distribution of
and assume
; otherwise, the result trivially holds.
there is a -almost
By Theorem 3, for any rate constraint
regular optimal quantizer . If only finitely many cells of
intersect any bounded interval in
, then can be redefined
on a set of -measure zero to obtain a regular quantizer (see the
remark preceding Theorem 2), and the theorem holds.
Assume now that has an infinite number of cells in some
bounded interval. Since is piecewise monotone and piecewise
continuous, it is easy to see that there is an interval partition
of
such that only finitely many of the intersect
, and is continuous and monoany bounded subset of
tone in the interior of each . Then there is a bounded interval
such that
for some
, contains infinitely many
, of , and the intersection of with any cell of
cells, say
not contained in has -measure zero. Thus, we can define a
with the corresponding
new quantizer on to have cells
codepoints. If denotes the conditional density of on , and
is the corresponding distribution, then is -almost regular,
and is positive, continuous, and monotone on . Assume
is nonincreasing; the argument is similar for nondecreasing .
with
Then, by Lemma 2, there is a
such that and have the same cell probabilities, has insuch
terval cells with subdivision points
are nonincreasing as inthat the cell probabilities
, the left endpoint of . Note that the condicreases, and
required in Lemma 2 is automatically
tion
satisfied since is bounded. Since was optimal, must be
. Denoting by
optimal for and the entropy constraint
the codepoint for the cell
, by Lemma 3, there is a
such that for all
we have
(8)
. If
where
to show that only finitely many
, we follow an idea in [15]
can be nonzero, which con-
423
tradicts the assumption that has an infinite number of cells in
. Notice that (8) implies that for any
where denotes the length of and the last inequality holds
is convex and nondecreasing. Therebecause
, then
is bounded from above; hence, there
fore, if
such that
for all . But then
,
is an
. Then (8) implies that the cells
which is impossible. Thus,
satisfy the standard nearest neighbor condition. Since
, the optimal codepoint corresponding to
is the centroid
, and since is
. By the nearest neighbor
nonincreasing on ,
, and so
condition,
for all . That is, the cell lengths increase with . Therefore,
, contradicting the fact the is bounded and
for all
. Thus, only finitely many cells of can intersect
any bounded interval.
VI. CONCLUSION
In this paper, we presented new results on the structure and
existence of optimal ECSQs. First, we considered the problem
of regularization. For nonatomic source distributions and a wide
class of difference distortion measures we showed that for any
finite-point ECSQ there is a regular ECSQ with the same entropy and equal or less distortion. As a consequence, we showed
(under the standard assumption that there is a reference point
with finite expected distortion) that regular finite-point ECSQs
can perform arbitrarily close to the operational distortion-rate
. We introduced the notion of an almost regular
curve
quantizer (such quantizers have convex cells but may be undefined for a set of input points of probability zero), and we
showed that any infinite-point quantizer can be replaced with
an almost regular quantizer that has the same entropy and equal
or less distortion.
Using the technique of regularization, we gave results concerning the existence of optimal ECSQs. We proved that there
exists an almost regular ECSQ that achieves the operational
. This result basically settles the
distortion-rate function
problem of existence of optimal entropy-constrained quantizers
in the scalar case. We also showed that a regular optimal ECSQ
exists among all ECSQs having no more than a given finite
number of codepoints. For the squared error distortion measure and a wide class of sources with well-behaved densities
can be achieved by a regular ECSQ.
we showed that
It is of interest to extend these results to entropy-constrained
vector quantizers (ECVQs). The techniques used in this paper
rely rather heavily on the fact that the set of reals can be linearly ordered in a natural way, and the proofs do not easily carry
over to the vector case. For example, the existence of optimal
ECVQs is an open problem. Another interesting (and perhaps
easier) problem is to show that for the squared error distortion
424
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002
measure, the th-order entropy-constrained operational distorcan be arbitrarily well approached
tion-rate function
by regular -dimensional ECVQs. Such a result would help
tie up some loose ends in asymptotic (high-rate) quantization
theory [23], [24].
and
if
if
.
Thus, (10) and (11) imply that
APPENDIX A
is finite; otherwise, the
Proof of Theorem 1: Assume
statement is trivial. The proof uses induction on . A one-point
quantizer is always regular; hence is always one-point regular.
Also, is two-point regular by assumption. Now assume that
is -point regular for all
, where
. We show
that then is also -point regular.
Assume without loss of generality that the indexing of the
cells of is such that
Step 2: Let
be the restriction of to
, and
have cells
and
let the two-point quantizer
. Then, since is assumed to be twocodepoints
point regular, there is a regular quantizer with interval cells
and codepoints
such that
and and have the same cell measures according
to . We assume (as we may without loss of generality) that
and
, where
(9)
, let denote the codepoint corresponding
and for
to .
The main idea in constructing is the following. First, we
and, using the induction hypothesis, redefine the cells
fix
such that the new cells are “intervals” in
.
and
.
That is, the new cells satisfy
and
such that the quantizer thus
Then we redefine
obtained have a leftmost cell which is a proper interval. Finally,
cells are replaced by intervals using the induction
the other
hypothesis.
be the restriction of to
, that is,
Step 1: Let
for any Borel set
. Further-point quantizer be defined to have cells
more, let the
and codepoints
. Then
Since either
(9) we have
or
. Thus
, by
, because
, and so
Therefore, we can define the quantizer
points
with cells and codeif
if
if
and
if
(10)
is nonatomic, the induction hy(recall definition (1)). Since
pothesis implies that there is a regular quantizer with interval
and codepoints
cells
such that
(11)
and the cells of
. Now define
points
and have the same measures according to
to have cells
and codesuch that
if
if
if
.
Then, as in Step 1, it is easy to see that
and
and
have the same cell probabilities according to .
is an interval, using the reStep 3: Since
, by the induction hypothesis we can restriction of to
with interval cells
place the cells
and corresponding codepoints to obtain a regular quantizer
(where
and
) such that has the same cell
according to , and
probabilities as
if
Consequently,
same cell probabilities.
and
if
if
.
it follows that
From the construction of
same cell probabilities according to
and
have the
(12)
, and
and
have the
Proof of Theorem 2: Assume
; otherwise, the
is an infinite-point quanstatement is trivial. Suppose that
and codepoints
. The
tizer with cells
proof consists of two parts. First, for every nonatomic finite
measure and for every positive integer we construct an inwith
having cells
finite-point quantizer
GYÖRGY AND LINDER: ON THE STRUCTURE OF OPTIMAL ENTROPY-CONSTRAINED SCALAR QUANTIZERS
such that
for all , and eior
if
,
.
ther
converges, in a sense, to an almost regThen we show that
ular quantizer which has the same cell probabilities as and
.
satisfies
The construction is a simple application of Theorem 1.
let
denote the restriction of
to
,
For
and the -point quantizer with
and apply Theorem 1 to
and codepoints
.
cells
has codepoints, say,
The resulting regular quantizer
and interval cells
such that
construction of
if
425
. Also by construction, we have that for all
. Hence, for all
Since
for all , this implies
for
Now define
for all
and since
we obtain
to have cells
for
and
for
for
or
, with corresponding codepoints
, and
for
. Then either
for all
, and
is nonatomic and
,
,
. Thus, the quantizer with cells
and codepoints ,
satisfies (13).
or
).
(Note that it may happen that
, then
To show (14), observe that if
for all large enough. Since the sequence
increases to
by construction, and either
or
for
,
, this implies that if
then
for all large enough. In other words,
denote the indicator function of a set
letting
if
otherwise.
Also, for all
In the second part of the proof, we show that there is a -alwith (interval) cells
most regular quantizer
such that
for all
, we have
since
and is nondecreasing and left-continuous (this also holds if is not finite; recall (3)). Thus, since
, we obtain
(13)
and
(14)
, this completes the proof.
Since
satisfying (13) and (14), let
To show the existence of
and
and form the vector
(recall that
is the codepoint associated with
, and
that
for
by construction since
was a regular quantizer). By Cantor’s diagonalization
, for convenience
method, there is a subsequence of
, converging componentwise to a vector
denoted also by
(convergence to
or
is also allowed). Denote the corresponding subsequence of
. Then
for all , and
quantizers also by
,
are pairwise disjoint
the intervals
,
, are pairwise disjoint by the
since
where the first inequality follows by applying Fatou’s lemma
[17] twice. Finally, we can use the centroid rule (4) to replace
any nonfinite by a finite one that lies inside its associated cell.
The modified quantizer is -almost regular and satisfies (13)
and (14).
APPENDIX B
has a nonatomic distribution and
, where
is nondebe a sequence of inficreasing and left-continuous. Let
Lemma 4: Assume
426
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 48, NO. 2, FEBRUARY 2002
nite-point almost-regular quantizers, each with cells
and codepoints
such that
, and
for all , where ,
are not necessarily finite. If the intervals
, then for the quantizer with cells
isfy
, we have
codepoints
Proof: Since
,
, and
satand
for each fixed and all large enough,
. Consequently, for such and
and so
,
and so letting
is nondecreasing and left-continuous, we
have
for -almost all (recall definition (3) if is not finite). Then,
by applying Fatou’s lemma twice, we obtain
we have
for all large enough. Therefore, Lemma 4
can be applied to show that the infinite-point quantizer with
and codepoints
subdivision points
,
satisfies
Since clearly satisfies the other requirements of the lemma,
the proof is complete for nonincreasing . A similar argument
can be used when is nondecreasing.
APPENDIX C
Proof of Lemma 2: Assume that is nonincreasing, and deby
. By Theorem 2, we can
note the cells of
to be almost regular. If
has a finite number of
redefine
codepoints, then it can be redefined to be a regular quantizer.
For a finite-point regular quantizer the statement of the lemma
reduces to [13, Lemma A.5].
Now assume that has infinitely many codepoints. Denote
by
and let us index the cells of
the distribution of
in such a way that
. The proof of
of regular
Corollary 2 shows that there is a sequence
-point quantizers such that each
has cell probabilities
(and so
),
.
and
to obtain a unique -point regular
Apply the lemma to
with
, having subdivision points
quantizer
and codepoints
Proof of Lemma 3: Consider first the case where
is a
finite-point regular quantizer with ordered subdivision points
and codepoints
. In this case,
.
[11, eq. (13)] shows that (7) holds for all
(This result is an immediate consequence of the Kuhn–Tucker
conditions of constrained optimization [25] applied to the
as functions of the vector
distortion and the entropy of
. Although explicit conditions on the source
density and the distortion measure were not stated in [11], it is
easy to check that the positivity and continuity of the source
density and the continuity of are sufficient.)
If is constant for all , then (7) does not depend on . Oth, and then is uniquely
erwise, there is an such that
given by
(15)
Notice that the numerator in the above expression is independent
of the distribution of , and the denominator depends only on
the ratio of the probabilities of adjacent cells. Therefore, if is
for some , and the
an infinite-point quantizer, then
on
application of (15) to the conditional density of
for all
proves (7) for all (clearly, the finite-point
must also be opquantizer obtained by restricting to
timal for the conditional density and the corresponding rate).
REFERENCES
such that
for
where
is obtained by ordering the probabilities
in a nonincreasing manner.
an infinite-point quantizer by defining
Let us formally make
for all
. Denote the distribution function of by
, and let
be the inverse of in the interval
. Clearly,
[1] S. P. Lloyd, “Least squared quantization in PCM,” unpublished memo.,
Bell Labs., 1957.
[2] J. Max, “Quantizing for minimum distortion,” IEEE Trans. Inform.
Theory, vol. IT-6, pp. 7–12, Mar. 1960.
[3] Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantizer
design,” IEEE Trans. Commun., vol. COM-28, pp. 84–95, Jan. 1980.
[4] R. M. Gray, J. C. Kieffer, and Y. Linde, “Locally optimum block quantizer design,” Inform. Contr., vol. 45, pp. 178–198, 1980.
[5] S. Graf and H. Luschgy, Foundations of Quantization for Probability
Distributions. Berlin, Germany: Springer-Verlag, 2000.
[6] D. Pollard, “Quantization and the method of k -means,” IEEE Trans.
Inform. Theory, vol. IT-28, pp. 199–205, Mar. 1982.
GYÖRGY AND LINDER: ON THE STRUCTURE OF OPTIMAL ENTROPY-CONSTRAINED SCALAR QUANTIZERS
[7] E. A. Abaya and G. L. Wise, “Convergence of vector quantizers with
applications to optimal quantization,” SIAM J. Appl. Math., vol. 44, pp.
183–189, 1984.
[8] M. J. Sabin, “Global convergence and empirical consistency of the generalized Lloyd algorithm,” Ph.D. dissertation, Stanford Univ., Stanford,
CA, 1984.
[9] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression. Boston, MA: Kluwer, 1992.
[10] T. Berger, “Optimum quantizers and permutation codes,” IEEE Trans.
Inform. Theory, vol. IT-18, pp. 759–765, Nov. 1972.
[11] N. Farvardin and J. W. Modestino, “Optimum quantizer performance
for a class of non-Gaussian memoryless sources,” IEEE Trans. Inform.
Theory, vol. IT-30, pp. 485–497, May 1984.
[12] A. N. Netravali and R. Saigal, “Optimum quantizer design using a fixedpoint algorithm,” Bell Syst. Tech. J, vol. 55, pp. 1423–1435, Nov. 1976.
[13] J. C. Kieffer, T. M. Jahns, and V. A. Obuljen, “New results on optimal
entropy-constrained quantization,” IEEE Trans. Inform. Theory, vol. 34,
pp. 1250–1258, Sept. 1988.
[14] P. A. Chou, T. Lookabaugh, and R. M. Gray, “Entropy-constrained
vector quantization,” IEEE Trans. Acoust. Speech, Signal Processing,
vol. 37, pp. 31–42, Jan. 1989.
[15] P. A. Chou and B. J. Betts, “When optimal entropy-constrained quantizers have only a finite number of codewords,” in Proc. IEEE Int. Symp.
Information Theory, Cambridge, MA, Aug. 16–21, 1998, p. 97.
427
[16] A. György and T. Linder, “Optimal entropy-constrained scalar quantization of a uniform source,” IEEE Trans. Inform. Theory, vol. 46, pp.
2704–2711, Nov. 2000.
[17] R. B. Ash, Real Analysis and Probability. New York: Academic, 1972.
[18] T. Cover and J. A. Thomas, Elements of Information Theory. New
York: Wiley, 1991.
[19] T. Linder and R. Zamir, “Causal source coding of stationary sources with
high resolution,” in Proc. 2001 IEEE Int. Symp. Information Theory,
Washington, DC, June 2001, p. 28.
[20] F. Jones, Lebesgue Integration on Euclidean Space. London, U.K.:
Jones and Bartlett, 1993.
[21] A. D. Wyner, “An upper bound on entropy series,” Inform. Contr., pp.
176–181, 1972.
[22] R. M. Gray and L. D. Davisson, “Quantizer mismatch,” IEEE Trans.
Commun., vol. COM-23, pp. 439–443, 1975.
[23] H. Gish and J. N. Pierce, “Asymptotically efficient quantizing,” IEEE
Trans. Inform. Theory, vol. IT-14, pp. 676–683, Sept. 1968.
[24] P. Zador, “Asymptotic quantization error of continuous signals and the
quantization dimension,” IEEE Trans. Inform. Theory, vol. IT-28, pp.
139–149, Mar. 1982.
[25] D. G. Luenberger, Linear and Nonlinear Programming, 2nd
ed. Reading, MA: Addison-Wesley, 1984.
[26] S. P. Lloyd, “Least squared quantization in PCM,” IEEE Trans. Inform.
Theory, vol. IT-28, pp. 129–137, Mar. 1982.