Fast gradient descent method for mean-CVaR optimization
Garud Iyengar
∗
Alfred Ka Chun Ma
†
February 27, 2009
Abstract
We propose an iterative gradient descent procedure for computing approximate solutions for the
scenario-based mean-CVaR portfolio selection problem. This procedure is based on an algorithm proposed
by Nesterov [13] for solving non-smooth convex optimization problems. Our procedure does not require
any linear programming solver and in many cases the iterative steps can be solved in closed form. We
show that this method is significantly superior to the linear programming approach as the number of
scenarios becomes large.
1
Introduction
{intro}
The goal of portfolio selection is to distribute a fixed amount of capital over a given set of investment
opportunities to maximize “return” while managing the “risk”. Although the benefits of diversifying were
well-known, the first mathematical model for portfolio selection was proposed by Markowitz [10]. In the
Markowitz model, the “return” of a portfolio is given by the expected return of the portfolio and the
“risk” of the portfolio is measured by the variance of the return of the portfolio. The variance is a good
measure of risk only if the returns are symmetric. The returns on equity, at least for short time horizons,
can be approximated by a Normal random variable; consequently, the variance is an adequate measure for
the risk in the portfolio. However, when the distribution of the returns of the underlying assets is not
symmetric, variance is not an adequate risk measure. Recently, Conditional Value-at-Risk (CVaR) [15] has
been proposed as a risk measure for asset classes that have asymmetric return distributions. CVaR has
many nice properties: it is coherent risk measure [4], Rockafellar and Uryasev [14] show that the CVaR of
∗ Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027.
[email protected]
† Department of Industrial Engineering and Operations Research, Columbia University, New York, NY 10027.
[email protected]
1
Email:
Email:
a portfolio can be computed from scenario by solving a linear program (LP), using LP duality CVaR upper
bound constraints can be formulated as linear constraints, and empirical studies suggest that the mean-CVaR
approach where the portfolio return is given by its expected return and the portfolio risk is given by the
CVaR of the portfolio is more appropriate than the mean-variance approach if the risk-return relation is
nonlinear [1].
From the results in Rockafellar and Uryasev [14], it follows that the mean-CVaR portfolio selection
problem reduces to an LP. However, the resulting LP is very ill-conditioned and solving such LP, particularly
when the scenario size is large, is very difficult in practice [2]. We adapt a gradient descent method proposed
by Nesterov [13] to solve the mean-CVaR optimization problem. The method we propose does not require
solving an LP and therefore it is able to potentially handle a very large number of scenarios. In addition,
the method can be easily implemented. These features imply that a portfolio manager can use our method
without installing any third-party LP solvers. We also show how to incorporate analysts’ views into the
mean-CVaR portfolio selection problem [5, 6].
2
Mean-CVaR optimization
Suppose there are n assets in the market. Let R ∈ Rn denote the random returns on the n assets. Let
!n
w ∈ Rn denote the portfolio of the investor, i.e., 1T w =
i=1 wi = 1. The CVaR1−β (−Rw) at the
probability β ∈ (0, 1) of the portfolio w is defined as
−1
CVaR1−β (−Rw) = EP [−Rw | Rw ≤ FRw
(β)],
where −Rw denotes the loss on the portfolio w, and FRw denote the cumulative density function (CDF) of
the random variable Rw. Thus, the CVaR is conditional expectation of the lowest β-quantile of the random
portfolio return.
The mean-CVaR portfolio selection problem we consider is as follows:
min CVaR1−β (−Rw),
w∈W
where the set W is the set of all feasible portfolios w. For example, by setting
"
#
W = w : EP (R)T w = r, 1T w = 1 ,
2
(1) {meancvar}
where EP [R] denotes the expected returns on the assets, one recovers the canonical mean-CVaR portfolio
selection problem where the goal is to select the minimum CVaR portfolio that has a target return r.
Rockafellar and Uryasev [14, 15] show that
$
%
1
CVaR1−β (−Rw) = min τ + EP (−Rw − τ )+ ,
τ
β
(2) {cvar}
where 1 − β is the confidence level and the function (x)+ = max(x, 0). It is typically very hard to explicitly
characterize the distribution of the returns R, and therefore, in practice, EP (−Rw − τ )+ is approximated by
using return vectors R generated by some scenario generator [8]. Let {Ri : i = 1, . . . , N } denote N scenarios
and let pi , i = 1, . . . , N , denote the probability of the i-th scenario. Then the expectation in (2) can be
approximated as follows.
EP (−Rw − τ )+ ≈
N
&
i=1
pi (−RTi w − τ )+ .
By introducing new variables ai ≥ (−RTi w − τ )+ , i = 1, i = 1, . . . , N , the optimization problem (1) can be
reformulated into the linear program (LP)
min τ +
1
β
!N
i=1
pi ai
s.t. ai ≥ −RTi w − τ,
i = 1, . . . , N,
(3) {lpmeancva
Aw = b,
a ≥ 0,
for W in the form of W =
"
#
w : Aw = b . The LP (3) is large – it has O(N ) constraints, and is, often,
very ill-conditioned [2]. Thus, solving the LP (3) as the number of samples N becomes large is very hard.
See Section 4 for further evidence of the numerical instability of the LP formulation.
Our solution method for the optimization problem (1) is based on the following variational characterization of CVaR [4, 16, 9]
CVaR1−β (−Rw) = max EQ (−Rw),
Q∈Q
where Q denotes a probability measure on the returns R and the set of measures
"
1#
∂Q
≤
.
Q= Q:0≤
∂P
β
3
(4) {cvardual}
Thus, the mean-CVaR portfolio selection problem (1) can be formulated as the following min-max problem
min max EQ (−Rw).
(5) {meancvarg
w∈W Q∈Q
This formulation can be thought of as a game played by the nature and the portfolio manager. It is then
natural to consider iterative methods to solve the mean-CVaR portfolio selection problem.
When the distribution P is approximated by N scenarios, the set of measures Q is given by
"
1 #
QN = q ∈ RN : 1T q = 1, 0 ≤ q ≤ p ,
β
(6) {eq:CQ-deg
where p = (p1 , . . . , pN )T and the inequalities are interpreted as component-wise inequalities. From now on,
we let RT = [R1 , . . . , RN ] ∈ Rn×N denote the matrix where the i-th column is the asset return in the i-the
scenario, i = 1, . . . , N . Thus, the scenario-based mean-CVaR problem reduces to the saddle-point problem
min max
w∈W q∈QN
3
'
(
− qT Rw .
(7) {meancvar-
An iterative algorithm
We solve the minimax problem (7) using a gradient-based procedure proposed by Nesterov [13]. This
procedure requires that the admissible set of portfolios W be bounded. In practice, there is always margin
requirement on the short positions in the portfolio. Such a margin requirement can be modeled as follows.
(1 + M )
&
i
(−wi )+ ≤
&
wi+ ,
(8) {marginreq
i
for some M > 0. Since the portfolio weights sum to one, we have
1=
n
&
j=1
wj+ −
n
&
j=1
(−wj )+ ≥ (1 + M )
Therefore, we have
&w&1 =
n
&
n
&
j=1
(−wj )+ −
wj+ + (−wj )+ = 1 + 2
n
&
j=1
j=1
n
&
(−wj )+ = M
j=1
(−wj )+ ≤ 1 +
n
&
(−wj )+ .
(9) {marginbou
j=1
2
.
M
(10) {normbound
In order to keep the portfolios w bounded, we will impose constraints in the form of &w&1 ≤ 1 + 2/M or
&w&2 ≤ &w&1 ≤ 1 + 2/M .
4
A naive approach to solve the modified minimax problem (7) would involve generating iterates {(w(k) , q(k) },
where w(k) is the best-response to the nature’s move q(k−1) , i.e.,
w(k) = argmin
w∈W
"
#
− (q(k−1) )T Rw ,
and q(k) is the best-response to the investor’s move w(k−1) , i.e.,
q(k) = argmin
q∈QN
"
− qT Rw(k−1)
#
The objective qT Rw is not smooth in (w, q); consequently, this iterative scheme converges very slowly.
Nesterov [13] devised a procedure that is able to escape this convergence bottleneck.
The Nesterov procedure consists of two steps. The first step is “smoothing” the optimization in q:. Let
w(k) denote the k-th iterate. Then the smoothed best response of nature is given by
q(k) = argmax
q∈Q
"
#
− qT Rw(k) − µd2 (q) ,
(11) {fmu}
where µ > 0 and d2 (q) is any strongly convex function. We choose
d2 (q) =
N
&
i=1
qi log qi + (pi /β − qi ) log(pi /β − qi ).
In Appendix A, we show that d2 (q) is strongly convex with parameter σ2 =
1
1−β
(12) {d2}
with respect to the %1 -norm.
The Lagrangian function L for optimization in q is given by
L(q) = −qT Rw(k) − µd2 (q) − α(1T q − 1) +
N
&
i=1
µi qi −
N
&
i=1
νi (qi − pi /β).
Setting ∇q L = 0, we have that q(k) must satisfy
−RTi w(k) − µ ln
)
(k)
qi
(k)
pi /β − qi
*
− α + µi − νi = 0,
i.e.,
(k)
qi
(k)
pi /β − qi
=e
(k) −α+µ −ν
−RT
i
i
i w
µ
.
Thus, it follows that for all values of (α, µ, ν), we have that 0 < q(k) < p/β. Therefore, complementary
5
slackness implies that µ = ν = 0, and
(k)
qi
=
β −1 pi
1+e
1
T
(k) +α)
µ (Ri w
,
i = 1, . . . , N,
(13) {q-opt}
where α is the solution of the equation
&
i
pi β −1
1+e
1
T
(k) +α)
µ (Ri w
= 1.
(14) {sumofq}
The second-step in the Nesterov procedure is to compute the update w(k) using a convex combination of
two updates z(k) and y(k) defined as follows.
y(k) = argmin
y∈W
z
(k)
w(k)
"
− q(k−1) Ry +
#
Ω
&y − w(k−1) &22 ,
2µσ2
) Ω
*#
t + 1 * (t)
q Rz +
&z&2 ,
2
2µσ2
z∈W
t=0
) 1 *
)k + 1*
=
z(k) +
y(k) ,
k+3
k+3
= argmin
"
−
k−1
&)
(15) {ykdef}
(16) {zkdef}
(17)
where
Ω = max
max
$q$1 ≤1 $w$2 ≤1
+
qT Rw
,2
= max &Ri &22
i
and σ2 is the convexity parameter for the strongly convex function d2 (q). The iterate y(k) is a modified
best-response where one penalizes large movements from the last response w(k−1) . The iterate z(k) in (16)
'
(
considers all the previous responses q(t) : t = 0, . . . , k − 1 to compute the response. The weight on y(k)
increases as the iteration count k increases.
When the set W is described by linear equalities, i.e., W =
'
(
w : Aw = b , we add the additional
constraint &w&2 ≤ 1 + 2/M , and in this case it is easy to show that (15) and (16) can be solved in closed
form. When the set W is described by linear inequality constraints, we impose the constraint &w&1 ≤ 1+2/M .
Then (15) and (16) are quadratic programs that can, in practice, be solved very efficiently using active set
methods. Note that each quadratic problem encountered in the course of our proposed iterative procedure
has n variables and O(m) constraints, where m denotes the number of components in b.
- q
-) of the algorithm displayed in Figure 1 satisfies
Nesterov [13] proves that after N steps the output (w,
"
#
"
#
) D D Ω * 12 1
∆
1 2
- − max q
-T Rw < δN =
· ,
qT Rw
q∈QN
w∈W
σ2
K
min
6
(18) {gap}
Nesterov Procedure
2 2
D1 ← 12 (1 + M
) , D2 ←
Ω ← maxi &Ri &22 ,
w(0) ← n1 1
for k ← 0 to K
do
+
1
β − β ln(β) − (1 − β) ln(1 − β),
.
ε
,
K ← 1ε ΩDσ12D2 , µ ← 2D
2
σ2 ←
1
1−β
#
− qT Rw(k) − µd2 (q)
#
"
Ω
(k) 2
&y
−
w
&
y(k+1) ← argminy∈W − q(k) Ry + 2µσ
)
* 2
) 2#
" !
k
t+1
Ω
(t)
(k+1)
z
← argminz∈W − t=0 2 q Rz + 2µσ
2
*
)
*
)
1
(k+1)
z(k+1) + k+1
y
w(k+1) ← k+3
k+3
q(k) ← argmaxq∈Q
- = y(K) , q
-=
return w
"
!K )
k=0
2(i+1)
(N +1)(N +2)
*
q(k) .
Figure 1: Nesterov Procedure
- q
-) that are δN -optimal policies for nature and the
i.e., after K iterations the algorithm produces a pair (w,
investor. One can, therefore, terminate the algorithm once we are satisfied with the quality of the portfolio.
* 12
)
· 1ε can ensure that the output of the algorithm is ε-optimal. In our
Moreover, choosing K ≥ D1σD22 Ω
numerical calculations we found that using the gap in (18) terminates the algorithm much quicker than using
the upper bound. The main features of this algorithm are as follows.
(a) The modified best-response y(k) and z(k) of the investor are computed by solving a separable quadratic
optimization problem that is similar to the mean-variance portfolio selection with uncorrelated assets.
This implies that the technology for mean-variance optimization can be used to solve the mean-CVaR
problem.
(b) The iterates (w̄, q̄) are at least δN -optimal, and often, the quality of the solution is significantly superior
to that implied by the bound. Thus, one can terminate the algorithm at any stage where one obtains a
solution of sufficient quality.
(c) In Section 4, we show that this algorithm converges to a reasonably accurate solution with the error
ε = 10−3 very quickly even when the number of scenarios N = 106 . Since the scenario-based meanCVaR problem is itself an approximation to the original problem, solving the scenario-based CVaR very
accurately does not serve any purpose.
7
{fig:neste
4
Numerical results
{results}
We tested our procedure on the example in [12]. Our asset universe consisted of Treasury bonds with 2,
5, 10, and 30 years to maturity. As in the example in [12], we approximated the returns on the assets a
Delta-Gamma approximation using the yields on bonds with 6 month, 2 years, 5 years, 10 years, 20 years,
and 30 years to maturity as the risk-factors. We simulated N scenarios for the risk factors and then used
the Delta-Gamma approximation to compute N return scenarios. We refer the reader to [11] for a detailed
discussion of the simulation procedure.
In Table 1, we display the optimal solution to the LP formulation for the mean-CVaR problem (3)
with β = 0.05 and N = 106 . We use MOSEK [3] to solve these LPs. Table 2 shows the optimal portfolio
computed by our proposed algorithm with the error tolerance ε = 10−3 . The portfolios produced by our
algorithm and the LP formulation (3) are quite different; although the CVaR values are close. These
results only imply that the LP approach and the our proposed iterative approach are consistent, i.e., both
approaches are able to solve the mean-CVaR problem; these results are not able to differentiate between the
two approaches.
The most important results of this section are in Tables 3 and 4. In Table 3 we display the CPU time
for solving the LP formulation using ILOG CPLEX [7] and MOSEK, and the CPU time for computing an
ε = 0.001 optimal solution using our algorithm as a function of the number of scenarios N . It is clear that the
industry leader LP solver CPLEX performs very poorly on this problem. MOSEK performs much better but
the run times for this commercial solver is an order of magnitude higher than that of our MATLAB-based
code. Table 4 displays the run times and the number of iterations required by our algorithm as a function
of the accuracy ε. The performance of our algorithm degrades very quickly as ε decreases. Therefore, this
algorithm is only suited for applications where one wants to compute a reasonably accurate solution very
quickly. An example of such an application is high-frequency trading. The data in high-frequency trading
is typically very noisy; therefore, it is pointless to compute a very accurate solution. Note that the LP
approach does not allow any flexibility in setting the accuracy level.
Next, we show how to use analysts’ “views” to bias the sample probability mass function p. We restrict
ourselves to “views” of the form:
ν T R ∼ g,
where ν ∈ Rn is a vector that determines the particular linear combination of the return vector R, and g is
a probability density on R. We convert this view on the distribution of the random return R to a view on
8
distribution of the N sample returns Ri , t = 1, . . . , n, by defining a view probability vector
g(ν T Ri )
,
p-i = !N
T
k=1 g(ν Rk )
i = 1, . . . , N.
- (j) , j = 1, . . . , m.
Suppose we have m different “views”, i.e., there are m different view probability vectors p
We combine these vectors into a single sample probability vector p as follows:
p=
m
&
j=1
- (j) + u0 p(0) ,
uj p
(19) {probabili
1
N1
- (j) . Since
denotes the empirical measure, and uj denotes the confidence weight on view p
!N
p is a probability vector, we require that
j=0 uj = 1. Next, we solve the mean-CVaR problem with
where p(0) =
scenario probability vector p. Our algorithm also works with other techniques for combining views, see, for
example [5, 6, 12].
For our numerical experiments, we set m = 2. The two views were chosen to be
ν
(1)
ν (2)
/
= 0
/
= 0
−1
−0.5
0 0
1
1
−0.5
0T
0
0
g (1) = unif[0, 0.001],
,
0
0T
,
g (2) = unif[0, 0.0005].
The weight vector was set such that u0 = 0.9, u1 = u2 = 0.05, i.e., we assumed that we had 90% confidence
in the empirical distribution and 5% confidence in each of the two views.
Table 5 shows the optimal portfolio computed using the LP formulation (3). As in the previous case, the
LP was solved using MOSEK. Table 6 shows the results computed using our algorithm with ε = 0.001.
5
Conclusion
In this paper, we propose an efficient algorithm for solving mean-CVaR portfolio selection problem without
using an LP solver. As shown in the numerical experiments, the algorithm is a useful alternative to the LP
approach when one wants a very fast solver that guarantees an accuracy algorithm with ε ≈ 10−3 . This
technique can also be extended to solve many other types of portfolio selection problems.
9
bond/target return (r)
2y
5y
10y
30y
CVaR
0.0020
0.1856
0.9591
−0.1857
0.0409
0.00994
0.0035
−0.7919
2.1602
−0.4601
0.0917
0.01597
0.0045
−1.4413
2.9548
−0.6379
0.1244
0.02003
0.0050
−1.7684
3.3541
−0.7244
0.1387
0.02206
Table 1: Optimal portfolio and CVaR for the Mean-CVaR problem solved by LP approach.
bond/target return (r)
2y
5y
10y
30y
CVaR
Error
0.0020
0.4739
0.3282
0.1888
0.0090
0.0101
0.0001
0.0035
0.2455
0.2484
0.2512
0.2548
0.0171
0.0011
0.0045
0.0932
0.1953
0.2928
0.4187
0.0218
0.0017
{tab:1}
0.0050
0.0171
0.1687
0.3136
0.5006
0.0244
0.0024
Table 2: Optimal portfolio and CVaR for the Mean-CVaR problem solved by our algorithm, and the absolute
error compared with LP approach.
{tab:1b}
References
[1] V. Agarwal and N.Y. Naik. Risks and portfolio decisions involving hedge funds. Review of Financial
Studies, 17(1):63–98, Spring 2004.
[2] S. Alexander, T.F. Coleman, and Y. Li. Minimizing CVaR and VaR for a portfolio of derivatives.
Journal Banking and Finance, 30(2):583–605, February 2006.
[3] E. D. Andersen and K. D. Andersen. The MOSEK optimization toolbox for MATLAB manual Version
4.0. http://www.mosek.com/products/4 0/tools/help/index.html, 2006.
[4] P. Artzner, F. Delbean, J.M. Eber, and D. Heath. Coherent measure of risks. Mathematical Finance,
9(3):203–228, July 1999.
[5] F. Black and R. Litterman. Asset allocation: combining investor views with market equilibrium. Goldman Sachs Fixed Income Research, 1990.
[6] F. Black and R. Litterman. Asset allocation: combining investor views with market expectations.
Journal of Fixed Income, 1(1):7–18, September 1991.
[7] ILOG. ILOG CPLEX 11.1. http://www.ilog.com/products/cplex/, 2008.
[8] Y.K. Koskosidis and A.M. Duarte Jr. A scenario-based approach to active asset allocation. Journal of
Portfolio Management, 23:74–85, Winter 1997.
10
N
10000
50000
100000
500000
1000000
CPLEX
1.42
39.25
155.71
6633.09
44439.79
MOSEK
0.76
3.00
4.41
25.08
50.62
Our algorithm (Iterations)
0.34 (1)
0.56 (1)
1.06 (1)
2.76 (1)
5.54 (1)
Table 3: CPU time for both methods in second and number of iterations required for our algorithm.
ε
0.001
0.0005
0.0002
0.0001
η
0.01
0.005
0.002
0.001
CPU time
4.06
4.77
3331.4
9457.8
Iterations
1
1
703
2192
CVaR
0.0244
0.0244
0.0240
0.0230
{tab:3}
Error
0.0023
0.0023
0.0019
0.00094
Table 4: CPU time and iteration counts for our algorithm.
{tab:4}
[9] H. Lüthi and J. Doege. Convex risk measures for portfolio optimization and concepts of flexibility.
Mathematical Programming, 104(2):541–559, November 2005.
[10] H.M. Markowitz. Portfolio selection. Journal of Finance, 7(1):77–91, March 1952.
[11] A. Meucci. Risk and asset allocation. Springer, 2005.
[12] A. Meucci. Beyond black-litterman: Views on non-normal markets. Risk Magazine, 19:87–92, 2006.
[13] Y. Nesterov. Smooth minimization of non-smooth functions. Mathematical Programming, 103(1):127–
152, May 2005.
[14] R.T. Rockafellar and S. Uryasev. Optimization of conditional value-at-risk. Journal of Risk, 2(3):21–41,
2000.
[15] R.T. Rockafellar and S. Uryasev. Conditional value-at-risk for general loss distributions. Journal
Banking and Finance, 26(7):1443–1471, July 2002.
[16] R.T. Rockafellar, S. Uryasev, and M. Zabarankin. Deviation measures in risk analysis and optimization.
Technical report, Department of Industrial and System Engineering, University of Florida, 2002.
Appendix A
Details of the parameters in the Nesterov algorithm
{compmaxit
2
The Hessian ∇ d2 (q) of the smoothing function d2 (q) =
by
!N
i
qi log qi + (β
−1
pi − qi ) log(β
−1
pi − qi ) is given
−1
∇2 (d2 (q)) = diag([q1−1 , . . . , qN
]) + diag([β −1 p1 − q1 )−1 , . . . , (β −1 pN − qN )−1 ]).
11
r
Bond
2y
5y
10y
30y
CVaR
0.0020
New
Change
0.0328 −0.1528
1.1395
0.1804
−0.1923 −0.0066
0.0199 −0.021
0.0110
0.00106
0.0035
New
Change
−1.0998 −0.3079
2.5328
0.3726
−0.4862 −0.0261
0.0533 −0.0384
0.0180
0.00203
0.0045
New
Change
−1.8520 −0.4107
3.4553
0.5005
−0.6785 −0.0406
0.0753 −0.0491
0.0228
0.00277
0.0050
New
Change
−2.2297 −0.4613
3.9186
0.5645
−0.7745 −0.0501
0.0855 −0.0532
0.0252
0.00314
Table 5: Optimal portfolio and CVaR for the Mean-CVaR problem solved by our algorithm with weights on
views u0 = 0.9, u1 = u2 = 0.05 by LP approach.
{tab:2a}
r
Bond
2y
5y
10y
30y
CVaR
Error
0.0020
New
Change
0.4457 −0.0282
0.3175 −0.0107
0.1960
0.0072
0.0408
0.0318
0.0113
0.0012
0.0003
0.0035
New
Change
0.1860 −0.0595
0.2280 −0.0204
0.2677
0.0162
0.3184
0.0636
0.0196
0.0025
0.0016
0.0045
New
Change
0.0128 −0.0804
0.1683 −0.027
0.3155
0.0227
0.5034
0.0847
0.0254
0.0036
0.0026
0.0050
New
Change
−0.0737 −0.0901
0.1384 −0.0303
0.3394
0.0258
0.5959
0.0953
0.0283
0.0039
0.0031
Table 6: Optimal portfolio and CVaR for the Mean-CVaR problem solved by our algorithm with weights on
views u0 = 0.9, u1 = u2 = 0.05 by our algorithm.
{tab:2b}
Therefore,
hT ∇2 (d2 (q))h
N
&
h2
N
&
h2i
q
(β −1 pi − qi )
i=1
i=1 i
!N
N
N
)&
* ) !N (β −1 p − q ) *
h2i * ) i=1 qi * ) &
h2i
i
i
i=1
=
·
+
·
−1 p − q )
−1 − 1
q
1
(β
β
i
i
i=1 i
i=1
=
≥
=
i
N
N
*2
)&
)&
1
1
h i √ *2
hi
1
·
· β −1 pi − qi
q
+
√
i
−1
qi
β − 1 i=1 β −1 pi − qi
i=1
(20)
(21)
1
&h&21 ,
1−β
where (20) follows from the fact that
Cauchy-Schwatrz inequality.
+
!
i qi
= 1 and
!
i (β
−1
pi − qi ) = β −1 − 1, and (21) follows from the
By setting w(k) = 0 in (11), it follows that qmin = argminq∈QN {d2 (q)} satisfies
qimin =
β −1 pi
,
1 + eα/µ
12
i = 1, . . . , N,
where α is chosen to ensure that 1T qmin = 1. Therefore, it follows that qmin = p, and
min d2 (q) =
q∈QN
&
pi log pi +
i
&
i
)
*
pi (β −1 − 1) log pi + log(β −1 − 1) .
Since d2 (q) is a convex function, maxq∈QN d2 (q) occurs at extreme points of the polytope QN . The extreme
points of the polytope QN are of the form:
β −1 pi , i ∈ {π(1), . . . , π(k − 1)},
qi =
0,
i ∈ {π(k + 2), . . . , π(N )}
where π is a permutation of the set {1, . . . , N } and qπ(k+1) ∈ [0, β −1 pπ(k+1) ] is chosen to ensure that
!N
i=1 qi = 1. The value
d2 (q)
&
=
β −1 pπ(i) ln(β −1 pπ(i) )
i:i&=π(k+1)
+ qπ(k+1) ln(qπ(k+1) ) + (β −1 pπ(k+1) − qπ(k+1) ) ln(β −1 pπ(k+1) − qπ(k+1) )
&
≤
β −1 pπ(i) ln(β −1 pπ(i) ),
i
where the last inequality follows from
qπ(k+1) ln(qπ(k+1) ) + (β −1 pπ(k+1) − qπ(k+1) ) ln(β −1 pπ(k+1) − qπ(k+1) ) ≥ (β −1 pπ(k+1) ) ln(β −1 pπ(k+1) ).
Thus,
D2 = max d2 (q) − min d2 (q)
q∈Q
≤
&
i
q∈Q
β −1 pi log(β −1 pi ) −
&
i
pi log pi −
&
i
)
*
= −β −1 − β log β + (1 − β) log(1 − β) .
13
)
*
pi (β −1 − 1) log pi + log(β −1 − 1)
(22) {D2_q}
© Copyright 2026 Paperzz