Newsvendor problems with sequentially revealed demand information

Newsvendor Problems with Sequentially Revealed Demand Information
Jing-Sheng Song,1,2 Paul H. Zipkin2
1
Department of Management Science, School of Management, Fudan University, Shanghai 200433, China
2
The Fuqua School of Business, Duke University, Durham, North Carolina 27708
Received 13 August 2012; revised 29 August 2012; accepted 30 August 2012
DOI 10.1002/nav.21509
Published online 8 October 2012 in Wiley Online Library (wileyonlinelibrary.com).
Abstract: This article analyzes a capacity/inventory planning problem with a one-time uncertain demand. There is a long procurement leadtime, but as some partial demand information is revealed, the firm is allowed to cancel some of the original capacity
reservation at a certain fee or sell off some inventory at a lower price. The problem can be viewed as a generalization of the classic
newsvendor problem and can be found in many applications. One key observation of the analysis is that the dynamic programming
formulation of the problem is closely related to a recursion that arises in the study of a far more complex system, a series inventory
system with stochastic demand over an infinite horizon. Using this equivalence, we characterize the optimal policy and assess the
value of the additional demand information. We also extend the analysis to a richer model of information. Here, demand is driven by
an underlying Markov process, representing economic conditions, weather, market competition, and other environmental factors.
Interestingly, under this more general model, the connection to the series inventory system is different. © 2012 Wiley Periodicals,
Inc. Naval Research Logistics 59: 601–612, 2012
Keywords: newsvendor problem; partially revealed demand information; dynamic program; series inventory system; Markov
modulated demand
1.
INTRODUCTION
This article is motivated by the following capacity/inventory planning problem for a seasonal product with
a long procurement leadtime. Consider a new toy or a new
fashion style whose demand is highly uncertain. The item is
manufactured overseas and requires a long production and
distribution leadtime. The retailer needs to reserve capacity long before the selling season. However, the overseas
manufacturer has many clients, so it offers the retailer a downward flexibility contract: at a finite set of time points before
the selling season, the retailer can choose to cancel some
of the original capacity reservation at a certain penalty fee.
Those time points may include epochs right before production, or right before certain subsequent production stages,
such as packaging. Correspondingly, the retailer can provide
incentives to its customers to induce early demand information. This way, even before production, the retailer may
obtain some demand information. Using this information,
the retailer can decide to release some of the capacity. Later,
after production, having more information, the retailer can
sell some items locally, before they are loaded on the ship
Correspondence to: Jing-Sheng Song ([email protected])
© 2012 Wiley Periodicals, Inc.
to cross the ocean. Still later, after the voyage, the retailer
can sell some items near the port, before they are loaded on
trucks to travel to a warehouse, and again before they leave
the warehouse to reach the retail stores. This kind of scenario has become more prevalent in global supply chains,
as more things are produced overseas, and advanced information technology provides instant access to newly revealed
market information.
Assume the capacity cancelation fee is increasing in time,
that is, it is more expensive to cancel a unit of capacity as
we get closer to the final selling point. This is reasonable,
because a shorter notice of capacity cancelation makes it
harder for the supplier to re-market the under-utilized capacity. Also, any unfilled demand incurs a penalty cost. To induce
customers to reveal demand earlier, we assume that earlier
revealed demands incur higher penalties (See Section 2 for
more detail). Our goal is to characterize the optimal procurement and cancelation policy for the retailer. We also want to
assess the value of the early demand information.
Many other applications possess the same features. The
development of e-commerce provides new opportunities to
obtain such early demand information. Suppose a popular
writer has announced a new book (a new volume of Harry
Potter, for example). Both e-tailers such as Amazon.com and
602
Naval Research Logistics, Vol. 59 (2012)
clicks-and-mortar sellers such as Barnes & Noble often email
some of their existing customers (such as members or people
who purchased similar books in the past) to encourage them
to purchase the book in advance. They may or may not offer
financial incentives. With this information, the retailers can
make better-informed order decisions.
Here are more examples: Consider a batch production
process, such as gasoline blending. At various stages during the process, the batch size can be reduced by selling off
intermediate products, for example, the various petrochemicals that make up gasoline. Or, a farmer who plants a field
of corn in the spring can pull up some of the corn before it
ripens, based on market conditions, and switch the land to
faster growing crops. Or, a conference organizer reserves a
block of hotel rooms 6 months in advance. As time goes on
and registrations come in, he can cancel some of the rooms
for a fee, but cannot add new ones. Similarly, shipping, construction, contract manufacturing, and utility companies may
require clients to reserve capacity in advance; as information
arrives over time, a client can cancel or sell off some of the
reserved capacity at a penalty. The essence of all these problems is the adjustment of an initial decision in response to
new information, but only in one direction.
Most readers, we trust, are acquainted with the newsvendor model. Therefore, viewing the problem under study as
a generalized newsvendor problem is of pedagogic value.
Imagine that the newsvendor buys newspapers in one place
and sells them elsewhere. After the newspapers are purchased, they are transported to the market. The shipping takes
time. Although all demand occurs at one time and place, once
the newspapers are acquired, the newsvendor gets information about the demand during the shipping time. Along the
way, using this information, the newsvendor can choose to
discard some of the newspapers or sell them cheaply. Thus,
he now faces multiple, sequential decisions. We investigate
whether it is worthwhile for the newsvendor to acquire and
use this gradually revealed demand information, to discard
some of the originally procured newspapers before the final
sales moment.
Although the problem appears to be new, it is closely
related to several models in the literature. Inventory models with disposal options have been studied for years (e.g.,
Fukuda [9] and Morton [14]). There, however, one can augment the stock by ordering as well as reduce it, and demand
occurs at several points in time, not just once. Models of
dam control (e.g., Cohen and Rubinovitch [7]) have similar
features.
Sobel [17] studies a serial production system similar to
the batch system mentioned above, but with random yields
at each stage and no partial demand information. The formulation and results are similar to ours, although the details
differ. Chen and Wu [4] consider a system like ours, but the
Naval Research Logistics DOI 10.1002/nav
newsvendor must select a single purchase time, subject to a
changing purchase price, and there are no subsequent opportunities to unload. Tang et al. [19] consider a moon-cake
producer (Moon-cakes are enjoyed throughout Asia during
the Moon Festival in early fall). Before the selling season, the
manufacturer offers advanced booking discounts to induce
early orders. The problem is how to use the realized-demand
information to plan the (one-time) production volume.
Many authors have considered inventory problems with
actual demands providing information about future demands,
for example, Scarf [15], Azoury [1], and Burnetas and
Gilbert [2]. Our problem is different in several ways. First, in
those models there are multiple procurement opportunities,
whereas in ours there is only one. Second, there demand is
filled as it occurs over time, but here demand is filled only
once at the end. Finally, the form of demand information
here is very special; we observe part of the final demand but
nothing more (until Section 4).
In a broader sense, our model framework is related to that
of revenue management (see the review by McGill and van
Ryzin [12]). Think of the hotel reservation problem mentioned above from the hotel manager’s point of view. Here, we
take prices as fixed, and there are other differences in detail.
Unlike the majority of works on revenue management, we
do not use dynamic pricing to change the customers’ behavior. Conversely, there the initial capacity level is exogenous,
whereas here it is a decision variable.
One of the key findings of our research is that the dynamics,
constraints, and costs of this problem are nearly identical to
those of a different, more complex one—a multistage series
inventory system over an infinite time horizon (see Chen and
Zheng [5], Clark and Scarf [6], and Zipkin [20]). Accordingly, the dynamic-program formulation here has nearly the
same form as the recursion for optimizing the series system.
This connection enables us to use methods and results for
that model in our context.
We also consider a more general model with a richer representation of demand information. Here, demand depends
on an underlying Markov process, referred to as “the world”
representing economic conditions, weather, market competition, and other environmental factors. The optimal policy
has the same form as before, but its parameters depend on
the current state of the world. Interestingly, the optimal policy no longer coincides with that of the corresponding serial
inventory system.
We formulate the problem as a dynamic program (Section
2) and characterize the optimal policy (Section 3). We obtain
simple bounds on the optimal-policy parameters and optimal cost; these provide useful heuristics and measure the
value of the dynamic demand information. Section 4 presents
the world-driven demand model. Section 5 concludes the
article.
Song and Zipkin: Newsvendor Problems with Sequentially Revealed Demand Information
2.
FORMULATION
Let j index geographic points, j = 0, 1, ..., J . The supply
source is point 0, and the customers arrive at point J . At time
τ0 = 0, we make the initial procurement decision. The other
points 0 < j < J are places where new information arrives
and the stock can be reduced. Let τj denote the time required
to travel from point 0 to point j , then τ0 < τ1 < · · · < τJ
(so, j can refer to both a point in space and a point in time).
The total travel time is T = τJ .
The total demand is D(T ), but it is revealed gradually,
according to the stochastic process D(τj ), starting with
D(0) = 0. Let Dj = D(τj ) − D(τj −1 ), j > 0. These increments are nonnegative and independent. They can be discrete
or continuous.
The supply available initially is the constant x̂(0), which
can represent the supplier’s capacity limit. The initial procurement amount is a decision variable ŷ(0), with ŷ(0) ≤
x̂(0). For 0 < j < J , x̂(τj ) denotes the stock upon arrival at
point j . With the available information D(τj ) at that time, we
make the decision ŷ(τj ), the remaining stock after unloading there, where ŷ(τj ) ≤ x̂(τj ). Thus, x̂(τj +1 ) = ŷ(τj ). We
arrive at J with stock x̂(T ). There is no decision to make at
that point; we simply fill as much demand as possible and
discard any leftovers.
At point j , we pay a cost ĥj per unit to discard goods (or
cancel reserved capacity, etc.). Assume that ĥj is increasing
in j (otherwise, if ĥj +1 ≤ ĥj , we would never choose to
unload at point j , so we can eliminate it). This includes point
0 – we pay ĥ0 [x̂(0) − ŷ(0)] to set the initial stock at ŷ(0) (we
write this term in this way just for symmetry. Most often,
ĥ0 will be negative, in which case −ĥ0 is the unit cost of
acquiring the initial stock. The constant ĥ0 x̂(0) has no particular meaning). At point J , we pay ĥJ per unit to discard
leftovers. Also, we pay a penalty cost b̂J per unit of unfilled
demand.
In addition, the model includes some kinds of incentives
for customers to reveal their demands earlier rather than later.
Such incentives can take many forms in practice. The model
includes some of these but not all. Specifically, we promise
to fill demands in the order they are received. So, customers
who order early have a better chance of actually receiving
the goods they order. Also, we promise to pay b̂j per unit of
unfilled demand Dj , j = 1, ..., J , and the b̂j are nonincreasing in j . That is, if a customer orders some items in (τj −1 , τj ),
then she receives b̂j per unit of her order that cannot be satisfied, where this penalty factor is larger for earlier orders. The
basic penalty cost b̂J is the smallest of these factors. Even so,
we assume b̂J > ĥJ > 0. The lowest penalty cost is more
than the highest disposal cost. (Unfortunately, this construct
cannot represent perhaps the most natural form of incentive,
namely, a price discount for all units ordered early.)
603
This formulation assumes that actual orders arrive over
time regardless of our stock position. For orders that arrive
when total demand exceeds available inventory, we may
immediately tell customers that we cannot fill their demands
instead of waiting until the end. The quantity b̂j describes
the corresponding penalty cost, whenever customers learn
about it.
The total shortage cost for unfilled demand is
b̂J [x̂(T ) − D(τJ )]− + (b̂J −1 − b̂J )[x̂(T ) − D(τJ −1 )]−
+ (b̂J −2 − b̂J −1 )[x̂(T ) − D(τJ −2 )]−
+ · · · + (b̂1 − b̂2 )[x̂(T ) − D(τ1 )]−
J
=
bj [x̂(T ) − D(τj )]− ,
j =1
where bJ = b̂J and bj = b̂j − b̂j +1 , j < J . The total cost is
thus
J −1
ĥj [x̂(τj ) − ŷ(τj )] + ĥJ [x̂(T ) − D(T )]+
j =0
+
J
bj [x̂(T ) − D(τj )]− .
j =1
For later convenience we add the constant cost
J −1
ĥj Dj +1 − ĥ0 x̂(0) .
j =0
Suppose we reach time τ1 , and we find that D(τ1 ) >
x̂(τ1 ). Because b̂1 ≥ b̂J > ĥJ ≥ ĥj for all j , we never
discard after this point, and consequently x̂(T ) = x̂(τ1 ).
On the other hand, if D(τ1 ) ≤ x̂(τ1 ), then we may discard now and in the future, but we certainly keep enough
to meet demand D(τ1 ) so also D(τ1 ) ≤ x̂(T ), implying
[x̂(T ) − D(τ1 )]− = [x̂(τ1 ) − D(τ1 )]− = 0. The same argument
the total shortage cost becomes
J holds for all j . Thus,
−
b
[
x̂(τ
)
−
D(τ
)]
.
j
j
j
j =1
We could now formulate a model with the pair of state
variables x̂(τj ) and D(τj ). It is clear, however, that the information in those variables is captured in the single quantity
x(τj ) = x̂(τj ) − D(τj ), which we call the net load. The net
load measures the current stock minus the demand observed
so far. Likewise, we can express the decision at j by the
remaining net load y(τj ) = ŷ(τj ) − D(τj ), constrained by
y(τj ) ≤ x(τj ). The dynamics are given by
x(τj +1 ) = y(τj ) − Dj +1 .
Naval Research Logistics DOI 10.1002/nav
604
Naval Research Logistics, Vol. 59 (2012)
In these terms the total cost is
J −1
+
ĥj [x(τj ) − y(τj )] + ĥJ [x(T )] +
j =0
⎡
+⎣
J −1
J
The argument y in Cj (y) represents y(τj −1 ), and C̄j (x) is
the minimal expected cost from point j − 1 onwards (with
costs assigned to points as in (1)), given x(τj −1 ) = x.
In particular, recall that τ0 = 0 and D(τ0 ) = D(0) = 0,
we have
bj [x(τj )]−
j =1
⎤
ĥj Dj +1 − ĥ0 x(0)⎦
y(τ0 ) = ŷ(τ0 ) − D(τ0 ) = ŷ(τ0 ).
j =0
= −ĥ0 x(0) +
J −1
So y1∗ (x) is the optimal initial procurement amount at time 0
or at source point 0, given the supplier’s capacity is x̂(τ0 ) = x.
ĥj [x(τj ) − x(τj +1 )] + ĥJ x(T )
j =0
+ (bJ + ĥJ )[x(T )]− +
J −1
bj [x(τj )]−
3.
j =1
=
J
hj x(τj ) + (bJ + ĥJ )[x(T )]− +
J −1
j =1
bj [x(τj )]− ,
j =1
j > 0.
The expected total cost is thus
⎡
⎤
J
J −1
E⎣
hj x(τj ) + (bJ + ĥJ )[x(T )]− +
bj [x(τj )]− ⎦ .
j =1
j =1
(1)
The first two terms here (all but the last sum) have the
same form as the total average cost of a series inventory system (see, e.g., Zipkin [20], Chapter 8). In that context, J is
the number of stages. Stage J is the nearest to the customer,
and stage 1 is nearest to the outside supplier (represented by
index 0). The transportation time between stage j and stage
j + 1 is τj +1 − τj , and Dj is the leadtime demand at stage j .
The state variable x(τj ) is the echelon net inventory at stage
j , hj is the echelon holding cost, ĥJ is the total holding cost
at stage J , and bJ + ĥ0 is the unit backorder cost. Also, y(τj )
is the echelon net inventory position at stage j + 1. In that
context too, y(τj ) ≤ x(τj ), and x(τj +1 ) = y(τj ) − Dj +1 .
What is different here is the last sum, which reflects the additional backorder cost bj for each unit of unfilled demand that
is revealed between τj −1 and τj , j < J .
Let us now formulate a dynamic program to minimize (1):
C̄J +1 (x) = ĥJ [x]−
(2)
−
Ĉj (x) = hj x + bj [x] + C̄j +1 (x)
Cj (y) = E[Ĉj (y − Dj )]
yj∗ (x) = arg min{Cj (y) : y ≤ x}
C̄j (x) = Cj (yj∗ (x)),
0 < j ≤ J.
Naval Research Logistics DOI 10.1002/nav
Recall that bJ > ĥJ > 0, so bJ + ĥJ > 0. By induction
on j , following Lemma 8.3.1 in [20], we have
THEOREM 1: The functions Cj (y) and C̄j (x) are convex,
and the optimal policy has the form yj∗ (x) = yj∗ ∧ x, where
the constant yj∗ minimizes Cj (y) over y.
where
hj = ĥj − ĥj −1 ,
OPTIMAL POLICY
So, y1∗ ∧ x(0) is the optimal initial procurement quantity,
and C̄1 (x(0)) is the optimal total cost. At each point j − 1,
j > 2, there is single critical net-load level yj∗ ; the optimal
policy is to reduce the net load down to that level. We can
thus rewrite (2) as follows:
C̄J +1 (x) = ĥJ [x]−
(3)
Ĉj (x) = hj x + bj [x]− + C̄j +1 (x)
Cj (y) = E[Ĉj (y − Dj )]
yj∗ = arg min{Cj (y)}
C̄j (x) = Cj (yj∗ ∧ x),
0 < j ≤ J.
This recursion is identical to the one used to optimize the
serial system, except for the additional term bj [x]− in the
function Ĉj (x), j < J . In that context the optimal policy is
called an echelon base-stock policy. When
bj = 0,
j < J,
(4)
that is, the shortage costs are identical for all demands, then
the two recursions ((3) and the one for a serial system) are
exactly the same. We shall assume (4) when necessary.
Analogous to the theory of series inventory systems, the
yj∗ need not be monotonic. If it happens that yj∗ < yj∗+1 ,
then we never unload goods at point j . In this case we can
replace yj∗+1 by yj∗ – the resulting policy is equivalent. After
all such replacements, the yj∗ are nonincreasing in j . For this
and other qualitative properties, and numerical solutions of
various cases (assuming (4)), see Gallego and Zipkin [10]
and Zipkin [20].
Song and Zipkin: Newsvendor Problems with Sequentially Revealed Demand Information
605
Table 1. Policies: optimal, bounds, and heuristic.
b
(3,3,3,9)
(4,3,2,9)
(20,20,20,99)
(30,20,10,99)
h1
h2
h3
h4
y1∞
y1∗ /y1a
y10
y2∞
y2∗ /y2a
y20
y3∞
y3∗ /y3a
y30
y4∗
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
2.5
0.25
0.25
0.25
2.5
0.25
0.25
0.25
2.5
0.25
0.25
0.25
2.5
0.25
2.5
2.5
0.25
0.25
2.5
2.5
0.25
21
19
18
17
26
24
24
23
22/22
22/22
22/22
20/21
27/27
27/26
27/26
26/26
24
25
25
25
28
28
28
28
17
14
14
13
21
19
19
18
18/18
17/17
17/17
13/14
22/22
22/21
22/21
18/19
19
20
20
15
23
23
23
19
13
10
10
10
16
14
14
14
13/14
12/12
12/12
10/11
17/17
16/16
16/16
14/14
14
14
14
11
17
17
17
14
8
6
6
9
11
8
8
11
3.1.
Bounds
Shang and Song [16] show how to construct simple bounds
on the optimal policy parameters yj∗ and the optimal cost
for the series system. These results of course apply to our
system, assuming (4). One can apply the same approach
to develop bounds for the general case, not assuming (4).
Because the techniques are similar to Shang and Song’s, we
merely summarize the main results here.
Let
h0 = ĥ0 ,
j
k=i hk ,
h[i, j ] =
0
D[i, j ] =
j
i ≤ j,
i > j.
yj0
Dk = D(τj ) − D(τi−1 ),
j ≥ i,
k=i
D̆j = D[j , J ].
Using this notation, ĥj = h[0, j ] for all j . Define
Cj∞ (y) = E h[j , J ](y − D̆j ) + (bJ + ĥJ )[y − D̆j ]−
(5)
+
J −1
hi E[D̆i+1 ] +
i=j
J −1
Notice that, under (4), the last sum in Cj∞ (y) vanishes,
and so the function becomes the cost of a simple, one-period
newsvendor problem. (This is Shang and Song’s result.) In
this case, it is easy to compute yj∞ , and there is no need for
∞
∞
j . In general, the last sum makes Cj (y) a bit more compli∞
∞
cated. Here, yj is a stronger bound than ∞
j , but j is easier
to compute.
Replacing h[j , J ] by hj in (5) and denoting by C 0 (y) the
resulting function, and letting yj0 = arg min{Cj0 (y)}, we can
also show:
bi E (y − D[j , i])−
i=j
This is the solution to (3) with C̄i fixed to Ci , or in other
words, yi∗ fixed to ∞, i > j . Clearly Cj∞ (y) is a convex
function. Let yj∞ = arg min{Cj∞ (y)}.
Also, denote
L∞
j (y) = h[j , J ] − (b̂j + ĥJ )P (Dj > y),
∞
∞
j = min{y : Lj (y) ≥ 0}.
Here is the main result:
THEOREM 2: For all j and y, we have: (a) Cj∞ (y) ≥
Cj (y). In particular, C1∞ (y1∞ ∧ x(0)) is an upper bound on
∞
∗
the optimal total cost. (b) ∞
j ≤ yj ≤ yj ; the inequalities
become equalities for j = J .
THEOREM 3: For all j and y, (a) Cj0 (y) ≤ Cj (y), and (b)
≥ yj∗ ; the inequality becomes equality for j = J .
These bounds may be useful computationally. To compute
yj∞ requires only the minimization of an explicit function,
not a recursion. Even simpler is the calculation of ∞
j . This
is the solution to a newsvendor problem. Similar to Shang
and Song [16], we can use these bounds to approximate the
optimal policy yj∗ , such as the simple average of the bounds
yja = (yj0 + yj∞ )/2 (rounded to the nearest integer).
We conducted a numerical study to examine the performance of the bounds yj0 and yj∞ and the heuristic in a
system with J = 4, T = 1, and τj − τj −1 = 0.25,
j = 1, 2, 3, 4. The demand process is Poisson with rate
16. The base holding cost vector h = (h1 , h2 , h3 , h4 ) =
(0.25, 0.25, 0.25, 0.25), and we vary this vector by first
increasing any one (resp, two, three, or four) of the components to 2.5 while keeping the others fixed, resulting in a total
of 16 holding cost vectors. For any given h, we test 6 penalty
cost vectors b = (b1 , b2 , b3 , b4 ) = (4, 3, 2, 9), (3, 3, 3, 9),
(2, 3, 4, 9), (30, 20, 10, 99), (20, 20, 20, 99), (10, 20, 30, 99).
The first three of these penalty cost vectors correspond to (b̂1 , b̂2 , b̂3 , b̂4 ) = (18, 14, 11, 9), (18, 15, 12, 9),
(18, 16, 13, 9), respectively, represent convex, linear, and
concave shapes of the penalty cost b̂j as a function of j ,
j = 1, 2, 3, 4. The same applies to the other three vectors.
Table 1 provides some examples of the lower bound y∞ ,
upper bound y0 , and the heuristic policy ya , in comparison
with the optimal policy y∗ .
Naval Research Logistics DOI 10.1002/nav
606
Naval Research Logistics, Vol. 59 (2012)
Figure 1. Value of information.
In this experiment of 96 test cases, the heuristic policy performs quite well, especially for larger penalty costs. Define
the percentage error of the heuristic as
% error =
C1 (y1a ) − C1 (y1∗ )
× 100%.
C1 (y1∗ )
The average (maximum) percentage error is 0.54% (4.93%).
For b4 = 9, the average (maximum) percentage error is
0.98% (4.93%). For b4 = 99, the average (maximum) percentage error is 0.10% (0.54%). Also, 37.5% of the time
(38 out of the 96 cases) the heuristic produces true optimal
solution.
With the good performance of the heuristic policy, we
can use the bounds to gain some qualitative insights into the
model behavior. For example, consider a family of problems
with one parameter σ , which acts as a scale factor. All the
Dj are scaled by this quantity, so in particular, the mean
and standard deviation of each Dj are both proportional to
σ . Otherwise, the problems are identical. Now, for a simple
newsvendor problem scaled in this way, we know that the
optimal solution and the optimal cost are linear in σ . It is not
hard to show that this is true also of the function Cj∞ (y) and
its minimum yj∞ . We can conclude that, for all j , the optimal
solution yj∗ is bounded below by a function linear in σ . The
upper bounds on the yj∗ have the same form. We can thus say
that the optimal yj∗ are roughly linear in σ . The same logic
applies to the optimal total cost.
Likewise, because all these bounds increase in b̂j and
decrease in hj , we can expect the yj∗ to behave similarly.
For the numerical tests described above, we indeed find that
yj∗ increases in b̂j and decreases in hj .
Naval Research Logistics DOI 10.1002/nav
3.2.
Value of Information
Note that, under (4), the function C1∞ (y) is precisely the
cost of starting with stock y and allowing no adjustments
thereafter, that is, constraining y(τj ) = x(τj ). The difference C1∞ (y1∞ ) − C1 (y1∗ ) is thus the value of the information
obtained by observing the process D(·) over time, along with
the flexibility to use the information.
Figure 1 illustrates this difference for three specific cases.
Here, J = 64, T = 1, D(T ) is normal with mean λ = 16
and standard deviation σ = 4, τj = j /64, and ĥj = j /64,
j = 0, ..., 64. The figure shows the true optimal cost (lower
curve) and the cost of the best solution with no information
(upper curve) for several different values of b. For these data,
the differences are fairly significant.
We can also explore the value of information through the
effect of J , keeping T fixed. In the context of conference
reservation, for example, the conference organizers may be
interested in how often they should observe the registration level and make adjustments in the hotel reservations.
If J = 1, then there are no adjustments, J = 2 allows
one adjustment, and so forth. Because of to the equivalence
between our model [assuming (4)] and the serial inventory
model, we can use the observations by Gallego and Zipkin
[10] on the latter to answer this question. Their Figures 7 and
8 indicate that increasing J lowers cost, but with diminishing
returns. So, starting with no adjustments (J = 1), checking
the conference registration level once before the conference
starts (J = 2) reaps considerable informational benefit.
Additional checks (J > 2) may help too, but less. Indeed,
in practice, we often do observe managers collecting information once, midway in the planning period before the true
Song and Zipkin: Newsvendor Problems with Sequentially Revealed Demand Information
selling season. This is the case with the ski-wear company
Sport Obermeyer; see Fisher et al. [8]. This is also true for
moon-cake planning; see Ref. 19.
Now, suppose we choose J = 2, so there is one adjustment
point τ1 . The next question is, where should τ1 be located in
(0, T )? In the conference example, this means, given that we
will only change the hotel reservation level once, when is the
best time to do so? The answer depends on the cost structure. Again, the observations of Gallego and Zipkin can be
applied here. The best time is just before a large increase in
ĥj . For example, the hotel may tell us that any cancellations
after two months before the conference date will incur higher
costs than earlier ones. In this case, we should try to make
the cancellations just before that time.
=
J
j =1
=
J
bj x̂(T ) −
j
−
Dk,wk−1
k=1
bj x̂(τj ) −
j
j =1
−
Dk,wk−1
k=1
where the last equality follows a similar argument as in
Section 2. The total cost on this sample path is thus
J −1
ĥj x̂(τj ) − ŷ(τj ) + ĥJ x̂(T ) −
j =0
+
4.
−
−
J
J −1
b̂J x̂(T ) − Dk,wk−1 + (b̂J −1 − b̂J ) x̂(T ) − Dk,wk−1
k=1
+ (b̂J −2 − b̂J −1 ) x̂(T ) −
J −2
k=1
+ (b̂1 − b̂2 )[x̂(T ) − D1,w0 )]−
−
Dk,wk−1
J
J
+
Dk,wk−1
k=1
bj x̂(τj ) −
j =1
WORLD-DEPENDENT DEMAND
We now assume that the demand is driven by an exogenous Markov process W = {W (t), t ≥ 0} with state space S,
representing the state of the world, such as the weather or the
economy or the market condition for our product. At time
t, given W (t) = w, the distribution of remaining demand
D(t, T ] depends on w. Otherwise, the evolution of W is independent of the demand. For example, consider a new style of
snow boots to be produced in Asia. A production order has
to be made in July, but the real shipments to the retail stores
will take place only in November. According to the weather
forecasts in September and October, it may appear that it will
be a warmer than usual winter, and demand for snow boots
will likely be lower than usual, so the retailers’ orders from
a distributor too should be lower. Or, the demand for a new
video game or game machine may depend on competitors’
offerings. When a competitor announces a new feature in a
similar product or a lower price, that will affect the demand
for our product.
We observe the state of W at each decision point τj . Given
W (τj ) = wj , let Dj +1,wj denote the demand until the next
decision epoch τj +1 . Also, denote Dj ,w,w be the cumulative demand in (τj −1 , τj ], given that W (τj −1 ) = w and
W (τj ) = w . Clearly, we can focus on the discrete-time
Markov process W (τj ), and we shall use the same notation
W for it.
For any given sample path of {W (τj ), j = 0, ..., J − 1} =
(w0 , w1 , ..., wJ −1 ) = w, the shortage cost is
607
j
−
.
Dk,wk−1
k=1
The state of the system at τj can be characterized by two
variables: the net load x(τj ) and the state of the world W (τj ).
The decision variable y(τj ) (net load after unloading) again
must satisfy y(τj ) ≤ x(τj ). Given W (τj ) = wj , the system
dynamics are x(τj +1 ) = y(τj ) − Dj +1,wj .
Using these variables, following a similar derivation as in
Section 2, the expected total cost is
⎤
⎡
J
J −1
hj x(τj ) + (bJ + ĥJ )[x(T )]− +
bj [x(τj )]− ⎦ ,
E⎣
j =1
j =1
(6)
where the expectation is over all possible sample path of W
and the corresponding demand distributions, assuming the
initial state w0 is chosen from the stationary distribution of W,
denoted by π = (πw , w ∈ S). The objective is to minimize
(6).
To keep the exposition simple, we focus on the case of (4);
the extension to the general case is straightforward. Under
this assumption, the expected total cost given W (0) = w is
⎡
⎤
J
hj x(τj ) + (bJ + ĥJ )[x(T )]− | W (0) = w ⎦ .
Gw = E ⎣
j =1
(7)
The objective is to minimize (7) for every w. All proofs in
this section are in Appendix.
k=1
+ ···
4.1.
Optimal Policy
Let us now formulate a dynamic program to minimize
(7). Let C̃j (w, x) be the minimal expected cost from point
Naval Research Logistics DOI 10.1002/nav
608
Naval Research Logistics, Vol. 59 (2012)
j − 1 onwards [with costs assigned to points as in (7)], given
W (τj −1 ) = w and x(τj −1 ) = x. For y = y(τj −1 ),
C̃J +1 (w, x) = ĥJ [x]+ + bJ [x]− ,
(8)
C̃j (w, x) = min{ĥj −1 (x − y)
y≤x
+ E[C̃j +1 (w , y − Dj ,w,w )]}
0 < j ≤ J.
The expectation is over [W (τj ) = w |W (τj −1 ) = w] as well
as Dj ,w,w . Letting C̄j (w, x) = C̃j (w, x) − ĥj −1 x , the above
equations can be rewritten as follows:
C̄J +1 (w, x) = (bJ + ĥJ )[x]− ,
(9)
C̄j (w, x) = min{hj (y − E[Dj ,w ])
y≤x
+ E[C̄j +1 (w , y − Dj ,w,w )]
− ĥj −1 E[Dj ,w ]},
0 < j ≤ J.
that described in Section 2 under assumption (4), except that
the demand is now driven by W. They show that a statedependent echelon base-stock policy is optimal, which has
the same form as in Theorem 4. However, the two policies
are not identical. The policy in Theorem 4 is a “myopic policy” for the serial system. It minimizes the one-period cost,
ignoring future periods. Let {sj∗ (w)} denote the true optimal
echelon base-stock level. Then, similarly to Song and Zipkin
[18], one can show that sj∗ (w) ≤ yj∗ (w).
4.2.
Monotonicity
In this subsection, we show that, when the problem data
induce a certain ordering of the states w, the optimal critical
net-load numbers yi∗ (w) are ordered in the same way.
Assume there is some partial order among the states,
denoted . We say that W is stochastically partial-monotone
if, for any w and v with w v, there is a way to construct
the process so that
Ignoring the constants ĥj −1 E[Dj ,w ], we obtain
C̄J +1 (w, x) = (bJ + ĥJ )[x]− ,
(10)
Cj (w, y) = hj (y − E[Dj ,w ]) + E[C̄j +1 (w , y − Dj ,w,w )],
yj∗ (w, x) = arg min{Cj (w, y) : y ≤ x}
C̄j (w, x) = Cj (w, yj∗ (w, x)),
0 < j ≤ J.
Using induction, one can show the following
THEOREM 4: For any given w and j , the functions
Cj (w, y) and C̄j (w, x) are convex, and the optimal policy
has the form yj∗ (w, x) = yj∗ (w) ∧ x, where yj∗ (w) minimizes
Cj (w, y) over y.
So, at the beginning, there is state-dependent order-upto level y1∗ (w). The optimal initial loading level equals that
value, if the state of the world is w. At each point j − 1,
j > 2, there is state-dependent critical net-load level yj∗ (w).
The optimal policy is to reduce the net load down to that
level, if the state of the world is w. We can thus rewrite (10)
as follows:
C̄J +1 (w, x) = (bJ + ĥJ )[x]− ,
(11)
Cj (w, y) = hj (y − E[Dj ,w ]) + E[C̄j +1 (w , y − Dj ,w,w )],
yj∗ (w) = arg min{Cj (w, y)}
C̄j (w, x) = Cj (w, yj∗ (w) ∧ x),
0 < j ≤ J.
The connection between this model and the corresponding
inventory system is now more subtle. Expression (7) has the
same form as the expected single-period cost of the series
inventory system with Markov modulated demand studied
by Chen and Song [3]. This serial system is identical to
Naval Research Logistics DOI 10.1002/nav
[W (t)|W (0) = w] [W (t)|W (0) = v] for all
t ≥ 0, w.p. 1.
(12)
(see Ref. 13 for an elaboration of this concept.)
A real-valued function f on S is said to be nondecreasing
if w v implies f (w) ≤ f (v). If W is stochastically partial
monotone, then for all nondecreasing functions f
E[f (W (t))|W (0) = w] ≤ E[f (W (t))|W (0) = v]
w v, t ≥ 0.
for all
(13)
Condition 1
a. W is stochastically partial-monotone.
b. Dw (0, t) = [D(0, t)|W (0) = w] is stochastically
nondecreasing in w (in the partial-order sense above)
for any t ≥ 0.
The simplest case is a complete ordering of the states:
Suppose that the state space S is some set of integers. Then
the concept of stochastic partial-monotonicity coincides with
that of ordinary stochastic monotonicity, and (12) is equivalent to (13) (see Ref. 11). This is a natural assumption
when W represents some scalar variable, whose effect on the
demand rate is monotonic. Stochastic monotonicity means,
roughly, that a larger initial demand rate leads (stochastically)
to larger future demand rates. For example, colder weather
may increase sales of snow boots.
More generally, suppose that W is a vector of independent
Markov chains, each of which is stochastically monotone in
the scalar sense above. Also, suppose that the demand rate is
increasing in each of the components of w, holding the others
fixed. Then, we can interpret as ordinary (component-wise)
Song and Zipkin: Newsvendor Problems with Sequentially Revealed Demand Information
vector inequality, and it is clear that condition 1 holds. This
model represents situations in which the demand rate is determined by several independent factors, such as the temperature
and competitors’s offerings.
The following result is immediate:
LEMMA 5: Under Condition 1, the embedded chain W
on {τj , 0 < j ≤ J } is stochastically partial-monotone.
Thus, w v implies that E[f (W (τj +1 ))|W (τj ) = w] ≤
E[f (W (τj +1 ))|W (τj ) = v] for any nondecreasing function
f and for all j .
Using a coupling method (similar to that of Ref. 18), one
can show the following result:
LEMMA 6: Under condition 1, Dj ,w is stochastically
increasing in w for all j . That is, w v implies Dj ,w ≤st
Dj ,v .
For convenience, in the remaining of this section, whenever we compare the policy parameters, we assume the
demand is continuous and the cost functions are differentiable, so the proofs involve the partial derivatives. For discrete demands, we can replace the derivatives by differences
without affecting the results. With this caveat, we have:
and for j = 1, ..., J
Cj∞ (w, y)
= E h[j , J ](y − D̆j ,w ) + (bJ + ĥJ )[y − D̆j ,w ]−
+
a.
b.
c.
∂
C̄ (w, x) is nonincreasing in w.
∂x j +1
∂
C (w, x) is nonincreasing in w.
∂x j
∗
yj (w) is nondecreasing in w.
The results of Theorem 7 confirm our intuition: If the current information indicates higher future demand, then the
critical net-load level in the current period is higher.
J −1
i=j
hi
(14)
pww (τj −1 , τi )E[D̆i+1,w ] .
w ∈S
LEMMA 8: Cj∞ (w, y) is the solution to (11) with C̄i fixed
to Ci , or in other words, yi∗ (w) fixed at ∞, i > j .
The function Cj∞ (w, y) is the cost of the (original)
newsvendor problem with demand D̆j ,w , overage cost h[j , J ]
and underage cost bJ + h[0, j − 1] = bJ + ĥj −1 . Let
yj∞ (w) = arg min{Cj∞ (w, y)}.
THEOREM 9: For all j = 1, ..., J , w ∈ S, and any real
y,
a. Cj∞ (w, y) ≥ Cj (w, y).
b.
c.
THEOREM 7: Assume condition 1 holds. For all 1 ≤ j ≤
J , and fixed x:
609
∂
∂
C ∞ (w, y) ≥ ∂y
Cj (w, y).
∂y j
∞
∗
yj (w) ≤ yj (w).
Again, replacing h[j , J ] by hj in (14) one can also obtain
a lower bound on Cj (w, y) and an upper bound on yj∗ (w) of
similar form. More specifically, define
Cj0 (w, y) = E hj (y − D̆j ,w ) + (bJ + ĥj )[y − D̆j ,w ]−
J −1
+
hi
pww (τj −1 , τi )E[D̆i+1,w ] .
i=j
w ∈S
Let yj0 (w) = arg min{Cj0 (w, y)}. We can show
THEOREM 10: For all j = 1, ..., J , w ∈ S, and any real
4.3.
Bounds
In this subsection, we show that simple bounds on
the yj∗ (w) can be obtained by minimizing independent
newsvendor-type cost functions. These closed-form bounds
allow us to see the dependence of the optimal policy on the
system parameters.
For any w ∈ S, let D̆j ,w be the cumulative demand in
(τj −1 , T ], given W (τj −1 ) = w. Similarly, denote D̆j ,w,w be
the cumulative demand in (τj −1 , T ], given that W (τj −1 ) = w
and W (τj ) = w . Also, let pww (τj −1 , τi ) denote the transition
probabilities of W. Define
CJ∞+1 (w, x) = (bJ + ĥJ )[x]− ,
y,
a. Cj0 (w, y) ≤ Cj (w, y), y ≥ 0.
b.
c.
∂
∂
C 0 (w, y) ≤ ∂y
Cj (w, y).
∂y j
0
∗
yj (w) ≥ yj (w).
As in Section 3, these bounds are all solutions to newsvendor problems, and so they exhibit the same behavior with
respect to the parameters as those simple models. Thus, for
instance, if each demand depends on a common scale factor,
then each yj∗ (w) is bounded below by a function linear in that
scale factor. The following corollary shows that these bounds
too are monotonic under condition 1. The proof is similar to
those above, so we omit the details.
Naval Research Logistics DOI 10.1002/nav
610
Naval Research Logistics, Vol. 59 (2012)
Table 2. Performance of world-dependent heuristic policy.
State
b4 = 9
b4 = 99
Overall
% Error
0
1.74
13.53
1.03
2.57
0.25
0.74
0.33
1.60
0.99
13.33
0.68
2.57
Average
Maximum
Average
Maximum
1
COROLLARY 11: Under condition 1, D̆j ,w is stochastically increasing in w for all j . That is, w v implies D̆j ,w ≤st
D̆j ,v . Consequently, yj0 (w) ≥ yj0 (v) and yj∞ (w) ≥ yj∞ (v) for
all j .
Similar to Section 3, we can use these bounds to approximate the optimal policy yj∗ (w), such as the simple average
of the bounds yja (w) = (yj0 (w) + yj∞ (w))/2 (rounded to the
nearest integer).We conducted a numerical study to examine
the performance of the bounds yj0 (w) and yj∞ (w) and the
heuristic in a system with J = 4, T = 1, and τj − τj −1 =
0.25, j = 1, 2, 3, 4. There are two states of the world, that is,
state spaceS = {0, 1}. The generator of the world W is Q =
−1 1
. The demand process is a Markov-modulated
1 −1
Poisson process. The demand rates are 16 and 4 when the
states of the world are 0 and 1, respectively. The base holding
cost vector h = (h1 , h2 , h3 , h4 ) = (0.25, 0.25, 0.25, 0.25),
and we vary this vector by first increasing any one (resp,
two, three, or four) of the components to 2.5 while keeping the others fixed, resulting in a total of 16 holding cost
vectors. For any given h, we test 2 penalty cost vectors
b = (b1 , b2 , b3 , b4 ) = (0, 0, 0, 9), (0, 0, 0, 99). From these 32
test cases, we observe that the heuristic policy works quite
well, especially when the penalty cost for unsatisfied demand
is high; see Table 2 for a summary. In this set of examples,
the worst case (with the maximum percentage error 13.53%)
corresponds to the holding cost vector (0.25,0.25,0.25,2.5)
and b4 = 9. Under this particular holding cost distribution
with a very high holding cost at stage J , Cj0 (w, y), by its
construction, severely underestimates the echelon holding
costs, resulting in much higher upstream stock levels than
the optimal.
5.
us to assess the value of information obtained by observing
realized demand, along with the flexibility to use the information. They enabled us to shed light on when and how frequent
such information and flexibility are most valuable.
Finally, we extended the analysis to a richer model of information, based on an underlying Markov process. Although
the resulting model is more complex, the solution shares
many qualitative features with that of the simpler version.
There are now many and growing opportunities to acquire
dynamic information about widely dispersed elements of
supply chains. The question then arises how to use that information productively. This research represents a step towards
an answer.
ACKNOWLEDGMENTS
The authors would like to thank Lu Huang and Lei Xie
for their helpful assistance in the numerical examples. This
research was supported in part by Awards No. 70328001 and
No. 70731003 from the National Natural Science Foundation
of China.
APPENDIX
Proof of Theorem 7
Note that Part c is implied immediately by Part b. We now show that
Part a implies Part b. Note that from Lemma 6, for any w v, we can say
without loss of generality that Dj ,w ≤ Dj ,v for any sample path. Recall that
∂
C̄j +1 (·, y) is nondecreasing in y. Thus, for
C̄j +1 (·, y) is convex in y, so ∂y
any fixed w and y, and any sample path, we have
∂
∂
C̄j +1 (w , y − Dj ,w,w ) ≥
C̄j +1 (w , y − Dj ,v,w ).
∂y
∂y
This implies that
E
∂
∂
C̄j +1 (w , y − Dj ,w,w )) ≥ E
C̄j +1 (w , y − Dj ,v,w )) ,
∂y
∂y
Applying Lemma 5, we then obtain
∂
∂
Cj (w, y) = hj + E
C̄j +1 (w , y − Dj ,w,w ))
∂y
∂y
∂
≥ hj + E
C̄j +1 (w , y − Dj ,v,w ))
∂y
∂
Cj (v, y), w v,
=
∂y
CONCLUDING REMARKS
In this article, we have studied a generalization of the
classic newsvendor problem, allowing for adjustments to an
initial decision as information is revealed. We observed a
close connection to a more complex system, a series inventory system. We found that the optimal policy is structured
in a way that simplifies the problem considerably. Also, we
derived relatively simple bounds on the optimal policy variables. These allow us to see, roughly, how the optimal policy
depends on the system parameters. Also, the results allowed
Naval Research Logistics DOI 10.1002/nav
w v.
which is Part b.
It remains to show Part a. We shall do so by induction. For j = J , because
C̄J +1 (w, x) is independent of w, the result is trivial. Now suppose that Part
a holds for some j and w v. Then according to the above argument, Parts
b and c hold for j . Note that
∂
∂
C̄j (w, x) =
Cj (w, x)1[x<yj∗ (w)]
∂x
∂x
and
∂
∂
C̄j (v, x) =
Cj (v, x)1[x<yj∗ (v)] ,
∂x
∂x
Song and Zipkin: Newsvendor Problems with Sequentially Revealed Demand Information
∂ ∞
∂ ∞
Cj (w, y) = hj + E
Cj +1 (w , y − Dj ,w )
∂y
∂y
∂
≥ hj + E
Cj +1 (w , y − Dj ,w )
∂y
∂
C̄j +1 (w , y − Dj ,w )
≥ hj + E
∂y
∂
=
Cj (w, y),
∂y
where 1A is the indicator function of event A. Moreover, yj∗ (w) ≤ yj∗ (v)
(Part c). We now compare the above two derivatives in different ranges of
∂
∂
C̄j (w, x) = ∂x
x. First, for x < yj∗ (w), according to Part b, ∂x
Cj (w, x) ≥
∂
∂
∗
∗
∂x Cj (v, x) = ∂x C̄j (v, x). Second, for yj (w) ≤ x < yj (w), we have
∂
∂
∗ (v), ∂ C̄ (w, x) = 0 =
C̄
C̄
(w,
x)
=
0
≥
(v,
x).
Finally,
for
x
≥
y
j
∂x j
∂x j
∂x j
∂
∂
∂
∂x C̄j (v, x). Combining all three cases we obtain ∂x C̄j (w, x) ≥ ∂x C̄j (v, x),
which is Part a for j − 1, completing the proof.
Proof of Lemma 8
We use induction. First, consider j = J . Substituting CJ∞+1 (w, x) =
C̄J +1 (w, x) = (bJ + ĥJ )[x]− into CJ in (11) and recalling τJ = T , we
obtain
hJ (y − E[D̆J ,w ]) + E[(bJ + ĥJ )[y − D̆J ,w ]− ]
611
which is Part b for j . Here, the first inequality follows from the induc∂
Cj +1 (w, y) ≥
tion assumption for Part b. The second follows from ∂y
∂
∂y C̄j +1 (w, y). To see this, note that Cj +1 (w, y) = C̄j +1 (w, y) for y ≤
∂
∂
Cj +1 (w, y) = ∂y
yj∗+1 (w), so in this range ∂y
C̄j +1 (w, y). For y >
∂
∂
∗
yj +1 (w), we have ∂y Cj +1 (w, y) ≥ 0 = ∂y C̄j +1 (w, y). The proof is thus
completed.
−
= E[h[J , J ](y − D̆J ,w + (bJ + ĥJ )[y − D̆J ,w ] ]
Proof of Theorem 10
= CJ∞ (w, y).
Now, suppose the assertion is true for some j . Replacing C̄j by Cj∞ in the
expression for Cj −1 in (11) yields
hj −1 (y − E[Dj −1,w ]) + E[Cj∞ (w , y − Dj −1,w )]
= hj −1 (y − E[Dj −1,w ])
+
w ∈S
∂ 0
C (w, y) = hj − (bJ + h[0, j ])P (D̆j +1,w > y).
∂y j
⎧
⎨
pww (τj −2 , τj −1 ) E[h[j , J ](y − Dj −1,w,w − D̆j ,w )
⎩
Also,
∂
Cj (w, y) = hj
∂y
∂
Cj +1 (w , (y − Dj ,w ) ∧ yj∗+1 (w )) .
+
pww (τj −1 , τj )E
∂y
+ (bJ + ĥJ )[y − Dj −1,w,w − D̆j ,w ]− ]
⎫
J
−1
⎬
hi
+
pw v (τj −1 , τi )E[D̆i+1,v )]
⎭
i=j
v∈S
= E h[j − 1, J ](y − D̆j −1,w ) + (bJ + ĥJ )[y − D̆j −1,w )]−
pww (τj −2 , τj −1 )E[D̆j ,w ]
+ hj −1
+
hi
i=j
We now analyze the derivative term on the right-hand side of (16) by
conditioning on Dj ,w = d. Suppose y − d ≥ yj∗+1 (w ), then
∂
Cj +1 (w , (y − d) ∧ yj∗+1 (w ))
∂y
∂
=
Cj +1 (w , yj∗+1 (w )) = 0 ≥ −(bJ + h[0, j ])P (D̆j +1,w > y).
∂y
pwv (τj −2 , τi )E[D̆i+1,v ]
v∈S
= E h[j − 1, J ](y − D̆j −1,w ) + (bJ + ĥJ )[y − D̆j −1,w ]−
+
J
−1
hi
i=j −1
If, Conversely, y − d > yj∗+1 (w ), then, using the induction assumption we
have
pwv (τj −2 , τi )E[D̆i+1,v ]
v∈S
= Cj∞−1 (w, y).
This completes the induction proof.
Proof of Theorem 9
Note Part c is implied immediately by Part b. We now show Parts a and b
by induction. From Lemma 8, CJ∞ (w, y) = CJ (w, y), so Parts a and b hold
for j = J . Suppose they are true for some j + 1. Then from Lemma 8,
Cj∞ (w, y)
= hj (y
≥ hj (y
− E[Dj ,w ]) + E[Cj∞+1 (w , y
− E[Dj ,w ]) + E[Cj +1 (w , y
− Dj ,w )]
(16)
w ∈S
w ∈S
J
−1
Note part c is implied immediately by part b. We now show parts a and
b by induction. By construction, CJ0 (w, y) = CJ (w, y), so parts a and b
hold for j = J . Now suppose they are true for some j + 1. Recalling that
ĥj = h[0, j ],
(15)
− Dj ,w )]
≥ hj (y − E[Dj ,w ]) + E[C̄j +1 (w , y − Dj ,w )]
= Cj (w, y),
where the first inequality follows from the induction assumption for Part a,
and the second from Cj +1 ≥ C̄j +1 by construction. Thus, we obtain Part a
for j . Next, by differentiating both sides of (15), we obtain
∂
Cj +1 (w , (y − d) ∧ yj∗+1 (w ))
∂y
∂
=
Cj +1 (w , y − d)
∂y
∂ 0
≥
C (w , y − d)
∂y j +1
= hj +1 − (b + h[0, j + 1])P (D̆j +2,w > y − d)
= −(b + h[0, j ])P (D̆j +2,w + d > y) + hj +1 (1 − P (D̆j +2,w > y − d))
≥ −(b + h[0, j ])P (D̆j +2,w + d > y).
By deconditioning and returning to (16), we obtain
∂
pww (τj −1 , τj ) − (bJ + h[0, j ])P (D̆j +1,w > y)
Cj (w, y) ≥ hj
∂y
w ∈S
= hj − (bJ + h[0, j ])P (D̆j +1,w > y)
=
∂ 0
C (w, y),
∂y j
proving part b.
Naval Research Logistics DOI 10.1002/nav
612
Naval Research Logistics, Vol. 59 (2012)
To prove part a, first note that D̆j ,w ≥ 0, so yj∞ (w) ≥ 0, which implies
yj∗ (w) ≥ 0 for all w and j . Thus, by the induction hypothesis, for y = 0,
we have
Cj (w, 0) = −hj E[Dj ,w ] + E[Cj +1 (w , −Dj ,w )]
≥ −hj E[Dj ,w ] + E[Cj0+1 (w , −Dj ,w )]
= −hj E[Dj ,w ] +
pww (τj −1 , τj )
w ∈S
⎡
× E ⎣hj +1 (−Dj ,w,w − D̆j +1,w )
+ (bJ + ĥj +1 )(Dj ,w,w + D̆j +1,w )
⎤
J
−1
+
hi
pw v (τj , τi )E[D̆i+1,v ] ⎦
i=j +1
v∈S
= −hj E[D̆j ,w ] + E[(bJ + ĥj )(D̆j ,w )]
pww (τj −1 , τj )E[D̆j +1,w ]
+ hj
w ∈S
+
J
−1
hi
i=j +1
pwv (τj −1 , τi )E[D̆i+1,v ]
v∈S
= E[hj (−D̆j ,w ) + (bJ + ĥj )(D̆j ,w )]
J
−1
+
hi
pwv (τj −1 , τi )E[D̆i+1,v ]
i=j
v∈S
= Cj0 (w, 0).
Consequently, for y > 0, we have
y
Cj (w, y) = Cj (w, 0) +
0
≥ Cj0 (w, 0) +
0
y
∂
Cj (w, x)dx
∂x
∂ 0
C (w, x)dx = Cj0 (w, y).
∂x j
Thus part a holds for j .
REFERENCES
[1] K. Azoury, Bayes solution to dynamic inventory models
under unknown demand distribution, Manage Sci 31 (1985),
1150–1160.
Naval Research Logistics DOI 10.1002/nav
[2] A. Burnetas and S. Gilbert, Future capacity procurements
under unknown demand and increasing costs, Manage Sci 47
(2001), 979–992.
[3] F. Chen and J. Song, Optimal policies for multiechelon on
inventory problems with Markov-modulated demand, Oper
Res 49 (2001), 226–234.
[4] H. Chen and O. Wu, Newsvendor problem with Ccontinuous
information Rrevealing, Working paper, University of British
Columbia, 2003.
[5] F. Chen and Y. Zheng, Lower bounds for multi-echelon stochastic inventory systems, Manage Sci 40 (1994), 1426–1443.
[6] A. Clark and H. Scarf, Optimal policies for a multi-echelon
inventory problem, Manage Sci 6 (1960), 475–490.
[7] J. Cohen and M. Rubinovitch, On level crossings and cycles
in dam processes, Math Oper Res 2 (1977), 297–310.
[8] M. Fisher, J. Hammond, W. Obermeyer, and A. Raman,
Making supply meet demand in an uncertain world, Harvard
Business Review, May/June 1994.
[9] Y. Fukuda, Optimal disposal policies, Nav Res Logist Q 8
(1961), 221–227.
[10] G. Gallego and P. Zipkin, Stock positioning and performance estimation for serial production-transportation systems,
Manufact & Serv Operat Manage 1 (1999), 77–88.
[11] J. Keilson and A. Kester, Monotone matrices and monotone
Markov processes, Stoch Proc Appl 5 (1977), 231–241.
[12] J. McGill and G. van Ryzin, Revenue management: Research
overview and prospects, Transportation Science 33 (1999),
233–256.
[13] W. Massey, Stochastic ordering for Markov processes on
partial ordered space, Math of Operat Res 12 (1987), 350–367.
[14] T. Morton, The nonstationary infinite horizon inventory problem, Manage Sci 24 (1978), 1474–1482.
[15] H. Scarf, Bayes solutions of the statistical inventory problem,
Ann Math Stat 30 (1959), 490–508.
[16] K. Shang and J. Song, Newsvendor bounds and heuristics for
optimal policies in serial supply chains, Manage Sci 49 (2003),
618–638.
[17] M. Sobel, Lot sizes in serial manufacturing with random yields,
Probab Eng Inf Sci 9 (1995), 151–157.
[18] J. Song and P. Zipkin, Inventory control in a fluctuating demand
environment, Operat Res 43 (1993), 351–370.
[19] C.S. Tang, K. Rajaram, A. Alptekinoglu, and J. Ou, The
benefits of advance booking discount programs: models and
analysis, Manage Sci 50 (2004), 465–478.
[20] P. Zipkin, Foundations of inventory management, McGrawHill, New York, 2000.