Dynamic Pricing for Non-Perishable
Products with Demand Learning
Victor F. Araman
René A. Caldentey
Stern School of Business
New York University
DIMACS Workshop on Yield Management and Dynamic Pricing
Rutgers University
August, 2005
Motivation
Price
Inventory
1
% Initial Price
% Initial Inventory
1
0.8
0.6
Product 2
0.4
Product 1
0.2
0
0
10
20
Weeks
Dynamic Pricing with Demand Learning
0.8
Product 2
0.6
Product 1
0.4
0.2
30
40
0
0
10
20
30
40
Weeks
1
Motivation
N0
New Product
Product 1
N’0
R
R
Dynamic Pricing with Demand Learning
2
Motivation
N0
New Product
Product 1
N’0
R
Dynamic Pricing with Demand Learning
3
Motivation
◦ For many retail operations “capacity” is measured by store/shelf
space.
◦ A key performance measure in the industry is
Average Sales per Square Foot per Unit Time.
◦ Trade-off between short-term benefits and the opportunity cost of
assets.
Margin vs. Rotation.
◦ As opposed to the airline or hospitality industries, selling horizons
are flexible.
◦ In general, most profitable/unprofitable products are new items for
which there is little demand information.
Dynamic Pricing with Demand Learning
4
Outline
X Model Formulation.
X Perfect Demand Information.
X Incomplete Demand Information.
- Inventory Clearance
- Optimal Stopping (“outlet option”)
X Conclusion.
Dynamic Pricing with Demand Learning
5
Model Formulation
I) Stochastic Setting:
- A probability space (Ω, F , P).
- A standard Poisson process D(t) under P and its filtration Ft = σ(D(s) : 0 ≤ s ≤ t).
- A collection {Pα : α > 0} such that D(t) is a Poisson process with intensity α under Pα.
- For a process ft, we define If (t) :=
Rt
0
fs ds.
II) Demand Process:
Demand Intensity
- Pricing strategy, a nonnegative (adapted) process pt.
- A price-sensitive unscaled demand intensity
Exponential Demand Model
λ(p) = Λ exp(−α p)
λt := λ(pt) ⇐⇒ pt = p(λt).
- A (possibly unknown) demand scale factor θ > 0. θλ(p)
- Cumulative demand process D(Iλ(t)) under Pθ .
Increasing θ
- Select λ ∈ A the set of admissible (adapted) policies
λt : R+ → [0, Λ].
Price (p)
Dynamic Pricing with Demand Learning
6
Model Formulation
III) Revenues:
λ∗ := argmaxλ∈[0,Λ]{c(λ)},
- Unscaled revenue rate c(λ) := λ p(λ),
- Terminal value (opportunity cost): R
c∗ := c(λ∗).
Discount factor: r
- Normalization: c∗ = r R.
IV) Selling Horizon:
- Inventory position: Nt = N0 − D(Iλ(t)).
- τ0 = inf{t ≥ 0 : Nt = 0},
T := {Ft − stopping times τ such that τ ≤ τ0}
V) Retailer’s Problem:
·Z
max
λ∈A, τ ∈T
subject to
Dynamic Pricing with Demand Learning
¸
τ
e
Eθ
−r t
p(λt) dD(Iλ(t)) + e
−r τ
R
0
Nt = N0 − D(Iλ(t)).
7
Full Information
Suppose θ > 0 is known at t = 0 and an inventory clearance strategy is used, i.e., τ = τ0.
Define the value function
·Z
W (n; θ) = max
λ∈A
subject to
τ0
Eθ
¸
e
−r t
p(λt) dD(Iλ(t)) + e
−r τ
R
0
Nt = n − D(Iλ(t)) and τ0 = inf{t ≥ 0 : Nt = 0}.
The solution satisfies the recursion
r W (n; θ)
= Ψ(W (n−1; θ)−W (n; θ)) and W (0; θ) = R,
θ
where Ψ(z) , max {λ z + c(λ)}.
0≤λ≤Λ
Proposition.
For every θ > 0 and R ≥ 0 there is a unique solution {W (n) : n ∈ N}.
◦ If θ ≥ 1 then the value function W is increasing and concave as a function of n.
◦ If θ ≤ 1 then the value function W is decreasing and convex as a function of n.
◦ limn→∞ W (n) = θR.
Dynamic Pricing with Demand Learning
8
Full Information
4.6
θ R
1
W(n;θ )
1
4
θ1 > 1
W(n;1)
R
3.3
W(n;θ2)
θ2 < 1
θ2R
0
5
10
15
Inventory Level (n)
20
Value function for two values of θ and an exponential demand rate λ(p) = Λ exp(−α p).
The data used is Λ = 10, α = 1, r = 1, θ1 = 1.2, θ2 = 0.8, R = Λ exp(−1)/(α r) ≈ 3.68.
Dynamic Pricing with Demand Learning
9
Full Information
Corollary.
Suppose c(λ) is strictly concave.
Optimal Demand Intensity
4.5
The optimal sales intensity satisfies:
∗
λ (n; θ) = argmax {λ (W (n−1; θ)−W (n; θ))+c(λ)}.
0≤λ≤Λ
θ2< 1
4
- If θ ≥ 1 then λ∗(n; θ) ↑ n.
- If θ ≤ 1 then λ∗(n; θ) ↓ n.
λ*(n)
λ*
3.5
θ1 > 1
- λ∗(n; θ) ↓ θ .
3
5
10
15
20
25
Inventoty Level (n)
- limn→∞ λ∗(n, θ) = λ∗.
Exponential Demand λ(p) = Λ exp(−α p).
Λ = 10, α = r = 1, θ1 = 1.2, θ2 = 0.8, R = 3.68.
What about inventory turns (rotation)?
Proposition.
Let s(n, θ) , θ λ∗(n, θ) be the optimal sales rate for a given θ and n.
If
Dynamic Pricing with Demand Learning
d
0
(λ p (λ)) ≤ 0,
dλ
then s(n, θ) ↑ θ.
10
Full Information
Summary:
◦ A tractable dynamic pricing formulation for the inventory clearance model.
◦ W (n; θ) satisfies a simple recursion based on the Fenchel-Legendre transform
of c(λ).
◦ With full information products are divided in two groups:
– High Demand Products with θ ≥ 1: W (n, θ) and λ∗(n) increase with n.
– Low Demand Products with θ ≤ 1: W (n, θ) and λ∗(n) decrease with n.
◦ High Demand products are sold at a higher price and have a higher selling rate.
◦ If the retailer can stop selling the product at any time at no cost then:
– If θ < 1 stop immediately (τ = 0).
– If θ > 1 never stop (τ = τ0).
◦ In practice, a retailer rarely knows the value of θ at t = 0!
Dynamic Pricing with Demand Learning
11
Incomplete Information: Inventory Clearance
Setting:
- The retailer does not know θ at t = 0 but knows θ ∈ {θL, θH } with θL ≤ 1 ≤ θH .
- The retailer has a prior belief q ∈ (0, 1) that θ = θL.
- We introduce the probability measure Pq = q PθL + (1 − q) PθH .
- We assume an inventory clearance model, i.e., τ = τ0.
Retailer’s Beliefs:
1
Define the belief process
qt := Pq [θ | Ft].
0.95
0.9
Proposition. qt is a Pq -martingale that satisfies the SDE
0.85
q
0.8
dqt = −η(qt−) [dDt − λt θ̄(qt−)dt],
t
0.75
0.7
where
and
θ̄(q) := θL q + θH (1 − q)
q (1 − q) (θH − θL)
η(q) :=
.
θLq + θH (1 − q)
Dynamic Pricing with Demand Learning
0.65
0.6
0.55
0.5
0
50
100
150
200
Time, t
250
300
350
12
400
Incomplete Information: Inventory Clearance
Retailer’s Optimization:
·Z
V (N0, q)
subject to
=
τ0
sup Eq
λ∈A
0
¸
−r t
e
Z
−r τ0
p(λt) dD(Iλ(s)) + e
R
t
Nt = N0 −
dD(Iλ(s)),
0
dqt = −η(qt−) [dDt − λt θ̄(qt−)dt],
q0 = q,
τ0 = inf{t ≥ 0 : Nt = 0}.
HJB Equation:
£
¤
rV (n, q) = max λ θ̄(q)[V (n − 1, q − η(q)) − V (n, q) + η(q)Vq (n, q)] + θ̄(q) c(λ) ,
0≤λ≤Λ
with boundary condition V (0, q) = R, V (n, 0) = W (n; θH ), and V (n, 1) = W (n; θL).
Recursive Solution:
µ
V (0, q) = R,
V (n, q) + Φ
Dynamic Pricing with Demand Learning
r V (n, q)
θ̄(q)
¶
− η(q) Vq (n, q) = V (n − 1, q − η(q)).
13
Incomplete Information: Inventory Clearance
Proposition.
-) The value function V (n, q) is
a) monotonically decreasing and convex in q ,
b) bounded by
W (n; θL) ≤ V (n, q) ≤ W (n; θH ),
and
Value Function
4.5
c) uniformly convergent as n ↑ ∞,
n→∞
V (n, q) −→ R θ̄(q),
uniformly in q.
n=20
n=∞
θ(q)R = [θLq+θH(1−q)]R
4
n=10
-) The optimal demand intensity satisfies
V(n,q)
n=5
n=1
∗
∗
lim λ (n, q) = λ .
3.5
n→∞
Conjecture:
The optimal sales rate θ̄(q) λ∗(n, q) ↓ q .
3
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Belief (q)
Dynamic Pricing with Demand Learning
14
1
Incomplete Information: Inventory Clearance
Asymptotic Approximation: Since
lim V (n, q) = R θ̄(q) = lim {q W (n, θL) + (1 − q) W (n, θH )},
n→∞
n→∞
we propose the following approximation for V (n, q)
Ve (n, q) := q W (n, θL) + (1 − q) W (n, θH ).
Some Properties of Ve (n, q):
- Linear approximation easy to compute.
- Asymptotically optimal as n → ∞.
- Asymptotically optimal as q → 0+ or q → 1−.
- Ve (n, q) = Eq [W (n, θ)] 6= W (n, Eq [θ]) =: V CE(n, q) = Certainty Equivalent.
Dynamic Pricing with Demand Learning
15
Incomplete Information: Inventory Clearance
Relative Error (%) :=
V
approx
(n, q) − V (n, q)
× 100%.
V (n, q)
Value Function
Relative Error(%)
9
40
CE
V (n,q)
8
7
n=5
35
~
V(n,q)
30
(asymptotic)
25
CE
V (n,q)
6
20
5
4
~
V(n,q)
10
V(n,q)
3
2
15
n=5
5
(optimal)
0
0.2
0.4
0.6
0.8
Belief (q)
Exponential Demand λ(p) = Λ exp(−α p):
Dynamic Pricing with Demand Learning
1
0
0
0.2
0.4
0.6
0.8
1
Belief (q)
Inventory = 5, Λ = 10, α = r = 1, θH = 5.0, θL = 0.5.
16
Incomplete Information: Inventory Clearance
For any approximation V
λ
approx
approx
(n, q), define the corresponding demand intensity using the HJB
(n, q) := arg max [λ θ̄(q)[V
0≤λ≤Λ
approx
(n−1, q−η(q))−V
approx
(n, q)]+λ κ(q)Vq
approx
(n, q)+θ̄(q) c(λ)].
p(λ approx) − p(λ∗)
Relative Price Error (%) :=
× 100%.
p(λ∗)
Asymptotic Approximation (%)
Inventory (n)
q
1
5
10
25
100
0.0
0.0
0.0
0.0
0.0
0.0
0.2
2.7
-0.2 -0.3
-0.6 -0.5
0.4
6.9
0.8 -0.6 -0.9 -0.7
0.6
12.5 2.4 -0.2 -0.7 -1.0
0.8
19.4 3.3
0.1
-0.4 -0.6
1.0
0.0
0.0
0.0
0.0
0.0
q
0.0
0.2
0.4
0.6
0.8
1.0
Certainty Equivalent (%)
Inventory (n)
1
5
10
25
0.0
0.0
0.0
0.0
5.3
2.6
2.7
2.4
14.4 11.6 12.0 10.1
29.9 28.2 28.0 17.6
54.6 46.2 37.4 11.1
0.0
0.0
0.0
0.0
100
0.0
-0.4
-0.5
-1.0
-0.7
0.0
Relative price error for the exponential demand model λ(p) = Λ exp(−α p), with Λ = 20 and α = 1.
Dynamic Pricing with Demand Learning
17
Incomplete Information: Inventory Clearance
When should the retailer engage in selling a given product?
When V (n, q) ≥ R.
e (n, q), this is equivalent to
Using the asymptotic approximation V
q ≤ qe(n) :=
W (n; θH ) − R
.
[W (n; θH ) − R] + [R − W (n; θL)]
0.5
Non−Profitable
Products
0.49
~
q(n)
0.48
qe(n) → qe∞ :=
0.47
Profitable Products
θH − 1
, as n → ∞.
θH − θL
0.46
1
5
10
15
20
25
30
Initial Inventory (n)
Exponential demand rate λ(p) = Λ exp(−α p).
Data: Λ = 10, α = 1, r = 1, θH = 1.2, θL = 0.8.
Dynamic Pricing with Demand Learning
18
Incomplete Information: Inventory Clearance
Summary:
◦ Uncertainty in market size (θ) is captured by a new state variable qt (a jump
process).
◦ V (n, q) can be computed using a recursive sequence of ODEs.
◦ Asymptotic approximation Ve (n, q) := Eq [W (n, θ)] performs quite well.
– Linear approximation easy to compute.
e (n, q).
– Value function: V (n, q) ≈ V
– Pricing strategy: p∗(n, q) ≈ p
e(n, q).
◦ Products are divided in two groups as a function of (n, q):
– Profitable Products with q < qe(n) and
– Non-profitable Products with q > qe(n).
◦ The threshold qe(n) increases with n, that is, the retailer is willing to take more
risk for larger orders.
Dynamic Pricing with Demand Learning
19
Incomplete Information: Optimal Stopping
Setting:
- Retailer does not know θ at t = 0 but knows θ ∈ {θL, θH } with θL ≤ 1 ≤ θH .
- Retailer has the option of removing the product at any time, “Outlet Option”.
Retailer’s Optimization:
U (N0, q) =
subject to
max
λ∈A, τ ∈T
·Z
τ
Eq
¸
e−r t p(λt) dD(Iλ(t)) + e−r τ R
0
Nt = N0 − D(Iλ(t)),
dqt = −η(qt−) [dD(Iλ(t)) − λt θ̄(qt−)dt], q0 = q.
Optimality Conditions:
(
(n,q)
U (n, q) + Φ( r Uθ̄(q)
) − η(q) Uq (n, q) = U (n − 1, q − η(q))
(n,q)
U (n, q) + Φ( r Uθ̄(q)
) − η(q) Uq (n, q) ≤ U (n − 1, q − η(q))
Dynamic Pricing with Demand Learning
if U ≥ R
if U = R.
20
Incomplete Information: Optimal Stopping
Proposition.
a) There is a unique continuously differentiable solution U (n, ·) defined on [0, 1] so that
U (n, q) > R on [0, qn∗ ) and U (n, q) = R on [qn∗ , 1], where qn∗ is the unique solution of
µ
R+Φ
rR
θ̄(q)
¶
b) qn∗ is increasing in n and satisfies
θH − 1
∗ n→∞ ∗
≤ qn −→ q∞ ≤ Root
θH − θL
= U (n − 1, q − η(q)).
½
µ
Φ
rR
θ̄(q)
¶
η(q)
=
(θH − 1) R
q
¾
< 1.
c) The value function U (n, q)
– Is decreasing and convex in q on [0, 1]
– Increases in n for all q ∈ [0, 1] and satisfies
max{R, V (n, q)} ≤ U (n, q) ≤ max{R, m(q)}
where
m(q) := W (n, θH ) −
for all q ∈ [0, 1],
(W (n, θH ) − R)
q.
∗
qn
– Converges uniformly (in q ) to a continuously differentiable function, U∞(q).
Dynamic Pricing with Demand Learning
21
Incomplete Information: Optimal Stopping
Relative Error (%)
Critical Threshold (q*n)
1−θ
0.25
θ
L
0.62
Always Non−Profitable
L
U(n,q)−V(n,q)
V(n,q)
0.2
0.6
n=40
0.58
Profitable Under Stopping but
0.56
Non−Profitable Under Inventory Clearance
0.15
0.54
n=10
0.52
0.1
0.5
n=1
0.48
0.05
Always Profitable
0.46
0
0.44
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Belief (q)
Exponential demand rate λ(p) = Λ exp(−α p).
Dynamic Pricing with Demand Learning
5
10
15
20
25
30
Inventory (n)
Data: Λ = 10, α = 1, r = 1, θH = 1.2, θL = 0.8.
22
Incomplete Information: Optimal Stopping
Approximation:
(W (n, θH ) − R)
e
U (n, q) := max{R, W (n, θH ) −
q}
q̃n
¶
µ
rR
e (n − 1, q − η(q)).
where q̃n is the unique solution of
R+Φ
=U
θ̄(q)
Relative Error (%)
Value Function
6
~
4.4
5
~
U(n,q)
4.2
U(n,q)−U(n,q)
× 100%
U(n,q)
n=40
(Approximation)
4
4
(%)
U(n,q)
(Optimal)
n=10
3
n=5
2
R
1
n=10
3.4
0
0.2
0.4
0.6
Belief (q)
0.8
1
Exponential demand rate λ(p) = Λ exp(−α p).
Dynamic Pricing with Demand Learning
0
n=1
0
0.2
0.4
0.6
Belief (q)
0.8
1
Data: Λ = 10, α = 1, r = 1, θH = 1.2, θL = 0.8.
23
Incomplete Information: Optimal Stopping
Summary:
◦ U (n, q) can be computed using a recursive sequence of ODEs with free-boundary conditions.
◦ For every n there is a critical belief qn∗ above which it is optimal to stop.
◦ Again, the sequence qn∗ is increasing with n, that is, the retailer is willing to take more risk for
larger orders.
◦ The sequence qn∗ is bounded by
θH − 1
∗
≤ qn ≤ q̂ := Root
θH − θL
½
µ
Φ
rR
θ̄(q)
¶
η(q)
(θH − 1) R
=
q
¾
◦ The “outlet option” increases significantly the expected profits and the range of products (n, q)
that are profitable.
+
0 ≤ U (n, q) − V (n, q) ≤ (1 − θL) R.
◦ A simple piece-wise linear approximation works well.
e (n, q) := max{R, W (n, θH ) − (W (n, θH ) − R) q}
U
q̃n
Dynamic Pricing with Demand Learning
24
Concluding Remarks
◦ A simple dynamic pricing model for a retailer selling non-perishable products.
◦ Captures two common sources of uncertainty:
– Market size measured by θ ∈ {θH , θL}.
– Stochastic arrival process of price sensitive customers.
◦ Analysis gets simpler using the Fenchel-Legendre transform of c(λ) and its
properties.
◦ We propose a simple approximation (linear and piecewise linear) for the value
function and corresponding pricing policy.
◦ Some properties of the optimal solution are:
– Value functions V (n, q) and U (n, q) are decreasing and convex in q .
– The retailer is willing to take more risk (↑ q ) for higher orders (↑ n).
– The optimal demand intensity λ∗(n, q) ↑ q and the optimal sales rate θ̄(q) λ∗(n, q) ↓ q .
◦ Extension: R(n) = R + ν n − K 11(n > 0).
Dynamic Pricing with Demand Learning
25
© Copyright 2026 Paperzz