Dynamic Optimization: An Introduction 1 A Simple Two

Dynamic Optimization: An Introduction
The remainder of the course covers topics that involve the optimal rates
of mineral extraction, harvesting of fish or trees and other problems that are inherently dynamic in nature. The Tietenberg text deals with dynamic problems
in one of two ways. Either he examines these problems in a simple two-period
fashion or he creatively makes the problems static (single period) problems.
Neither method of dealing with these problems is completely satisfactory. We
will begin with the two-period formulation, but then develop a set of fully dynamic optimization tools that can be used to solve multi-period problems.
The goal of this lecture is to develop those tools. We’ll spend the rest of the
semester applying the tools so it is worth investing the time now to understand
what is going on. The math is NOT hard, but it involves notation, that if
unfamiliar, may appear hard. Try not to get intimidated, notation is just
letters, even if they are greek letters, they aren’t that scary. We also will not
be going into why these methods work. If you like dynamic problems and want
more of the nitty-gritty behind how dynamic optimization works, I encourage
you to take ENV-252 with Professor Smith next spring. For this class, you
need to understand how to apply these tools, but you don’t need to know why
the tools work.
1
1.1
A Simple Two-Period Model
Setup of the Model
We are going to work our way up to a very general method for solving dynamic
problems. But rather than jump straight to the most general case, let’s start
with a very simple problem. This will give us our bearings and we can build
more complicated models from there. In this simple model, there are two
periods—today and tomorrow. After tomorrow the world comes to an end.
Let’s assume we have a fixed stock of some resource (oil or coal or something
similar). To be specific, assume we have 20 units of the resource. We are
trying to decide how much of the resource to extract in each period in order to
maximize the present value of net benefits. Let total benefits in each period be
given as:
B (qt ) = 8qt − 0.2qt2
Let total costs in each period be equal to:
C (qt ) = 2qt
Net benefits in each period is then:
N B (qt ) = B (qt ) − C (qt ) = 6qt − 0.2qt2
We want to maximize the present value of net benefits over two periods. We
can write that as:
6q2 − 0.2q22
max 6q1 − 0.2q12 +
q1 ,q2
1+r
1
where r is the discount rate. Let’s set r at 10% (0.10).
Now what we need to deal with is the fact that our stock of the resource is
finite. One way to deal with this is to assume that we use up all the resource
in the two periods. Why might this be a reasonable assumption?
Then our constraint for this problem can be written as:
q1 + q2
q1 + q2
1.2
= S
= 20
Solving the Model
This is beginning to look like a problem we know how to solve. We want to
maximize some objective function subject to a constraint. We can use the
Lagrangian to do this!
L = 6q1 − 0.2q12 +
6q2 − 0.2q22
+ λ (20 − q1 − q2 )
1.10
The three conditions for a constrained optimum are then:
∂L
∂q1
∂L
(2)
∂q2
∂L
(3)
∂λ
(1)
= 0 = 6 − 0.4q1 − λ
= 0=
6 − 0.4q2
−λ
1.10
= 0 = 20 − q1 − q2
Using (1) and re-writing yields:
6 − 0.4q1 = λ
Using (2) and re-writing yields:
6 − 0.4q2
=λ
1.10
Combining these two implies:
6 − 0.4q1 =
6 − 0.4q2
1.10
Using (3) and re-writing yields:
q1 = 20 − q2
2
Substituting into the formula above yields:
6 − 0.4 (20 − q2 ) =
1.3
6 − 8 + 0.4q2
=
1.10 (−2 + 0.4q2 )
−2.20 + .44q2
.84q2
q2
q1
λ
=
=
=
=
=
=
6 − 0.4q2
1.10
6 − 0.4q2
1.10
6 − 0.4q2
6 − 0.4q2
8.2
9.762
20 − q2 = 10.238
6 − 0.4q1 = 1.905
Features of the Solution
First, recall that in a single period competitive market equilibrium, price equals
marginal cost. This is NOT true for the two period problem. To see this,
we need to calculate the price (marginal willingness-to-pay) at the equilibrium
quantities and compare that to the marginal cost.
M B1
=
M B2
=
M C1
=
M C2
=
P1 − M C1
P2 − M C2
=
=
∂B
= P1 = 8 − 0.4q1 = 8 − 0.4 (10.238) = 3.905
∂q1
∂B
= P2 = 8 − 0.4q2 = 8 − 0.4 (9.762) = 4.095
∂q2
∂C
=2
∂q1
∂C
=2
∂q2
3.905 − 2 = 1.905
4.095 − 2 = 2.095
Not only is price not equal to marginal cost, but the present value of the
difference between price and marginal cost has a particular interpretation. Let’s
first calculate the present value of price minus marginal cost in each period.
P V (P1 − M C1 ) = 1.905
2.095
2.095
P V (P2 − M C2 ) =
=
= 1.905
1+r
1.10
Notice the present value of the difference between price and marginal cost is
constant. It is also equal to the value of the Lagrange multiplier. We seem to
be on to something here.
The Lagrange multiplier can ALWAYS be interpreted as the shadow value
on the constraint. Or how much more of the objective function (in this case
present value of net benefits) would we get if we relaxed the constraint a little
3
bit (had a bit more oil in the ground).
Here the market price (P) minus
marginal cost equals the lagrange multiplier. We call the difference between
the market price and marginal cost the marginal user cost or sometimes the
scarcity rent. The marginal user cost is the opportunity cost (in terms of
future consumption possibilities) of consuming another unit of oil today. Or
reversing the logic, the marginal user cost tells you how much better off you
would be (in terms of future consumption possibilities) if you had one more
unit of oil in the ground.
We also know that the Lagrange multiplier ALWAYS has a price interpretation. What is that price interpretation here? The marginal user cost can be
thought of as the in-situ value (price) of the resource. It tells you how
much an additional unit of oil in the ground (hence, in-situ) is worth.
There is one more interesting feature of the solution. Notice that the present
value of the marginal user cost (MUC) is constant. However, the current value
of the MUC is growing.
MU C1
MU C2
= P1 − M C1 = 1.905
= P2 − M C2 = 2.095
Moreover:
2.095 − 1.905
0.19
MU C2 − M U C1
=
=
≈ 0.10 = r
MU C1
1.905
1.905
The growth rate (percentage change) in the marginal user cost equals the
interest rate. This may seem like a weird coincidence, but it isn’t! In fact,
this is a general property of non-renewable resource problems and it is called
the Hotelling Rule.
To understand the Hotelling Rule, think about a situation where you have
two choices. Choice number 1 is to extract one unit of the resource today and
sell it at today’s price. You then put the revenue you earned in the bank and
earn interest on it at a rate r. Next period you’ll have (P1 − M C1 ) (1 + r) .
Alternatively you can leave the unit of oil in the ground and extract it next
period. You’ll then sell it at next periods price. If you choose this option
you’ll have (P2 − M C2 ) . These two options MUST yield equal values for the
market to be in equilibrium. So the growth rate of oil in the ground (MUC)
must be equal to the interest rate. If the growth rate of the MUC is greater
than the interest rate, then you extracted too much. The value of the asset
"oil in the ground" is growing faster than the value of the asset "money in the
bank." You should have left the oil in the ground. If the MUC is growing
slower than the interest rate you have not extracted enough. The value of the
asset "money in the bank" is growing faster than the value of the asset "oil in
the ground." You should convert some oil to money. In equilibrium the value
of the two assets will be growing at the same rate. This is often referred to as
a no-arbitrage rule.
4
1.4
What’s wrong with the 2-period model or Why do we
need more math
In an undergraduate environmental economics course, analysis of dynamic problems generally stops with the two-period model. We are going to generalize our
analysis of dynamic problems, but I think it is important for you to understand
why. It’s not just to make it harder or because I love math. Rather, there are
some insights that cannot be had if we limit ourselves to the two period model.
Here are just a few.
1. Is it ever optimal to not use up all the resource? In the simple two-period
model we assume that we consume all the oil (or other resource) in the
two periods. This isn’t always true in multi-period models. We’ll see
that under some circumstances we might want to leave some resource in
the ground. But we can’t see why using the two-period model.
2. What if the resource isn’t fixed? We did the two-period model for oil
or some other non-renewable resource. But we might be interested in
resources that regenerate over time (like fisheries). The two-period model
can’t really address that issue very well. Static models of fisheries that
are presented in Tietenberg and in undergraduate courses miss a lot of
real world insights about fisheries. A fuller dynamic model will do much
better.
3. What if we have an infinite horizon problem? Obviously a two-period
model is just for two-periods. We could expand it to three periods or
four, etc. But the solution gets harder and harder to arrive at the more
periods we allow for. And we can never address a general infinite horizon
problem.
2
2.1
Dynamic Problems—A General Solution Method
Vocabulary
Dynamic problems have several defining characteristics. First, these problems
occur over time. So there will be notation in the problems that indicates time.
In this class we will use the subscript t, to indicate time-specific variables. For
example, Xt denotes the value of the variable X in time period t. So, X2
denotes the value of X in period 2. Problems also have a time horizon which
we will denote using the capital T . Some problems may have infinite horizons
so that T = ∞.
The second characteristic of dynamic problems is that they have a variable or
set of variables that describe the state of the world. These variables are called
state variables. State variables in economic problems generally describe the
stock of capital. In environmental economics this will be the stock of oil in the
ground, the stock of fish in the sea, etc. You cannot choose state variables (you
cannot choose the level of oil in the ground or the number of fish in the sea).
5
Rather the state variable evolves over time as a function of other choices you
do make. The capital stock evolves based on choices about investment. The
stock of oil evolves based on choices about extraction. The stock of fish evolves
based on choices about harvesting. Variables over which there is direct control
are called choice variables. So investment, extraction, and harvest rates are
all choice variables.
All dynamic problems have some equation which relations how the state
variable evolves over time as a function of the choice variables. This is called
the state equation.
Finally, in all dynamic problems there is something that we are trying to
maximize or minimize. This is called the objective function. Because the
problem occurs over time, we will generally be trying to maximize or minimize
the present value of something. So we will need to worry about the discount
rate. If you aren’t comfortable with the both discrete and continuous time
discounting you will want to go back and review those notes now!
Example 1 Imagine you own a gold mine. The mine contains a fixed amount
of gold, S. The gold industry is perfectly competitive so there exists some
market price, pt , which is allowed to vary over time (hence the t subscript).
The marginal cost of getting the gold out of your mine is constant (does not
increase with the amount extracted) c. Your profits in each period are then
given by (pt − c) qt where qt is the amount of gold you extract in time t. You
want to determine how much gold to extract in each period, qt , in order to
maximize your profits over a 10 year period.
For this example, what are the time horizon, state variable, choice variable,
objective function, and state equation?
Example 2 Imagine you own a fishing pond. The pond contains an initial
amount of fish, S. In addition, the fish reproduce at some rate which is a
function of how many fish there are F (S). Your profits from harvesting fish in
each period is given by pt qt − C (qt ) where qt is the amount of fish extracted in
each period and C (qt ) is the total cost function. You want to determine how
many fish to consume in each period in order to maximize the present value of
your pond over an infinite time horizon.
For this example, what are the time horizon, state variable, choice variable,
objective function, and state equation?
2.2
Setting Up the Problem
Now that we know the vocabulary let’s start to write down dynamic problems.
Let’s start with the gold mine example. The objective is to maximize the
present value of profits. This can be written as
max
q1 ,...,q10
10
(pt − c) qt e−rt dt
t=0
6
Recall that the integral is just like a big summation sign. It just says add up
over time (which is now continuous rather than discrete) (pt − c) qt e−rt . But
what does all that stuff mean. Well (pt − c) qt is profits in each period. To
see this multiply through and you’ll get pt qt − cqt . Price times quantity is
total revenue and marginal cost times quantity is total cost. So this equation
is total revenue minus total cost which is the definition of profits. The e−rt is
1
the continuous version of the discount factor. It is equivalent to (1+r)
t in the
discrete problem. So taken as a whole we have an objective function that says:
choose extraction levels in each period to maximize the present value of profits
over a ten year period.
We still have to worry about the fact that we have a fixed amount of gold.
We do this by writing down the state equation. How does the state (the stock
of gold) change over time as a function of extraction? Each period the stock is
decreased by the amount of extraction. In discrete time we might write that
as:
St+1 − St = −qt
In continuous time we write:
•
∂S
= S = −qt
∂t
The other piece of the problem that we need is to state what the initial stock
level is. This is often written in generic notation as:
S0 = S
If the total amount of gold were equal to 100 tons then this would be written as
S0 = 100
The whole problem can be written as:
10
max
q1 ,...,q10
(pt − c) qt e−rt dt
t=0
•
S = −qt
S0 = S
s.t.
I’m going to assert that the fish pond problem can be written as:
max
qt
∞
[pt qt − C (qt )] e−rt dt
t=0
•
s.t.
S = −qt + F (St )
S0 = S
7
Can you explain this in words?
We can write down a very general dynamic problem as follows:
max
qt
∞
G (qt , St ) e−rt dt
t=0
•
s.t.
S = F (qt , St )
S0 = S
where F () and G () are just arbitrary functions.
2.3
2.3.1
Solving Dynamic Problems—the Hamiltonian
Writing Down the Hamiltonian
The Hamiltonian is the Lagrangian’s big brother. Just like you memorized
the cookbook recipe for solving constrained optimization problems with the
Lagrangian, you can easily memorize the cookbook recipe for solving dynamic
optimization problems using the Hamiltonian. You’ll see that in many ways
they are quite similar, but the Hamiltonian rules are a little different.
The first step is to write down the Hamiltonian. Here are the rules. First
write down the letter H and an equals sign:
H=
Then the first element of the Hamiltonian is the objective function WITHOUT the discount factor
H = G (qt , St )
Then you add the "Hamiltonian multiplier" which instead we call the co-state
variable. It’s got a new name, but it plays the same role as the Lagrange
multiplier and will have the same interpretation!
H = G (qt , St ) + λt
You’ll notice that the co-state variable is indexed by t. This variable will have a
different value in different periods. If you look back at the 2-period model you’ll
see that marginal user cost was different in the two periods and the intuition
for why that is true will carry over to this more complicated model.
The next step is to add in the constraint. With the Lagrangian we wrote
the constraint "right hand side minus left hand side". Here we have a bit of
a problem in that the left hand side of our constraint has a dot in it! What
do we do about that? The good news is we ignore it. We simply write the
"right hand side" of the constraint in the Hamiltonian and ignore the part of
the constraint with the dot.
H = G (qt , St ) + λt F (qt , St )
That’s it! Not so bad. Let’s do our two examples.
8
Example 3
10
max
q1 ,...,q10
(pt − c) qt e−rt dt
t=0
•
S = −qt
S0 = S
s.t.
H = (pt − c) qt + λt (−qt )
Example 4
max
qt
∞
[pt qt − C (qt )] e−rt dt
t=0
•
s.t.
S = −qt + F (St )
S0 = S
H = [pt qt − C (qt )] + λt (−qt + F (St ))
2.3.2
The Necessary Conditions for the Hamiltonian
After we wrote down the Lagrangian we took the partial derivatives with respect
to all the choice variables and the lagrange multiplier and set them equal to
zero and solved. We will do something similar for the Hamiltonian. The only
change is that we won’t set all of the partial derivatives equal to zero. This is
the cookbook part. You should just memorize these rules!
H = G (qt , St ) + λt F (qt , St )
1. Take the partial derivative with respect to the choice variable(s) and set
it equal to 0.
∂H
∂G
∂F
=0=
+ λt
∂qt
∂qt
∂qt
2. Take the partial derivative with respect to the state variable(s) and set it
•
equal to −λ + rλ. This is the part you need to memorize.
•
∂H
∂G
∂F
= −λt + rλt =
+ λt
∂St
∂St
∂St
3. The final rule is called the Transversality Condition it says that at the
end of the time horizon the value of the resource in the ground (or in the
9
sea in the case of fish) must be zero. For a finite problem that can be
written as:
λT +1 = 0
and for an infinite horizon problem we write:
lim λt = 0
t→∞
Let’s do our first example (the gold) only. We’ll save the fish for later in
the course.
Example 5
10
max
q1 ,...,q10
(pt − c) qt e−rt dt
t=0
•
s.t.
S = −qt
S0 = S
H = (pt − c) qt + λt (−qt )
∂H
∂qt
∂H
(2)
∂St
(1)
= 0 = (pt − c) − λt
•
= −λ + rλ = 0
Notice that our state variable, S, does not appear in the Hamiltonian. Therefore
the partial derivative with respect to S is just like taking the derivative of a
constant. And the derivative of a constant is zero.
(3) λ11 = 0
Re-writing (1) yields
(pt − c) = λt
Re-writing (2) yields
•
λt
=r
λt
2.3.3
Features of the Solution
Let’s look at this last equation. Whenever we have a variable with a dot over
the same variable without a dot, what we have is a growth rate. Think about
calculating the percentage change in price over two periods. To do that you
take the difference in price in the two periods and divide by the initial price.
10
•
What the λt represents is the change in the shadow price (just like taking the
difference between two shadow prices) and the λ in the denominator is like
dividing by the original price. So what this last equation tells us is that the
growth rate of the shadow price equals the interest rat
.0e. This is the Hotelling Rule! This is the same rule we got from the
two-period model, but now it falls right out of the math.
The equation (pt − c) = λt tells us what the shadow price is. Namely the
shadow price is the difference between the price and marginal costs. This is
the marginal user cost.
So from the Hamiltonian we get the result that the marginal user cost (or
the value of an additional unit of oil left in the ground) is growing at the interest
rate.
What we don’t know is how much gold to extract in each period. Sadly,
for this problem, the Hamiltonian does not tell us directly about the extraction
path (q1 , ..., q10 ) . Sometime it will, but not in this case. All we know is
that marginal use cost must grow at the rate of interest. But there are many
different price pt paths that would correspond to a growth in the MUC equal
to the interest rate. In all of the three graphs below the MUC is growing at
the interest rate, but there are very different price levels associated with these
graphs.
$
MEC
Price
MUC
10
Time
$
MEC
Price
MUC
10
11
Time
$
MEC
Price
MUC
Time
10
To solve for the optimal extraction path we’ll need to use two pieces of
information. The first is the transversality condition and the second is the
demand function for gold. The transversality condition tells us that in year 11
the shadow price is zero. This is either because there is no gold left, or because
there is gold left but it has no value. For this problem the transversality
condition will hold because there is no gold left. Right now you are taking this
on faith, but we’ll do more problems where instead the resource has no value
and the difference will become clearer.
Given a downward sloping demand curve for gold, each of these price paths
will lead to different quantity demanded paths. The higher the initial price
the lower quantity demanded. The lower price the higher quantity demanded.
So only one of these price paths satisfies the hotelling rule AND results in just
enough gold demanded over the 10 periods to exhaust the mine. This is depicted in the graph below.
$
MEC
Price
MUC
Quantity
Consumed
10
Time
Total Quantity
Consumed (area under
curve) equals stock
10
12
Time