Introductory maths for MA Economics/IBE 1. Functional relationships • We frequently need to represent economic quantities as a mathematical function of one or more variables, e.g.: • A firm’s costs as a function of output • Market demand for a good in terms of price and other factors • A consumer’s utility in terms of quantities of different goods purchased. 1.1 A function is a rule that takes one or more numbers as inputs (arguments) and gives another number as output. e.g. f(x) = x2 Here, “f” is the name given to the function, x represents the input, x2 tells us how to calculate the answer in terms of the input. e.g. f(2) =22 = 2 x 2 = 4, f(5) = 52 = 5 x 5 =25 etc. A function can be seen as a ‘black box’ for converting numbers into other numbers: 2 4 2 5 X 25 The variable or number in brackets is referred to as the argument of the function. If we want to write another variable as a function of x, we may write y = f(x), or just (in this case) y=x2. E.g. we could express a demand function by Q = 20 – 5P Where Q is quantity demanded and P is price. Thus quantity is expressed as a function of price. 1.2 Functions of more than one variable We may have functions of two or more variables. For example F(x,y) = xy + 2x This function requires two numbers input to produce an answer. For example, F(4,5) = 4*5 +2*4 = 28 Here x is 4 and y is 5. Here, the function has two arguments. For example in economics, a firm’s output may depend on their input of Labour (L) and Capital (K), for example a Cobb-Douglas type production function: Q = 100L0.5K0.5 Where Q is units of output, L is number of workers and K is, perhaps, number of machines. For example, if L=9 and K=16, then Q = 100*(90.5)*(160.5) = 100*3*4 = 1,200 units. 1.3 Graphs of functions Functions of 1 variable can easily be represented on a graph. E.g. if we have F(x) = x2 This has a graph something like: F(X) X The simplest sorts of functions are linear. These are of the form Y = a + bX Where a and b are constants, e.g. Y = 5 – 2X.. The graphs of these functions are straight lines. For example if we have the demand function Q = 20 –5P First, because we always put price on the vertical axis, we need to get P on to the left hand side of the equation. Thus Q + 5P = 20 5P = 20 – Q P = (20/5) –(Q/5) P = 4 –Q/5 This has a graph: P 4 Q 20 With functions of two variables, the graph will in fact be a threedimensional surface – for example the function F(X,Y)=X2+Y2 will be a dome-shape, which it may be possible to sketch. Another way of graphically representing functions of two variables is by a contour map that draws curves showing combinations of values of X and Y that give the same value of F(X,Y). For example, an indifference curve map is simply the contour map of a Utility function. To give another example, let F(X,Y)=X2+Y2. Then the contour map for F(X,Y) consists of a series of concentric circles around the origin, with further out circles representing higher values of F(X,Y), as follows: Y F(X,Y)=9 F(X,Y)=1 X F(X,Y)=4 Of course, it is not really possible to draw graphs of functions of more than two variables. 2. Finite and infinite sums Finite sums: notation n ∑a i i =1 means a1 + a2 + … + an Very often, ai will be defined by some formula or function of i. E.g. 10 ∑i 2 means 12 + 22 +….+ 102. (In this case, ai = i2). i =1 Here i is called the index variable, and ai the summand, that is the thing being summed. ∑a x∈A x Where A is a set means we add up ax over all values of x belonging to the set A. We may also sum over all values of x satisfying a certain condition. E.g. ∑ x 2 means 12+32+…+92. Or, suppose we have a set A x =1...10 , xodd of people, and we wish to add up their (say) incomes, we could write ∑ Y ( x) Where Y(x) is the income of individual x. x∈A Infinite sums: notation ∞ ∑ F (n) means the infinite sum F(1)+F(2)+F(3)+…. n =1 This may or may not sum to a finite total. Some tricks for dealing with summations A particularly common type of sum (finite or infinite) is a geometric progression, where the ratio between each term and the next is a constant. That is, a series of the form ∞ n ∑ ar k or ∑ ar k k =0 k =0 (The first a finite, the second an infinite sum). Let’s deal with the finite sum first. Let S = n ∑ ar k = a + ar + ar2 + … + arn k =0 Then rS = ar + ar2 + … + arn + arn+1 So S – rS = a - arn+1 So S(1-r) = a(1 – rn+1) a(1 − r n +1 ) Hence S = (unless r=1, in which case of course the sum is 1− r simply a(n+1). We can see that, if |r|<1, that is if -1<r<1, then the bracketed term at the top will get closer and closer to 1 as n gets larger. On the other hand, if |r|>1, then the bracketed term will get larger and larger in magnitude as n increases. This suggests when and how we can calculate the infinite sum. Let S = a + ar + ar2 + ar3 + …. (infinite sum) So rS= ar + ar2 + ar3 + …. Hence S(1-r) = a a . 1− r Whence S = However, this sum will only be valid in the case |r|<1. Formal definition ∞ ∑a Let k =1 be an infinite sum. We define the n’th partial sum as: k Sn = a1 + … + an – the sum of the first n terms of the infinite series. We say that the infinite sum converges if the sequence of partial sums, S1, S2, S3,… tends to some limit S. This limit S is then defined to be the sum of the infinite series. Otherwise, we say the infinite sum diverges, and has no value. In other words, if when we add on successive terms of the series, we get closer and closer to some limiting value (I will not define precisely what is meant by this), then this limiting value is taken to be the sum of the series. Example Let an = 0.5n. Consider the infinite series ∞ ∑a n =0 n That is, 1 + (1/2) + (1/4) + (1/8) + (1/16) +… In our formula for geometric progressions, we have a = 1, and r = 0.5. Since |r|<1, we can say that the infinite sum is equal to 1/(1-0.5) = 2. Now consider the partial sums. We have S0 = 1, S1 = 1 + ½ = 1.5, S2 = 1 + ½ + ¼ = 1.75, S3 = 1 + ½ + ¼ + 1/8 = 1.875, etc. It is intuitively easy to see (and can be proven), that this series gets closer and closer to 2 as n increases (though never quite reaching 2). Hence, we say that ∞ ∑a n =0 n = 2 as an infinite sum. On the other hand, if we had r =2, so that our infinite sum was 1 + 2 + 4 + 8 + 16 + …, so that the partial sums went 1, 3, 7, 15, 31, etc, then these partial sums are clearly not converging to any finite total, but are just increasing towards infinity. Hence this infinite series diverges, and we cannot give a value to the infinite sum. 3. Differentials, slopes, and rates of change Economics is frequently concerned with marginal effects – Marginal cost, marginal utility, marginal revenue, etc. When relationships between variables are expressed in functional terms, the marginal effect is the rate of change of the function with respect to the argument. So if costs C=C(Q) where Q is output, then marginal costs is the rate of change of the function as Q changes. This is the same as the slope of a function on a graph. E.g. If we have a function F(x) = 3 + 2X, with the graph F(X) 2 1 3 X Each increase of 1 in X leads to an increase in 2 of F(X). The slope of the line is 2. We could say that the marginal increase in F for a change in X is 2. Straight lines are easy, as the slope is always the same – the slope of the above graph is 2 for all values of X. When we have a non-linear function – for example F(X) = X2 - the slope of the graph, and therefore the rate of change of F(X), varies depending on the value of X. . We can measure the rate of change, or the marginal effect, in two different ways: first, most easily, by looking at the change in X and the change in F(X) between two points: . For example, between X=2 and X=3, F(X) goes from 4 to 9, so the rate of change is (9-4)/(3-2) = 5. But it is more precise to measure the rate of change at a particular point. We do this by looking at the slope of the tangent line to the curve at the point we’re interested in. . We can also see that, if we are taking the slope between two points, the nearer these pints are together, the closer the slope is to the slope of the tangent line. . The rate of change at a particular point X – that is, the slope of the tangent line – is also known as the differential of the function F(x) at X. We write the differential at a point X as F’(X). (For example, at X=1, we write the slope F’(1).) In general, the differential changes as x changes, and so is itself a function of x, written F’(x). If we have Y=F(X), then we write the differential as dY . dX Formally, the differential of F(X) at the point X=a is given by: F’(a) = Lim X →a F ( X ) − F (a) Where Lim here means “Limit”. X −a 3.2 Rules for differentiation There are some fairly simple rules for differentiating all the basic functions you are likely to meet in this course. 1) Constant functions: F(X) = a, where a is a constant, e.g. F(X) = 3. These are flat, they have slope 0, so F(X) = 0. 2) Linear functions: If F(X) = a +bX, then F’(X) = b. (Linear functions have a constant slope). 3) If F(X) = Xn , where n is any number (positive or negative, not necessarily an integer (whole number)), then F’(X) = nXn-1. 4) If F(X) = Ln(X) where Ln is the natural logarithm, then F’(X) = 1/X. 5) If F(X) = eX (the exponential function), then F’(X) = eX. 6) If F(X) = sin(X), then F’(X)=cos(X). If F(X)=cos(X), then F’(x)=sin(X). For example (case 3), if Y = X2 , then dY/dX = 2X – in other words, the slope increases as X increases, as we can see from the graph. 3.3 Rules for combining functions 1) Addition of functions: If F(X) = G(X) + H(X), then F’(X) = G’(X) + H’(X) 2) Multiplication by a constant: If a is a constant, then the differential of aF(X) is aF’(X). (E.g. the differential of 2X2 is 2*2X = 4X). 3) Multiplication of functions: If F(X) = G(X)H(X) then F’(X) = G(X)H’(X) + G’(X)H(X). 4) Division of functions: If F(X) = G(X)/H(X), then F’(X) = H ( X )G ' ( X ) − G ( X ) H ' ( X ) ( H ( X )) 2 For example, if Y = (X+3)(3-2X), we let G(X) = X+3, and H(X) = 3-2X. Then, G’(X) = 1, and H’(X)= -2. Thus, dY/dX = (X+3)*(-2) + 1*(3-2X) = -2X – 6 + 3 –2X = -4X-3. 5) Function of a function: If F(X) = G(H(X)), then F’(X) = G’(H(X))H’(X) For example, if F(X) = e (x2 ) X , we let G(.) = e(.) , and H(X)=X2. (.) Now G’(X) = e , so G’(.) = e , so G’(H(X)) = e H(X) = e (x2 ) . Also H’(X) = 2X, so (x F’(X)=G’(H(X))H’(X) = e 2 ) * 2X. 4. Optimisation in one variable We are frequently interested in maximising or minimising a quantity, e.g. maximising profits or utility, or minimising costs. This can be done using differentiation. A function is at its maximum or minimum value when it stops rising and starts falling, or vice versa. . When a function moves from rising to falling (or v.v.), there will be a momentary stationary point where it is not changing. That is, at a local maximum or local minimum of a function, the differential, F’(X), will be equal to 0 (i.e. the tangent line is flat): We say a local maximum or minimum, because it may not be the global highest or lowest point. Stationary points can also be points of inflexion, where the function flattens out then continues in the same direction. . Example Suppose a firm faces a demand curve given by: Q = 20 – 3P Where Q is quantity and P is price. How can the firm maximise revenue? Well, revenue is price*quantity, PQ, which is equal to P(20-3P) = 20P – 3P2. . So, let F(P) = 20P – 3P2 Then F’(P) = 20 – 6P. A stationary point will come when F’(P) = 0, i.e. when 20 – 6P =0 Therefore, 20 = 6P, so P = 20/6 = 3.333 At this value, Q = 20 –3P = 10, so revenue = 10*3.333 = 33 and a third. 4.2 Classifying stationary points How can we be sure (apart from the graph) that this is a maximum and not a minimum or a point of inflexion? We do this by looking at the second differential – that is, the differential of the differential – which we write F’’(X) (or d 2Y .) dX 2 E.g. if F(X) = X3 , then F’(X) = 3X2 , so F’’(X) = 3*2X = 6X. This is the rate of change of the rate of change. Now at a maximum, the rate of change starts positive, goes to zero, then goes negative – so the rate of change is going down, so the rate of change of the rate of change is negative. In other words If F’’(X)<0 at a stationary point, then the point is a local maximum. The opposite holds at a minimum, so If F’’(X) >0 at a stationary point, the point is a local minimum. Now in the case of our company, where the revenue function was F(P) = 20P – 3P2, and F’(P) = 20 – 6P, with a stationary point at P=3.333. Now F’’(P) = -6. This is negative at the stationary point (indeed at all values of P), and so the point is a local maximum. If F’’(X) = 0 at a stationary point, the point could be a maximum, minimum or point of inflexion. Specifically: look at successive differentials (F’’’(X), F(4)(X), etc.) • If the first non-zero differential at the stationary point is of odd order (e.g. 3rd, 5th differential), then the stationary point is a point of inflexion. • If the first non-zero differential at the stationary point is of even order and negative, then the stationary point is a local maximum. • If the first non-zero differential at the stationary point is of even order and positive, then the stationary point is a local minimum. Note that conditions for a minimum (whether in functions of one or more variables) will always be a mirror image of the conditions for a maximum. This can easily be seen, since: Minimising the function F(X) is the same as Maximising the function –F(X). The same holds true for functions of more than one variable. 4.3 Distinguishing a global maximum or minimum In general, the global maximum or minimum can occur at any of the local maxima and minima, or at a corner solution – the lowest or highest possible value. (For example, a company’s profits may be highest when output is zero.) . It may be necessary to look at all maxima/minima and all possible corner solutions to find the best. However, there are certain cases where we can be sure a local maximum/minimum is the global maximum/minimum: If F’’(X) < 0 for the full range of values a function can take, then any local maximum is the global maximum. (We say such a function is concave). If F’’(X) > 0 for the full range of values a function can take, then any local minimum is the global minimum. (We say such a function is convex). In the case we considered, we found F’’(P) = -6, which is <0 for all possible values (0 to infinity), so the local maximum we found must be a global maximum. 4.4 Non-negativity constraints In actual economic problems, we will frequently require that our variables should not take negative values. For example, it would not be of much use to a company to work out that its optimum number of workers is negative. Suppose therefore that we are maximising the function F(X) subject to the condition X≥0. The (global) maximum value of F(X) either where δF/δX=0 and δ2F/δX2<0, or where X=0 and δF/δX≤0. Similarily, the minimum value must occur either where δF/δX=0 and δ2F/δX2>0, or where X=0 and δF/δX≥0. We can see the reasons for this on the graph below: F(X) Local maximum: X=0, δF/δX<0 Local maximum: δF/δX=0 X A maximum at X=0 is known as a boundary solution, one where X>0 is an interior solution. The concavity/convexity condition that guarantees that a local maximum/minimum will be a global maximum/minimum remains, for either type of local optimum. Example: Marginal costs and marginal revenue We know that a company maximises profits when marginal costs = marginal revenue. (MC=MR). This can be analysed in terms of calculus. Suppose a company has a Revenue function R(Q), where Q is the output, and a cost function C(Q). Then the profit function, Π(Q), can be written Π(Q) = R(Q) – C(Q). Differentiating, Π’(Q) = R’(Q) – C’(Q). This will have a stationary point where Π’(Q) = 0, so R’(Q) – C’(Q) = 0, and hence: R’(Q) = C’(Q). But R’(Q) is the rate of change of revenue as output increases, in other words, the marginal revenue. C’(Q) is the rate of change of costs, in other words, the marginal cost. Hence, the equation we have tells us that MC=MR. 5. Optimising functions of more than one variable 5.1 Partial differentiation When we have a function of more than one variable, we can use partial differentiation to find the rate of change of the function with respect to any of the variables. Let F(X,Y) be a function of two variables. Then the partial differential of F with respect to X, written ∂F , is obtained simply by differentiating ∂X F(X,y) with respect to X, holding Y constant, i.e. treating the function as if it were a function only of X, with Y a constant parameter. Similarily, the partial differential of F with respect to Y, ∂F , is obtained by ∂Y differentiating F with respect to Y, treating X as a constant. Formally, the partial differentials at the point (a,b) are given by: ∂F ( a, b) = ∂X Lim X →a F ( X , b) − F ( a, b) and X −a ∂F ( a, b) = ∂Y Lim Y →b F ( a , Y ) − F ( a, b ) Y −b Example Consider a Cobb-Douglas production function, given by Q = aKαLβ , where Q is output, K is capital and L is labour, and α and β are constants. Then ∂Q = aαK α −1Lβ , and ∂K ∂Q = aβK α Lβ −1 . ∂L We may of course have functions of any number of variables, for example F(X,Y,Z), a function of 3 variables. We may take partial differentials with respect to any of the variables. 5.2 Stationary points of functions of two variables. Simple optimisation in 2 variables is quite similar to one variable: A stationary point occurs when all partial differentials are equal to zero. This can be a local maximum, a local minimum, or a saddle point. In other words, where the function is momentarily flat with respect to changes in any variable. To find out the nature of the stationary points, we need to look at the second partial derivatives at the stationary point. Second partial differentials If F(X,Y) is a function of two variables, we may define the second partial derivatives as follows: The second partial derivative of F wrt X, differentiate ∂F with respect to X. ∂X The second partial derivative of F wrt Y, differentiate ∂F with respect to Y. ∂Y ∂2F ∂ ∂F ( ) , that is, we = 2 ∂X ∂X ∂X ∂2F ∂ ∂F ( ) , that is, we = 2 ∂Y ∂Y ∂Y The cross-partial derivative of F with respect to X and Y, ∂2F ∂2F = = ∂X∂Y ∂Y∂X ∂ ∂F ∂ ∂F ∂F ( )= ( ) . That is, we can either differentiate with respect to ∂X ∂X ∂Y ∂Y ∂X ∂F with respect to X – the result is always the same. Y, or differentiate ∂Y Example Continuing with the CD production function, Q = aKαLβ, we had ∂Q ∂Q = aαK α −1Lβ and = aβK α Lβ −1 ∂K ∂L Then, 2 ∂ 2Q α −2 β ∂ Q = a α ( α − 1 ) K L , = aβ ( β − 1) K α Lβ − 2 and 2 2 ∂L ∂K ∂ 2Q = aαβK α −1Lβ −1 . Note that the last result is the same whichever ∂K∂L order we perform the two differentiations in. 5.3 Classifying stationary points of functions of two variables The nature of a stationary point of a function of two variables depends, unfortunately, on all the second partial derivatives. Suppose F(X,Y) has a stationary point at (a,b). Let A = ∂2F ∂2F ∂2F (a,b), B = (a,b) and C = (a,b). 2 2 ∂ X ∂ Y ∂X ∂Y Then (a,b) is a local maximum if A<0 and AB-C2>0, a local minimum if A>0 and AB-C2>0, and a saddle point if AB-C2<0. (Indeterminate if ABC2=0). A saddle point will appear to be a local maximum from some directions, and a local minimum from others – like a saddle. Example Let F(X,Y) = X2-2Y2+6XY-4X+3Y ∂F = 2X+6Y-4 ∂X ∂F = -4Y+6X+3 And ∂Y Then Setting these both to zero to find the stationary points gives X+3Y=2 6X-4Y=-3 Whence Y=15/22 and X=-1/22 ∂2F ∂2F ∂2F ∂2F ∂2F ∂2F 2 Now =2, 2 =-4 and =6, so * ≤( ) (for all ∂X∂Y ∂X 2 ∂Y 2 ∂X∂Y ∂X 2 ∂Y values of X and Y), which means that the stationary point is a saddle point. 5.4 Convex and concave functions As in the single variable case, the problem of finding a global maximum or minimum can be more difficult than finding a local optimum. Global optima can occur either at one of the local optima, or at a corner solution. However, the picture is again clearer for convex and concave functions. A function F(X,Y) is said to be convex over a range of values A of X and ∂2F ∂2F ∂2F ( , ) ( , ) ( x, y ) ≥ 0 x y x y − ∂X 2 ∂Y 2 ∂X∂Y ∂2F ∂2F ∂2F ∂2F and ( x, y ) < 0 , and concave if ( x, y ) 2 ( x , y ) − ( x, y ) ≥ 0 and ∂X 2 ∂Y ∂X∂Y ∂X 2 ∂2F ( x, y ) > 0 for all (x,y) in A. ∂X 2 Y if at all points (x,y) in A, we have These definitions lead to the following results: If a function F(X,Y) is convex over a region (range of values) A, then any local minimum in A is a global minimum for that region. If F(X,Y) is concave on A, then any local maximum is a global maximum on A. 5.5 Non-negativity constraints If we are seeking to maximise F(X,Y) subject to the conditions that X≥0 and Y≥0, analogous conditions apply to the 1 variable case: At the maximum value of F(X,Y), we must have δF/δX≤0, with δF/δX=0 if X>0, and δF/δY≤0, with δF/δY=0 if Y>0. Note that these are not sufficient conditions for a local maximum (we could have a local minimum or a saddle), and certainly not a global maximum, so in general we may have to check a number of different possibilities. While we can check whether we have a local maximum, minimum or saddle using second derivatives for an interior solution (where X and Y are both greater than 0), this is not so straightforward where one variable is equal to zero. The conditions for the minimum value are analogous, remembering that minimising F(X,Y) is the same as maximising –F(X,Y). A solution to an optimisation problem with non-negativity constraints where one of the variables is equal to zero, is again known as a boundary solution. A solution with all variables strictly greater than zero is an interior solution. 5.6. Functions of several variables We shall look briefly at the question of finding and classifying stationary points of more than two variables. The process is entirely analogous, but requires the machinery of matrix algebra, which we are not covering here. Let F(X1,….Xn) be a function of n variables. A stationary point of F will occur where ∂F ∂F = ... = = 0. ∂X 1 ∂X n To decide what type of stationary point we have, we need to look at the Hessian matrix of second partial derivatives. This is an n by n array or matrix as follows: ⎛ ∂2F ⎜ 2 ⎜ ∂X 1 ⎜ ∂2F HF(X1,….,Xn)= ⎜⎜ ∂X 1∂X 2 ⎜ ....... ⎜ ∂2F ⎜⎜ ⎝ ∂X 1∂X n ∂2F ∂X 1∂X 2 ..... ∂2F ∂X 2∂X n ∂2F ⎞ ⎟ ∂X 1∂X n ⎟ ⎟ .... ... ⎟ ⎟ ... ⎟ ∂2F ⎟ ⎟ ....... 2 ∂X n ⎟⎠ ....... The type of stationary point will depend on the properties of the Hessian matrix at that point, but the details are beyond the scope of this course. 6. Constrained optimisation Problems in economics typically involve maximising some quantity, such as utility or profit, subject to a constraint – for example income. We shall therefore need techniques for solving such constrained optimisation problem. Typically, we will have an objective function F(X1,X2,…,Xn), where X1…Xn are the choice variables, and one or more constraint functions G1(X1,X2,…,Xn),…Gk(X1,X2,…,Xn). The problem is typically formulated as: Maximise/Minimise F(X1,X2,…,Xn) subject to G1(X1,X2,…,Xn)≤0, G2(X1,X2,…,Xn)≤0,…, Gk(X1,X2,…,Xn)≤0. In this section, we will consider techniques for solving problems of this type. 6.1 Constrained optimisation in one variable We will start by considering constrained optimisation problems in one variable. For example, consider the problem: Maximise F(x)=4+3x-x2 Subject to the condition x≤2 We can rewrite the constraint as G(x)=x-2≤0, to get it into the form described above. We can easily solve this problem using differentiation, and see the solution graphically: F(X) 6.25 1.5 2 X We have that dF/dx=3-2x. Setting this to 0 gives x=1.5, F(x)=6.25, and consideration of the second differential shows this is a local maximum. The second differential is equal to -2, so the function is concave for all real values, so this is a global maximum. Finally, the resulting value of x is within the constraint, so that this is the solution to the constrained optimisation problem as well as to the unconstrained problem. In this case, the constraint, x≤2, is non-binding or slack. Suppose that instead we had imposed the constraint G(x)=x-1≤0, i.e. x≤1 F(X) 1 1.5 X We can now see from the graph that the optimum solution is x*=1, giving F(x)=6. This time the constraint is binding. Although it is easy to see what is happening in this case, in general we need to be able to distinguish between binding and non-binding constraints. 6.2 Constrained optimisation in more than one variable: the method of Lagrange Multipliers The most important method for solving constrained optimisation problems in more than one variable is the method of Lagrange Multipliers. Consider the problem of a consumer seeking to maximise their utility subject to a budget constraint. They must divide their income M between food (F) and clothes (C), with prices PF and PC, so as to maximise the following ‘Stone-Geary’ utility function: U(F,C) = αLn(F-F0) + (1-α)Ln(C-C0) So their budget constraint can be written G(F,C) = PFF + PCC – M = 0 Our problem is to maximise U(F,C), subject to the constraint G(F,C)=0. To solve this we introduce an auxiliary variable λ, the Lagrange Multiplier, and form the Lagrangian function L(F,C,λ) = U(F,C) – λG(F,C) To maximise U(F,C) subject to our constraint, we instead solve the unconstrained maximisation problem for L(F,C,λ). To do this, we must set all three partial derivatives to zero. Thus, 1) ∂U α = − λPF = 0 ∂F F − F0 2) ∂U (1 − α ) = − λPC = 0 ∂C C − C0 3) ∂U = PF F + PC C − M = 0 ∂λ The third condition is of course simply the original constraint. It is worth taking a moment to look at the economic significance of this approach. We can rewrite equations 1) and 2) to say that δU/δF = λPF and δU/δC = λPC, whereupon, eliminating λ, we get: ∂U ∂F ∂U ∂C = PF PC In other words, that the ratio of marginal utilities to price must be the same for both goods. This is a familiar result from elementary consumer choice theory, and illustrative of a general economic principle: an economic quantity (utility, output, profits, etc.) is optimised where the ratio of marginal benefits of different uses of resources is equal to the ratio of marginal costs. Solving, we obtain: F= F0 + α ( M − PC C0 − PF F0 ) C = C0 + PF (1 − α )(M − PCC0 − PF F0 ) PC Which says that, after the minimum quantities C0 and F0 have been bought, remaining spending is allocated in the proportions α:(1-α) between food and clothing – this is of course a particular property of this utility function, rather than any general law. We obtain λ= 1 M − PF F0 − PC C0 What does λ signify? Well, if we feed back our solutions for F and C into the Utility function, we find that U* = αLn( α ( M − PC C0 − PF F0 ) PF ) + (1 − α ) Ln( (1 − α )( M − PC C0 − PF F0 ) ) PC Which can be rearranged to give U*=α(Ln(α)-Ln(PF))+(1-α)(Ln(1-α)-Ln(PC)) + Ln(M-PFF0-PCC0) Whereupon δU*/δM = 1/( M-PFF0-PCC0) = λ. Thus, λ gives the marginal utility from extra income. More generally, the Lagrange Multiplier λ gives the marginal increase in the objective function from a unit relaxation of the constraint. 6.3 Lagrange multipliers; a formal treatment We now extend the treatment of Lagrange Multipliers to functions of several variables, and to allow for both non-negativity constraints and non-binding constraints. Thus, we consider the following problem: Maximise F(X1,….,Xn) subject to G1(X1,….,Xn)≤0 …. Gk(X1,….,Xn)≤0 Xi≥0, for each i=1,…,n. Thus we have n variables, and k constraints, each of which may be binding or non-binding. We also have n non-negativity constraints. We form the Lagrangian: L(X1,…,Xn,λ1,...,λn)=F(X1,...,Xn)-λ1G1(X1,…,Xn)-…-λkGk(X1,...,Xn) Note there are now k Lagrange Multipliers, one for each constraint. The Kuhn-Tucker theorem states that, at the optimum solution, (X1*,…,Xn*) where F takes its maximum value, there exist values for λ1*,...,λn* for λ1,...,λn such that: 1) For each Xi, ∂F ≤ 0 , with equality if Xi>0 ∂X i 2) For each j=1,...,k, Gj(λj*)≤0, λj*>0, and either λj*=0 or Gj(λj*)=0. The second condition is worth looking at more closely. It says that, first of all, the Lagrange multiplier must always take a positive value (this is natural if we consider the role of the LM as the marginal benefit from relaxing the constraint – this must be positive); secondly, that the constraint must be satisfied; and thirdly that either the constraint must be just satisfied (a binding constraint), or the value of the LM must be zero, in which case we have a slack constraint. Again this is natural, since if the constraint is slack, then there is no marginal benefit from relaxing it. Note that these are necessary conditions for the existence of a local maximum. It is possible to state sufficient conditions that specify cases when we can guarantee that a point that satisfies conditions 1) and 2) will be a global maximum, but these conditions are quite complex, and beyond the scope of this course. In general, it may be necessary to look at all the different possible combinations of binding and slack constraints, and of boundary and interior solutions. Exact constraints If one of the constraints is exact, that is requiring G(X1,…,Xn)=0, then condition 2) for this constraint does not apply, instead it is required, of course, that the constraint is satisfied. Non-negativity conditions We have framed the problem on the assumption that all the variables must be non-negative. If a particular variable Xi does not have to be nonnegative, then condition 1) for that variable simply becomes δL/δXi=0. Constrained minimisation We have formulated the Kuhn-Tucker theorem in terms of maximising a function. Of course, it is easy to minimise a function F(X1,…,Xn) by maximising –F(X1,…,Xn). However, more usually, we solve a minimisation problem by forming the Lagrangian as L(X1,…,Xn,λ1,...,λk)=F(X1,...,Xn)+λ1G1(X1,…,Xn)+…+λkGk(X1,...,Xn), and proceeding as above. Example A manufacturing firm produces two models of Widget, A and B. Let X and Y denote the quantity of models A and B produced in a week respectively. Model A requires 2 hours of machine time per item, while model B requires 1.5 hours of machine time. Each hour of machine time costs £2, whether for type A or type B. The total labour and material costs for producing X units of type A is 4X-0.1X2+0.02X3, while for Y of type B, the cost is 4.5Y-0.1Y2+0.02Y3. The two are strong substitutes, so that the demand curve for types A and B are given by X = 80 – 0.5PA + 0.3PB and Y = 70 + 0.25PA – 0.4PB, where PA and PB are the price in pounds of A and B respectively. The two constraints on (short-term) production are firstly, that there is only a maximum of 80 hours available machine time per week, (The rest being required for maintenance), and that the firm is under contractual obligations to produce a total of at least 40 widgets per week. What is the optimal quantity of types A and B for the firm to produce to maximise profits? First of all, we solve the demand functions to work out price in terms of X and Y, giving PB=220-X-2Y, and PA = 292-2.6X-1.2Y. Thus, total revenue is equal to 292X-2.6X2+220Y-2Y2-3.2XY. Total costs (machining, labour and materials) come to 8X-0.1X2+0.02X3 + 7.5Y – 0.1Y2 + 0.02Y3. Hence, we can write the profit function as: Π(X,Y)=284X-2.5X2-0.02X3+212.5Y-1.9Y2 -0.02Y3 – 3.2XY The constraints on machine time and production give: (putting them in the required form) G1(X,Y)=2X+1.5Y-80≤0 G2(X,Y)=40-X-Y≤0 We also have the non-negativity constraints X≥0 and Y≥0, as we can’t have negative production. We form the Lagrangian L(X,Y,λ,µ)= Π(X,Y)=284X-2.5X2-0.02X2+212.5Y-1.9Y2 -0.02Y3–3.2XY – λ(2X+1.5Y-80) – µ(40-X-Y). We thus have the conditions: 1) δL/δX = 284 – 5X – 0.06X2 – 3.2Y – 2λ + µ ≤ 0, with equality if X>0. 2) δL/δY = 212.5 – 3.8Y–0.06Y2 – 3.2X – 1.5λ + µ≤0, with equality if Y>0 3) λ≥0, G1(X,Y)≤0, and either λ=0 or G1(X,Y)=0 4) µ≥0, G2(X,Y)≤0, and either λ=0 or G2(X,Y)=0 Let us start by looking for interior solutions, so that X,Y>0. Let us also start by looking for solutions where both constraints are slack, that is where λ=µ=0. Solving some ugly equations for conditions 1) and 2) gives X = 22.37, and Y= 26.22. (Ignoring the fact that you can’t have non-integer quantities of widgets for now). However, this does not satisfy the constraint on machine time, so this is impossible. Let us now consider solutions where the first constraint is slack, so λ=0, but the second is binding, so X+Y=40, and µ≥0. Conditions 1) and 2) now become 284 – 5X – 0.06X2 -3.2(40-X) + µ = 0 So that 3) 156 – 1.8X – 0.06X2 + µ = 0 And 212.5 – 3.8(40-X)–0.06(40-X)2 -3.2X + µ = 0 So that 3) -35.5 + 5.4X – 0.06X2 + µ =0 Which gives 191.5 – 7.2X = 0, so X = 26.6, whereupon Y = 13.4. This satisfies the constraint on machine time, and also the non-negativity conditions. We must check that it gives a positive value for µ. With these values, µ= 35.5+5.4*26.6-0.06*26.6*26.6 = -51.3<0. Hence this violates the condition that the LM be non-negative, so it is not a possible solution. We can now consider the possibility that the machine-time constraint is binding, so that 2X + 1.5Y = 80, but that the production constraint is slack, so that µ=0 and X+Y≥40. We now have 1) 284 – 5X – 0.06X2 – 3.2Y – 2λ =0 and 2) 212.5 – 3.8Y–0.06Y2 – 3.2X – 1.5λ = 0. Substituting using 2X + 1.5Y = 80, so X = 40 – 0.75Y gives 3) 2λ = -.03375Y2 + 4.15Y – 12 and 4) 1.5λ = -.06Y2 – 1.4Y + 84.5 Which gives .04625Y2 + 6.01667Y – 124.6667 = 0 One solution is negative, the other gives Y = 18.18, whence X = 26.365. We need to confirm that this gives a non-negative value of λ. These values give λ = 52.29, which is OK. Hence, (X,Y) = (26.365,18.18) is a possible solution to our optimisation problem. We may now suppose both constraints are binding, so that X+Y=40 and 2X+1.5Y=80. This is only possible with X=40 and Y=0. Then conditions 1) and 2) become 284 – 200 – 96 – 2λ + µ = 0, so -12 – 2λ + µ =0 and 212.5 – 128 -1.5λ + µ = 0, so 84.5 – 1.5λ + µ = 0. Hence, 0.5λ + 96.5 = 0, giving a negative value for λ, which is impossible. We have thus exhausted all possibilities for internal solutions, the only one being (X,Y) = (26.365,18.18). We may now try for boundary solutions. We may first try X=Y=0, which gives the conditions: 1) 284 – 2λ + µ ≤0 2) 212.5 -1.5λ + µ ≤0 Again, we may consider binding or non-binding constraints. If both are non-binding, so that λ=µ=0, then clearly 1) and 2) are not satisfied. However, if either constraint is binding, then X or Y must be strictly positive, which contradicts our assumption. What if X=0, but Y≥0? In that case, the conditions become 1) 284 – 3.2Y -2λ + µ ≤0 2) 212.5 – 3.8Y – 0.06Y2 – 1.5λ + µ = 0 Let us try both constraints non-binding, so that λ=µ=0. This gives Y = 35.75 as the non-negative solution to 2), but that fails to satisfy 1). If the machine constraint is binding but the production constraint is slack, then 2) gives a negative value for λ, which is impossible. If the production constraint is binding but the machine constraint slack (so Y=40 and λ=0), then 2) gives µ=35.5, but this fails to satisfy 1). Finally we cannot have both constraints binding, as then X is positive. Hence, there is no solution with X=0 but Y≥0. Finally, we may look for solutions where X≥0 and Y=0. Our first two conditions now become 1) 284 – 5X – 0.06X2 – 2λ + µ = 0 2) 212.5 – 3.2X -1.5λ + µ ≤0 The constraints must now be either both binding or both slack, as they are both precisely satisfied when X=40. In this case, the conditions become -12 – 2λ + µ=0 84.5 – 1.5λ+µ =0 But this makes λ negative. If both constraints are slack, so that λ=µ=0, we have from 1) that X = 38.77. However, this fails to satisfy condition 2). Thus, we have ruled out all possible boundary solutions, leaving only the interior solution (X,Y)=( 26.365,18.18), where the machine-time constraint is binding, and the production constraint is slack. As this is the only possibility, and as there logically must be some profit-maximising combination of outputs subject to the constraint (as infinite profits are clearly impossible), then this must in fact be the global maximum solution. This has been a rather cumbersome process of checking all possibilities. In fact, consideration of the properties of the function would enable us to rule out a lot of the possible solutions very easily, but this would take rather more theoretical machinery to demonstrate. You are not likely to meet such awkward cases in your MA programme, but this example illustrates how the process can be carried out if necessary.
© Copyright 2026 Paperzz