1 • The Lagrange multipliers is a mathematical method for performing constrained optimization of differentiable functions. • Recall unconstrained optimization of differentiable functions, in which we want to find the extreme values (max or min values) of a differentiable function . • In other words, we want to find the domain points minimum values (extrema) of the function that yield the maximum or . • We determine the extrema of by first finding the function’s critical domain points, which are points where the gradient (i.e., each of the partial derivatives) is zero. • These points may yield (local) maxima, (local) minima or saddle points of the function . • We then check the properties of the second derivatives or simply inspect the function values to determine the function’s extreme values. 2 • In constrained optimization of differentiable functions, we still have a differentiable function we want to maximize or minimize. • But, we have restrictions on the domain points that we can consider. • The set of these points is called the feasible region a constraint function , typically formulated as • , and they are given by . Example: consider an inverted paraboloid as the function to maximize, constrained by a set of points defined by a line in the x-y plane. Function’s value at each of the domain points chosen from the constraint curve. Consider only the domain points that are on the constraint curve. 3 Example: Constrained Maximization (See • Suppose you want to maximize constraint ) , subject to the 4. Function's maximum constrained value: , . 1. The feasible set consists of points on the unit circle, plotted in the x-y plane. 3. Function’s value for each point of the feasible set. 2. Feasible set dropped by 6. 4 • We can solve this constrained optimization problem using the method of substitution. • First, we write the formal expression for the constrained optimization (maximization): • Solution by substitution: • Now, set 1st derivative to zero, and find critical points: • Substituting the critical point , as determined by inspection, into : 5 Lagrange Multiplier Method Outline • We now consider an alternative way to perform constrained optimization of differentiable functions, called the Lagrange multiplier method, or just the Lagrangian. • We will give the basic formulation of the problem, and a procedure of how to solve it using the basic Lagrange multiplier method. • This will be followed by an intuitive derivation of the basic Lagrange equations. • In addition, an intuitive derivation of the generalized Lagrange multiplier method will be given. • Finally, the primal and dual forms of the Lagrange multiplier method will be given. The primal and dual forms offer equivalent methods for performing constrained optimization. 6 (Basic) Lagrange Multiplier Method • Consider a basic constrained optimization (maximize of minimize) problem: , • The formulation of the Basic Lagrange Multiplier method for constrained optimization problem is: • We set the partial derivatives of the Lagrangian to zero, and then find the optimal values of the variables ∗ ∗ ∗ that maximize (or minimize) the function. • The is called the Lagrange multiplier. 7 , • Why does the Basic Lagrange Multiplier equation include • We know from basic calculus that setting the function’s derivative to zero yields the function’s critical points, and then, from the critical points, we can determine the extrema of the function. • The constraint function ? must also be satisfied, and setting the partial derivative gives the constraint in the Basic Lagrange equation. , and so, there is some intuition for including 8 More Motivation for Including • To obtain an intuitive appreciation for why the Lagrange is formulated as such , consider a contour plot of the previous optimization example: (-2,2,0) Contour plot of . Surface plot of . (2,-2,0) These are “level curves” of . 9 Gradient of • Note that the gradient of is perpendicular to the level curves of and points in the direction of the maximum rate of change of the function : Note that this direction is given in the plane. Gradient is Parallel with the diagonal in the x-y axis. 10 Contour of • Now consider the contour of : and and the constraint level curve Contour plot of . Constraint level curve: . 11 Gradient of The constraint level curve but plotted in the plane. • The gradient of the level curve the outward direction. is a level curve of the paraboloid , is perpendicular to the level curve and points in , , • Constraint level curve: . 12 At Solution Points • Informally, notice that, the slope of the tangent line of the contour of (note: the tangent line is the contour line itself) is equal to the slope of the tangent line of the constraint level curve at the critical points , • Also, informally, note that at the intersection point of any other contour line of and the level curve , the slope of their tangent lines appear to be different. Contour line Point on contour line: . Contour line . . Their slopes appear to be the same. Constraint level curve: . 13 close all; d=linspace(-2,2,1000); [x,y]=meshgrid(d,d); figure; hold all; grid on; contour(x,y,x+y,50); surf(x,y,x+y), shading interp; theta = linspace(0,2*pi); [x1,y1] = pol2cart(theta,1.0); plot3(x1,y1,x1+y1,'LineWidth',2,'Color','k'); plot3(sqrtm(2)/2.0,sqrtm(2)/2.0,sqrtm(2), 'bo', 'markerfacecolor','b', 'markersize',6); set(gca, 'PlotBoxAspectRatio', [0.9221 1.0000 0.7518]); set(gca,'GridAlpha', 0.5, 'GridLineStyle', '--', 'FontSize',22); set(gca,'ZLim',[-6 4],'XLim',[-2 2],'YLim',[-2 2]); set(gca,'XTick',[-2:1:2], 'YTick',[-2:1:2], 'ZTick',[-6:2.0:4]); colormap 'jet'; view(49,12) % adjust view so it is easy to see 14 Aside: Derivative Interpretation “The slope of the tangent line of a curve gives the direction we should travel to stay on the curve.” • Recall the definition of the derivative (evaluated at : rise run → • • In the limit as at a point , the function must become linear at point ; otherwise the function would not be differentiable. This is because the only way the function would not be at point , would be if the function had a corner at point , and, if so, the linear as function would have an infinite derivative at point , and would not be differentiable at point . , , the slope of the tangent line Starting from the point at which the derivative is taken gives the rise and the run we should take to get to the next infinitesimally closest point of the function, and hence stay on the function. 15 Moving along the Tangent Line • Imagine you are at one of the points (in this example, infinitesimally small step along the constraint curve tangent line of at the point. or ), and you make an , and along the • Your movement will keep you on the constraint curve, since, “the slope of the tangent line of a curve gives the direction we should travel to stay on the curve.” Constraint level curve: . 16 • Your infinitesimally small movement will cause one of two possible outcomes: – Move parallel with and along a level curve of • For example: as shown at point , you will move parallel with and along , or – Cross over a level curve • For example: as shown at point , you will move across : 17 Consider Intersection Point Where Tangents are Different • In this case, a movement to the right will cross the level curve , and will touch another level , which represents an increase in , i.,e., . 18 Cannot be an Extrema Point • Since your movement touches another level curve , while staying on (i.e., your movement takes you to a valid point in the constraint curve the feasible region on the constraint curve , this means the point under consideration, , cannot be a maximum point, because the function’s . value at the new point of intersection is In other words, starting from , and moving along the feasible region, one finds another point in the feasible region that has a greater function value. Therefore, is not an extreme point. 19 Tangent Lines are Different at • Notice that the tangent line of the level curve tangent line of the constraint level curve is different than the at point . • Also, note that if you were to move along the tangent line of the level curve , your movement would take you off the constraint level curve, and . hence take you out of the feasible region, i.e., Tangent of at Tangent of at 20 Conclusion Regarding Intersection Point • In general, point cannot be a critical point if the slope of the tangent line of the constraint level is different than the slope of the curve tangent line of the objective level curve at intersection point . If the slopes are different at an intersection point, then that point cannot be an extremum. 21 Considering Touching Point Where Tangents are Equal • Previously, we considered a point where the tangent of the constraint level curve and the tangent of a level curve of were different. • Next, consider where your movement takes you in the case of a touching point , where the tangents are equal. Tangent of Tangent of at 22 Considering Touching Point Tangents are Equal • Once again, consider moving along the tangent line of the constraint curve, but now in more detail. along the tangent • Say, you took an infinitesimally small step from to line of the constraint curve , at the touching point . Tangent of at 23 Considering Touching Point Tangents are Equal • This infinitesimally small step from to on the constraint curve . along the tangent line keeps you • Also, since the tangents are equal, you can move along the objective level curve using the same step . This keeps you on the objective level curve, and does . not change the value of the function, Tangent of Tangent of at 24 Touching Point is a Solution Tangents are Equal • Therefore, a local extremum must occur where • In general, if the slopes of the tangents at a touching point of the constraint level function and an objective level curve are equal, then the touching point is a critical point in the constrained extremum problem. • In other words; – In a constrained maximization or minimization problem, we are constrained to finding an extremum of considering only those points that satisfy the constraint point . is tangent to a level curve of – The extreme value occurs at a point , where the objective level curve does not cross, the constraint level curve . . touches, but – At that point , the tangent of the constraint level curve is equal to the tangent of the objective level curve . At this point the slopes touch but do not cross. – At this point the extreme value of the function is . 25 • Next, we use the fact that at a solution point , the slope of the tangent line of the constraint level curve is equal to the slope of the tangent line of the objective level curve , to derive the Lagrange multiplier constrained optimization equation. • We will show that the gradient of the objective function can be written as a scalar multiple of the gradient of the constraint function , i.e., 26 • Known: the slope of the tangent line of the constraint level curve is equal to the slope of the tangent line of the at the critical point . objective level curve • Their normal vectors are parallel. • is Known: the gradient of the objective function perpendicular to the tangent line of the objective level curve at the critical point . Therefore, the gradient is the normal vector of the objective level curve at the critical point . Normal vector Tangent at • is Known: the gradient of the constraint function perpendicular to the tangent line of the level curve at the critical point . Therefore, the gradient is the normal vector of the level curve at the critical point . • This means the gradient of the objective function related to the gradient of the constraint function a scalar multiple . • Note: • Therefore, we can write that: can be either positive or negative is through . for negative for positive 27 • At a non-solution point, the tangent of the constraint curve is not parallel to the tangent of the objective level curve. • At a solution point, the tangent of the constraint curve is parallel to the tangent of the objective level curve. • At a solution point, since the tangents are parallel, the normal of the constraint level curve and the normal of the objective level curve are also parallel. • This means that at the solution point, the gradient of the objective function is either parallel or anti-parallel to the gradient of the constraint level curve. • This means multiple . • Note: can be either positive or negative • Therefore, we can write that: is related to Normal vector Tangent at through a scalar . for negative for positive 28 Thought Experiment: Exhaustive Search of Extrema • Do the following for every point on the constraint level curve, i.e., for every point in the feasible region: – Imagine you are at a point of the constraint level curve – You take note of the value of the objective level curve . at point – You take an infinitesimally small step on the curve to get to point stay on the curve , i.e., you stay in the feasible region. – You take note of the value of the objective level curve . , and so you at point . • If the value of the objective level curve at point is different than the value at point , then point cannot be an extremum. You will note that the slopes of the tangents lines of the two functions and at the point are different. • If the value of the objective level curve at point is the same as the value at point , then point is an extremum. You will note that the slopes of the tangents lines of the two functions and at the point are the same. 29 Lagrange Optimization Equation • The above can be written as: • Undoing the differentiation and removing the “setting to zero” procedure: • Since can be either positive or negative, we can now write the Lagrangian: • The is called the Lagrange multiplier. 30 Lagrange Minimization/Maximization Procedure • We set the partial derivatives of the Lagrangian to zero, and then find the optimal values of the variables that maximize (or minimize) the function. • Notice that the above three equations comprise the gradient equation derived earlier: • And that the last equation extracts the constraint 31 (Basic) Lagrange Multiplier Method • Consider a basic constrained optimization (maximization or minimization) problem: , • The formulation of the basic Lagrange constrained optimization problem is: • We set the partial derivatives of the Lagrangian to zero, and then find the optimal values of the variables ∗ ∗ ∗ that maximize (or minimize) the function. • The is called the Lagrange multiplier. 32 Example Lagrange Computation , • The formulation of the basic Lagrange constrained optimization problem is: • Now, setting the partial derivatives to zero, and finding critical points: • Substituting the critical points into , , we find: 33 Lagrange with Multiple Constraints • Consider an optimization problem with multiple constraints • The set of points • In other words, the feasible region is the set of intersection points of the constraint functions, . • • , at which all : intersect, form the feasible region. may not be parallel (or anti-parallel) to any single . The set of intersection points corresponds to a linear combination of the constraint functions, . The gradient of a sum is equal to the sum of the gradients: 34 Multiple Constraints Example 35 Example: Multiple Constraints at critical point. Maximum constrained value ’s values in the feasible region. (plane) Feasible region. (plane) 36 Lagrange with Inequality Constraints • Consider an optimization problem with an inequality constraint : • The Lagrange multiplier formulation with inequality constraint can be written as: • Notice that the equation appears very similar to the Lagrange multiplier method . with equality constraints, except that the is constrained to be • Why should the multiplier with inequality constraints be limited to 37 Lagrange with Inequality Constraints • To show intuitively why this must be the case, first consider the possibilities: 1. No solution exists and the lines of constant do not touch or intersect. and feasible region 2. A solution exists at the boundary of the feasible region 3. A solution exists inside the feasible region • . . In the interesting case where a solution exists, we will show there will be two cases: . . • Consider a maximization example with two variables and one inequality constraint. • To maximize subject to allowed by the inequality, i.e., , let us first look at the boundary of the region . 38 Lagrange with Inequality Constraints: Solution on Boundary • Consider a sketch of the level curve • We assume a solution exists at point • Then, and must be parallel (not anti-parallel), and this point will give a maximum (not a minimum) of for the region , because of the following argument: and the level curve where touches . Feasible Region 39 Solution on Boundary: Gradient Points Away From Feasible Region • At the point , where and would be pointing away from the feasible region, since: • The gradient always points in the direction of maximum increase of a function, and • The function touch, the gradient decreases as you move towards the inside of the feasible region. Feasible Region 40 Solution on Boundary: and Point in Same Direction • If and were pointing in opposite directions, then pointing inwards towards the feasible region, meaning greater values inside the feasible region. Feasible Region would be would have • If we were to find another point where touches inside the feasible region, we would find an increase in ; since we are still in the feasible region , we must conclude that the point on the boundary cannot be a critical point (not a solution). • This contradicts our initial assumption that is a solution. 41 Solution on Boundary: and Point in Same Direction • Therefore, and must be parallel and be pointing in the same direction (i.e., not anti-parallel). Feasible Region 42 Solution on Boundary: Gives a Maximum of • The gradient indicates the function feasible region. increases away from the • If a solution exists on the boundary, then this solution must be a maximum. Feasible Region 43 Solution on Boundary: Effective and Binding Constraint • In the case where a critical point exists on the boundary , the inequality constraint is said to be effective and is called a binding constraint, and Feasible Region 44 Summary for Solution on Boundary • Consider a sketch of the level curve • We assume a solution exists at point • Then, region – • where touches . must be parallel (not anti-parallel), and this point will give a maximum (not a minimum) of , because of the following argument: At the point region, since , , where , and , touch, the gradient as you move towards the inside of the feasible region. for the would be pointing away from the feasible – If and were pointing in opposite directions, then have greater values inside the feasible region. – If we were to find another point where , touches , inside the feasible region, we would find an increase in ; since we are still in the feasible region , , we must conclude that the point on the boundary cannot be a critical point (not a solution). This contradicts our initial assumption. – Therefore, and This implies that would be pointing inwards towards the feasible region, meaning would must be parallel and be pointing in the same direction (i.e., not anti-parallel). Therefore, at the critical point – • and and the level curve 0 for on , and must be parallel (not anti-parallel) . In the case where a critical point exists on the boundary called a binding constraint, and , the inequality constraint is said to be effective and is 45 Lagrange with Inequality Constraints: Solution Within Boundary • In the case where a critical point exists inside the feasible region, i.e., can consider any point within the feasible region to determine the extrema of • I.e., the problem is unconstrained, if we assume a solution exists within the feasible region. • In other words, the problem becomes an unconstrained optimization problem (i.e., optimization with no constraints). • In this case we say the constraint is not binding, or the constraint is ineffective. • The maximum is then found by looking for the unconstrained maximum of that we look only inside the feasible region. • In this case: , then we . , assuming 46 Lagrange with Inequality Constraints: Summary • The Lagrange multiplier method with inequality constraints can be written as: • If the extremum occurs at the boundary of the constraint binding and effective): • If the constraint • I.e., unconstrained optimization. Argument: at this point we know there is a solution, and that the solution does not exist at the boundary. Therefore, any critical point the optimizer finds will be inside the feasible region. Finally, the optimizer is free to choose (i.e., it is unconstrained) to find a solution. any point (the constraint is is not binding and ineffective, then the above reduces to: 47 • The Lagrange optimization method has a dual form, one called the primal optimization method and the other called the dual optimization method. • In some applications it is more suitable to use the dual optimization method, as it leads to a simpler and quicker solution, while in other applications, the primal method is better. • In the following we show that under certain conditions, the primal and dual optimization methods are equivalent and lead to the exact same solution to an optimization problem. • As an example use of the primal and dual optimization methods being equivalent, we can show that the condition , is also true for a minimization problem. – Note that our intuitive verification for the condition assumption of a maximization problem. , was based on the 48 Lagrange Optimization: Basic Formulation • Consider an optimization problem of the following form: • The basic Lagrange formulation (Lagrangian) for this problem is: • The are called the Lagrange multipliers for equality constraints. • We would then find and set ‘s partial derivatives to zero: • Finally, solve for and , and then locate the minima. 49 Lagrange Optimization: Generalized Formulation • Consider the following, which is called the primal optimization problem: • The generalized Lagrangian is given by: • The and are called the Lagrange multipliers. 50 Deriving an Alternate Expression for the Primal Optimization Problem • We will now derive an alternative expression for the primal optimization problem. • We call this the “min max” expression for the primal optimization problem. 51 Min Max Expression for Primal Optimization Problem • Consider the following quantity: • If the choice of • For instance, if therefore, • In addition, if therefore, violates any of the primal constraints (below), then , then can be chosen as to maximize : , and . , then can be chosen as to maximize , and . 52 Min Max Expression for Primal Optimization Problem • Now, if the choice of • For instance, if irrespective of . • In addition, if • Taken together, satisfies the primal constraints (below), then , then the value of , then is irrelevant, since will be chosen as to maximize : , . . 53 Min Max Expression for Primal Optimization Problem • Now, if the choice of : satisfies the primal constraints (below), then • Note also that if were allowed to be negative, then to maximize , in which case ). have a solution, even for good • This provides further reason for requiring that would be chosen as , and so we would not . 54 Min Max Expression for Primal Optimization Problem • Therefore: • Next, consider the minimization problem: • This means that, after performing the maximization of , which we found to be , given that the constraints are satisfied, then minimize the resulting function by finding the optimal value of . • I.e., given that the constraints are satisfied: 55 Min Max Expression for Primal Optimization Problem • In other words, is the same as our original primal optimization problem. Min max representation • Finally, define the optimal value of Original primal optimization problem as : • We call this the value of the primal optimization problem. 56 Deriving a Dual Expression for the Primal Optimization Problem • We will now derive the dual expression of the generalized Lagrange optimization formulation. • We call the dual expression the “Max min” expression. • We will relate the dual expression to the primal expression, and hence show that the dual expression can also be used to express the generalized Lagrange optimization formulation. • Finally, we will show that under certain conditions, the dual expression is equivalent to the primal expression, and thus, either of them can be used to solve an optimization problem posed as a generalized Lagrange optimization formulation. 57 The Dual “Max Min” Expression • Consider the quantity: • Note that, whereas in the definition of (below) we were optimizing (maximizing) with respect to and , here (above) we are minimizing with respect to . 58 The Dual “Max Min” Expression • Now, add a maximization term: • This is exactly the same as our primal problem (below), except that the order of the “max” and the “min” are now exchanged. • Finally, define the optimal value of as ∗ : ∗ • We call this the value of dual optimization problem. 59 Primal and Dual Relationship • It can be shown that (see next 2 slides): • Furthermore, it can be shown that, under certain conditions: • This means that under certain conditions, we can solve a given optimization problem by using either the primal or dual methods, and we’ll pick the most suitable one. 60 2-D Case PROOF: • Since we can chose any , we can write on the LHS that • On the RHS, since • implies that we can choose a . this implies: that minimizes the RHS: 61 • What does the following mean? • For each , we find a • This will generate each value of . that minimizes the function answers (i.e., . minimums of ), one for • Does the following make sense then? • Yes, because for each value of you choose, the term represents the minimum value of the function over all . 62 Our Case PROOF: implies that we can choose a both sides, even with the restriction implies that we can choose a combination that maximizes : that minimizes the RHS: 63 64 Recall: Lagrange Optimization Generalized Formulation • Consider the following, which is called the primal optimization problem: • The generalized Lagrangian is given by: • The and are called the Lagrange multipliers. 65 Conditions for Equality Given without Proof • Let • Under certain assumptions, there must exist , , : be the optimal domain points that optimize the objective function: , , so that: . • Once determined, , which are as follows: , will satisfy the Karush-Kuhn-Tucker (KKT) conditions, 66 • The first two conditions (below) follow from the Lagrange optimization procedure. – We set the partial derivatives of : to zero, and solve for the variables – Of course, when we solve for the variables and find the optimal values , they should satisfy the following two KKT conditions: 67 • The last two conditions (below) follow from the initial constraints of the problem. – The is the initial constraint: – Of course, when we solve for the variables the optimal values , should satisfy : and find 68 • The third condition (below) follows from the analysis of the primal optimization problem. • Recall that in the derivation of the primal optimization problem we wanted to perform the maximization: • For the above term to be maximum, and subject to . Furthermore: • We note that in the case of strict feasibility (i.e., when is active and a solution exists at ), the condition is satisfied and also is free to vary . • When is inactive and , then , it is required that . , 69 [1] J. Kitchin, "Matlab in Chemical Engineering at CMU," [Online]. Available: http://matlab.cheme.cmu.edu/2011/12/24/using-lagrange-multipliers-in-optimization/. [Accessed 20 February 2015]. [2] C. A. Jones. [Online]. Available: http://www1.maths.leeds.ac.uk/~cajones/math2640/notes4.pdf. [Accessed 23 February 2015]. [3] D. Klein, "Dan Klein's Homepage," [Online]. Available: www.cs.berkeley.edu/~klein/papers/lagrange-multipliers.pdf. [Accessed 23 February 2015]. [4] J. Beggs, "Introduction to the Lagrange Multiplier," [Online]. Available: https://www.youtube.com/watch?v=RZz7c1oeHm4. [Accessed 2 March 2015]. 70
© Copyright 2026 Paperzz