Lagrange multipliers are used to solve extremum problems for a function defined on a
level set of another function. For example, suppose you want to maximize xy given
that x + y = 4. This is not too hard to do using methods developed earlier.
Solve for one of the variables, say y, in the constraint equation x + y = 4 to
find y = 4 − x. Then the function to maximize is f
(x )
= x
(4 − x )
and the
answer is clearly x = 2. Thus the two numbers are x = y = 2. This was easy
because you could easily solve the constraint equation for one of the variables
in terms of the other. Now what if you wanted to maximize f
(x,y,z)
= xyz
subject to the constraint that x^{2} + y^{2} + z^{2} = 4? It is still possible to do this using
similar techniques. Solve for one of the variables in the constraint equation, say z,
substitute it into f, and then find where the partial derivatives equal zero to find
candidates for the extremum. However, it seems you might encounter many cases and
it does look a little fussy. However, sometimes you can’t solve the constraint
equation for one variable in terms of the others. Also, what if you had many
constraints? What if you wanted to maximize f
(x,y,z)
subject to the constraints
x^{2} + y^{2} = 4 and z = 2x + 3y^{2}. Things are clearly getting more involved and messy.
It turns out that at an extremum, there is a simple relationship between the
gradient of the function to be maximized and the gradient of the constraint
function.
This relation can be seen geometrically in the following picture.
PICT
In the picture, the surface represents a piece of the level surface of g
(x,y,z)
= 0 and
f
(x,y,z)
is the function of three variables which is being maximized or minimized on the
level surface. Suppose the extremum of f occurs at the point
(x0,y0,z0)
. As shown
above, ∇g
(x0,y0,z0)
is perpendicular to the surface or more precisely to the tangent
plane. However, if x
(t)
=
(x(t),y(t) ,z (t))
is a point on a smooth curve which passes
through
(x0,y0,z0)
when t = t_{0}, then the function h
(t)
= f
(x (t),y(t),z(t))
must have
either a maximum or a minimum at the point t = t_{0}. Therefore, h^{′}
and since this holds for any such smooth curve, ∇f
(x ,y ,z)
0 0 0
is also perpendicular to
the surface. This picture represents a situation in three dimensions and you can see that
it is intuitively clear that this implies ∇f
(x ,y ,z )
0 0 0
is some scalar multiple of
∇g
(x ,y,z )
0 0 0
. Thus
∇f (x ,y ,z ) = λ∇g (x ,y ,z )
0 0 0 0 0 0
This λ is called a Lagrange multiplier after Lagrange who considered such problems in
the 1700’s.
Of course the above argument is at best only heuristic. It does not deal with the
question of existence of smooth curves lying in the constraint surface passing through
(x0,y0,z0)
. Nor does it consider all cases, being essentially confined to three dimensions.
In addition to this, it fails to consider the situation in which there are many constraints.
However, I think it is likely a geometric notion like that presented above which led
Lagrange to formulate the method.
Example 22.5.1Maximize xyz subject to x^{2} + y^{2} + z^{2} = 27.
Here f
(x,y,z)
= xyz while g
(x,y,z)
= x^{2} + y^{2} + z^{2}− 27. Then ∇g
(x,y,z)
=
(2x,2y,2z)
and ∇f
(x,y,z)
=
(yz,xz,xy)
. Then at the point which maximizes this
function^{1} ,
(yz,xz,xy)
= λ
(2x,2y,2z)
. Therefore, each of 2λx^{2},2λy^{2},2λz^{2} equals xyz. It follows
that at any point which maximizes xyz,
|x|
=
|y|
=
|z|
. Therefore, the only candidates
for the point where the maximum occurs are
(3,3,3),(− 3,− 3,3) (− 3,3,3)
etc. The maximum occurs at
(3,3,3)
which can be verified by plugging in to the function
which is being maximized.
The method of Lagrange multipliers allows you to consider maximization of functions
defined on closed and bounded sets. Recall that any continuous function defined on a
closed and bounded set has a maximum and a minimum on the set. Candidates for the
extremum on the interior of the set can be located by setting the gradient equal to zero.
The consideration of the boundary can then sometimes be handled with the method of
Lagrange multipliers.
Example 22.5.2Maximize f
(x,y)
= xy+y subject to the constraint, x^{2}+y^{2}≤ 1.
Here I know there is a maximum because the set is the closed disk, a closed and
bounded set. Therefore, it is just a matter of finding it. Look for singular points on the
interior of the circle. ∇f
(x,y)
=
(y,x+ 1)
=
(0,0)
. There are no points on the interior
of the circle where the gradient equals zero. Therefore, the maximum occurs on the
boundary of the circle. That is, the problem reduces to maximizing xy + y subject to
x^{2} + y^{2} = 1. From the above,
Example 22.5.3Find candidates for the maximum and minimum values of thefunction f
(x,y)
= xy − x^{2}on the set
{(x,y) : x2 + 2xy+ y2 ≤ 4}
.
First, the only point where ∇f equals zero is
(x,y)
=
(0,0)
and this is in
the desired set. In fact it is an interior point of this set. This takes care of the
interior points. What about those on the boundary x^{2} + 2xy + y^{2} = 4? The
problem is to maximize xy − x^{2} subject to the constraint, x^{2} + 2xy + y^{2} = 4. The
Lagrangian is xy − x^{2}− λ
yields a
maximum. However, this is a little misleading. How do you even know a maximum or a
minimum exists? The set x^{2} + 2xy + y^{2}≤ 4 is an unbounded set which lies
between the two lines x + y = 2 and x + y = −2. In fact there is no minimum. For
example, take x = 100,y = −98. Then xy − x^{2} = x
(y − x)
= 100
(− 98 − 100)
which is a large negative number much less than 0, the answer for the point
(0,0)
.
There are no magic bullets here. It was still required to solve a system of
nonlinear equations to get the answer. However, it does often help to do it this
way.
A nice observation in the case that the function f, which you are trying to maximize,
and the function g, which defines the constraint, are functions of two or three variables is
the following.
At points of interest,
∇f × ∇g = 0
This follows from the above because at these points,
∇f = λ∇g
so the angle between the two vectors ∇f and ∇g is either 0 or π. Therefore, the sine of
this angle equals 0. By the geometric description of the cross product, this implies the
cross product equals 0. Here is an example.
Example 22.5.4Minimize f
(x,y)
= xy − x^{2}on the set
{ 2 2 }
(x,y) : x + 2xy+ y = 4
Using the observation about the cross product, and letting f
Thus there are two equations, x^{2} + 2xy + y^{2} = 4 and 4xy − 2y^{2} + 6x^{2} = 0. Solving these
two yields the points of interest
( )
− 12,− 32
,
( )
12, 32
. Both give the same value for f a
maximum.
The above generalizes to a general procedure which is described in the following
major Theorem. All correct proofs of this theorem will involve some appeal to the
implicit function theorem or to fundamental existence theorems from differential
equations. A complete proof is very fascinating but it will not come cheap. Good
advanced calculus books will usually give a correct proof. If you are interested, there is a
complete proof later. First here is a simple definition explaining one of the terms in the
statement of this theorem.
Definition 22.5.5Let A be an m × n matrix. A submatrix is any matrixwhich can be obtained from A by deleting some rows and some columns.
Theorem 22.5.6Let U be an open subset of ℝ^{n}and let f : U → ℝ be a C^{1}function. Then if x_{0}∈ U, has the property that
gi(x0) = 0, i = 1,⋅⋅⋅,m, gi a C1 function, (22.2)
(22.2)
and x_{0}is either a local maximum or local minimum of f on the intersection of the levelsets just described, and if some m × m submatrix of
To help remember how to use 22.3, do the following. First write the Lagrangian,
m∑
L = f (x) − λigi(x)
i=1
and then proceed to take derivatives with respect to each of the components of x and also
derivatives with respect to each λ_{i} and set all of these equations equal to 0. The formula
22.3 is what results from taking the derivatives of L with respect to the components of x.
When you take the derivatives with respect to the Lagrange multipliers, and set what
results equal to 0, you just pick up the constraint equations. This yields n + m
equations for the n + m unknowns x_{1},
⋅⋅⋅
,x_{n},λ_{1},
⋅⋅⋅
,λ_{m}. Then you proceed to
look for solutions to these equations. Of course these might be impossible to
find using methods of algebra, but you just do your best and hope it will work
out.
Example 22.5.7Minimize xyz subject to the constraints x^{2} + y^{2} + z^{2} = 4 andx − 2y = 0.
Form the Lagrangian,
L = xyz − λ(x2 + y2 + z2 − 4)− μ(x − 2y)
and proceed to take derivatives with respect to every possible variable, leading to the
following system of equations.
Now you have to find the solutions to this system of equations. In general, this could be
very hard or even impossible. If λ = 0, then from the third equation, either x or y must
equal 0. Therefore, from the first two equations, μ = 0 also. If μ = 0 and λ≠0, then from
the first two equations, xyz = 2λx^{2} and xyz = 2λy^{2} and so either x = y or x = −y, which
requires that both x and y equal zero thanks to the last equation. But then from the
fourth equation, z = ±2 and now this contradicts the third equation. Thus μ and λ are
either both equal to zero or neither one is and the expression, xyz equals zero in this
case. However, I know this is not the best value for a minimizer because I can
take x = 2
∘ 3-
5
,y =
∘ 3-
5
, and z = −1. This satisfies the constraints and the
product of these numbers equals a negative number. Therefore, both μ and λ must
be non zero. Now use the last equation eliminate x and write the following
system.
2 2
5y2 + z = 4
y − λz = 0
yz − λy+ μ = 0
yz − 4λy − μ = 0
From the last equation, μ =
(yz − 4λy )
. Substitute this into the third and get
5y2 + z2 = 4
y2 − λz = 0
yz − λy+ yz − 4λy = 0
y = 0 will not yield the minimum value from the above example. Therefore, divide the
last equation by y and solve for λ to get λ =
(2∕5)
z. Now put this in the second equation
to conclude
5y2 + z2 = 4 ,
y2 − (2∕5) z2 = 0
a system which is easy to solve. Thus y^{2} = 8∕15 and z^{2} = 4∕3. Therefore, candidates for
minima are
( ∘ 8-∘ -8- ∘ 4)
2 15, 15,± 3
, and
( ∘-8- ∘ 8- ∘ 4)
− 2 15,− 15,± 3
, a choice of 4 points to
check. Clearly the one which gives the smallest value is
( ∘ ---∘ --- ∘ -)
2 -8, -8,− 4
15 15 3
or
( ∘ -- ∘ -- ∘ -)
− 2 8,− 8-,− 4
15 15 3
and the minimum value of the function subject to the
constraints is −
2
5
√--
30
−
2
3
√-
3
.
You should rework this problem first solving the second easy constraint for
x and then producing a simpler problem involving only the variables y and
z.