Differential Properties of Functions Defined on
R
n
Prof. Dan A. Simovici
UMB
1/1
Outline
2/1
Partial Derivatives
Let S be an open subset of Rn and let f : S −→ R be a function.
∂f
The partial derivatives of f are denoted by ∂x
for 1 66 n.
i
The second order partial derivatives are denoted by
∂2f
.
∂xi ∂xj
Definition
The linear form
f 0 (xx , h ) =
n
X
∂f
hi
∂xi
i=1
is the first differential of f at x .
It is also the derivative at t = 0 of the function g (t) = f (xx + thh ).
f 0 (xx , h ) can be interpreted as the derivative of f at x in the direction h .
3/1
Partial Derivatives
Definition
The gradient of f at x is the vector
∂f
x
x)
∂x1 (x
∇f (xx ) =
.. .
.
∂f
x
∂xn (x )
We have f 0 (xx , h ) = ∇f (xx )0h .
4/1
Partial Derivatives
The quadratic form
00
f (xx , h ) =
n X
n
X
i=1 j=1
∂2f
hi hj
∂xi ∂xj
is the second order differential of f at x .
It is also the second order derivative of the function g (t) = f (xx + thh );
accordingly, it is the second order derivative of f at x in the direction h .
The matrix of second derivatives of f is
2 ∂ f
00
f (xx ) =
∂xi ∂xj
is called the Hessian of f at x .
5/1
The Taylor Formula
Taylor Theorem for One Argument Functions
Theorem
(Taylor’s Theorem for One-Argument Functions) Let f : [a, b] −→ R
be a function such that f together with its first n − 1 derivatives
f (1) , . . . , f (n−1) are continuous on the interval [a, b]. If f (n) exists on
(a, b), then there exists c ∈ (a, b) such that
f (2) (a)
(b − a)2 + · · ·
2!
f (n−1) (a)
f (n) (c)
+
(b − a)n−1 +
(b − a)n .
(n − 1)!
n!
f (b) = f (a) + f (1) (a)(b − a) +
6/1
The Taylor Formula
Proof
Let φ : [a, b] −→ R be the function defined by
φ(x) = f (b) − f (x) − f (1) (x)(b − x) −
−
f (2) (x)
(b − x)2 − · · ·
2!
f (n−1) (x)
(b − x)n−1 .
(n − 1)!
The derivative of φ exists for any x ∈ (a, b) and is easily seen to be
φ0 (x) = −
f (n) (x)
(b − x)n−1 .
(n − 1)!
Define the function g : [a, b] −→ R as
b−x n
g (x) = φ(x) −
φ(a).
b−a
7/1
The Taylor Formula
Proof (cont’d)
Since g (a) = g (b) = 0, by Rolle’s Theorem from elementary analysis,
there exists c ∈ (a, b) such that g 0 (c) = 0. Note that
g 0 (x) = φ0 (x) + n
so
φ0 (c) + n
(b − x)n−1
φ(a),
(b − a)n
(b − c)n−1
φ(a) = 0.
(b − a)n
8/1
The Taylor Formula
Proof (cont’d)
(n)
f (c)
Since φ0 (c) = − (n−1)!
(b − c)n−1 , it follows that
φ(a) = (b − a)n n!f (n) (c),
which implies the desired equality
f (b) − f (a) − f (1) (a)(b − a) −
=
(b−a)n (n)
(c).
n! f
f (2) (a)
2! (b
− a)2 − · · · −
f (n−1) (a)
(n−1)! (b
− a)n−1
9/1
The Taylor Formula
Notations
Let f : S −→ R be a function, where S ⊆ Rn . If all partial derivatives
∂k f
n
∂xi ··· ∂xi exist for 1 6 k 6 n and h ∈ R , define the expression
1
k
f [k] (xx ;hh ) =
n
X
i1 =1
···
n
X
ik =1
∂k f
(xx )hi1 · · · hik .
∂xi1 · · · ∂xik
This notation is needed for formulating Taylor’s Theorem for functions of
n variables. Note that
f [k] (xx ;λhh ) = λk f [k] (xx ;hh ).
The function f [1] (xx ;hh ) is f 0 (xx , h ), the differential of the function f .
10 / 1
The Taylor Formula
Notations
For k = 2, we have
f [2] (xx ;hh ) =
n X
n
X
i1 =1 i2 =1
∂2f
(xx )hi1 hi2
∂xi1 ∂xi2
= h 0 Hf (xx )hh ,
where Hf (xx ) ∈ Rn×n , the matrix defined by
2 ∂ f
Hf (xx ) =
.
∂xi ∂xj
is the as the Hessian matrix of the function f at x .
f [2] (xx ;hh ) is f 00 (xx , h ) the second order differential of f at x .
11 / 1
The Taylor Formula
Taylor’s Theorem for Functions of Several Arguments
Theorem
Let f : S −→ R be a function, where S ⊆ Rn is an open set. If f and all
its partial derivatives of order less or equal to m are differentiable on S,
a , b ∈ S such that [aa , b ] ⊆ S, then there exists a point c ∈ [aa , b ] such that
f (bb ) = f (aa ) +
m−1
X
k=1
1 [m]
1 [k]
f (aa , b − a ) +
f (cc , b − a ).
k!
m!
12 / 1
The Taylor Formula
Proof
Let g : R −→ R be the function defined by g (t) = f (pp (t)), where
p (t) = a + t(bb − a )) for t ∈ [0, 1]. We have g (0) = f (aa ) and g (1) = f (bb ).
The Taylor’s formula applied to g yields the existence of c ∈ (0, 1) such
that
m−1
X 1
g (k) (0) + g (m) (c).
g (1) = g (0) +
k!
k=1
We claim that
g (m) (t) = f [m] (pp (t), b − a ).
Indeed, for m = 1, by applying the chain rule we have
m
X
∂f
g (t) =
(pp (t))(bj − aj ) = f [1] (xx ;bb − a ).
∂xj
0
j=1
13 / 1
The Taylor Formula
Proof (cont’d)
Suppose that the equality holds for m. Then
g (m+1) (t) = (f [m] (pp (t), b − a ))0
0
n
n
m
X
X
∂ f
=
···
(pp (t))(bi1 − ai1 ) · · · (bim − aim )
∂xi1 · · · ∂xim
i1 =1
=
n
X
···
i1 =1
= f
im =1
[m+1]
n
X
n
X
im =1 im+1 =1
∂ m+1 f
(pp (t))(bi1 − ai1 ) · · · (bim − aim )(bim+1 −
∂xi1 · · · ∂xim+1
(pp (t), b − a ).
When the values of the derivatives of g are substituted we obtain the
equality of the theorem.
14 / 1
The Taylor Formula
Proof
Example
For m = 2, Taylor’s Theorem yields the existence of c ∈ [aa , b ] such that
1
f (bb ) = f (aa ) + f [1] (aa , b − a ) + (bb − a )0 Hf (cc )(bb − a )
2
1
= f (aa ) + (∇f )a0 (bb − a ) + (bb − a )0 Hf (cc )(bb − a ).
2
15 / 1
The Taylor Formula
Example
Let f : R3 −→ R be the function given by f (xx ) =k x k for x ∈ R3 . We
have
x1
∂f
x1
=q
,
=
∂x1
kx k
x2 + x2 + x2
1
and similar expressions for
∂f
∂x2
and
2
∂f
∂x3
(∇f )x 0 =
3
and we have
1
x0
k x0 k
if x 0 6= 0 3 . The gradient ∇(f )x 0 is a unit vector for every x 0 ∈ R3 − {003 }.
16 / 1
The Taylor Formula
Example (cont’d)
The function R(xx 0 , x ) is given by
R(xx 0 , x ) =
=
f (xx ) − f (xx 0 ) − (∇f )x0 0 (xx − x 0 )
k x − x0 k
k x k − k x 0 k − kxx10 k x 00 (xx − x 0 )
k x − x0 k
Therefore, the function f is differentiable in x 0 if x 0 6= 0 3 .
17 / 1
The Taylor Formula
Example
Let f : R3 −→ R be the function given by f (xx ) =k x k2 = for x ∈ R3 . We
have
∂f
∂f
∂f
= 2x1 ,
= 2x2 ,
= 2x3 ,
∂x1
∂x2
∂x3
so (∇f )x = 2xx and this function is differentiable for all x ∈ R3 .
18 / 1
Differentiability and Optimization
Definition
Let f : Rn −→ R be a function.
A point x 0 ∈ Rn is a local minimum for f if there exists a closed sphere
B(xx 0 , ) such that f (xx 0 ) 6 f (xx ) for every x ∈ B(xx 0 , ). If we have
f (xx 0 ) < f (xx ) for every x ∈ B(xx 0 , ) − {xx 0 }, then x 0 is a strict local
minimum.
A global minimum for f is a point x 0 such that f (xx 0 ) 6 f (xx ) for x ∈ Rn ;
x 0 is a strict global minimum if f (xx 0 ) < f (xx ) for x ∈ Rn − {xx 0 }.
19 / 1
Differentiability and Optimization
Similar definitions can be formulated for local maxima, strict local
maxima, global maxima, and strict global maxima:
x 0 ∈ Rn is a local maximum if there exists a closed sphere B(xx 0 , ) such
that f (xx 0 ) > f (xx ) for every x ∈ B(xx 0 , ). If f (xx 0 ) > f (xx ) for every
x ∈ B(xx 0 , ) − {xx 0 }, then x 0 is a strict local maximum.
A global maximum for f is a point x 0 such that f (xx 0 ) > f (xx ) for x ∈ Rn ;
x 0 is a strict global maximum if f (xx 0 ) > f (xx ) for x ∈ Rn − {xx 0 }.
A local minimum or maximum of f is said to be a local extremum.
20 / 1
Differentiability and Optimization
An unconstraint optimization problem consists in finding a local minimum
or a local maximum of a function f : Rn −→ R, when such a minimum
exists. The function f is referred to as the objective function.
Finding a local minimum of a function f is equivalent to finding a local
maximum for the function −f .
Definition
Let f : Rn −→ R be a function.
An ascent direction for f in x 0 is a vector h ∈ Rn such that there exists a
positive number for which 0 < t < implies f (x0 + thh ) > f (xx 0 ).
A descent direction for f in x 0 is a vector h ∈ Rn such that exists a
positive number for which 0 < t < implies f (x0 + thh ) 6 f (xx 0 ).
21 / 1
Differentiability and Optimization
If f : Rn −→ R is a differentiable function, the existence of the gradient
provides new instruments for computing extrema. The vector (∇f )x points
in the direction of increased values of the function f .
u (ascent direction)
OC
IC
(∇f )Cu
x
x 00
v (descent direction)
I
direction of increase of f (xx )
An ascent direction u in x 0 makes an acute angle with the vector (∇f )x 0 ,
while a descent direction v makes an obtuse angle with the same vector,
as we see in the next statement.
22 / 1
Differentiability and Optimization
Theorem
Let f : Rn −→ R be a differentiable function at x 0 .
If (∇f )x0 0 w < 0, then w is a descent direction for f in x 0 .
If (∇f )x0 0 w > 0, then w is a ascent direction for f in x 0 .
23 / 1
Differentiability and Optimization
Proof
Since f is differentiable in x 0 we can write
w ) = f (xx 0 ) + t(∇f )x0 0 w + t k w k R(xx 0 , x 0 + tw
w ),
f (xx 0 + tw
w ) = 0. This implies
for t > 0, where limt→0 R(xx 0 , x 0 + tw
w ) − f (xx 0 )
f (xx 0 + tw
w ).
= (∇f )x0 0 w + k w k R(xx 0 , x 0 + tw
t
w ) = 0, there exists > 0 such
Since (∇f )x0 0 w < 0 and limt→0 R(xx 0 , x 0 + tw
w ) − f (xx 0 ) 6 0, so w is a descent direction.
that 0 < t < implies f (xx 0 + tw
The argument for the second part of the theorem is similar.
24 / 1
Differentiability and Optimization
Theorem
Let f : Rn −→ R be a function differentiable at x 0 . If x 0 is a local
minimum or a local maximum, then (∇f )x 0 = 0 n .
25 / 1
Differentiability and Optimization
Proof
Let x 0 be a local minimum of f . Suppose that (∇f )x 0 6= 0 n and let
w = −(∇f )x 0 . We have (∇f )x0 0 w < 0, so, by Theorem ??, w is a descent
direction, that is, there exists a positive number such that 0 < t < implies f (x0 + tuu ) 6 f (xx 0 ), which contradicts the initial assumption
concerning x 0 . Therefore, (∇f )x 0 = 0 n .
The case when x 0 is a local maximum can be treated similarly.
26 / 1
Differentiability and Optimization
Definition
A stationary point of a differentiable function f : S −→ R (where S ⊆ Rn )
is a point x ∈ S such that (∇f )x = 0 n .
Observe that a stationary point of a function is not necessarily a local
extremum of the function, as the next example shows.
27 / 1
Differentiability and Optimization
Example
Consider the function f : R2 −→ R defined by f (x1 , x2 ) = x1 x2 for
(x1 , x2 ) ∈ R2 . We have
x
(∇f )x = 2 ,
x1
which shows that 0 2 is a stationary point of f . Note that in any sphere
B(002 , ) there are both positive and negative numbers. Since f (002 ) = 0, it
is clear that although 0 2 is a stationary point of f , 0 2 is not a local
extremum.
28 / 1
Differentiability and Optimization
For functions that are twice differentiable it is possible to give
characterizations of local extrema.
The next theorem gives sufficient conditions for the existence of a local
minimum.
Theorem
Let f : Rn −→ R be a twice differentiable function at x 0 .
If (∇f )x 0 = 0 n , and the Hessian matrix Hf (xx 0 ) is positive definite, then x 0
is a local minimum of f .
29 / 1
Differentiability and Optimization
By Taylor’s Theorem,taking into account that (∇f )x 0 = 0 n , we have
f (xx ) = f (xx 0 ) + (xx − x 0 )0 Hf (xx 0 )(xx − x 0 )
+ k x − x 0 k2 R(xx 0 , x − x 0 ),
where limx →xx 0 R(xx 0 , x − x 0 ) = 0.
Suppose that x 0 is not a minimum. Then, there exists a sequence
x 1 , x 2 , . . . , x n , . . . such that limn→∞ x n = x 0 such that f (xx n ) < f (xx 0 ) for
1
x n − x 0 ).
each n > 1. Let r n be the unit vector r n = kxx n −x
x 0 k (x
30 / 1
Differentiability and Optimization
Proof (cont’d)
We have
f (xx n ) = f (xx 0 )+ k x n − x 0 k2 r 0n Hf (xx 0 )rr n
+ k x n − x 0 k2 R(xx 0 , x n − x 0 ) < f (xx 0 ),
which implies
r 0n Hf (xx 0 )rr n + R(xx 0 , x n − x 0 ) < 0.
The sequence (rr 1 , . . . , r n , . . .) is bounded since it consists of unit vectors
and, therefore, it contains a subsequence convergent subsequence
(rr i1 , . . . , r im , . . .) such that limm→∞ r im = r and k r k= 1. This implies
r 0 Hf (xx 0 )rr 6 0, which contradicts the fact that Hf (xx 0 ) is positive definite.
Therefore, x 0 is a local minimum.
31 / 1
© Copyright 2026 Paperzz