712 CHAPTER 22. THE DERIVATIVE

Lemma 22.2.3 Let f be differentiable at x. Then f is continuous at x and in fact, thereexists K > 0 such that whenever ||v|| is small enough,

||f(x+v)− f(x)|| ≤ K ||v||

Also if f is differentiable at x, then

o(∥f(x+v)− f(x)∥) = o(v)

Proof: From the definition of the derivative,

f(x+v)− f(x) = Df(x)v+o(v) .

Let ||v|| be small enough that o(||v||)||v|| < 1 so that ||o(v)|| ≤ ||v||. Then for such v,

||f(x+v)− f(x)|| ≤ ||Df(x)v||+ ||v||≤ (||Df(x)||+1) ||v||

This proves the lemma with K = ||Df(x)||+ 1. Recall the operator norm discussed inDefinition 17.1.5.

The last assertion is implied by the first as follows. Define

h(v)≡

{o(∥f(x+v)−f(x)∥)∥f(x+v)−f(x)∥ if ∥f(x+v)− f(x)∥ ̸= 0

0 if ∥f(x+v)− f(x)∥= 0

Then lim∥v∥→0 h(v) = 0 from continuity of f at x which is implied by the first part. Alsofrom the above estimate,∥∥∥∥o(∥f(x+v)− f(x)∥)

∥v∥

∥∥∥∥= ∥h(v)∥ ∥f(x+v)− f(x)∥∥v∥

≤ ∥h(v)∥(||Df(x)||+1)

This establishes the second claim.Here ||Df(x)|| is the operator norm of the linear transformation Df(x).

22.3 The Chain RuleWith the above lemma, it is easy to prove the chain rule.

Theorem 22.3.1 (The chain rule) Let U and V be open sets U ⊆ X and V ⊆ Y . Supposef : U → V is differentiable at x ∈U and suppose g : V → Fq is differentiable at f(x) ∈ V .Then g◦ f is differentiable at x and

D(g◦ f)(x) = Dg(f(x))Df(x) .

Proof: This follows from a computation. Let B(x,r)⊆U and let r also be small enoughthat for ||v|| ≤ r, it follows that f(x+v) ∈V . Such an r exists because f is continuous at x.For ||v||< r, the definition of differentiability of g and f implies

g(f(x+v))−g(f(x)) =