17.3. THE DERIVATIVE OF FUNCTIONS OF MANY VARIABLES 301

If you deleted the o(x−x0) term and considered the function of x given by what isleft, this is called the linear approximation to the function at the point x0. In the case wherex ∈ R2 and f has values in R one can draw a picture to illustrate this.

Of course the first and most obvious question is whether the linear transformation isunique. Otherwise, the definition of the derivative Df (x) would not be well defined.

Theorem 17.3.3 Suppose f is differentiable, as given above in (17.5). Then T is uniquelydetermined. Furthermore, the matrix of T is the following p×n matrix(

∂f(x)∂x1

· · · ∂f(x)∂xn

)where

∂f

∂xi(x)≡ lim

h→0

f (x+tei)−f (x)

t,

the kth partial derivative of f .

Proof: Suppose T1 is another linear transformation which works. Thus, letting t be asmall positive real number,

f (x+th) = f (x)+Tth+o(th)

f (x+th) = f (x)+T1th+o(th)

Now o(th) = o(t) and so, subtracting these yields

Tth−T1th= o(t)

Divide both sides by t to obtain

Th−T1 h=o(t)

t

It follows on letting t→ 0 that Th= T1h. Since h is arbitrary, this shows that T = T1. Thusthe derivative is well defined. So what is the matrix of this linear transformation? FromTheorem 8.3.2, this is the matrix whose ith column is Tei. However, from the definition ofT, letting t ΜΈ= 0,

f (x+ tei)−f (x)

t=

1t(T (tei)+o(tei))

= T (ei)+o(tei)

t= T (ei)+

o(t)t

Then letting t→ 0, it follows that

Tei =∂f

∂xi(x)