22.2. THE DERIVATIVE 461

and this converges to 0 as (u,v)→ (0,0). This assertion follows from the inequality |uv| ≤12

(u2 + v2

)which you can verify from (u− v)2 ≥ 0. Similar considerations apply in higher

dimensions also. In general, this is a hard question because it involves a limit of a functionof many variables. Furthermore, there is really no substitute for answering this question,because its resolution involves the definition of whether a function is differentiable. Thatmay be why we spend most of our time on one dimensional considerations which involvetaking the partial derivatives. The following exercises should help give you an idea of howto determine whether something is o.

22.2 The DerivativeThe way of thinking about the derivative in Theorem 22.1.3 is exactly what is needed todefine the derivative of a function of n variables. One can argue that it is also the right wayto define the derivative of a function of one variable in order to reduce confusion later on.

As observed by Deudonne, “...In the classical teaching of Calculus, this idea (that thederivative is a linear transformation) is immediately obscured by the accidental fact that,on a one-dimensional vector space, there is a one-to-one correspondence between linearforms and numbers, and therefore the derivative at a point is defined as a number insteadof a linear form. This slavish subservience to the shibboleth1 of numerical interpretation atany cost becomes much worse when dealing with functions of several variables...”

In fact, the derivative is a linear transformation and it is useless to pretend otherwise.This is the main reason for including the introductory material on linear algebra in thisbook.

Recall the following definition.

Definition 22.2.1 A function T which maps Rn to Rp is called a linear transforma-tion if for every pair of scalars, a,b and vectors x,y ∈ Rn, it follows that T (ax+by) =aT (x)+bT (y).

Recall that from the properties of matrix multiplication, if A is a p× n matrix, and ifx,y are vectors in Rn, then A(ax+by) = aA(x)+ bA(y). Thus you can define a lineartransformation by multiplying by a matrix. Of course the simplest example is that of a 1×1matrix or number. You can think of the number 3 as a linear transformation T mapping R toR according to the rule T x= 3x. It satisfies the properties needed for a linear transformationbecause 3(ax+by) = a3x+b3y = aT x+bTy. The case of the derivative of a scalar valuedfunction of one variable is of this sort. You get a number for the derivative. However, youcan think of this number as a linear transformation and this is the way you must think of itfor a function of n variables. First there is a useful lemma.

Lemma 22.2.2 Let T ∈ L (Rn,Rm). Then there is a constant C such that |Tx| ≤C |v|.

Proof: Let A be the matrix of T . Then, using the Cauchy Schwarz inequality Lemma

1In the Bible, there was a battle between Ephraimites and Gilleadites during the time of Jepthah, the judgewho sacrificed his daughter to Jehovah, one of several instances of human sacrifice in the Bible. The cause ofthis battle was very strange. However, the Ephramites lost and when they tried to cross a river to get back home,they had to say shibboleth. If they said “sibboleth” they were killed because their inability to pronounce the “sh”sound identified them as Ephramites. They usually don’t tell this story in Sunday school. The word has come tosignify something which is arbitrary and no longer important.