The Fréchet and Gateaux Derivatives
This brief note explains the precise relationship between the two principal notions of derivatives on normed vector spaces, namely, the Fréchet and Gateaux derivatives.
This brief note examines the precise relationship between the Fréchet and Gateaux derivatives.
Introduction
The fundamental idea of differential calculus is to study continuous phenomena by means of linear phenomena. Continuous phenomena are naturally expressed in metric spaces and linear phenomena are naturally expressed in vector spaces. Thus, the natural setting for the study of differential calculus is within spaces that are both metric spaces and vector spaces, namely, normed vector spaces.
I previously explored some concepts of differential calculus on normed spaces in another post.
There are two notions of derivatives on normed vector spaces which are very similar. Both notions are continuous linear maps that map vectors to their corresponding directional derivatives.
Directional Derivatives
Recall the definition of the directional derivative.
Definition (Directional Derivative). The directional derivative of a function \(f : V \rightarrow W\) between real normed vector spaces \(V\) and \(W\) at a point \(p \in V\) in the direction \(v \in V\) is defined to be the following limit (whenever it exists):
\[D_pf(v) = \lim_{t \to 0} \frac{f(p + t \cdot v) - f(p)}{t}.\]
The directional derivative indicates how a change in the input to a function affects its output, namely, as the ratio of the change in a particular direction \(v\) along a parameterized linear path \(p + t \cdot v\) originating a point \(p\) to the value of the parameter \(t \in \mathbb{R}\).
Some authors require the direction \(v\) to be a unit vector. However, if arbitrary directions are permitted, the operation \(D_pf(-)\) will be linear and the operators \(D_p(-)(v)\) will comprise a vector space which is isomorphic to the domain \(V\), which is extremely useful (e.g., for defining tangent spaces on manifolds).
If we write \(u = v / \lVert v \rVert\) for the unit vector corresponding to \(v\), then \(t \cdot v\) is equivalent to \(t \cdot \lVert v \rVert \cdot u\), and the corresponding normalized directional derivative is
\[\lim_{t \to 0} \frac{f(p + t \cdot v) - f(p)}{t \cdot \lVert v \rVert}\]
which is equivalent to
\[\lim_{t \to 0} \frac{f(p + t \cdot \lVert v \rVert \cdot u) - f(p)}{t \cdot \lVert v \rVert}\]
and hence the parameters in the numerator and denominator match (they are both \(t \cdot \lVert v \rVert\)).
The Gateaux Derivative
The Gateaux derivative (a.k.a weak derivative) maps each vector to its respective directional derivative and additionally requires the mapping to be a bounded linear map. All bounded linear maps between normed vector spaces are continuous, so this requires the Gateaux derivative to be continuous.
Definition (Gateaux Derivative). The Gateaux derivative of a function \(f : V \rightarrow W\) between normed vector spaces \(V\) and \(W\) at a point \(p \in V\) is a bounded linear map \(D_pf(-) : V \rightarrow W\) which evaluates to the directional derivative \(D_pf(v)\) for each vector \(v \in V\).
Note that some authors do not require the Gateaux derivative to be a bounded linear map (and hence permit discontinuous derivatives).
The Fréchet Derivative
The Fréchet derivative (a.k.a strong derivative) is defined as follows.
Definition (Fréchet Derivative). The Fréchet derivative of a function \(f : V \rightarrow W\) between normed vector spaces \(V\) and \(W\) at a point \(p \in V\) is a bounded linear map \(df_p(-) : V \rightarrow W\) such that
\[\lim_{v \to 0}\frac{\lVert f(p + v) - f(p) - df_p(v) \rVert_W}{\lVert v \rVert_V} = 0.\]
It happens that, if the Fréchet derivative exists at a point \(p\), then its value for every vector \(v\) is precisely the respective directional derivative, i.e., \(df_p(v) = D_pf(v)\).
The Relationship
Both notions are very similar. They are both bounded linear maps that map vectors to their respective directional derivatives. However, the two notions are not equivalent. What, then, is the precise difference between the two notions?
There is an alternative definition of the Fréchet derivative which makes the difference precise: the Fréchet derivative requires a uniform limit whereas the Gateaux derivative does not. This can be seen in the definitions: each directional derivative in the Gateaux derivative is an independent limit, whereas the Fréchet derivative expresses a single limit involving the entire mapping \(df_p(-)\).
In this context, the uniform limit of interest is defined as follows.
Definition (Uniform Directional Derivatives). The directional derivatives \(D_pf(-)\) for a function \(f : V \rightarrow W\) between normed vector spaces \(V\) and \(W\) and a point \(p \in V\) are uniform if, for all \(\varepsilon > 0\) there exists a \(\delta > 0\) such that, for all \(t \in \mathbb{R}\) satisfying \(\lvert t \rvert < \delta\) and for all \(v \in V\) satisfying \(\lVert v \rVert = 1\),
\[\bigg\lVert\frac{f(p + t \cdot v) - f(p)}{t} - D_pf(v)\bigg\rVert < \varepsilon.\]
This definition is similar to the definition of the ordinary directional derivative limit, except that it additionally requires all unit vectors (i.e., all "pure" directions in the unit sphere \(S_V = \{v \in V : \lVert v \rVert = 1\}\)) to converge to their respective limit within the same neighborhood (i.e., whenever the parameter \(t\) is within distance \(\delta\) of \(0\)). Note that, with the Gateaux derivative, this neighborhood can vary for each vector \(v\) since each limit is independent.
We can re-write this expression as
\[\bigg\lVert\frac{f(p + t \cdot v) - f(p) - t \cdot D_pf(v)}{t} - 0\bigg\rVert < \varepsilon,\]
which then makes the requirement equivalent to the following condition:
\[\lim_{t \to 0}\bigg\lVert\frac{f(p + t \cdot v) - f(p) - t \cdot D_pf(v)}{t}\bigg\rVert = 0.\]
This is equivalent, by properties of the norm, to
\[\lim_{t \to 0}\frac{\lVert f(p + t \cdot v) - f(p) - t \cdot D_pf(v) \rVert}{\lvert t \rvert} = 0.\]
Since this condition is only required for unit vectors, we can re-express it for arbitrary vectors according to the normalization factor indicated previously:
\[\lim_{t \to 0}\frac{\lVert f(p + t \cdot v) - f(p) - t \cdot D_pf(v) \rVert}{\lvert t \rvert \cdot \lVert v \rVert} = 0\]
which, by properties of the norm, and, since \(D_pf(-)\) is linear, is the same as
\[\lim_{t \to 0}\frac{\lVert f(p + t \cdot v) - f(p) - D_pf(t \cdot v) \rVert}{\lVert t \cdot v \rVert} = 0.\]
By a theorem on limits in normed vector spaces (proved in this post), \(\lim_{t \to 0}g(t \cdot v) = \lim_{v \to 0}g(v)\) for any function \(g\) whenever \(v \ne 0\) and \(\lim_{v \to 0}g(v)\) exists, and thus this is equivalent to
\[\lim_{v \to 0}\frac{\lVert f(p + v) - f(p) - D_pf(v) \rVert}{\lVert v \rVert} = 0,\]
which recovers the standard definition of the Fréchet derivative.
Thus, the essential difference is that the Fréchet derivative expresses a uniform limit whereas the Gateaux derivative expresses independent limits. The additional requirement of uniformity is what makes the Fréchet derivative a "strong" derivative.