The Fréchet and Gâteaux Derivatives
This brief note explains the precise relationship between the two principal notions of derivatives on normed vector spaces, namely, the Fréchet and Gâteaux derivatives.
This brief note examines the precise relationship between the Fréchet and Gâteaux derivatives.
Introduction
The fundamental idea of differential calculus is to study continuous phenomena by means of linear phenomena. Continuous phenomena are naturally expressed in metric spaces and linear phenomena are naturally expressed in vector spaces. Thus, the natural setting for the study of differential calculus is within spaces that are both metric spaces and vector spaces, namely, normed vector spaces.
I previously explored some concepts of differential calculus on normed spaces in another post.
There are two notions of derivatives on normed vector spaces which are very similar. Both notions are continuous linear maps that map vectors to their corresponding directional derivatives.
Directional Derivatives
Recall the definition of the directional derivative.
Definition (Directional Derivative). The directional derivative of a function \(f : V \rightarrow W\) between real normed vector spaces \(V\) and \(W\) at a point \(p \in V\) in the direction \(v \in V\) is defined to be the following limit (whenever it exists):
\[D_pf(v) = \lim_{t \to 0} \frac{f(p + t \cdot v) - f(p)}{t}.\]
The directional derivative indicates how a change in the input to a function affects its output, namely, as the ratio of the change in a particular direction \(v\) along a parameterized linear path \(p + t \cdot v\) originating a point \(p\) to the value of the parameter \(t \in \mathbb{R}\).
Some authors require the direction \(v\) to be a unit vector. However, if arbitrary directions are permitted, the operation \(D_pf(-)\) will be linear and the operators \(D_p(-)(v)\) will comprise a vector space which is isomorphic to the domain \(V\), which is extremely useful (e.g., for defining tangent spaces on manifolds).
The Gâteaux Derivative
The Gâteaux derivative (a.k.a weak derivative) maps each vector to its respective directional derivative and additionally requires the mapping to be a bounded linear map. All bounded linear maps between normed vector spaces are continuous, so this requires the Gâteaux derivative to be continuous.
Definition (Gâteaux Derivative). The Gâteaux derivative of a function \(f : V \rightarrow W\) between normed vector spaces \(V\) and \(W\) at a point \(p \in V\) is a bounded linear map \(D_pf(-) : V \rightarrow W\) which evaluates to the directional derivative \(D_pf(v)\) for each vector \(v \in V\).
Note that some authors do not require the Gâteaux derivative to be a bounded linear map (and hence permit discontinuous derivatives).
The Fréchet Derivative
The Fréchet derivative (a.k.a strong derivative) is defined as follows.
Definition (Fréchet Derivative). The Fréchet derivative of a function \(f : V \rightarrow W\) between normed vector spaces \(V\) and \(W\) at a point \(p \in V\) is a bounded linear map \(df_p(-) : V \rightarrow W\) such that
\[\lim_{v \to 0}\frac{\lVert f(p + v) - f(p) - df_p(v) \rVert_W}{\lVert v \rVert_V} = 0.\]
It happens that, if the Fréchet derivative exists at a point \(p\), then its value for every vector \(v\) is precisely the respective directional derivative, i.e., \(df_p(v) = D_pf(v)\).
The Relationship
Both notions are very similar. They are both bounded linear maps that map vectors to their respective directional derivatives. However, the two notions are not equivalent. What, then, is the precise difference between the two notions?
There is an alternative definition of the Fréchet derivative which makes the difference precise: the Fréchet derivative requires a uniform limit whereas the Gâteaux derivative does not. This can be seen in the definitions: each directional derivative in the Gâteaux derivative is an independent limit, whereas the Fréchet derivative expresses a single limit involving the entire mapping \(df_p(-)\).
In this context, the uniform limit of interest is defined as follows.
Definition (Uniform Directional Derivatives). The directional derivatives \(D_pf(-)\) for a function \(f : V \rightarrow W\) between normed vector spaces \(V\) and \(W\) and a point \(p \in V\) are uniform if, for all \(\varepsilon > 0\) there exists a \(\delta > 0\) such that, for all \(t \in \mathbb{R}\) satisfying \(\lvert t \rvert < \delta\) and for all \(u\in V\) satisfying \(\lVert u \rVert = 1\),
\[\bigg\lVert\frac{f(p + t \cdot u) - f(p)}{t} - D_pf(u)\bigg\rVert < \varepsilon.\]
This definition is similar to the definition of the ordinary directional derivative limit, except that it additionally requires all unit vectors (i.e., all "pure" directions in the unit sphere \(S_V = \{u \in V : \lVert u \rVert = 1\}\)) to converge to their respective limit within the same neighborhood (i.e., whenever the parameter \(t\) is within distance \(\delta\) of \(0\)). Note that, with the Gâteaux derivative, this neighborhood can vary for each vector \(u\) since each limit is independent.
This requirement is equivalent to the following condition:
\[\bigg\lVert\frac{f(p + t \cdot u) - f(p) - t \cdot D_pf(u)}{t}\bigg\rVert < \varepsilon.\]
This is equivalent, by properties of the norm, to
\[\frac{\lVert f(p + t \cdot u) - f(p) - t \cdot D_pf(u) \rVert}{\lvert t \rvert} < \varepsilon.\]
By a theorem on limits in normed vector spaces (proved in this post), \(\lim_{t \to 0}g(t \cdot v) = \lim_{v \to 0}g(v)\) for any function \(g\) whenever \(v \ne 0\) and \(\lim_{v \to 0}g(v)\) exists. We will repeat the proof here.
Theorem. For any function \(f : V \rightarrow W\) between normed vector spaces,
\[\lim_{t \to 0}f(th) = \lim_{h \to 0}f(h)\]
whenever \(h \ne 0\) and \(\lim_{h \to 0}f(h)\) exists.
Proof. Suppose \(\lim_{h \to 0}f(h) = L\) and \(\varepsilon \gt 0\). Then, there exists a \(\delta \gt 0\) such that \(\lVert f(h) - L \rVert_W \lt \varepsilon\) whenever \(\lVert h - 0 \rVert_V = \lVert h \rVert_V \lt \delta\). Suppose for some \(h \ne 0\) that \(\lVert t -0\rVert_{\mathbb{R}} = \lvert t \rvert \lt \delta / \lVert h \rVert_V\) (which is well-defined since \(h \ne 0\) and hence \(\lVert h \rVert_V \gt 0\)). It then follows that \(\lvert t \rvert \lVert h \rVert_V = \lVert th \rVert_V \lt \delta\), and hence \(\lVert f(th) - L\rVert_W \lt \varepsilon\). Thus, \(\lim_{t \to 0}f(th) = L\). \(\square\)
Thus, if
\[\lim_{v \to 0}\frac{\lVert f(p + v) - f(p) - D_pf(v) \rVert}{\lVert v \rVert} = 0,\]
then it follows that there exists a \(\delta > 0\) such that, for all unit vectors \(u\), whenever \(\lvert t \rvert < \delta/\lVert u \rVert\) and hence \(\lvert t \rvert < \delta\),
\[\frac{\lVert f(p + t \cdot u) - f(p) - t \cdot D_pf(u) \rVert}{\lVert t \cdot u \rVert} \lt \varepsilon\]
which, since \(\lVert u \rVert = 1\), is the same as
\[\frac{\lVert f(p + t \cdot u) - f(p) - t \cdot D_pf(u) \rVert}{\lvert t \rvert} \lt \varepsilon.\]
Thus, the Fréchet derivative implies uniform convergence.
The converse is also true, as expressed in the following theorem.
Theorem. Let \(f : V \rightarrow W\) be a map between normed vector spaces \(V\) and \(W\). Suppose that for all \(\varepsilon > 0\) there exists a \(\delta > 0\) such that, for all \(t \in \mathbb{R}\) with \(\lvert t \rvert < \delta\) and all \( u \in V\) with \(\lVert u \rVert = 1\), \(\lVert f(tu) - L_u \rVert \lt \varepsilon\), where \(L_u = \lim_{t \to 0}f(tu)\). Then there exists a \(u \in V\) such that \(\lim_{v \to 0}f(v) = L_u\).
Proof. Assume the hypothesis. Let \(\varepsilon > 0\). Then, by hypothesis, there exists a \(\delta > 0\) such that, for all \(t \in \mathbb{R}\) with \(\lvert t \rvert < \delta\) and all \( u \in V\) with \(\lVert u \rVert = 1\), \(\lVert f(tu) - L_u \rVert \lt \varepsilon\), where \(L_u = \lim_{t \to 0}f(tu)\). Let \(v \in V\) and suppose that \(\lVert v \rVert < \delta\). Define \(u = v / \lVert v \rVert\). Then, since \(\lVert u \rVert = 1\) and \(\lVert v \rVert < \delta\), it follows that \(\lVert f(\lVert v \rVert \cdot u) - L_u \rVert < \varepsilon\) and hence \(\lVert f(v) - L_u \rVert < \varepsilon\). \(\square\)
Thus, applying this theorem with
\[f(v) = \frac{\lVert f(p + v) - f(p) - D_pf(v) \rVert}{\lVert v \rVert},\]
there exists a \(u\), namely \(u = v / \lVert v \rVert\) for any \(v \in V\), such that \(\lim_{t \to 0}f(tu) = \lim_{v \to 0}f(v)\). In other words,
\[\lim_{v \to 0}\frac{\lVert f(p + v) - f(p) - D_pf(v) \rVert}{\lVert v \rVert} = 0.\]
Thus, the essential difference is that the Fréchet derivative expresses a uniform limit whereas the Gâteaux derivative expresses independent limits. The additional requirement of uniformity is what makes the Fréchet derivative a "strong" derivative.