Definition 7.3.1.
Suppose that
is a scalar field on
. Let
be a stationary point of
, that is
. Then
To illustrate the application of these definitions, let us begin with
a simple one-parameter function
(which may arise, for example,
in the point-curve distance problem). Suppose
is a stationary
point of
, i.e.
. Then a Taylor series expansion
tells us that
Unfortunately, the problem becomes more complicated if
. In this case we can not say anything about the sign of
and must therefore expand the Taylor series to
higher derivatives. For example, if the second derivative is zero but
the third derivative is nonzero, then we will have neither a maximum
nor a minimum but a point of inflection. Consider the function
; in any neighborhood of the stationary point
, the
function takes on both positive and negative values and thus
is
neither a maximum nor a minimum. If the third derivative is also zero,
we have to look at the fourth derivative, and so on. Fortunately,
these exceptions are rare, but a robust point classification algorithm
should be able to handle such cases. In general, if
is the
first derivative function that does not vanish, then the function
has [152]
Similar analysis can be performed with functions of more than one
variable. Consider a function
of two variables, which might
arise, for example, in a curve-curve distance problem. A Taylor
expansion around the stationary point
to second order
leads to
Notice that the third condition above applies even if
.
In this case, the second order term reduces to
. As long
as
, specific choices of
and
within any arbitrarily
small neighborhood of the stationary point can make this term either
positive or negative, and hence
is neither a maximum nor
a minimum.
It is possible to complete the square for functions of more than two variables; however, in order to develop general conditions for minima and maxima of such functions, we will employ the powerful theory of quadratic forms [410].
Definition 7.3.2.
A quadratic form is an expression of the form
, where
is an
-dimensional
column vector of unknowns and
is an
matrix. A
quadratic form
is said to be
To see how these quadratic forms are used, let us take a Taylor
expansion of a function of
variables
around the
stationary point
:
At this point, we can use a familiar theorem of linear algebra whose proof is given in [410]:
Theorem 7.3.2.
Let
be a quadratic form, let
be a real
symmetric matrix, and for
, let
be the
determinant of the upper left
submatrix of A (which we
will denote
). Then the quadratic form is positive definite if
and only if
for all
.
If the
condition in the above theorem is relaxed to
, a
corresponding theorem applies to positive semidefinite forms (that is,
a form is positive semidefinite if and only if
for all
).
Theorem 7.3.2 immediately gives us a necessary and
sufficient condition for negative definiteness as well. For, if we let
, the quadratic form
is
negative definite if and only if
is
positive definite if and only if
for all
. But
and so we have the following corollary:
Corollary 7.3.1.
Let
and
be as before. Then the quadratic form
is negative definite if and only if
for
all
.
Simply applying these conditions gives us the following theorem:
Theorem 7.3.3.
Let
be as before, and let
be its Hessian matrix, evaluated at
the stationary point
. Let
be the determinant of
the upper left
submatrix
of
.
If, on the other hand, the form
is indefinite, we can conclude that
is neither a minimum nor
a maximum (in two dimensions, such points are usually called saddle points), as the following theorem shows:
Theorem 7.3.4.
Let
and
be as before. Suppose that the
do not satisfy the conditions for definiteness or semidefiniteness.
Then
is neither a local minimum nor a local maximum.
Proof: Because
is indefinite, there
exist vectors
and
such that
and
. Pick an
arbitrary neighborhood
of
. Then for any
smaller than some positive number
,
and
.
Using (7.21), we obtain
Since
was an arbitrary neighborhood of
, there is no
neighborhood of
where
for
all
or where
for all
. Therefore
is neither a local minimum nor a local
maximum.
Unfortunately, trouble can occur if the Hessian is only semidefinite and not definite. In this case we need to look at higher order derivatives or use some other method to classify the point precisely. Although this is not too difficult for one-parameter functions, it becomes progressively more involved as the number of variables increases (see for example, [323,129]).