Definition 7.3.1. Suppose that is a scalar field on . Let be a stationary point of , that is . Then
To illustrate the application of these definitions, let us begin with
a simple one-parameter function
(which may arise, for example,
in the point-curve distance problem). Suppose
is a stationary
point of
, i.e.
. Then a Taylor series expansion
tells us that
Unfortunately, the problem becomes more complicated if . In this case we can not say anything about the sign of and must therefore expand the Taylor series to higher derivatives. For example, if the second derivative is zero but the third derivative is nonzero, then we will have neither a maximum nor a minimum but a point of inflection. Consider the function ; in any neighborhood of the stationary point , the function takes on both positive and negative values and thus is neither a maximum nor a minimum. If the third derivative is also zero, we have to look at the fourth derivative, and so on. Fortunately, these exceptions are rare, but a robust point classification algorithm should be able to handle such cases. In general, if is the first derivative function that does not vanish, then the function has [152]
Similar analysis can be performed with functions of more than one
variable. Consider a function
of two variables, which might
arise, for example, in a curve-curve distance problem. A Taylor
expansion around the stationary point
to second order
leads to
Notice that the third condition above applies even if . In this case, the second order term reduces to . As long as , specific choices of and within any arbitrarily small neighborhood of the stationary point can make this term either positive or negative, and hence is neither a maximum nor a minimum.
It is possible to complete the square for functions of more than two variables; however, in order to develop general conditions for minima and maxima of such functions, we will employ the powerful theory of quadratic forms [410].
Definition 7.3.2. A quadratic form is an expression of the form , where is an -dimensional column vector of unknowns and is an matrix. A quadratic form is said to be
To see how these quadratic forms are used, let us take a Taylor
expansion of a function of
variables
around the
stationary point
:
At this point, we can use a familiar theorem of linear algebra whose proof is given in [410]:
Theorem 7.3.2. Let be a quadratic form, let be a real symmetric matrix, and for , let be the determinant of the upper left submatrix of A (which we will denote ). Then the quadratic form is positive definite if and only if for all .
If the condition in the above theorem is relaxed to , a corresponding theorem applies to positive semidefinite forms (that is, a form is positive semidefinite if and only if for all ).
Theorem 7.3.2 immediately gives us a necessary and sufficient condition for negative definiteness as well. For, if we let , the quadratic form is negative definite if and only if is positive definite if and only if for all . But and so we have the following corollary:
Corollary 7.3.1. Let and be as before. Then the quadratic form is negative definite if and only if for all .
Simply applying these conditions gives us the following theorem:
Theorem 7.3.3. Let be as before, and let be its Hessian matrix, evaluated at the stationary point . Let be the determinant of the upper left submatrix of .
If, on the other hand, the form is indefinite, we can conclude that is neither a minimum nor a maximum (in two dimensions, such points are usually called saddle points), as the following theorem shows:
Theorem 7.3.4. Let and be as before. Suppose that the do not satisfy the conditions for definiteness or semidefiniteness. Then is neither a local minimum nor a local maximum.
Proof: Because is indefinite, there exist vectors and such that and . Pick an arbitrary neighborhood of . Then for any smaller than some positive number , and .
Using (7.21), we obtain
Since was an arbitrary neighborhood of , there is no neighborhood of where for all or where for all . Therefore is neither a local minimum nor a local maximum.
Unfortunately, trouble can occur if the Hessian is only semidefinite and not definite. In this case we need to look at higher order derivatives or use some other method to classify the point precisely. Although this is not too difficult for one-parameter functions, it becomes progressively more involved as the number of variables increases (see for example, [323,129]).