10.10 Introduction to Chemical Engineering

NMM Chapter 6: Root-Finding

There are many equations f(x)=0 where one cannot solve explicitly for the special x=x_root that solves the equation exactly; sometimes these are called “transcendental” equations. Often x_root is an irrational number, so a computer could not return the exact value even if we had an explicit expression.

However, if you plot y=f(x), it is usually not hard to see approximately where the curve crosses the x axis, this gives an approximate value of x_root. If you would like to know the value of x_root more precisely, and to make the computer do the work rather than relying on your eyeballs looking at a graph, you can use one of the root-finding algorithms presented in NMM Chapter 6.

Bracketing and Bisection

You should thoroughly read Sections 6.1 and 6.3, which are essential. The basic idea is to “bracket” the root, i.e. find an x_min and x_max so that you know that

x_min < x_root < x_max

If the function is continuous, this will certainly be true if f(x_min) and f(x_max) have opposite signs (because somewhere between x_min and x_max the curve y=f(x) must cross the x axis.) There are several algorithms where the computer guesses a number x_guess somewhere between x_min and x_max, it evaluates f(x_guess), and based on its sign it establishes a tighter bracketing interval (either [x_min, x_guess] or [x_guess, x_max]). Then the computer makes a new guess inside the new interval, and this process continues until the bracket is smaller than the user-specified tolerance on x_root. The simplest way to choose x_guess is to take the midpoint of the interval; this algorithm is called “bisection” (see NMM 6.3).

Here is a graphical demo of how bisection works in detail for NMM Example 6.6:

Click Here to see the demo in a seperate window

The o’s show previously evaluated points. The * indicates the most recently evaluated point (at the midpoint of the interval defined by the two closest o’s).

Convergence Tolerances

The computer seldom returns an exact answer; instead it returns a number that is “close” to the real answer. Often, the user has to specify a tolerance on the answer, in effect to tell the computer “how close is close enough”. For example, in root-finding, the true answer is x_root, but the computer returns x_answer, usually guaranteeing only that

|x_answer – x_root| < tolerance

Usually an engineer wants the answer to N significant figures, i.e. he or she wants

|x_answer – x_root| < 10^-N |x_answer|

i.e.

tolerance = 10^-N |x_answer|

This usually works, but if you are unfortunate x_answer might be very close to zero, and then the computer might have a tough time satisfying this condition. A more conservative tolerance (with an escape clause for very small x’s) would be:

tolerance = max (ATOL, RTOL*|x_answer|)

or (almost the same):

tolerance = ATOL + RTOL*|x_answer|

where RTOL = 10^-N (this is what you really want to control the tolerance) and ATOL is some tiny positive number (this is the escape clause). Note that ATOL has the same units as x, and should be many orders of magnitude smaller than your guess at the value of x_root.

Sometimes you may want to specify a different type of tolerance; for example in root-finding you could specify a tolerance on |f(x_answer)|, see NMM 6.3.2

Passing function names as inputs to a Matlab function

The demoBisect.m function in NMM 6.3 (http://web.mit.edu/10.10/www/Examples/demoBisect.m) is not very general: it only works for the particular f(x)=x – x^1/3 – 2, and it runs for a fixed number of iterations (i.e. the user cannot adjust the tolerance directly). NMM 6.3.3 gives a more general implementation of the bisection algorithm called bisect.m, which works for almost any f(x), and which allows the user to adjust the convergence tolerance. The first input argument expected by bisect.m is the name of a Matlab function that takes x as an input and returns f(x) as an output; to change the function you are solving, you only need to change this name in the input list. (The function name must be given in single quotes, or, better, preceded by an @ sign.) For example, copy bisect.m (http://web.mit.edu/10.10/www/Examples/bisect.m) into your directory, and then try

>> X_answer=bisect(@sin, [3 4], 1e-6)

which should return an X_answer very close to pi.

You can write your own general Matlab functions which take function names as inputs; the trick is to use feval. ‘feval’ is also discussed in NMM Section 3.6.3

Newton’s Method

Section 6.4 is a good introduction to Sir Isaac Newton’s root-finding method, which converges the fastest of any known method, and which can be generalized to solve systems of equations. (However, unlike bisection, Newton’s algorithm is not guaranteed to converge at all if you give it an unfortunate initial guess at the root, see NMM Fig. 6.12.) If time permits, we will use Newton’s method to solve systems of nonlinear equations towards the end of the semester.

We will not have time to discuss the other root-finding methods presented in Chapter 6 in 10.10. The built-in Matlab root-finding function fzero is discussed in Section 6.6; since it is built in, you may find it convenient to use fzero. This is an example of a hybrid method, which combines the reliability of bisection with the speedy convergence of Newton-like methods. One of the methods fzero uses is the Secant Method, described in Section 6.5. Section 6.2 gives an abstract, but very general mathematical view of root-finding which is applicable to a large number of computational methods which iteratively refine an initial guess. Finally, section 6.7 describes specialized methods for finding the root of polynomials.

Next Chapter: Linear Algebra
Back to Study Guide Index
Back to 10.10 home page
Last modified: September 19, 2002