We have presented a couple of cases in class where we could solve these equations analytically, and you could learn how to solve a few more in 18.03. But in general, there is no analytical solution, and we will have to solve them numerically.
Like many physical laws, our conservation equations are written as differential equations. In our case, all the differentials are with respect to time, so these are Ordinary Differential Equations (ODE). In many cases, we know everything about the system at one point in time to, and we want to know what happens next. These are called Initial Value Problems (IVP). We will always write our ODE-IVP’s this way:
dY/dt = F(t,Y) Y(to)=Yo
where you are given F and Yo, and you want to compute Y(t).
There are other sorts of differential equations. For example, if the derivatives are with respect to several different coordinates, they are called Partial Differential Equations (PDE), and if you do not know everything about the system at one point, but instead partial information about the solution at several different points they are called Boundary Value Problems (BVP). You saw some PDE-BVP’s in electromagnetism, and you will see them in advanced Chemical Engineering classes, when you try to deal with systems that are not well-mixed (i.e. spatially inhomogeneous).
dY/dt = limh-->0 {Y(t+h) – Y(t)} / h = F(t,Y)
and just use a small finite h instead of going to the limit. Rearranging, this gives
Y(t+h) = Y(t) + h*F(t,Y)
You can repeat this over and over, to go from to to t=to+h, and then t=to+2h, etc. This is called Euler’s Explicit Method. See NMM 12.0-12.2.
It is worthwhile to do NMM Example 12.3 out long hand:
dy/dt = t – 2y , yo=1
analytic solution y=0.25*(2t-1+5*exp(-2t))
To confirm this works analytically, just plug y in to the ODE and confirm
it satisfies the equality.
It is easy to write a little Matlab program to do this for us:
Y=zeros(nmax,1); % for case where Y is a scalar
h=deltaT / nmax;
Y(1)=Yo+h*F(to,Yo) % initial condition
for n=2:nmax
t=to+n*h
Y(n)=Y(n-1) + h*F(t,Y(n-1))
end
Y(tf) = Y(to) + S h*f(to+nh)
In the limit as h goes to zero, the sum is the definition of the integral of f(t):
òf(t) dt = limS h*f(to+nh) (For finite h, this is the Rectangle Rule integral).
This is a manifestation of the Fundamental Theorem of Calculus:
if dY/dt = f(t), then Y(tf) – Y(to) = òf(t)
dt
You can solve most one-dimensional integrals easily using ODE solvers (though there are better ways that take advantage of the fact that f does not depend on Y).
Y(t+h) = Y(t) + Y’(t) h + ½ Y”(t) h2 + …
= Y(t) + F(t,Y(t)) h + ½ Y”(t) h2 + …
= (Euler Formula) + ½ Y”(t) h2 + …
But in Euler’s Explicit algorithm, we only kept the first two terms. So our computed Y(t+h) is erroneous. For small h the 3rd term is usually the biggest contributor to the error, so we say “the Local Truncation Error” (LTE) is O(h2), and it is biggest when |Y”(t)| is big. Note that you can cut this error significantly by reducing h. There is a very similar problem in numerical integration using the rectangle rule.
Also, the Local Truncation Error for each little interval is
O(h2), and if we are unlucky and all of these local errors add
up (instead of canceling), the total (global) error in the value of the
integral could be
Total error = S
(local error)I ~ S O(h2)
~ Nrectangles * O(h2) ~ O(h).
There is always a factor of N ~ 1/h between the error in each little interval and the total errors; in the textbook they define a “Local Discretization Error” (LDE) = LTE/h, so it scales like the total error.
CPU time is cheap, so you might be tempted to improve your accuracy by just cutting the size of h: if you use 1000x more function evaluations, you cut h by 1000, and add three more significant figures of accuracy to your calculated value of the integral. The problem with this is round-off. The way we evaluate sums is:
for n=1:nmax
Sum=Sum+h*f(to+n*h)
end
The last term is roughly 1/nmax times as large as the other terms, i.e. we are adding a big number and a small number. When you do this, you lose the last log10(nmax) significant figures in the small number. So if you can calculate f to 10 significant figures, and nmax=1e6, you only get about 4 significant figures in the integral. If nmax is too small, you lose because each term has an error O(1/nmax); if nmax is too large you lose because of round-off.
The solution is to find a better approximation to f(t) in the interval [t,t+h], e.g. we can draw piecewise straight line interpolations (NMM chapter 10) between f(t) and f(t+h), this gives us the trapezoidal rule, which is much more accurate. You can do even better with more complicated interpolations, e.g. if you use second-order polynomials to interpolate, you can end up with Simpson’s Rule. These methods all suffer from the round-off, but if the error scales as a high power of h, you don’t need a very small h to get a good result. In the text, Example 11.8 shows how you only need 65 function evaluations to get the integral of x*exp(-x) accurate to six significant figures, but you would need 1000 function evaluations using the trapezoid rule to get the same accuracy, and on many computers it would be impossible to achieve this accuracy with the rectangle rule: round-off would kill the accuracy before the O(h) error was small enough.
(In fact, because of all the errors, we won’t know any Y(t) precisely except Y(to)=Yo. So all of our estimates of F(t,Y) are going to have errors in them except F(to,Yo).)
The Midpoint Method (NMM 12.3.1) uses a simple extrapolation idea:
1) use the slope=F(t,y(t)) at the starting point ‘t’ to extrapolate
y(t+h/2)
2) compute the slope at t+h/2, using our estimate of y(t+h/2)
3) use this estimated slope at the midpoint of the interval to extrapolate
to y(t+h):
y(t+h) = y(t) + h*(slope from midpoint)
In Matlab, we write this:
Ymid = Y(n-1,:) + (0.5*h).*F(t(n-1),Y(n-1))
Y(n,:) = Y(n-1,:) + h*F(t(n-1)+0.5*h, Ymid)
All “explicit” methods use some explicit formula to compute Y(t+h):
Y(n,:) = ….
Later we will discuss more complicated methods, called “implicit” methods, where it is not possible to write a simple algebraic formula Y(n,:)=…., but instead one has to do some complicated iterative calculation each time one wants to compute Y(t+h).
First, expand a Taylor series around the midpoint (t+h/2, Y(t+h/2)):
Y(t’) = Y(t+h/2) + (t’-t-h/2) * Y’(t+h/2) + ½ (t’-t-h/2)2 * Y”(t+h/2) + O(Dt3)
Use this to compute the difference between y(t) and y(t+h):
Y(t+h)-Y(t) = h * Y’(t+h/2) + O(h3)
Note the O(h2) terms cancel each other out!
But we have a problem: in the Midpoint Method, we used Ymid instead of the true Y(t+h/2) which we do not know. Let’s write a Taylor expansion about (t,Y(t)) to see how wrong our Ymid could be:
Y(t+h/2) = Y(t) + (h/2) * Y’(t) + ½ (h/2)2
* Y''(t) + O(h3)
Y(t+h/2) = Ymid + 1/8 h2 Y”(t) + O(h3)
So how wrong can our F(t,Ymid) be?
F(t+h/2, Ymid) = F(t+h/2, Y(t+h/2)) + ¶F/¶Y
* (Ymid-Y(t+h/2)) + O((Ymid-Y(t+h/2))2)
= F(t+h/2, Y(t+h/2)) + ¶F/¶Y*(-1/8
Y”(t) h2 + O(h3)) + O(h4)
h*F(t+h/2,Ymid) = h*F(t+h/2, Y(t+h/2)) + O(h3)
So, the LTE in the Midpoint method is O(h3), the LDE and the overall error will be O(h2)
The Midpoint procedure decreases the LTE from O(h2) to O(h3), and so the LDE and the total error drops from O(h) to O(h2). This is a huge improvement for small h! See Example 12.5 in the text to see how much better the Midpoint Method is than the Euler method.
So if Euler’s Method is the Rectangle Rule (error~ O(h)), the Midpoint Method is similar to the Trapezoidal Rule: (error ~ O(h2)). What is analogous to Simpson’s rule (error ~ O(h4))?
Ystart = Y(t)
Ymid1 = Y(t) + (h/2)*F(t,Y(t))
Ymid2 = Y(t) + (h/2)*F(t+h/2, Ymid1)
Yend = Y(t) + h*F(t+h, F(t+h/2, Ymid2))
weighted avg slope = (slope(Ystart) + 2*slope(Ymid1)
+ 2*slope(Ymid2) + slope(Yend))/6
(This averaging pattern is very similar to Simpson’s rule, and causes
the O(h3) terms to cancel in a similar way.)
Y(t+h) = Y(t) + h*( weighted avg slope )
The equations are written out carefully in 12.3.3.
At the end of section 12.3 there is a comparison of the three methods. Just as in numerical integration, Simpson’s rule is much more accurate than trapezoid rule, even with a much larger h. The same thing happens with ODE solvers: RK-4 handily beats Midpoint. Rectangle and Euler are both pathetic, and run often run into roundoff error before you get many significant figures.
function [tvec, Ymatrix] = ODEsolver(F,to,Yo,tf,tols,params)
If we have enough memory to store it, it is convenient to get a whole bunch of ti,Y(ti) out, so we can make plots. As I showed yesterday, a convenient format for the output is a big matrix:
Y1(t1) Y2(t1) Y3(t1) …
Y1(t2) Y2(t2) Y3(t2) …
Y1(t3) Y2(t3) Y3(t3) …
….
You also must store a list of all the ti’s, i.e. tvec.
Then after you run the ODE solver, the command plot(tvec,Ymatrix(:,4))
draws a picture of Y4(t).
Often, we have some parameters in F that we are unsure of, or which could vary in different versions of a problem. For example, you probably heard about the water rocket project we did in this class Spring 2002; in that case we did a lot of runs with different amounts of water in the rocket, so the mass of water in the rocket was a parameter. We also didn’t know the air friction coefficient for the rocket very precisely, so we could leave that as a parameter (or set it equal to zero, to make sure that we got the correct zero-friction result.)
So we might want to use this form of F:
function dY_dt = water_rocket_diff_eqs(t,Y,params)
and we would call the ODEsolver this way:
[t, Y]=ODEsolver(@water_rocket_diff_eqs,0,Yo,10,[1e-6 1e-4],params)
Inside ODEsolver.m, every time we need to evaluate F, we do it this way:
dY_dt = feval(F,t,Y,params)
See the Matlab codes in Chapter 12 for more examples of the correct use of feval.
We don’t really want to input ‘h’, instead we would prefer to specify tolerances on the accuracy of the output.
More precisely:
Suppose you evaluate an integral numerically, with Simpson’s rule (or
RK-4), O(h4), using h=0.2. The computer returns an approximate
answer, e.g.
Integral(h=0.2) = 23.201650.
Then you repeat the calculation using h=0.1. This time you get the
approximate answer
Integral(h=0.1) = 23.200150.
Q. How many significant figures have we converged to?
A. Because of the rapid convergence of Simpson’s rule with h, the second
answer is expected to be much more accurate than the first answer (16x
smaller error). So for the purposes of counting significant figures, you
can assume that the second answer is exact. This approach would say that
the first answer is accurate to 4 significant figures. Then the second
answer is expected to be 16x more accurate, i.e. about one more significant
figure more accurate, so the second answer is probably good to 5 significant
figures.
Q. What is your best guess at the true answer?
A. If you plot the computed I vs. h4, you expect approximately
a straight line. You can extrapolate that straight line to a best guess
at the true answer. It would be best to do a couple of more calculations
to check that the plot of I vs. h4 really is linear over the
range of h of interest. If so, you can very reliably extrapolate to h=0,
often gaining at least one more significant figure of accuracy.
These same ideas are used in good ODE solvers, usually to ensure that each component of Y(ti) individually is accurate to within the user-specified tolerances. If Y(ti) calculated two different ways (e.g. using two different h’s, or two different methods as in ode45 (NMM 12.4)) does not agree to the desired accuracy, one typically cuts h by a factor of 2 and tries the calculation again. If Y is accurate, h is allowed to grow by some factor. In this way, the computer spends most of the CPU time in regions where the error is a problem, and much less CPU time in areas where the error is under control. This also works for numerical integration. See 12.4, and example 11.15.
d2x/dt2 = F(x)/m (Newton’s law of motion F=ma)
The “order” of an ODE is the largest power of dt appearing the denominator. These can easily be converted in the standard (i.e. first-order) ODE-IVP form dY/dt =G(t,Y):
Define Y = [x(t); v(t)] where v(t)=velocity=dx/dt
Then define
G(t,Y) = [Y(2); F(Y(1))/m ] = [v(t); F(x)/m]
It is easy to verify that dY/dt indeed equals G(t,Y). The same trick (defining derivatives to be components of Y, e.g. here we defined Y(2)=dx/dt) works for higher-order ODE’s as well. An ODE of any order can be replaced by an equivalent system of first-order ODE’s.