Spring 2006

Class Handout on Recoverable Operations

In class on April 10, 2006, Professor Liskov presented implementations of recoverable_put and recoverable_get. This note describes these implementations and presents the reasoning about why they are correct.

### The algorithm

1. the specifcations: recoverable_put is called because the higher level has some new data to write, e.g., the new balance of the account.

The goal is:

• recoverable_put is all-or-nothing (recoverable_get will see either the old balance or the new one).
• if recoverable_put completes, recoverable_get will see the change (e.g., the new account balance)

2. assumptions: If careful_put returns, the data has been correctly written to the disk sector in questions. Thus the only error we will look at is a failure during the execution of careful_put. You can assume no disk failures: no decay, no loss of a sector.

3. the strategy: for a particular "disk address" x, two sectors, x.D0 and x.D1.

4. the plan: write carefully so that you can guarantee that at most one of x.D0 and x.D1 is corrupted.

5. the code:

```recoverable_put(x,data):
flag = careful_get(x.D0,buffer);
if flag = ok {
careful_put(x.D1, data);
careful_put(x.D0, data);
}
else {
careful_put(x.D0, data);
careful_put(x.D1, data);
}

recoverable_get(x, data)
flag = careful_get(x.D0, data);
if flag = ok return;
careful_get(x.D1, data);
```

6. the reasoning to show that we always preserve the following predicate P: at least one copy of (x.D0, x.D1) is uncorrupted. This is an inductive argument.

Basis step: initially both copies are ok by assumption, and therefore P holds.

Inductive step: assume P holds when recoverable_put is called.

There are the following cases:

• a) both copies are good: In this case it doesn't matter what order we write them. If the first write finishes, that copy is good; if it doesn't the other copy is good. Therefore P holds.
• b) one copy is bad: This is the key step. In this case we must write to the bad copy first. If this write doesn't finish, we still have one good copy. If it finishes, we now have 2 good copies, so we will satisfy P even if the second write fails.

More specifically:

• D0 is good when recoverable_put starts: In this case, we write to D1 first. If that write doesn't finish we haven't done anything to D0 and therefore it is still good. If that write finishes, D1 is now good, and therefore if doesn't matter whether the write to D0 fails. In either case we are sure that at least one of D0 and D1 is good.
• D0 is bad when recoverable_put starts: This implies that D1 is good, by our inductive assumption. we write to D0 first. If that write doesn't finish, we haven't done anything to D1 and therefore it is still good. If the write to D0 finishes, D0 is now good, and therefore it doesn't matter if the write to D1 fails. In either case we are sure that at least one of D0 and D1 is good.

7. showing that we satisfy our goal:

P implies directly that recoverable_put is all or nothing.

If recoverable_put finishes, it has succeeded in writing the new state to both D0 and D1. Therefore recoverable_get will observe the new state.

8. The commit point in various executions: The new state will be visible as soon as the careful_put to D0 finishes. It will also be visible in the case where the careful_put to D1 happens first and the careful_put to D0 starts.

There are the following cases:

1. D0 was good when recoverable_put started
• (a) the careful_put to D1 failed.
• (b) the careful_put to D1 succeeded and then the machine failed.
• (c) the careful_put to D1 succeeded, but the machine failed during the write to D0.
2. D0 was bad when recoverable_put started
• (a) the careful_put to D0 failed.
• (b) the careful_put to D0 succeeded and then the machine failed.
• (c) the careful_put to D0 succeeded and the machine failed during the write to D1.

The new state will be visible in cases 1c, 2b, and 2c. (Of course as indicated in point (7) above, it will also be visible if recoverable_put finishes.)

Questions or comments regarding 6.033? Send e-mail to the 6.033 staff at or to the 6.033 TAs at

Top // 6.033 home // \$Id: recoverable_ops.html,v 1.3 2006/04/12 00:16:55 mwalfish Exp \$