6.033 2011 Lecture 14: Time & Ordering

There's several ways in which time might be used in computer systems:
  Time intervals: how long did some operation take?
  Calendar time: what time/date did some event happen at?
  Ordering of events: in what order did some events happen?

Measuring time intervals
  Need physical device to produce events ("ticks") with a well-known frequency.
  [ slide with physical oscillators ]
  RC circuits in certain configurations.
  Crystal (quartz) oscillators.
  Cesium atom oscillators.
    Attractive because 1 second is defined in terms of #cesium oscillations.
  Keep a counter of the number of oscillations.
  If the frequency is known, time period (in seconds) = counter / frequency.
  E.g., common frequency for quartz oscillators is 32768 Hz.
    Cheap to manufacture, low power, easy to divide by 2^15.

Keeping track of calendar time
  Typically, calendar time is represented using a counter from some fixed epoch.
  For example, Unix time is #seconds since midnight UTC at start of Jan 1, 1970.
  A program might convert this counter value into human-readable date/time,
    and vice-versa.
  Conversion requires two more inputs: time zone, data on leap seconds.
  E.g., start of lecture is 1301335500 seconds after the Unix epoch.

What happens when you turn off your computer?
  No need to track time intervals, but still want to keep calendar time.
  Separate "Real-Time Clock" (RTC) chip remains powered, with battery/capacitor.
  OS reads the time from RTC on bootup, writes it back periodically.

Properties / quality of a clock
  Resolution, range: what kinds of times can be represented?
    A property of our counter representation.
    E.g. 32-bit Unix time can represent calendar time until 2038 (2^31 seconds),
         resolution 1 sec.
    A 32-bit unsigned counter with a 32768 Hz crystal can represent time up to
         ~1.5 days, but with much better resolution.
    128-bit counter (64-bit # seconds, 64-bit fraction of a second) would be
         sufficient for almost any purpose (both resolution and range), but
         most systems use less general representations.
    RTC chip typically tracks seconds (when computer powered off).
    When CPU is running, can measure w/ resolution of CPU's clock (e.g., 2GHz).
  Precision: how close is the diff between two counter values to elapsed time?
    How precisely do we know oscillator's freq?
    How precisely do we measure oscillations?
    E.g., underlying oscillator might change frequency with temperature.
  Accuracy: how close is the calendar time to an official time, at some instant?
    Property of how well our counter is synchronized with official time.
    Accuracy can degrade over time with poor precision.

Synchronizing with official time
  TAI ("International Atomic Time") is a commonly-agreed-upon master time.
  [ slide with USNO reference clock ]
  Defined as average of many clocks, kept by different labs.
    Most of these are cesium-based atomic clocks (due to defn of a second).
  Time distribution protocol used to synchronize clocks.
    E.g., WWVB radio station in Ft. Collins, CO broadcasts "tick" every second.
      [ slide with WWVB antenna ]
      For high accuracy, need to know location to cancel out propagation delay.
      (Also rx circuitry, but that's a fixed property of receiver.)
    GPS protocol also used to synchronize time.
      Helps receiver determine location: can cancel out propagation delay.

Synchronizing a clock over the internet: NTP
  Query server's time, adjust local time accordingly.
  [ slide with simple time sync protocol ]
  Need to take into account network latency.
    Simple estimate: rtt/2.
    When does this fail to work well?
      Asymmetric routes, with different latency in each direction.
      Queueing delay, unlikely to be symmetric even for symmetric routes.
      Busy server might take a long time to process client's request.

Sync part 1: improving accuracy
  Naive approach: local_time = local_time - offset
  Problem: time would be discontinuous, or even repeat itself (if jump back).
    E.g., program that was timing something would see large (maybe neg!) value.
    Worse yet, the Unix "make" command might malfunction, if time goes backward.
  [ slide with slew time ]
  Solution: "slew" the clock by slowing down / speeding up local clock.
  Typically cannot change oscillator (fixed in hardware).
  Can change assumed frequency (i.e., how much to advance time on each tick).
    Works if counter resolution is much higher than oscillator frequency.
    E.g., increase assumed frequency
        => time incremented by less on each tick
        => time will appear to run a little slower.

Sync part 2: improving precision
  If we only adjust our time once, an inaccurate clock will lose accuracy.
  Need to also improve precision, so we don't need to slew as often.
  Assumption: poor precision caused by poor estimate of oscillator frequency.
  [ slide with adjusting frequency estimate ]
  Can measure difference between local and remote clock "speeds" over time.
  Adjust local frequency estimate based on that information.
  In practice, may want more stable feedback loop (PLL): look at control theory.

Choosing peers in NTP
  In practice, there are many servers running NTP; how to choose which one?
  NTP prefers lower rtt: less error due to network latency.
  Can use other algorithms to sync with average, remove outliers, ..
  One problem: how to avoid sync loops between two close-by machines?
    NTP assigns a "stratum number" to each machine.
    Stratum 1 means machine directly attached to (non-NTP) ref time source.
    E.g., GPS receiver, radio receiver, cesium clock.
    Server reports its stratum number; client sets stratum number to 1 higher.
    Maximum stratum number prevents infinite loops (ignore higher stratums).

Assumption in NTP: everyone is trying to synchronize with some ground truth.
  What happens if there is no master clock?  E.g., completely offline system.
  One possibility: all nodes vote for a coordinator.
  Coordinator then polls all nodes, removes outliers, computes average time.

Application: file reconciliation
  One use of time in a computer system: determining what's more recent.
  Suppose you have three computers, each has a copy of one file to start with.
  Goal: want to propagate recent changes to that file to other computers.
  [ slide with file reconciliation setup ]
  Here, user changes file from 'abc' to 'def' on computer A.
  Then reconciles B->A.
  Then reconciles A->B, A->C.
  Then converts file to upper-case on computer C.
  Then reconciles C->B.

Key problem: how do we know if one version of a file is fresher than another?
  E.g., in first reconciliation step (B->A), "abc" should NOT overwrite "def",
        but in later steps, new versions should overwrite older versions.
  One approach: assume perfectly synchronized clocks on all machines.
  [ slide with timestamp reconciliation algorithm ]
  Keep track of the time when each file was last modified.
  When reconciling, take newer version of file over an older version.
  Seems to work OK in this example.
  [ slide with timestamp reconciliation diagram ]

Concurrent updates
  Suppose the reconciliation A->C never happened.
  [ slide with concurrent change lost problem ]
  User still converts file to uppercase on C, and reconciles C->B.
  On B's computer, it looks like C's file (T=5) is newer, so it accepts.
  Not quite right, because we lost the 'def' change.

Goal: no lost updates
  Should only be OK to overwrite version 1 with version 2,
    if version 2 contains all updates that version 1 contained.
  In this case, reconciliation C->B overwrites one change with another,
    because C's version does not contain A's change.
  System should flag a "conflict" in this case, require user input to resolve.

How to detect such conflicts (i.e., concurrent updates)?
  Associate a vector of timestamps with each file, instead of just 1 timestamp.
  [ slide with vector timestamp algorithm ]
  Entry in vector keeps track of the last time that computer wrote the file.
  Thus, vector has to be as long as the number of computers in question.
  V1 is newer than V2 if all of V1's timestamps are >= V2's.
  V1 is older than V2 if all of V1's timestamps are <= V2's.
  Otherwise, versions V1 and V2 were modified concurrently, so conflict.
  [ slide with vector timestamp conflict ]
  If two vectors are incomparable, we can deduce that one computer modified
    file without seeing the latest version from another computer.
  On the other hand, if vectors are ordered, everything is OK as before.
  [ slide with vector timestamp OK ]

Cool property of version vectors:
  [ slide with local comparisons ]
  A node's timestamps are only compared to other timestamps from the same node.
  Time synchronization not necessary for reconciliation w/ vector timestamps.

Summary:
  Several different uses of time in systems: intervals, calendar time, ordering.
  Often good idea to decide what your system/app is using time for.
  Time synchronization can improve both accuracy and precision (within limits).
  In distributed system, new kind of event ordering: concurrent.
    (In some ways similar to a new kind of outcome for RPC: unknown.)
  Vector timestamps can detect concurrent actions, avoid need for perfect sync.

---

For next time:
  principle: always compare with your own local time
  principle: time never goes backwards
  more uses of time: TTL, caching, expiration, ..

---

Related links:
  http://cr.yp.to/proto/utctai.html
  ftp://elsie.nci.nih.gov/pub/