6.033--Computer System Engineering
Suggestions for classroom discussion of:
S[teven] P. Miller, B. C[lifford] Neuman, J[effrey] I. Schiller, and
J[erome] H. Saltzer.
Kerberos authentication and authorization system.
Section E.2.1 of Athena Technical Plan, M. I. T. Project Athena, October 27, 1988.
by J. H. Saltzer, April 4, 1996, updated March 31, 1998 and April 6, 1999
Kerberos is a good case study of the wide range of considerations that go
into designing something that is intended to be secure. It is unlikely
that most students will have picked up all, or even a majority, of the
issues, partly because this paper only sketches out the main design
considerations. It also assumes that the reader comes in knowing why you
need such a system.
Because of this consideration, it is unlikely that asking a class to
derive Kerberos from its specifications will get very far. It is
probably more illuminating to ask "why does Kerberos do X?"
As a case study, it is also a good illustration of an older approach
to developing security protocols. Rather than modularly separating
authentication and confidentiality and using distinct procedures for
each, it achieves authentication by clever use of the same encryption
that provides confidentiality. The primary consequence of this difference
is that one must study the design very carefully to conclude that it
does both functions in a correct, consistent way; there is actually
not enough information in this document to come to a conclusion and
one must actually look at the code to be sure. (Certain fields need
to be in the packet in a specific order for the authentication
to be foolproof.) A secondary consequence of the difference is that
Kerberos provides only a single key for the inquiring parties to use, rather
than providing one key for confidentiality encryption and a second
key for authentication, and tying the two keys together to ensure that
they belong to the same session.
Redesigning Kerberos to use modular authentication and confidentiality
primitives is an interesting exercise, but may be too much to try
to undertake in a 6.033 recitation.
Some bugs in the paper.
As it says at the top of page nine, the descriptions of packet
contents in this section are simplified, and full detail is in
the appendix. Unfortunately, (1) the appendix isn't included in
the version handed out to the class, and (2) in the process of
simplifying the scenario descriptions some fields got omitted.
1. All tickets are identical, whether for the ticket-granting
service or a Kerberos-mediated service. They always contain both
the service name and the workstation identifier.
2. All responses from the TGS are identical, they always
contain a ticket lifetime.
The specific bugs are that the service ticket shown in numbered item
2 of scenario III
on page 11 omits the service name and the workstation identifier,
and the KTGSr --> WS message omits the lifetime field.
- Why a key distribution center (KDC) at all? What is the
alternative, assuming one is designing a symmetric-key system? (Pair-wise
symmetric keys. For Alice to talk to Bob, each must have a copy of
Kab. For Alice to talk to any of 10,000 other people at M.I.T.,
she must have a list of 9,999 such keys. Alice would probably keep a much
shorter key list, of just regular correspondents, so a
name discovery and key exchange mechanism must also be available. The fundamental
problem with pair-wise keys is providing a secure name discovery
and key exchange mechanism. If Alice meets Charles in the hall, they can exchange principal
identifiers, but this is not an opportune instant to generate and exchange
a randomly-chosen key. And if Alice discovers the need to send a message
to Charles by reading one of Charles's papers and seeing an e-mail address
at the bottom, it is even harder to arrange to create and exchange a new
key.)
- Why not use trans-encryption? That is, we create a central agent,
which maintains a single list of all principals and their associated keys.
If Alice wants to send a message to Charles without any prearrangements,
she encrypts the message in her own key and sends it to the central agent.
The agent decrypts it using its copy of Alice's key, then encrypts it using
Charles's key and sends it along to Charles. (Every byte of private
information must flow through the central agent, so it is a high-volume
production operation and at some level of traffic a potential bottleneck.
In addition, Alice's message to Charles is for a moment exposed in
cleartext. It is probably a bad idea to try to achieve high
security--needed for all the keys as well as for the cleartext user data as
it goes through the agent--in a high-volume production server.)
- Why is that bad? (Here is a good example of paranoid reasoning
in action: Because a high-volume production server would be expected to
require quite a bit of wizardly attention, to maintain its performance, to
upgrade it as rapidly as new hardware or operating systems become
available, to adjust its configuration to meet changing demand, etc. In
contrast, managing a secure service should be done very conservatively,
dragging one's feet on installing new system releases, checking and
double-checking every proposed change, under supervision of a small,
trusted, paranoid staff.)
[Before proceeding, it might be a good idea to sketch the packet flow of
a general key distribution service on the board: a packet goes from client
Charles to the KDC asking to talk to data warehouse service W. KDC
fabricates and sends back two copies of a temporary session key
Tcw, the first enciphered in the key of Charles and the second
enciphered in the key of W. Charles then can send a message to W
enciphered in Tcw, along with the copy of Tcw that is
enciphered with W's key. This scheme as described so far acts as an
introduction service--it gets a nonce key privately to
previously out-of-touch correspondents--but it lacks authentication and
replay protection. Kerberos can then be described as a beefed-up version of
this basic key-distribution protocol, with authentication and replay protection added.]
- Why run the user's password through a one-way encryption algorithm to
create the user's key? Why not just use the password as the key? (The
user chooses his or her own password, and may have chosen the same
password for use on Prodigy or a bank ATM. Applying a one-way
transformation to the password helps assure that if the KDC is accidentally
compromised, you don't have to also change your Prodigy password.)
- Why does Kerberos include the principal identifer of the service
inside its encrypted response? Doesn't the client know what service it
asked for? (By explicitly including the service identifier inside the
encrypted packet, that identifier is securely associated with the ticket,
so the client does not have to assume that this ticket goes with this
service. This inclusion is an example of the principle "be explicit".)
- How could one exploit a protocol that didn't explicitly associate
the service name with the ticket? (Lucifer could attack as follows:
intercept Charles's original request, modify it to be a request for a
ticket for Lucifer's service, then send it along to the KDC. The response
would go back directly to Charles, who would be unaware that the ticket in
the response is for Lucifer's service. If Lucifer can intercept further
packets from Charles, Lucifer can pretend to be the data warehouse
service.)
- Where else does Kerberos use explicitness? (Every place that
two or more fields are enciphered together. For example, the ticket
contains the client's name, so that the server knows for sure who the
session key is associated with.)
- Is it safe to send the ticket for W back to Charles? Wouldn't it
be safer to send it directly to the server? (Given that the ticket has to
cross the network to get to W, we can regard Charles as just one more
network forwarding agent; the ticket is safely enciphered and it isn't any
less secure for this extra hop. And by sending the ticket to Charles, he
can package it with the request to W, thereby eliminating the need for W to
figure out which ticket goes with which request.)
- This thing seems more complicated than necessary. Why is there a
Ticket-Granting Service (TGS)? Why not just ask Kerberos for the ticket
you want in the first place? (With that design, each time you get another
ticket, you would have to use your password to decipher the ticket. If one
could predict all the tickets that one would need during a session at the
beginning, that would be OK. But that is unrealistic, so a dilemma arises:
either the workstation stores a copy of your password for the duration of
the session, or it stops and asks you to type it in each time you invoke a
different network service. So the fundamental reason for introducing the
TGS is to allow the workstation to acquire your password exactly once and
then destroy it as soon as possible.)
- Let's try to simplify another aspect of Kerberos. Why not omit
the authenticator that is included with a ticket? Why not send just the
ticket to the service? (This point is critical. Rather than using the
procedure described in lecture of sending a string of data plus a separate
key-driven authenticator, Kerberos uses an ad hoc
self-authenticating technique, in which the encrypted packet is assumed
authentic if it is internally self-consistent. The authenticator does two
distinct things, both of which may be essential, depending on the
requirements of the service:
- it carries a timestamp. This assures that this request is
current, not a replay from yesterday.
- it repeats the client's principal identifer--this repetition
allows the service to
verify that the session key found in the ticket is valid. Without
this check, there is no way to distinguish a valid ticket from a
random bit string.)
- Aren't both of those things always essential? (Services that
don't maintain dynamically changeable state sometimes don't require
protection against request replays, so the timestamp may not be important.
A domain name service is an example of one that could get along without
this part of the authenticator. However, it would probably still include a
time-stamp in its responses, because its clients may be concerned about
reply replays.)
- What is the essential difference between this scheme of authentication and the key-driven authenticator described in lecture? (The scheme described in lecture systematically protects against yet another attack, called slicing. In a slicing attack, Lucifer intercepts packets encrypted from different sessions, cuts a segment out of one packet and pastes it into another packet. Depending on the particular encryption scheme being used, it is entirely passible that the recipient of this doctored packet may accept the result as authentic. Kerberos prevents slicing attacks by carefully arranging individual data fields of the packet so that they cross boundaries between encryption blocks. The intent is to make it difficult for Lucifer to find a place where cut-and-paste won't be detected. The paper doesn't mention this defense at all, which has led some people to assume that Kerberos is susceptible to slicing. No one has identified a workable slicing attack on Kerberos, but it is a challenge to verify that it can't be done.)
- But services that do maintain state need replay protection. In
scenario II it says that the authenticator must be "sufficiently recent".
But without precise clock synchronization, won't there be a window during
which Lucifer can replay a request and get the server to accept it? (Yes;
the service has to provide further protection. It works as follows:
- As part of constructing an authenticator, a client inserts a new
value for Tcurrent.
- The service requires that that value of Tcurrent be within
Tdelta of its own clock. Kerberos sets Tdelta at five minutes.
- The service maintains a history of depth Tdelta of all the
authenticators it has received during that time. For each newly
received authenticator it searches back through its history (it
has to search only as far back as 2* Tdelta; it can garbage
collect space devoted to any older history entries) to verify
that this is not a replay.
The primary replay protection that the Kerberos authenticator provides
is to limit the depth of the history that the server needs to keep.
Alternatively, if dynamic replay attacks are unlikely, the server can just
ignore the problem and still be fairly well protected.)
- How was the value of five minutes chosen for Tdelta? (The choice
of the authenticator expiration period is a tradeoff; if it were longer
servers would need to keep a deeper history to avoid replay attacks. If
it were shorter the clocks would have to be more carefully
synchronized. With Tdelta set to five minutes, there is a ten-minute
window, which is large enough to allow clocks to be set by hand and
clock-setting errors big enough to cause trouble are likely to be
noticed.
- Why not use a clock synchronizing protocol to make the window much
smaller? I have heard that good protocols can adjust clocks to within
a few dozen milliseconds, even across the Internet. (That is certainly
a possibility. But one would then need a certifiably authenticated
clock-setting protocol, which really escalates the problem. And since
the window can't be made negligibly small, the application would still
have to maintain a list--albeit perhaps much shorter--of recent requests.)
- I have heard that Kerberos version five somehow eliminates replays
completely. (Kerberos comes with a library program that an application
can use to analyze arriving tickets. The library procedure that comes
with version five of Kerberos automatically maintains a history of
authenticators received during the last 2*Tdelta minutes and it rejects
any duplicates as replays. Placing this function in the library reduces
the chance that an application programmer will foul things up. This is
an example in which the end-to-end argument has been dominated by
concerns about protecting application programmers from their own
mistakes.)
- The paper says that the Kerberos design assumes that server and
client clocks are loosely synchronized. How does security fail if this
assumption is not met? (If you can play with the clocks, there is a
fairly simple attack scenario:
1. 6.033 student makes a recording of an encrypted session
between Kaashoek and a print server some afternoon when Kaashoek
is printing 6.033 Quiz 2.
2. The student comes back late that night, spoofs the network
time protocol, and convinces the print server that it is 2 p.m.
again.
3. The student plays back Kaashoek's part of the encrypted
session, and causes the printer to print another copy of the
quiz, for later study.)
- Isn't this a pretty severe flaw in Kerberos? (It is definitely
a problem, but it is misleading to characterize it as a "flaw" in Kerberos.
Every security component comes with a set of assumptions. Explicitness
tells us that those assumptions should be clearly stated, so that when that
component is used in a larger system, the system designer knows what the
assumptions are. If you, as the designer, violate an assumption, you
should expect that security can be compromised. If you don't like some
assumption, you can choose a different component that doesn't make that
assumption. But it is misleading to characterize a component as flawed if
you don't like its assumptions. The term flawed should be
reserved for cases where, even after granting the assumptions the
component is still not secure.
If the assumptions are unrealistic, perhaps the component is
worthless, but that is a different objection. In this case, Kerberos
specifies that all of the participants must have clocks that are
synchronized to within a certain maximum skew. On this (and a few other
assumptions, such as the server is under physical lock and key) various
security features rest.
At the time that Kerberos was designed, this assumption was in some sense
more realistic than it is today, because clocks were set by hand, and to
change a clock required physical capture of the server. Nowadays, many
servers have switched over to setting their clocks with an insecure network
protocol, NTP. This change explicitly violates one of the explicit
assumptions. If a secure clock-setting protocol were to be introduced
(perhaps switching to secure GPS, as suggested by Dave Mazieres), then the
assumption made by Kerberos would once again be sound, and the attack would
fail.
All of this reinforces Ross Anderson's observation that you can too easily
construct an insecure system out of secure components.
)
- When a client first approaches Kerberos to ask for a ticket it
includes a timestamp in the request packet, and Kerberos echos the
timestamp in the response, apparently to ensure freshness--that the
response is a current one. Why does freshness matter in this particular
interchange? (Without this protection, Lucifer could intercept the
original request packet and send you back the response packet that Kerberos
sent you six hours ago, on another occasion when you used this same
service. . The main difference between the new response and the replayed
one is that the new one has a new session key; Lucifer is forcing you to
reuse an old session key, one that he may have by now found a way to
compromise, either by cryptanalysis or by dumpster diving. Even without
this protection you would probably detect replays of packets more than 8
hours old because the enclosed tickets will have expired. The interesting
observation is that Kerberos is engineered to provide protection not just
against likely attacks, but also against some quite unlikely ones.)
- But shouldn't services protect or destroy old session keys? (Yes,
but it is not safe to depend on them doing so correctly. In addition,
a session key may be used just to establish authenticity, not for
secrecy, in which case the participants in the transaction may not see
any need to protect it once they have finished using it. Having
introduced the concept of a single-use session key, users will begin to
assume that the concept is meaningful, so it is important to make it
so.)
- How is key distribution done in public key systems? (One still
needs a key distribution method, and it is almost as hard as with a
symmetric key system. The reason is that if Alice is going to use Bob's
public key, she depends on its authenticity, so she must be absolutely
confident in the method by which she obtained that key. Although it
doesn't have to be kept confidential, one can't handle it casually, in a
way that Lucifer might be able to tamper with it.)
- Tickets contain a field "WS", the workstation IP address.
What good does that do? (WS is the one thing that is probably
superfluous in this protocol. It increases the work factor of
certain replay attacks slightly (because an attacker has to
resort to IP spoofing to make the WS field verify correctly) but
that adds little to overall security. For example, if your home
computer dials up to check for mail every thirty minutes, and the
dialup scheme assigns a new IP address on each dialup, then a
ticket used on one dialup is worthless for the next one, and you
are forced to retype your password every half-hour.)
- Kerberos allows a dictionary attack on your password. How can
Lucifer mount such an attack? (Lucifer can send a message to the KDC
claiming to be you and asking for a Ticket-Granting Ticket. The KDC will
be happy to send back such a ticket, along with a current timestamp and,
for explicitness, the name of the ticket-granting service, all encrypted in
your key. These last two things are intended to allow you to authenticate
the response. But they can be used by Lucifer to verify that he has
guessed your password: Lucifer sends this encrypted response over to his
256-way parallel processor and starts trying to decrypt, using every word
in the dictionary as a potential password. If your password is in his
dictionary, when he tries it, the authenticator will match, and he knows he
has found your password. The timestamp and the identity of the TGS,
because they can be recognized, are called "verifiable plaintext". Even on
a Pentium, it takes only a few hours to try all the entries in a good-sized
dictionary. The beauty of this attack is that it requires only a single,
apparently legitimate, interaction with the KDC. All of the trial
encryptions can be done in private where noone is watching or logging
traffic or access failures. This situation should be compared with that of
an ATM, where in order to try a candidate PIN an attacker has to send a
message to the bank, which after three failures will confiscate the ATM
card.)
- Can this problem be fixed? (Yes. The trick is to redesign the
Kerberos protocol to ensure that there is no verifiable plaintext in
that returned packet. Eliminating verifiable plaintext while
maintaining freshness is a bit challenging; read the following paper
for details of one such design. Li Gong, Mark A. Lomas, Roger M.
Needham, and Jerome H. Saltzer. Protecting poorly chosen secrets from
guessing attacks. IEEE Journal on Selected Areas in Communications 11,
5 (June,1993) pages 648-656.)
Comments and suggestions: Saltzer@mit.edu