Why Bitkeeper Isn't Right For Free Software
by Greg Hudson
-------------------------------------------

Much has been written about the use of Bitkeeper in free software,
particularly in relation to the Linux kernel.  Most of the discussion
has focused on the restrictive licensing terms of the free-beer
version (no source, can't be used by Bitmover competitors, termination
clauses).  But licensing-related arguments against Bitkeeper aren't
very compelling because non-Bitkeeper users haven't been penalized: as
long as Bitkeeper use isn't required for Linux kernel development, the
free software world is hardly "in crisis" as some have argued, any
more than it would be if many Linux developers preferred to use fancy
proprietary text editors.  (However, this article does assume that you
have a certain level of background distaste for proprietary software,
and would only adopt Bitkeeper if you believe you really need it.)

There is another reason why Bitkeeper is wrong for the free software
world: it targets the wrong development model.  From the beginning,
Bitkeeper has been aimed at making it easier to do development the way
Linux kernel development is done.  As Larry McVoy put it: "BK makes it
really easy to do what Linus is doing." [1]

What is this development model?  Unlike just about any other large
free software project, the Linux kernel relies on a single integrator
for each line of development.  As of this writing, the integrators are
Linus Torvalds for the mainline, Alan Cox for the bleeding-edge
development branch, and Marcelo Tosatti for the stable 2.4 branch.
Outside contributions are generally filtered through area-specific
maintainers (e.g. Remy Card for the ext2 filesystem), forming a
two-level hierarchy.

Because of this model, the area maintainers--and particularly the
branch maintainers--need tools which can handle very high integration
throughput.  CVS, the mainstay of other free software projects, does
not perform well for this purpose: its tagging and status operations
are slow; it does not have atomic changesets; its command set is not
streamlined for change integration.

To meet the Linux kernel developers' needs, Bitkeeper focuses on
decentralized development and communication of changesets between
developers.  Bitkeeper supports a hierarchy of repositories, with
changes propagated from lower-level repositories to upper-level
repositories through a variety of channels.

Although a changeset-oriented source control tool is useful in many
contexts (offline development on a laptop and private branches of a
project, to name two), the pyramid development model which motivates
it is a fundamentally poor way to run a project.  Although it is
difficult to argue with Linux's numerical success, the pyramid model
has important limitations:

  * Limited development speed: even with the best tools, a single
    integrator can only achieve a certain level of throughput.

  * Single point of failure: if the single integrator of a branch
    suffers an accident, goes on vacation, or simply burns out,
    development is disrupted until a new integrator can be selected
    and comes up to speed.

  * Opinionated maintainers: it is a rare individual who is always
    right.  For instance, the mainline Linux kernel does not contain a
    kernel debugger because Linus won't allow it.  "I don't think
    kernel development should be 'easy'.  I do not condone
    single-stepping through code to find the bug." [2]

  * Limited filtering: work done by (or approved by) an area
    maintainer is only subject to review by the branch integrator, and
    such review may be cursory or nonexistent.  Of course, anyone can
    review all the changes that go into a branch, but only two people
    are in a position to say "no, this change does not meet our
    standards; it cannot go in" and make it stick.

For Linux, the consequences of these limitations have been slow and
unpredictable release schedules, poor stability of release branches,
and a lack of important standards (for instance, no consistent kernel
module ABI or even API within a release branch).

Other large free software projects (the *BSDs, Mozilla, Apache) have a
pool of committers with write access to the development mainline.  How
committers are chosen and how conflicts are handled varies from
project to project [3], but the fundamental organization is constant.
These projects are able to use CVS productively because no single
individual is required to integrate a large volume of outside changes.
To be fair, not all of these projects have achieved perfectly
predictable or stable releases or adherence to important standards,
but they have performed better in those respects than Linux has, on
balance.

Of course, Bitkeeper can still work with centralized project
management, and carries some advantages over CVS in that context, such
as easy file renaming and offline development.  But free software
developers are unlikely to consider Bitkeeper unless they are
genuinely unable to muddle through with CVS--that is, if they are
using pyramid management.

In conclusion: Bitkeeper is wrong for free software because it
encourages a development model with bad results.  Developers would be
better off managing their projects in a centralized manner with
multiple committers.


[1] linux-kernel, 2002-10-05 23:28:52 GMT
    http://www.uwsg.iu.edu/hypermail/linux/kernel/0210.0/1804.html
    Also see the end of
    http://www.uwsg.iu.edu/hypermail/linux/kernel/0303.1/0315.html

[2] linux-kernel, 2002-09-06 19:52:29 GMT
    http://www.uwsg.iu.edu/hypermail/linux/kernel/0009.0/1148.html

[3] For example, see http://httpd.apache.org/dev/guidelines.html


Thanks to Carl Alexander and Joseph Sokol-Margolis for valuable
editing assistance.