Why Bitkeeper Isn't Right For Free Software by Greg Hudson ------------------------------------------- Much has been written about the use of Bitkeeper in free software, particularly in relation to the Linux kernel. Most of the discussion has focused on the restrictive licensing terms of the free-beer version (no source, can't be used by Bitmover competitors, termination clauses). But licensing-related arguments against Bitkeeper aren't very compelling because non-Bitkeeper users haven't been penalized: as long as Bitkeeper use isn't required for Linux kernel development, the free software world is hardly "in crisis" as some have argued, any more than it would be if many Linux developers preferred to use fancy proprietary text editors. (However, this article does assume that you have a certain level of background distaste for proprietary software, and would only adopt Bitkeeper if you believe you really need it.) There is another reason why Bitkeeper is wrong for the free software world: it targets the wrong development model. From the beginning, Bitkeeper has been aimed at making it easier to do development the way Linux kernel development is done. As Larry McVoy put it: "BK makes it really easy to do what Linus is doing." [1] What is this development model? Unlike just about any other large free software project, the Linux kernel relies on a single integrator for each line of development. As of this writing, the integrators are Linus Torvalds for the mainline, Alan Cox for the bleeding-edge development branch, and Marcelo Tosatti for the stable 2.4 branch. Outside contributions are generally filtered through area-specific maintainers (e.g. Remy Card for the ext2 filesystem), forming a two-level hierarchy. Because of this model, the area maintainers--and particularly the branch maintainers--need tools which can handle very high integration throughput. CVS, the mainstay of other free software projects, does not perform well for this purpose: its tagging and status operations are slow; it does not have atomic changesets; its command set is not streamlined for change integration. To meet the Linux kernel developers' needs, Bitkeeper focuses on decentralized development and communication of changesets between developers. Bitkeeper supports a hierarchy of repositories, with changes propagated from lower-level repositories to upper-level repositories through a variety of channels. Although a changeset-oriented source control tool is useful in many contexts (offline development on a laptop and private branches of a project, to name two), the pyramid development model which motivates it is a fundamentally poor way to run a project. Although it is difficult to argue with Linux's numerical success, the pyramid model has important limitations: * Limited development speed: even with the best tools, a single integrator can only achieve a certain level of throughput. * Single point of failure: if the single integrator of a branch suffers an accident, goes on vacation, or simply burns out, development is disrupted until a new integrator can be selected and comes up to speed. * Opinionated maintainers: it is a rare individual who is always right. For instance, the mainline Linux kernel does not contain a kernel debugger because Linus won't allow it. "I don't think kernel development should be 'easy'. I do not condone single-stepping through code to find the bug." [2] * Limited filtering: work done by (or approved by) an area maintainer is only subject to review by the branch integrator, and such review may be cursory or nonexistent. Of course, anyone can review all the changes that go into a branch, but only two people are in a position to say "no, this change does not meet our standards; it cannot go in" and make it stick. For Linux, the consequences of these limitations have been slow and unpredictable release schedules, poor stability of release branches, and a lack of important standards (for instance, no consistent kernel module ABI or even API within a release branch). Other large free software projects (the *BSDs, Mozilla, Apache) have a pool of committers with write access to the development mainline. How committers are chosen and how conflicts are handled varies from project to project [3], but the fundamental organization is constant. These projects are able to use CVS productively because no single individual is required to integrate a large volume of outside changes. To be fair, not all of these projects have achieved perfectly predictable or stable releases or adherence to important standards, but they have performed better in those respects than Linux has, on balance. Of course, Bitkeeper can still work with centralized project management, and carries some advantages over CVS in that context, such as easy file renaming and offline development. But free software developers are unlikely to consider Bitkeeper unless they are genuinely unable to muddle through with CVS--that is, if they are using pyramid management. In conclusion: Bitkeeper is wrong for free software because it encourages a development model with bad results. Developers would be better off managing their projects in a centralized manner with multiple committers. [1] linux-kernel, 2002-10-05 23:28:52 GMT http://www.uwsg.iu.edu/hypermail/linux/kernel/0210.0/1804.html Also see the end of http://www.uwsg.iu.edu/hypermail/linux/kernel/0303.1/0315.html [2] linux-kernel, 2002-09-06 19:52:29 GMT http://www.uwsg.iu.edu/hypermail/linux/kernel/0009.0/1148.html [3] For example, see http://httpd.apache.org/dev/guidelines.html Thanks to Carl Alexander and Joseph Sokol-Margolis for valuable editing assistance.