I’m not sure we talk about Linux enough. For those of us who have grown up in open source, as well as those new to open source, we all owe a huge debt of gratitude to the pioneering work by the Linux community. Linux, after all, was the early poster child for what open source meant, and what it could mean for individuals, businesses, and governments.
But the Linux community, including its gravitational center, Linus Torvalds, has also shaped the way open source works today. From Git to organizational structure (maintainers, committers, etc.), Linux has either directly established or indirectly influenced how open source communities operate. It’s therefore worth reviewing some of the ways Linux paved the way for open source, generally, as detailed in The Linux Foundation’s 2020 Linux Kernel History Report.
Getting along at scale
There’s nothing small about Linux (well, except for all those zillions of IoT devices happily running embedded Linux distributions). The project has been running for 29 years now, fueled by over 20,000 contributors (now roughly 4,000 per year), adding up to an astounding one million commits (as of August 2020), or more than 75,000 commits on average over the last few years.
But, of course, it didn’t start that way.
Linux started as the solo project of Torvalds, but by 1996 he was joined by two others, Alan Cox and Jon Naylor, with each of the three dubbed “maintainers.” Other projects like the Apache web server took organizational shape in this same general time period, but to the best of my knowledge none organized as early (and formally) around a maintainer hierarchy.
Such hierarchy was critical because, as the end of the first MAINTAINERS file (for kernel v1.3.68) revealed, project communication was a problem:
P: Linus Torvalds
S: Buried alive in email
That first MAINTAINERS file in 1996 was just 107 lines long. Today (v5.8), the file is 19,033 lines long and has 1,501 maintainers listed. Those maintainers (among others) share an incredible amount of messages on the Linux Kernel Mailing List (LKML):
Even at this scale, however, the organizational setup of Linux has allowed Torvalds (and the 1,501 others) to collaborate without becoming “buried alive in email.” Unfortunately, with Linux’s success Torvalds instead has to learn to swim through reporters, as the end of the MAINTAINERS file for v5.8 suggests:
M: Linus Torvalds <email@example.com>
S: Buried alive in reporters
T: git git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
First world open source problems.
Talking in code
As important as that hierarchy has been, as well as the various communication media (like LKML) used to stitch development teams together, something more was needed to help tackle the incredible growth in the Linux kernel’s code. That something is Git.
I’ve highlighted before the importance of Git, the version control system Torvalds established to help tame the growing code base. It was so important, in fact, that Torvalds actually took a break from kernel development in 2005 to create it. We now more commonly think of Git in the context of GitHub or GitLab, but the basis of them both is, of course, Git. As such, Tobie Langel is correct in arguing that Git (and companies like GitHub that made it even easier to use) “gave open source visibility and lowered the playing field for collaboration by an order of magnitude.”
The switch from BitKeeper to Git in 2005 allowed the Linux kernel community to swell without the process for collaborating on code becoming unwieldy. As the folks at Atlassian have written:
Unlike centralized version control systems, Git branches are cheap and easy to merge…. Feature branches provide an isolated environment for every change to your codebase. When a developer wants to start working on something — no matter how big or small — they create a new branch. This ensures that the master branch always contains production-quality code.
In [a centralized version control system like] SVN, each developer gets a working copy that points back to a single central repository. Git, however, is a distributed version control system. Instead of a working copy, each developer gets their own local repository, complete with a full history of commits. Having a full local history makes Git fast, since it means you don’t need a network connection to create commits, inspect previous versions of a file, or perform diffs between commits.
Distributed development also makes it easier to scale your engineering team. If someone breaks the production branch in SVN, other developers can’t check in their changes until it’s fixed. With Git, this kind of blocking doesn’t exist. Everybody can continue going about their business in their own local repositories.
All of this is easy to take for granted, because Git has become the standard for software development over the past 15 years. But it was the Linux kernel community that first piloted Git-based development, making it standard for the rest of us today.
Git has allowed Linux kernel development to accelerate dramatically. When the first kernel report was published in 2008, there were just two commits per hour for the 2.6.12 release. In 2019 the average jumped to 9.4 commits per hour, and in 2020 (5.8 kernel) the average is 10.7 commits per hour. More. Faster. Better.
Making open source pay
There are a number of other technical advances that have arisen from the Linux community, not to mention process improvements that bleed out into other open source projects. But let’s end on one other important “innovation” to emerge from Linux: money. Linux, more than any other project, has demonstrated that there’s significant financial upside in contributing time and money toward free software.
In part I’m referring to companies like Red Hat that have built billion-dollar businesses selling services around Linux. But I think it’s even more important to look at the individuals involved in Linux kernel development.
Talking to kernel developers like Jens Axboe, kernel development can be “something you do for fun and because you find it exciting.” Given the central importance to so many organizations, however, most Linux kernel developers (74.2 percent) get paid to contribute. In turn, it has become standard practice throughout the industry for contributors to open source projects to get paid for this work. This is a good thing because it helps to ensure the long-term sustainability of a wide variety of projects.
Including Linux. Linux wasn’t the first open source project, but it’s been the gold standard in open source since its launch in 1994. Most open source developers will never work on the Linux kernel, but all derive considerable benefit from the practices, technologies, and culture that it has created.