History of Linux

The very first release of Linux was announced by Linus Torvalds on August 25, 1991, on the Usenet group comp.os.minix:

Hello everybody out there using minix -

I'm doing a (free) operating system (just a hobby, won't be big and professional like gnu) for 386(486) AT clones. This has been brewing since april, and is starting to get ready. I'd like any feedback on things people like/dislike in minix, as my OS resembles it somewhat (same physical layout of the file-system (due to practical reasons) among other things).

I've currently ported bash(1.08) and gcc(1.40), and things seem to work. This implies that I'll get something practical within a few months, and I'd like to know what features most people would want. Any suggestions are welcome, but I won't promise I'll implement them :-)

Linus (torvalds@kruuna.helsinki.fi)

PS. Yes - it's free of any minix code, and it has a multi-threaded fs. It is NOT portable (uses 386 task switching etc), and it probably never will support anything other than AT-harddisks, as that's all I have :-(.

— Linus Torvalds

[Note that in this announcement Torvald's OS has no name yet.] This first version of Linux had about 10,000 lines of code. It supported a hard-disk filesystem (but not floppies), BSD network sockets, linking and loading, memory management and protection (but not paging to disk), and process scheduling.

In October 1991 Torvalds wrote, in another post,

I can (well, almost) hear you asking yourselves "why?" [why use Linux -- pld]. Hurd will be out in a year (or two, or next month, who knows),

As of 2019, Hurd exists only in experimental, 32-bit releases.

Earlier, in 1985, Intel released the 386 chip, the first Intel chip with support for virtual memory. Prior single-user Unix systems used other chips, like the Motorola 68000 (also used by the 1984 Macintosh).

Following the release of the 386 processor, various Unix-like commercial operating systems for the 386 began to appear. In 1987, the Santa Cruz Operation (SCO) ported Microsoft's Xenix OS to the 386 (previous x86 versions either used no memory management or used an external MMU chip). In that same year, Microsoft sold Xenix to SCO in exchange for a 25% stake in SCO's stock. In 1989 SCO released SCO Unix. Later SCO acquired Novell UnixWare and sold that as well.

In 1986, Maurice Bach published The Design of the Unix Operating System, describing the internals of AT&T Unix. It made Unix internals comprehensible, and was quite influential.

In 1987, Andrew Tanenbaum released his Minix (for "mini-Unix) operating system, intended for students and hobbiests. It was Unix-like, but at the time had a 16-bit address space. Minix was developed as a companion to Tanenbaum's OS textbook Operating Systems: Design and Implementation. The source code was distributed (on a floppy disk), but modifications were not accepted and redistribution was not allowed. (This changed in 2000, when Minix was released under the Berkeley license). Also, one had to buy Tanenbaum's book to get the code (this was a requirement of the publisher).

Torvalds was clearly very influenced by Minix: he developed Linux using it, and announced Linux to the minix user community. However, there were deep design differences. Minix used (and, to an extent, introduced) the "microkernel" approach, in which the kernel consisted of a collection of discrete modules. Linux, by comparison, used a so-called "monolithic kernel". (Later, loadable device drivers restored some considerable degree of microkernelness.

Torvalds had intended to name his OS "Freax", for free/unix, with maybe some "freak" thrown in. But when he uploaded his code to the Finnish University and Research Network, an administrator changed the name to "Linux".

In 1992, Orest Zborowski ported X windows to Linux, giving Linux a GUI.

Also in 1992, Andrew Tanenbaum published his famous criticism of Linux, mostly regarding its design as a monolithic kernel. He also stated, "Writing a new operating system that is closely tied to any particular piece of hardware, especially a weird one like the Intel line, is basically wrong." [In 1992 there was considerable belief that, by 2000, RISC architectures would surpass the 386 in performance and Intel would be dethroned.] Tanenbaum thought that the soon-to-be-released GNU Hurd was a better bet.

In December 1992 Linux was first released under the GNU GPL. An informal previous license had limited Linux to "noncommercial" use.

In the early 1990's, the term "Linux" did refer only to the kernel. But soon the word was being used to refer to Torvald's kernel plus all the GNU utilities. Richard Stallman proposed "lignux" briefly, but GNU/Linux won out. (Sort of. Most people just call it linux.)

In 1993 the Slackware and Debian distributions appeared.

In 1996 Linux v2.0 was released, supporting Symmetric Multi-Processing (SMP), now more commonly thought of as multi-core support. This was a huge step forward in the server world.

Throughout the later 1990's there were multiple competing commercial versions of Unix. Linux was seen by many in the industry as an opportunity to create a standard platform out of, well, chaos. This lead to a surge in commercial contributions to Linux.

In 1997 Eric Raymond published his essay The Cathedral and the Bazaar, contrasting GNU (and commercial) software development (the "cathedral") with Linux (the "bazaar"). In the former, source code is made available only via releases; there is no public discussion between releases. In the latter, there is. Raymond's idea was that, with so many people looking at the code, bugs would be found much more readily. This is sometimes summarized as "given enough eyeballs, all bugs are shallow". The essay was later expanded into a book.

In 1998 some of the Microsoft "Halloween documents" (named for the date they were released by Eric Raymond, to whom they were leaked) became available; these documents outlined Microsoft's strategy in competing with Linux. In public, Microsoft was dismissive, and suggested Linux was less reliable. In private, they sowed "fear, uncertainty and doubt". This included announcement of products that did not yet exist, and the stirring up of fears of future incompatibility. The Halloween 1 document suggests that plain FUD was unlikely to succeed.

The first Halloween document, by Vinod Valloppillil, is an analysis by Microsoft of the overall threat from Open Source. Some points:

  1. OSS projects have achieved "commercial quality"
  2. OSS projects have become large-scale and complex
    OSS teams are undertaking projects whose size & complexity had heretofore been the exclusive domain of commercial, economically-organized/motivated development teams.

The second Halloween document was leaked a few days later; it makes it clear that Microsoft understood Linux was a high-quality system.

Also in 1998, IBM began to contribute to Linux. In 2000 IBM announced its official commitment to Linux, founded its own Linux Technology Center, and its source contributions ramped up accordingly. This led to much more robust filesystem code (the Journaling File System, or JFS). IBM may have been the source of Linux SMP support as well.

IBM's embrace of Linux also had a large impact on the business world generally, which began to see Linux as a legitimate option for serious business computing.

In 2000 the Open Source Development Lab was founded; Torvalds then worked there. The OSDL merged with the Free Standards Group in 2007 to form the Linux Foundation.

For some discussion of the finer points of the GNU license as applied to Linux, and specifically whether it applies to kernel modules, see licenses.html#modules.


Get the Facts

In 2004 Microsoft began its "Get the Facts" advertising campaign against Linux. Microsoft questioned Linux's reliability and Total Cost of Ownership (TCO).

In July 2009 Microsoft contributed 22K lines of code to Linux. This signaled the end of the Windows-Linux wars, but was probably not just done for public benefit. Microsoft may have violated the GPL and been forced to release the code.


SCO

AT&T assigned ownership of Unix to their own Unix System Laboratories. At one point USL sued Berkeley for distributing BSD Unix; the BSD group countersued arguing that some of their code had been incorporated into Unix, without copyright attribution. 

In 1993 point Novell bought USL, and thus all Unix rights, except that Sun Microsystems had independently acquired their own rights. Two years later Novell sold to SCO the rights to Novell UnixWare and also the Unix licensing rights, but explicitly retained the copyrights.

In 2003 SCO launched their lawsuit against IBM. Various legal theories at various times were proposed, but the most persistent one was that IBM had licensed Unix code from SCO, and that the terms of this license forbade IBM from contributing to Linux, even if the code in question had not come from SCO. Initially, though, SCO did indeed claim that Linux contained Unix code. IBM's Linux contributions included SMP, JFS, and a few more important feature for large-scale Linux operation. Throughout the lawsuit, SCO never spelled out exactly what Unix code IBM was accused of having added to Linux.

SCO also announced its intent to sue all Linux users. This probably was intended to apply only to corporate users. SCO was never able to formulate a legal theory under which this could be done, though, and eventually gave up on this part. It was this and similar claims that led companies to be more hesitant about adopting Linux. SCO did sue DaimlerChrysler and AutoZone for their Linux use.

Recall that SCO was 25% owned by Microsoft. This may explain something. It was during this period that Microsoft's anti-Linux "Get the Facts" advertising campaign was launched.

In 2007 a federal District Court ruled that Novell owned the copyrights to Unix.

In 2009, the Tenth Circuit partially reversed.

In 2010 a federal jury decided Novell owned the copyrights to Unix, settling that portion of the case.



Stats

The Linux Foundation's 2017 report is available at linuxfoundation.org/publications/2017-state-of-linux-kernel-development. Some stats:

The report also indicates that, as of 2017, 85% of development is by those who are paid for their work, most often by their companies. This is up from 80% in 2015. It is worth keeping in mind that (a) to some extent these are estimates; it is not always known when a contributor's Linux work is actually done at the request of the employer versus as a side project tolerated by the employer; (b) Linux kernel development requires a significant degree of commitment, and (c) kernel devs who start out as "amateurs" soon get job offers.


Linux Features

In late 1991, Linux 0.11 was released. It supported code sharing between unrelated processes, a reasonably wide range of video (eg VGA), and mkfs/fsck/fdisk. It still did not have an init process, for startup. Instead, bash ran at the console prompt on bootup. This was the first version that did not need Minix to install.

In January 1992, Linux 0.12 was released. This had job control and paging to disk (virtual memory). It was the first "popular" system, in the sense that users who were not kernel hackers began installing it.

January 1994: Linux 1.0

January 1996: Linux 2.0 with SMP support (but not very efficient yet)

Now we jump a few years.

2.5.2, January 2002: per-process filesystem namespaces (per-process mount points), USB support

2.5.3, January 2002: new driver system

2.5.6, March 2002: Unicode support, Journaling File System (JFS) contributed by IBM

2.5.11, April 2002: new video framebuffer design, new NTFS driver

2.5.18, May 2002: software suspend -- great for laptops!

2.5.25, July: clock ticks up to 1/1000 sec supported, versus 1/100 sec

2.5.33, August 2002: TCP segmentation offload, SCTP

2.5.42, October 2002: block devices can now be 16 TB on 32-bit systems (2 TB before), 8EB on 64-bit systems, NFS v4.0, improved ext2/ext3 filesystem support

2.5.45, Halloween 2002: kconfig for kernel configuration, IPSEC

2.6.0 (final release December 2003):

2.6.15, Jan 2006: "shared-subtree" code, a precursor to more namespaces

2.6.19, Nov 2006: The ext4 filesystem, considerable work on namespaces

2.6.20, Feb 2007: fault injection, to allow simulated kernel errors (to check error-handling)

2.6.22, July 2007: new Wi-Fi stack; new drivers appear over the course of the next half-dozen releases

2.6.23, October 2007: user namespaces

2.6.24, Dec 2007: process and network namespaces are here! (Well, there were continuing improvements). These namespaces allow for containerization, as in Docker and Kubernetes.

2.6.28, Dec 2008: ext4 is considered stable

2.6.29, March 2009: wireless access point support

2.6.31, Sept 2009: USB 3.0 support

2.6.36, October 2010: AppArmor gets kernel integration (AppArmor is a major component of SELinux), OOM Killer (killer of out-of-memory processes) is updated

3.0 (July 2011)

3.1, Oct 2011: Improved RAID (bad-block management), NFC & Wiimote support

3.6, Sept 2012: TCP small queues

4.0, April 2015

4.15, January 2018: retpoline ("return trampoline") for Spectre vulnerability

4.17: in-kernel support for TLS connections

See

Containerization

A recent popular theme in OS operations has been containerization: the creation of "virtual OS spaces" in which software modules can run without conflicts with other software in other containers. Docker is the best-known Linux container-management system.

One application of containers is that a container can be delivered with all dependencies (other necessary software) in place. If the host system already has some of these dependencies installed, or has conflicting versions installed, it does not matter, because the container is isolated.

Containerization can be thought of as a "virtual OS copy" running within the existing OS. The virtual OS is the same OS, but appears to be a clean copy.

Linux has taken the lead in the past 15 years in containerization support.

A crucial concept of containerization is namespace isolation:

Modern containers typically, though not necessarily, have their own init() process, responsible for management of all other processes. A container can in effect be "rebooted" by restarting init().

The first step towards containerization was the Unix v7 chroot() system call, introduced in 1979 (12 BLE). Once executed by a process, with a specific directory as parameter, that process and its children could only see the parts of the filesystem within that directory. The goal at the time was greater security: a "chrooted" process would simply have no access the password file or other sensitive files. The same would apply to its children. (Typically, copies of essential libraries would have to be made within the chroot subtree if they were needed by the process or its children.) The chrooted filesystem did not have its own "name", so in that sense it wasn't a "namespace".

One security vulnerability with chroot() was that the chroot() process could create (using mknod(), "make node", to create device files),  a device file (like /dev/sda) representing the entire disk, and then mount that device file. This would allow the chrooted process to break out to the entire filesystem. This has been addressed in later containerization implementations, but security remains a concern. For example, access to the mknod() call is often disallowed in containers, but sometimes it is needed.

In 2000, a small hosting provider came up with "FreeBSD jails". Each customer sharing a physical machine would be in their own "jail" (nowadays this would often be achieved with virtual machines). Each separate jail would typically have its own

FreeBSD jails are still in existence. They can be started at boot time.

The jail concept made it to Linux the following year, 2001, with "VServer". This never made it into the mainstream kernel, however; to enable VServer one had to install kernel patches. [Kernel patches are a huge sink of support time, and they introduce a considerable degree of uncertainty as to whether your system will boot tomorrow.] [Note: there is an entirely separate project known as Linux Virtual Server, which does something entirely different: load balancing. Multiple machines act as one, though they do not necessarily share all state.]

Solaris containers were introduced in 2004.

In 2005 the Linux world saw the arrival of "Open VZ", an update on VServer. Open VZ was, however, still distributed only via patches; it never made it into the mainstream kernel.

Google introduced process containers in 2006; these made it into the Linux kernel in version 2.6.4, under the name cgroups (control groups). On a basic system, the processes form a tree structure. Parents must wait() for their children, and processes with sufficient privileges can attach debuggers or tracers to other processes, or even kill() other processes.

A process container is essentially a subtree of the entire process tree that is isolated from the rest of the process tree. Processes in the process container cannot see processes outside the container in any way. They cannot send signals. (Sometimes inter-process message passing is allowed.)

At the level of the host system, processes are now kept track of by something like the following:

struct upid {
  int nr;                     // the containerized PID value
  struct pid_namespace *ns;   // namespace where this PID is relevant
  // ...
};

Around the same time, Linux began introducing other namespace-isolation features. For example, chroot() was replaced with mount namespaces. A process (or process group) could get its own set of mount points, and could mount its own version of the root filesystem.

In 2008 Linux got true containers, under the name LXC (LinuX Containers). IBM contributed heavily to this project.

The Warden container-management system was introduced in 2011, initially based on LXC. LMCTFY (Let Me Contain That For You) was an open-source version of Google's container system, released in 2013.

Docker was also introduced in 2013. Kubernetes (from the Greek κυβερνήτης for "ship pilot") came along a couple years later, as a management layer. It is now supported by Docker.

Linux management

There are three main forks of BSD Unix: OpenBSD, NetBSD and FreeBSD. How did Linux remain free of forks?

One theory is that Linus Torvalds has held it all together by pure force of will. But Torvalds has a reputation as a rather difficult manager, berating contributors over technical issues. See newyorker.com/science/elements/after-years-of-abusive-e-mails-the-creator-of-linux-steps-aside, or else just browse the kernel mailing list.

Here's one example, unedited, from here (in 2012):

On Sun, Dec 23, 2012 at 6:08 AM, Mauro Carvalho Chehab
<mchehab@redhat.com> wrote:
>
> Are you saying that pulseaudio is entering on some weird loop if the
> returned value is not -EINVAL? That seems a bug at pulseaudio.

Mauro, SHUT THE FUCK UP!

It's a bug alright - in the kernel. How long have you been a
maintainer? And you *still* haven't learnt the first rule of kernel
maintenance?

If a change results in user programs breaking, it's a bug in the
kernel. We never EVER blame the user programs. How hard can this be to
understand?

To make matters worse, commit f0ed2ce840b3 is clearly total and utter
CRAP even if it didn't break applications. ENOENT is not a valid error
return from an ioctl. Never has been, never will be. ENOENT means "No
such file and directory", and is for path operations. ioctl's are done
on files that have already been opened, there's no way in hell that
ENOENT would ever be valid.

Aside from Torvalds' use of profanity, note that he makes a very serious point: any kernel change that results in the failure of user programs (that used to work) represents a kernel failure. You can only add new features, not change (or "fix") the behavior of old features.

Another point has to do with the error code ENOENT, which, Torvalds argues, should not be used for ioctl() calls.

Torvalds announced in Sept 2018 that he was taking a break, after five years of increasingly pointed criticism that his approach was discouraging to lots of contributors and particularly to women.

This week people in our community confronted me about my lifetime of
not understanding emotions.  My flippant attacks in emails have been
both unprofessional and uncalled for.  Especially at times when I made
it personal.  In my quest for a better patch, this made sense to me.
I know now this was not OK and I am truly sorry.

Torvalds returned a month later (October 2018) and is back at the helm. The Linux Foundation has adopted a Code of Conduct

Which leaves the question: could Linux continue to thrive without Torvalds?

One thing Torvalds has done, exceptionally well, is to allow almost all useful contributions to the Linux kernel, but as kernel variants and optional loadable kernel modules: things you can install with a patch, or a module, or a git branch. To get into the mainline kernel, though (and eventually into stable and then longterm), your feature has to have withstood some test of time.

One consequence is that some code languishes for quite a while in the "specialized feature" desert. But the authors know that, to get their code mainlined, it's not Torvalds they have to convince.

Sometimes, modules are proprietary. These are marked as such by the kernel at load time. In 2004, Linuxant tried to set the MODULE_LICENSE flag as follows; note the null byte that is the normal string terminator:

MODULE_LICENSE("GPL\0for files in the \"GPL\" directory; for others, only LICENSE file applies");

The goal was to make the kernel think the code was covered by the GPL, which would make support easier.