History of Linux

The very first release of Linux was announced by Linus Torvalds on August 25, 1991, on the Usenet group comp.os.minix:

Hello everybody out there using minix -

I'm doing a (free) operating system (just a hobby, won't be big and professional like gnu) for 386(486) AT clones. This has been brewing since april, and is starting to get ready. I'd like any feedback on things people like/dislike in minix, as my OS resembles it somewhat (same physical layout of the file-system (due to practical reasons) among other things).

I've currently ported bash(1.08) and gcc(1.40), and things seem to work. This implies that I'll get something practical within a few months, and I'd like to know what features most people would want. Any suggestions are welcome, but I won't promise I'll implement them :-)

Linus (torvalds@kruuna.helsinki.fi)

PS. Yes - it's free of any minix code, and it has a multi-threaded fs. It is NOT portable (uses 386 task switching etc), and it probably never will support anything other than AT-harddisks, as that's all I have :-(.

— Linus Torvalds

[Note that in this announcement Torvald's OS has no name yet.] This first version of Linux had about 10,000 lines of code. It supported a hard-disk filesystem (but not floppies), BSD network sockets, linking and loading, memory management and protection (but not paging to disk), and process scheduling.

In October 1991 Torvalds wrote, in another post,

I can (well, almost) hear you asking yourselves "why?" [why use Linux -- pld]. Hurd will be out in a year (or two, or next month, who knows),

As of 2018, Hurd is still not out.

Earlier, in 1985, Intel released the 386 chip, the first Intel chip with support for virtual memory. Prior single-user Unix systems used other chips, like the Motorola 68000 (also used by the 1984 Macintosh).

Following the release of the 386 processor, various Unix-like commercial operating systems for the 386 began to appear. In 1987, the Santa Cruz Operation (SCO) ported Microsoft's Xenix OS to the 386 (previous x86 versions either used no memory management or used an external MMU chip). In that same year, Microsoft sold Xenix to SCO in exchange for a 25% stake in SCO's stock. In 1989 SCO released SCO Unix. Later SCO acquired Novell UnixWare and sold that as well.

In 1986, Maurice Bach published The Design of the Unix Operating System, describing the internals of AT&T Unix. It made Unix internals comprehensible, and was quite influential.

In 1987, Andrew Tanenbaum released his Minix (for "mini-Unix) operating system, intended for students and hobbiests. It was Unix-like, but at the time had a 16-bit address space. Minix was developed as a companion to Tanenbaum's OS textbook Operating Systems: Design and Implementation. The source code was distributed (on a floppy disk), but modifications were not accepted and redistribution was not allowed. (This changed in 2000, when Minix was released under the Berkeley license). Also, one had to buy Tanenbaum's book to get the code (this was a requirement of the publisher).

Torvalds was clearly very influenced by Minix: he developed Linux using it, and announced Linux to the minix user community. However, there were deep design differences. Minix used (and, to an extent, introduced) the "microkernel" approach, in which the kernel consisted of a collection of discrete modules. Linux, by comparison, used a so-called "monolithic kernel". (Later, loadable device drivers restored some considerable degree of microkernelness.

Torvalds had intended to name his OS "Freax", for free/unix, with maybe some "freak" thrown in. But when he uploaded his code to the Finnish University and Research Network, an administrator changed the name to "Linux".

In 1992, Orest Zborowski ported X windows to Linux, giving Linux a GUI.

Also in 1992, Andrew Tanenbaum published his famous criticism of Linux, mostly regarding its design as a monolithic kernel. He also stated, "Writing a new operating system that is closely tied to any particular piece of hardware, especially a weird one like the Intel line, is basically wrong." [In 1992 there was considerable belief that, by 2000, RISC architectures would surpass the 386 in performance and Intel would be dethroned.] Tanenbaum thought that the soon-to-be-released GNU Hurd was a better bet.

In December 1992 Linux was first released under the GNU GPL. An informal previous license had limited Linux to "noncommercial" use.

In the early 1990's, the term "Linux" did refer only to the kernel. But soon the word was being used to refer to Torvald's kernel plus all the GNU utilities. Richard Stallman proposed "lignux" briefly, but GNU/Linux won out.

In 1993 the Slackware and Debian distributions appeared.

In 1996 Linux v2.0 was released, supporting Symmetric Multi-Processing (SMP), now more commonly thought of as multi-core support. This was a huge step forward in the server world.

Throughout the later 1990's there were multiple competing commercial versions of Unix. Linux was seen by many in the industry as an opportunity to create a standard platform out of, well, chaos. This lead to a surge in commercial contributions to Linux.

In 1997 Eric Raymond published his essay The Cathedral and the Bazaar, contrasting GNU (and commercial) software development (the "cathedral") with Linux (the "bazaar"). In the former, source code is made available only via releases; there is no public discussion between releases. In the latter, there is. Raymond's idea was that, with so many people looking at the code, bugs would be found much more readily. This is sometimes summarized as "given enough eyeballs, all bugs are shallow". The essay was later expanded into a book.

In 1998 some of the Microsoft "Halloween documents" (named for the date they were released by Eric Raymond, to whom they were leaked) became available; these documents outlined Microsoft's strategy in competing with Linux. In public, Microsoft was dismissive, and suggested Linux was less reliable. In private, they sowed "fear, uncertainty and doubt". This included announcement of products that did not yet exist, and the stirring up of fears of future incompatibility. The Halloween 1 document suggests that plain FUD was unlikely to succeed.

The first Halloween document, by Vinod Valloppillil, is an analysis by Microsoft of the overall threat from Open Source. Some points:

  1. OSS projects have achieved "commercial quality"
  2. OSS projects have become large-scale and complex
    OSS teams are undertaking projects whose size & complexity had heretofore been the exclusive domain of commercial, economically-organized/motivated development teams.

The second Halloween document was leaked a few days later; it makes it clear that Microsoft understood Linux was a high-quality system.

Also in 1998, IBM began to contribute to Linux. In 2000 IBM announced its official commitment to Linux, founded its own Linux Technology Center, and its source contributions ramped up accordingly. This led to much more robust filesystem code (the Journaling File System, or JFS). IBM may have been the source of Linux SMP support as well.

IBM's embrace of Linux also had a large impact on the business world generally, which began to see Linux as a legitimate option for serious business computing.

In 2000 the Open Source Development Lab was founded; Torvalds then worked there. The OSDL merged with the Free Standards Group in 2007 to form the Linux Foundation.


Get the Facts

In 2004 Microsoft began its "Get the Facts" advertising campaign against Linux. Microsoft questioned Linux's reliability and Total Cost of Ownership (TCO).

In July 2009 Microsoft contributed 22K lines of code to Linux. This signaled the end of the Windows-Linux wars, but was probably not just done for public benefit. Microsoft may have violated the GPL and been forced to release the code.


SCO

AT&T assigned ownership of Unix to their own Unix System Laboratories. At one point USL sued Berkeley for distributing BSD Unix; the BSD group countersued arguing that some of their code had been incorporated into Unix, without copyright attribution. 

In 1993 point Novell bought USL, and thus all Unix rights, except that Sun Microsystems had independently acquired their own rights. Two years later Novell sold to SCO the rights to Novell UnixWare and also the Unix licensing rights, but explicitly retained the copyrights.

In 2003 SCO launched their lawsuit against IBM. Various legal theories at various times were proposed, but the most persistent one was that IBM had licensed Unix code from SCO, and that the terms of this license forbade IBM from contributing to Linux, even if the code in question had not come from SCO. Initially, though, SCO did indeed claim that Linux contained Unix code. IBM's Linux contributions included SMP, JFS, and a few more important feature for large-scale Linux operation. Throughout the lawsuit, SCO never spelled out exactly what Unix code IBM was accused of having added to Linux.

SCO also announced its intent to sue all Linux users. This probably was intended to apply only to corporate users. SCO was never able to formulate a legal theory under which this could be done, though, and eventually gave up on this part. It was this and similar claims that led companies to be more hesitant about adopting Linux. SCO did sue DaimlerChrysler and AutoZone for their Linux use.

Recall that SCO was 25% owned by Microsoft. This may explain something. It was during this period that Microsoft's anti-Linux "Get the Facts" advertising campaign was launched.

In 2007 a federal District Court ruled that Novell owned the copyrights to Unix.

In 2009, the Tenth Circuit partially reversed.

In 2010 a federal jury decided Novell owned the copyrights to Unix, settling that portion of the case.



Stats

The Linux Foundation's 2017 report is available at linuxfoundation.org/publications/2017-state-of-linux-kernel-development. Some stats:

The report also indicates that, as of 2017, 85% of development is by those who are paid for their work, most often by their companies. This is up from 80% in 2015. It is worth keeping in mind that (a) to some extent these are estimates; it is not always known when a contributor's Linux work is actually done at the request of the employer versus as a side project tolerated by the employer; (b) Linux kernel development requires a significant degree of commitment, and (c) kernel devs who start out as "amateurs" soon get job offers.


Linux Features

In late 1991, Linux 0.11 was released. It supported code sharing between unrelated processes, a reasonably wide range of video (eg VGA), and mkfs/fsck/fdisk. It still did not have an init process, for startup. Instead, bash ran at the console prompt on bootup. This was the first version that did not need Minix to install.

In January 1992, Linux 0.12 was released. This had job control and paging to disk (virtual memory). It was the first "popular" system, in the sense that users who were not kernel hackers began installing it.

January 1994: Linux 1.0

January 1996: Linux 2.0 with SMP support (but not very efficient yet)

Now we jump a few years.

2.5.2, January 2002: per-process filesystem namespaces (per-process mount points), USB support

2.5.3, January 2002: new driver system

2.5.6, March 2002: Unicode support, Journaling File System (JFS) contributed by IBM

2.5.11, April 2002: new video framebuffer design, new NTFS driver

2.5.18, May 2002: software suspend -- great for laptops!

2.5.25, July: clock ticks up to 1/1000 sec supported, versus 1/100 sec

2.5.33, August 2002: TCP segmentation offload, SCTP

2.5.42, October 2002: block devices can now be 16 TB on 32-bit systems (2 TB before), 8EB on 64-bit systems, NFS v4.0, improved ext2/ext3 filesystem support

2.5.45, Halloween 2002: kconfig for kernel configuration, IPSEC

2.6.0 (final release December 2003):

2.6.15, Jan 2006: "shared-subtree" code, a precursor to more namespaces

2.6.19, Nov 2006: The ext4 filesystem, considerable work on namespaces

2.6.20, Feb 2007: fault injection, to allow simulated kernel errors (to check error-handling)

2.6.22, July 2007: new Wi-Fi stack; new drivers appear over the course of the next half-dozen releases

2.6.23, October 2007: user namespaces

2.6.24, Dec 2007: process and network namespaces are here! (Well, there were continuing improvements). These namespaces allow for containerization, as in Docker and Kubernetes.

2.6.28, Dec 2008: ext4 is considered stable

2.6.29, March 2009: wireless access point support

2.6.31, Sept 2009: USB 3.0 support

2.6.36, October 2010: AppArmor gets kernel integration (AppArmor is a major component of SELinux), OOM Killer (killer of out-of-memory processes) is updated

3.0 (July 2011)

3.1, Oct 2011: Improved RAID (bad-block management), NFC & Wiimote support

3.6, Sept 2012: TCP small queues

4.0, April 2015

4.15, January 2018: retpoline ("return trampoline") for Spectre vulnerability

4.17: in-kernel support for TLS connections

See

Containerization

A recent popular theme in OS operations has been containerization: the creation of "virtual OS spaces" in which software modules can run without conflicts with other software in other containers. Docker is the best-known Linux container-management system.

One application of containers is that a container can be delivered with all dependencies (other necessary software) in place. If the host system already has some of these dependencies installed, or has conflicting versions installed, it does not matter, because the container is isolated.

Containerization can be thought of as a "virtual OS copy" running within the existing OS. The virtual OS is the same OS, but appears to be a clean copy.

Linux has taken the lead in the past 15 years in containerization support.

A crucial concept of containerization is namespace isolation:

Modern containers typically, though not necessarily, have their own init() process, responsible for management of all other processes. A container can in effect be "rebooted" by restarting init().

The first step towards containerization was the Unix v7 chroot() system call, introduced in 1979 (12 BLE). Once executed by a process, with a specific directory as parameter, that process and its children could only see the parts of the filesystem within that directory. The goal at the time was greater security: a "chrooted" process would simply have no access the password file or other sensitive files. The same would apply to its children. (Typically, copies of essential libraries would have to be made within the chroot subtree if they were needed by the process or its children.) The chrooted filesystem did not have its own "name", so in that sense it wasn't a "namespace".

One security vulnerability with chroot() was that the chroot() process could create (using mknod(), "make node", to create device files),  a device file (like /dev/sda) representing the entire disk, and then mount that device file. This would allow the chrooted process to break out to the entire filesystem. This has been addressed in later containerization implementations, but security remains a concern. For example, access to the mknod() call is often disallowed in containers, but sometimes it is needed.

In 2000, a small hosting provider came up with "FreeBSD jails". Each customer sharing a physical machine would be in their own "jail" (nowadays this would often be achieved with virtual machines). Each separate jail would typically have its own

FreeBSD jails are still in existence. They can be started at boot time.

The jail concept made it to Linux the following year, 2001, with "VServer". This never made it into the mainstream kernel, however; to enable VServer one had to install kernel patches. [Kernel patches are a huge sink of support time, and they introduce a considerable degree of uncertainty as to whether your system will boot tomorrow.] [Note: there is an entirely separate project known as Linux Virtual Server, which does something entirely different: load balancing. Multiple machines act as one, though they do not necessarily share all state.]

Solaris containers were introduced in 2004.

In 2005 the Linux world saw the arrival of "Open VZ", an update on VServer. Open VZ was, however, still distributed only via patches; it never made it into the mainstream kernel.

Google introduced process containers in 2006; these made it into the Linux kernel in version 2.6.4, under the name cgroups (control groups). On a basic system, the processes form a tree structure. Parents must wait() for their children, and processes with sufficient privileges can attach debuggers or tracers to other processes, or even kill() other processes.

A process container is essentially a subtree of the entire process tree that is isolated from the rest of the process tree. Processes in the process container cannot see processes outside the container in any way. They cannot send signals. (Sometimes inter-process message passing is allowed.)

At the level of the host system, processes are now kept track of by something like the following:

struct upid {
  int nr;                     // the containerized PID value
  struct pid_namespace *ns;   // namespace where this PID is relevant
  // ...
};

Around the same time, Linux began introducing other namespace-isolation features. For example, chroot() was replaced with mount namespaces. A process (or process group) could get its own set of mount points, and could mount its own version of the root filesystem.

In 2008 Linux got true containers, under the name LXC (LinuX Containers). IBM contributed heavily to this project.

The Warden container-management system was introduced in 2011, initially based on LXC. LMCTFY (Let Me Contain That For You) was an open-source version of Google's container system, released in 2013.

Docker was also introduced in 2013. Kubernetes (from the Greek κυβερνήτης for "ship pilot") came along a couple years later, as a management layer. It is now supported by Docker.