Open Source Computing

Week 3

Developers spend most of their time figuring the system out

From blog.feenk.com/developers-spend-most-of-their-time-figuri-7aj1ocjhe765vvlln8qqbuhto. They have a nice diagram. What most open-source contributors really do is "maintenance", and maintenance is hard. The Feenk blog suggests it represents two-thirds of the work.

Teams and Reports

I have now sent out emails to each team.

I would like weekly progress reports, generally to be emailed to me on Fridays. Initially your reports will be about selecting a project. I do not need a lot of detail, but I do need to make sure everyone is keeping up. If your report just says "we got nothing done this week", that's fine.

Pick one person to send these. The report should be cc:ed to all team members.

If you have a team issue, let me know.


Elastic

Week 1 I talked about this: elastic.co/blog/why-license-change-AWS

The next day was this announcement: aws.amazon.com/blogs/opensource/stepping-up-for-a-truly-open-source-elasticsearch.

It's still riling up the Open Source community. See the following position by crate.io: crate.io/a/cratedb-doubling-down-on-permissive-licensing-and-the-elasticsearch-lockdown. Crate is really annoyed about this switch from Apache to GPL (inside of SSPL):

We would never have chosen Elasticsearch in the first place, had it been licensed under the GPL as some of our customers ... banned GPL licensed software from their application stacks for legal risks.

These legal "risks" are sometimes overblown, but there is an issue: if there's an off-the-wall court decision, or a mistake on the part of an employee, a part of a company's codebase might have to be made public. There is not likely to be an option simply to stop distributing the offending code: the damage is done.

A more serious potential issue here is that Elasticsearch plugins were added to CrateDB. If you build a plugin using a GPL package, and add that plugin to something else, is that second project brought under the GPL? This is somewhat related to the Linux kernel-module issue, as kernel modules are a kind of plugin.

The GPL is not really clear about plugins. At www.gnu.org/licenses/gpl-faq.en.html#GPLPlugins they suggest the dispositive issue, assuming the plugin runs as a separate process, is how the plugin communicates with the main program. "Complex" communication, or "shipping complex data structures back and forth", would bring the plugin under the scope of the GPL. And if the plugin is simply linked to the main program, the GPL FAQ says this:

If the main program dynamically links plug-ins, and they make function calls to each other and share data structures, we believe they form a single combined program, which must be treated as an extension of both the main program and the plug-ins. If the main program dynamically links plug-ins, but the communication between them is limited to invoking the ‘main’ function of the plug-in with some options and waiting for it to return, that is a borderline case.

The main issue is if the main program is covered by the GPL and the plugin is not; that is, can you build proprietary plugins to GPL programs? The Gnu position certainly suggests not. But Crate is more worried about the reverse direction: can you write GPL plugins for use with a proprietary package. That is a little less likely, on the theory that the main program is independent of the plugin and can't have its license altered by an optional component. But then that theory is somewhat undermined if Crate makes both the proprietary product (CrateDB) and the Elasticsearch plugin.

(Another perspective on the plugin-for-a-GPL-package issue is whether, in writing the plugin, you needed any access to the source code of the main package. If all you need is the interface specification, then perhaps not.)

The Crate blog also says this:

Also, raising venture capital (and a lot is needed to build a database from scratch) is very difficult with copyleft code, since it simply reduces possible future opportunities.

The GPL strategy certainly worked for MySQL. But MySQL was started in the last century.

The bottom line is that Crate will be working with a fork of Elasticsearch in the future.

Clang

Why did Apple create the Clang compiler, and switch from gcc?

In 1989 NeXT computer apparently added support for Objective-C to gcc, and distributed the binaries but never released the source. But this isn't the whole story: Clang is a front-end that is part of the LLVM open-source compiler project, which is licensed under the Apache license.

Even that isn't the whole story: while at UIUC, Chris Lattner did major development work on the LLVM compiler collection, and wrote his PhD thesis about it. After he got his PhD, Apple hired him to turn LLVM from a research compiler to a robust production compiler. And the source is still open.

One issue is that, back when gcc was first developed, compilers were strictly black boxes, that converted your source code to object code. But this is no longer really true: most IDSs have extensive hooks into their compiler. This way they can show compiler error messages tied to line numbers, and show syntax errors before compilation (because the parser runs on your source as you type). Clang also supports code-analysis plugins; under the Apache license, such plugins can remain proprietary. Could the plugin issue be the real reason for Clang? Clang also has internal structural features that make it easier to tie late-compilation and even run-time issues back to a specific source location.


More on Licensing