Social and Communication Issues

Managing Open-Source Projects

Reading: Producing Open-Source Software (local copy), by Karl Fogel. Also at producingoss.com/en/producingoss-letter.pdf.

Chapter 4: Social and Political Infrastructure
Chapter 6: Communications

See also social.pdf, communication.pdf and toxic.pdf, from Matt Butcher

Managing volunteer labor, as in open-source projects, is a little different from managing employees, though many of the same ideas still apply. An employee manager can use salary to motivate workers, and can point to the company's bottom line as the ultimate indicator of success. That said, a manager of employees must still resolve disputes, must still determine project goals, and must still find ways to motivate workers in between annual salary reviews.

A 2019 study at Baylor University by Quade et al concluded that motivating workers through the company's bottom line may backfire: "Supervisors driven by profits could actually be hurting their coveted bottom lines by losing the respect of their employees, who counter by withholding performance". So even for-profit managers often have to use the same motivational strategies as one might use with volunteer devs.

Here is a good summary of a large number of potential open-source management issues: github.com/PayDevs/awful-oss-incidents. Some of these relate also to open-source security or to monetization.

Who will be in charge of the open-source project? Who will decide what pull requests to commit? Whoever it is does not get to decide the agenda (unless there are paid employees involved); that is set by the contributors. If no one wants to work on feature A, then it doesn't get done (unless the manager can convince people that feature A, despite initial appearances, really is important).

As an example, suppose the Free Software Foundation created a browser (besides wget). If someone contributed code to allow the playing of DRM-protected content within that browser, the FSF would almost certainly reject it. But who would actually make that decision?

For many projects, there is a top-level "board" of sorts; a group of "founders" who make decisions like this. Governance is usually a meritocracy: maybe only contributors get to vote?

For other projects, one person is appointed benevolent dictator. A BD is more like an arbitrator, or judge, than a decision-maker; generally, lower-level participants have worked out the pros and cons of each side, before the BD steps in. A good BD need not be a great programmer, but must have a strong vision for the project.

Social.pdf has several slides on the BD role.

An individual or group that sets up an open-source project can set up governance however they wish. However, if contributors are dissatisfied, they are likely to fork. Forking is bad: it gives outsiders the impression of dissension, and it faces outsiders to choose which version to install. A large number of them will choose neither, because it's simpler.

Lessig's book "Code" contains a fair amount of material on project governance.

Some BDs:

rms (some people dispute the B)
Linus
Guido van Rossum, developer of Python

what's with Python2 v Python3?

The IETF and consensus:
(Ok, the IETF isn't exactly about Open Source, but the governance is similar.)

Dave Clark:

We reject: kings, presidents and voting.

We believe in: rough consensus and running code.

Jon Postel:

Be liberal in what you accept, and conservative in what you send.

Postel and crypto: is there a real conflict here?

What if consensus cannot be reached? See ietf64.
Voting! But who gets a vote?

Communications

POSS chapter 6, Butcher communication.pdf

Asynchronous v synchronous
IRC vs Slack

Archived email is ubiquitous within the IETF. One common problem is that some contributors may not know where the archive is. Slack is another approach.

Most open-source projects show a strong preference for asynchronous communication: no video chat, for example.

That said, the IETF actually has regular f2f meetings, at which IETF workgroups may meet as well. One advantage of f2f meetings is that it requires considerable investment to attend; this eliminates dilettantes. But that kind of money may not be an option in an open-source project.

Rules for email (communication.pdf)

Avoid html (why?)
Pay serious attention to punctuation and grammar
Avoid ambiguity. Consider supplying a second, alternative wording.
Use good visual formatting: indentation and blank lines
Be aware of the 80-character limit
Quote when responding. Inline responses are best. Delete unnecessary parts of the quote!
Put your response after the quoted material you are responding to.
The subject: line will become a thread key. Change it when appropriate, but not otherwise. See POSS, Productive vs Unproductive Threads
Pay attention to where you're sending the message. To the list alias is correct. Sometimes cc: to those you're replying to.

This is actually a hard problem. See The Great Reply-to: Debate on POSS p 48. Should the mailing list set the reply-to: header to be the email address of the list? Unfortunately, many people set that themselves, and won't get email otherwise. Also, it breaks the idea that reply-to-author sends only to the author. The IETF likes to have the list be a cc: address.

Generally include only one major point per email

Avoid hyperbole (but don't be elliptical either)
Think before you send!

Here's another discussion of email rules: useplaintext.email. The title refers to plaintext email; that is, not html (see the first item above), but there's a good discussion under "Etiquette recommendations for plaintext emails" about the drawbacks of "top posting"; that is, putting your reply at the top of the email, and including the quoted previous email below that. This is very irritating on mailing lists, as it's hard to follow what is being said. The right way to quote is "follow posting"; that is, to have your answer below the quoted part of the previous message that you are responding to. Like this:

> I'm in favor of allowing IPv6 prefixes to be arbitrarily long.
> Fixing the limit at 64 bits will bring about address-space exhaustion.

Lots of software out there already assumes a 64-bit prefix. There are 2^64 of those;
we won't run out of address space anytime soon. And one risk of allowing longer address
prefixes is that it will break mechanisms for changing the low-order host bits at
regular intervals in order to enhance privacy.

The site puts it this way:

A: Because it reverses the logical flow of conversation.
Q: Why is top posting frowned upon?

Fogel strongly recommends that all discussion take place on the mailing list; that is, avoid private email exchanges "to work things out". The problem with taking a conversation private is that some other participants, who might have been very interested despite not having said anything earlier, are now cut out. And they may not buy into the new compromise. In fact, other participants might have been deliberately avoiding replying to a "stupid" comment, out of courtesy; such people may not be happy seeing that commentator's suggestion incorporated into a compromise. You can never tell who is interested in a given thread!

That said, private discussions are often essential for dealing with security vulnerabilities; if you tell the list, then you've told the world. And if the patch isn't ready yet, that's a problem.

Most sites make the email archive available via a web interface. This is an excellent idea.

There's also IRC and Slack.The problem with Slack is that it is not open-source, and is supposedly an acronym for "searchable log of all communication and knowledge". In other words, a tool for spying on you. On the other hand, IRC never quite grew up.

Messages to the mailing list should be clearly written. Generally one major point per email is appropriate; if you have two points, consider two emails, with two different Subjects. All posts should have a purpose, and a project-related one at that. Emails should also be reasonably well formatted. To put it another way,

You are what you write.

The project leader, or leaders, should work to set an appropriate tone. It has become popular for open-source projects to have an explicit Code of Conduct addressing how people should behave. An example is at documentfoundation.org/foundation/code-of-conduct.

Rudeness can be a very serious problem! Nip Rudeness in the bud (POSS p 28)

Draw attention to rudeness early, but move on promptly
Never retaliate with more rudeness, or one-upsmanship,
Be alert for techniques:

filibusters
obstructionism
the shape of the table (during the Viet Nam war, the peace negotiators spent a lot of time on this)
Anything in svn.cacert.org/CAcert/CAcert_Inc/Board/oss/oss_sabotage.html

Communication.pdf (Butcher) has some examples.

See also ietf64.

And don't forget lkml.org.

Speaking of which, you might check out lkml.org/lkml/2017/10/26/511, from Oct 26, 2017, written by Linus himself:

Stop gthis f*cking idiocy already!

As far as the kernel is concerned, a regressions is THE KERNEL NOT
GIVING THE SAME END RESULT WITH THE SAME USER SPACE.

The regression was in the kernel. You trying to shift the regressions
somewhere  else is bogus SHIT.

And seriously, it's the kind of garbage that makes me think your
opinion and your code cannot be relied on.

If you are not willing to admit that your commit 651e28c5537a
("apparmor: add base infastructure for socket mediation") caused a
regression, then honestly, I don't want to get commits from you.

This is pretty much exactly the wrong tone to take as project leader. That said, this developer was trying to weasel out of blame by claiming that the change did not lead to a "kernel" regression.

A year later (September 2018), Torvalds took some time off as leader in order to try to get a handle on communications that were often seen as abusive. "I need to change some of my behavior, and I want to apologize to the people that my personal behavior hurt and possibly drove away from kernel development", he wrote. He returned to his role as Linux BD a month later. The Linux Foundation added an official code of conduct: linuxfoundation.org/code-of-conduct.

Avoid lengthy (and silly) disclaimers, such as the one on p 122 of POSS. If your workplace requires a disclaimer, consider using a private email account. If someone posts to the list and uses a long disclaimer, privately suggest that it isn't necessary.

Be aware that sometimes a small minority can be very noisy on the email list. That does not mean they are right.

Avoid bashing competing open-source projects. If there's a fork, your project may become one of them.

Avoid bikeshedding, that is, spending way too much discussion time on the small, unimportant details (though this is not limited to open source). See POSS, The Smaller the Topic, the Longer the Debate.

Remember: in Open Source, you cannot just cut off a debate by managerial fiat.

Avoid Holy Wars. Typically these start about

Computer languages
licenses
expert v novice UI
little-endian v big-endian

The best way to damp these down is to have the project leader call for peace, and point out that the discussion is getting into issues that are either irrelevant or settled.

Dealing with difficult people and "toxic contributors":

toxic.pdf (notes from Matt Butcher)

Work out ahead of time a policy on the release of security vulnerabilities. Once you release a patch, the problem is out of the bag, because the vulnerability will show up in the code. You may need to have a private discussion with trusted contributors in order to prepare a fix. This is one time private discussions are reasonable.

Finally, and related to Torvalds' language above, sometimes devs put profanity in code (generally in the comments, and perhaps especially in test-suite code). Sometimes this really bothers other people. Should you have a rule?

One theory is that profanity in code reflects the stress that the developers are under, and so should be grudgingly tolerated.

Jan Strehmel, in a thesis submitted to the Karlsruhe Institute of Technology, found a modest but significant link between the presence of profanity and code quality. Strehmel used the SoftWipe package to measure code adherence to quality standards:

SoftWipe is an open source tool and benchmark to assess, rate, and review scientific software written in C or C++ with respect to coding standard adherence. The coding standard adherence is assessed using a set of static and dynamic code analysers such as Lizard (https://github.com/terryyin/lizard) or the Clang address sanitiser (https://clang.llvm.org/). It returns a score between 0 (low adherence) and 10 (good adherence).

The thesis is at cme.h-its.org/exelixis/pubs/JanThesis.pdf.

Basics and Background

Producing OSS , by Karl Fogel

1. Intro

You may not get any devs, just complaints
Presentation is important!
Open-source projects need careful management
Cultural issues can be fatal

BSD: free software without ideology?

2. Appearances

Choose a good name
Own the name, eg as a domain name
Create a decent website
have an appropriate mission statement (see the hadoop statement as an example)
pick a license and make it clear
Supply a feature list
Supply a download mechanism
Supply a bug tracker
Supply a way for devs to communicate
create developer guidelines
make sure there is documentation

And do not forget Code Review.

Political issues

POSS chapter 4

The bus factor.

Forking: what happens if you cannot manage disagreement.

Benevolent Dictators, and their own ability to fork the project. Due to forking, a BD is king of a land where everyone is free to leave at the drop of a hat.

Version control helps here: you can have people working on different forks within the same project.

Forking is why consensus is more important than voting or democracy.

When trying to achieve consensus, it is very helpful to have honest brokers: people who understand both sides of the issue and are willing to explain, on the email list, just what the consequences are. Who is an honest broker in today's political discussion?

When is voting necessary?

Who gets to vote? One theory is everyone. Another theory is just contributors.

Stages

Here's an essay Seven Stages of Open Software, by Matthew Rocklin. These stages apply most directly if a company is opening up their own project. Here are the stages:

Making your source code public
Licensing it so users can make and resdistribute changes
Accepting contributions from users outside the company
Moving development (and dev communications) to outside the company
(no more internal Slack!)
Moving project decision-making to outside the company
Getting support from other institutions outside the company
Quietly retiring, so the project is now completely separated from the founding company

MySQL is only at step 2! In fact, many corporate-owned projects get stuck at step 2, though step 3 is often a major goal.

Once a project makes it to step 5, it's no longer directly controlled by the founding company. That said, step 6 is a major one in terms of external buy-in as an open-source project (versus just as users).