Class 4 Notes

Comp 141 Class 4 notes

June 13

Shotts Some material is from 7: Seeing the World as the Shell Sees It

Chapter 8: Keyboard Tricks

Chapter 9: File Permissions

Chapter 10: Processes

Chapter 11: The Environment, officially this time (maybe)

Videos:

8. Keyboard Tricks

9. Permissions

10. Processes

Review pipelines

Guacamole is broken (but maybe not for you)

You've been using it to log in to your Loyola virtual machines. But our installation is broken, in that at least for me when using Ubuntu it won't transmit control characters. It does work for me when using a Mac. I have no idea about Windows. You'll need Control Characters soon enough.

To test if these work, log in and type Cntl-C. You should see ^C and get a new line. If you just get the letter c, it's broken.

In which case here are the options:

Use the Loyola VPN to connect to your Loyola Virtual Machine using ssh. I've signed everyone up for that. Instructions at www.luc.edu/its/informationsecurity/resources/loyolasecureaccess.
Create a Linux Virtual Machine on your laptop. You need 20-30 GB of free disk space.
Use the Visual Studio codespace

See the week 1 notes for more information on 2 and 3.

I am trying to get our installation fixed, but it may take a while.

Homework 2

How did demo.sh figure out who you were? All the machines are identical clones of one another, and I sent the same files to each machine.

How the shell sees things

The echo command is handy because it simply displays its expanded command line. Thus, echo * prints all files in the current directory, if there are any, and the * alone if there are not (Demo)

The ls * command does the same, but only because * matches filenames, and ls echos back any filenames it is given. This is subtly different from echo, which echos back everything.

Filename expansion works for more than individual files; for example, try echo */*, or echo [a-d]*/*, or echo /usr/*/share

Filename expansion does not include "hidden" files, or "dot" files. We can use ls -a or ls -A (the latter omits . and ..). Some people use the pattern .[^.]* [Schott writes this with ! instead of ^, which is exactly the same thing]. Here's what it means:

. Matches the . character and the . character only
[^.] Matches any character other than ".". But there actually has to be some character
* Matches zero or more characters

The second rule blocks "." (because there isn't any second character) and ".." because the second character can't be a "."

This pattern still does not match filenames like "..foo". There are stronger matching patterns in bash, but they are disabled by default, and we will ignore them

Patterns like *, foo.? and .[^.]* are technically called regular expressions. They have many uses for pattern matching aside from listing files. There is a famous quote from Jamie Zawinski, though, from 1997:

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

There is also tilde expansion. A ~ alone matches your home directory; ~foo matches the home directory of user foo, if there is such a user.

There is also arithmetic expansion. Try echo $((2+2)). But why do you need two parentheses on each side? For Shotts' example on page 71, note that

echo $(((5**2)*3))

is enough; you don't need the inner $

Then there is brace expansion. This is a handy way of generating a sequence of names like file01.text, file02.text, file03.text, .... One writes

echo file{01..10}.text

That goes to file10.text.

I have never used this.

Parameter Expansion

Schotts describes this on page 73. It's about expansion of shell variables, like $PATH.

Note that mistyped variable names don't trigger an error. echo $PAHT displays nothing. Why?

Command substitution

This is incredibly useful for writing shell scripts. It lets you turn the stdout of a command into a string, to be used in another command. You simply include the command in $( ). Here are a few examples:

echo $(ls)
ls -l $(which ls)
ls -l $(ls -d /usr/bin/* | grep zip) (same as ls -ld /usr/bin/*zip*)
file $(ls -d /usr/bin/* | grep zip)

Quoting

The shell allows both strings in single quotes ('hello') and double quotes ("hello"). They do different things.

Both preserve spaces: echo this is a test vs echo "this is a test"

Both allow files with spaces: less "stupid report file.text"

Both suppress pathname expansion. Try echo '*' and echo "*".

However, variable expansion is treated differently: compare echo '$PATH' and echo "$PATH"

You can also escape characters: less stupid\ report\ file.text The \ prevents the shell from recognizing the following space character as a separator; it is instead treated as part of the string. There are specific \ escape sequences for, say, newline, or tab, but note that echo ignores these unless you give it the -e option.

Advanced Keyboard Tricks

You can type clear to clear the screen.

Typing an up-arrow gives you your previous command. You can edit it by arrowing left and right. [There are lots of other editing tricks, but we're going to ignore them]

Typing history shows you the list of all your prior commands. Multiple up-arrows steps through this list.

Command-line completion

Press TAB to have the system autocomplete the command name. I seldom use it. Try it with echo and ls (it doesn't really work for ls)

Typing TAB twice gives a list of possible completions.

Where it's most useful is in examining shell variables. For example

echo $LD_LIBRARY_PATH

is tedious to type. But the TAB after LD is great.

Your bash history is kept in a file (.bash_history), and is displayed by the command history. To re-enter line 729 of the output of history, use !729.

Permissions

Each process -- including shell sessions -- has a user id associated with it. The things a process can do depend on what permissions this userid has.

The root user -- numeric userid 0 -- can do everything. You don't want to do work logged in as root, however.

Numeric userids are connected to user names in /etc/passwd.

Users also have a group id defined in /etc/passwd. The file /etc/group allows users to be in multiple groups. The command groups shows you what groups you belong to.

Processes also have a group id.

The command whoami gives your user name; the command id gives your numeric userid.

Changing someone's numeric userid in /etc/passwd is usually a nightmare.

File permissions

The original Unix systems were big multiuser systems. You could set permissions for

yourself
your group
everyone else

In Linux, these are known as user, group and other permissions. For files, they can be permission to read a file, to write to a file, and to execute the file as a program (one of the strongest reasons for not granting execute permission to everything you can read is that most files are not in fact executable). These permissions are abbreviated r, w, and x.

The first columns of ls -l's output show the rwx permissions for the user, the group, and others. If you're allowing other members of your workgroup to read your document, but not others, its permissions would be set to rw- r-- ---. If everyone in your group can write to it as well, and others can read it, the bits would be rw- rw- r--.

In a classic multiuser system, your group might correspond to your department, or to a workgroup within the department. In a single-user system, groups don't necessarily make traditional sense, but they can be used to give the primary user various fine-grained permissions. The most common group strategy on single-user Linux systems is to create a group for the user, named the same as their username. In this case, user permissions and group permissions are typically the same.

These rules are baked into the typical Unix filesystem: files have nine permission bits, and belong to one owner and to one group. This is a limitation on true multiuser shared systems. Some possible alternatives:

Files could belong to multiple groups, each with group permission (users can be in multiple groups already)
Files could have multiple owners
Files could have access control lists (ACLs) allowing precise access to a list of specified users
Files could have different permissions for write and for append
Directories could have different permissions for writing to a file and creating a file

The windows NTFS supports ACLs, as does Windows itself.

But one problem with more elaborate access control is that it's hard to audit. Is there a simple command on Windows to list everyone who can read your file, for example? It's not easy.

chmod

The chmod command is used to change file/directory permissions; think of it as change mode. You specify the permission rules, and then the files. Permission-change rules are of the form:

u+x
u+x,o-w
gu+rx
o-rw
go=rx
+x
+r

Demo: examples

Permission bits can also be set as octal numbers, since 3 bits is one octal digit. 4 is r--, and 6 is rw-. For example

chmod 644 foo

I'd avoid octal unless you're really comfortable with it, except for the following. The command umask shows your "new file permission mask". This is an octal number representing the bits that are turned off by default; that is, new files are created with mode \0777 xor umask (or \0666 for files that are not meant to be executable. My umask is 002, so files are created by default with permissions 664, or rw- rw- r--. If I change it using umask 022, then my files will be created with mode 644, or rw- r-- r--.

chown changes the owner. There's a -R option for recursively changing the ownership of every file in a subdirectory.

Directory x permission

You need x permission to access files inside a directory. All the parent directories need x permission too.

Example: create a directory, cd to it, and then turn off x permission on it for yourself

First character of ls -l output

The permission flags are characters 2-10 of the output of ls -l. The first character is an indicator of something more general:

- for regular files
d for directories
l for symlinks
c for character-oriented device files
b for block-oriented device files (network and disk devices)

The setuid bit

In addition to rwx, the user also has a setuid bit. This originally had meaning only for executable files. If set, it means that if someone runs that file, they have the permissions of the owner of the file. We can find the setuid commands in /usr/bin with ls -l /usr/bin | grep '^...s'. (Note that a single '.' matches one character in grep, versus a '?' for filename matching. The '^' character matches the start of the line.)

There is also a setgid bit. For an executable, if the setgid bit is set then someone running it has group identity that of the group in question. On directories, if this bit is set then files created there have the group id of the group in question, not that of the creating user.

There is a third bit. It might have been called the setotherid bit, but that makes no sense. It is known as the "sticky bit" when applied to directories; it allows deletion of a file only if you own the file, or the directory (or are root). For example, the /tmp directory has 777 permissions, and this prevents others from deleting your files.

Changing your userid

You can change to a different userid. Sometimes this is useful for specific non-root userids, but is most often done to become the root user.

The su command is used for becoming root. You need to know the root password, though. (Also it subtly messes things up with a gui interface, as applications with windows that you start probably will not succeed).

The sudo command allows you to run individual commands with root privileges. It is configured so you only need your own password. (That would be a terrible way to configure it on multiuser systems, generally.) You can check the sudo configuration with sudo -l.

sudo ls -l / # kind of useless

sudo chmod o+r /var/log/syslog # give everyone, including yourself, read access to /var/log/syslog.

There's also the passwd command, to change your password.

Bill and Karen's Excellent Adventure

Suppose Bill and Karen want to share a directory, say /usr/local/share/Music. We start by creating a group "music", and changing the group ownership of this directory to "music". We also add bill and karen to this group, and make the directory group-writeable. Finally, we need to set the setgid bit on the directory. That forces any newly added file to be switched to group "music", versus, say, the bill group or the karen group.

Users bill and karen must also set their umask to 0002, not 0022. They must make sure any files they create are group-writeable!

Processes

A process is something that gets scheduled on the CPU. It has its own virtual memory space, its own stack,its own heap, and its own code segment (often called "text segment" for historical reasons). When a process is suspended, the address of the next instruction to be executed is saved, as is the virtual-memory mapping. When the process is scheduled, the VM map is restored, the registers are restored, and control branches to the previously saved address of the next instruction.

One process can have multiple threads, but we'll ignore that here. Two processes can use shared memory; that is, a set of pages of physical memory that are mapped into the virtual-address spaces of both processes. The processes can talk to one another that way.

Processes can create other processes, eg with the fork() system call.

Every process has a numeric Process IDentifier, or pid. You can see these with the ps command (ps -ef or ps ax). Actually, the top command is simplest, but note that it goes into full-screen mode; type q to quit.

It is common to pipe the output of ps into grep to look for specific processes; eg ps -ef | grep -i firefox or ps ax | grep -i cups. Of course, that is likely to find the grep process as well.

The command ps alone, without the -ef or ax, just shows processes that are still "attached" to your terminal. That pretty much means processes you explicitly started.

Shotts has a table of process run states on page 112. R means it's either running or is waiting to run. S means it is sleeping, which usually means in turn that it is waiting for input. D used to be the bane of my existence. Z means it's become a zombie.

The top command is a bit better at showing how much memory and cpu the process is using.

Starting a process

Any command we type will start a process (unless it's a shell built-in command). Here are some gui-display commands:

xlogo
xclock
gedit

What does the shell do when you start one of these? It waits. You can tell it not to wait by typing an '&' after it.

We can send signals to processes. Typing Cntl-C in the shell window where we started the process sends it the Interrupt (INTR or INT) signal. It exits (although processes can catch this signal). Other signals:

HUP (brings back memories of dialup terminals)
KILL
TERM (what is this for?)

Signals can be sent manually to processes using the kill command (which, despite the name, sends the TERM signal).

To be able to send a signal to a process, you must either be root or else own the process (that is, your userid matches the userid of the process)

Finally, let's do this:

start a process, say xlogo (with &)
verify with ps -ef what the process PID is, and also the PPID. Verify the PPID is that of the shell that started xlogo.
Now kill the parent shell. What is the PPID of xlogo now?

If you want to reproduce this from the command line and can't run xlogo, try sleep 120 &. This sleeps for 120 seconds.