Fourth of July, part 2
Shotts:
This is the second part of the July 4 Alternative Lecture, and we'll cover a variety of things:
Many of the bash scripts used in this lecture are here.
Let's start with this script that reads a file (the filename is provided as the single argument), converts the entire file to lowercase, and writes to stdout. It has no options itself, but we'll use it in something else that does. The script works by reading in the input file one line at a time, and using the tr command to convert that line to lowercase, and then writing to stdout.
#!/bin/bash
set -eu
set -o pipefail
if [ $# -ne 2 ]
then
FILENAME=$1
fi
while read THELINE
do
echo $THELINE | tr A-Z a-z
# convert THELINE to lowercase, and write it to
stdout
done < $FILENAME
Now we want to embed this in another script, lcbunch, that converts all the regular files listed on the command line as file arguments. lcbunch also has the following options, supplied as command-line "option arguments":
-v
print the
name of each file converted on stdout
-e extension add the
supplied extension to each new file created
-d dirname put the
converted files in the given directory (which must already exist)
The user must supply either -e or -d. Both can be supplied. All options must come before any files. As an example,
lcbunch -v -d mydir *.text
converts all the .text files and puts the converted files into mydir.
We will do this two ways, first using bare-bones bash, and the shift command, and then with getopts, which automates this kind of option processing.
Here is the bare-bash version:
#!/bin/bash
#
-v
print the name of each file converted on stdout
# -e extension add the supplied
extension to each new file created
# -d dirname put the converted
files in the given directory (which must already exist)
set -eu
set -o pipefail
OPTDONE=0
VERBOSE=0
OUTDIR="."
EXTENSION=""
USAGE="usage: lcbunch [-v] [-e extension] [-d directory] files"
while test $OPTDONE -eq 0 -a $# -ge 1
# options processing is done in
this loop
do
case $1 in
-v)
# echo
"setting VERBOSE"
VERBOSE=1
shift # for the -v
;;
-e)
EXTENSION=$2
# echo
"setting EXTENSION to $EXTENSION"
shift # for the -e
shift # for the option value following -e
;;
-d)
OUTDIR=$2
shift # for the -d
shift # for the option value following -d
;;
*)
OPTDONE=1
;;
esac
done
if test -n "$EXTENSION"
then
EXTENSION='.'$EXTENSION
# add the dot
fi
# check if EXTENSION or OUTDIR was supplied
if test -z "$EXTENSION" -a "$OUTDIR" == "." # neither
changed
then
echo "Must supply -e or -d"
echo $USAGE
exit 2
fi
# At this point, $* just consists of files to be converted
while test $# -gt 0
do
INFILE=$1
if test -f $INFILE
# make sure
it's a regular file
then
OUTFILE=$OUTDIR/${INFILE}${EXTENSION} # curly braces
to avoid running into one another
if test $VERBOSE -eq 1; then
echo "converting $INFILE" to $OUTFILE; fi
lcasecopy $INFILE >
$OUTFILE # copy this file
fi
shift
# next file
done
The first while loop does the options processing. We work through the $* string of all arguments (with shift), after setting shell variables for all the option defalts (VERBOSE, EXTENSION and OUTDIR). If the first argument is -v, we set VERBOSE, and shift. Now $1 is the next argument, and we go around the while loop again.
Likewise, if the first argument is -e, we set EXTENSION to $2. We then shift twice, to pop both those values off of $*. Similarly for -d.
As long as we have options, either -v alone or an -e or a -d with something following, we keep going. Each time we start the while loop, if there are any more options then $1 must be -v, -e or -d. When we are done with the options, and get to the files, the case *) matches, and we set OPTDONE=1. This leads to our exiting from the loop, and moving on to the file-processing section.
After the while loop, we add a "." to EXTENSION, if necessary, and check that either EXTENSION or OUTDIR is different (otherwise we'll be writing files in place, which is bad).
At this point we've popped all the options from $*, and all that's left is the files to convert. We handle them one at a time, shifting after each one. We're done when the argument count, $#, reaches zero.
Note that we redirect the stdout of lcasecopy into the filename $OUTFILE
The only thing that changes is the first loop. It is now
while
getopts "ve:d:" opt
do
case $opt in
v)
echo
"setting VERBOSE"
VERBOSE=1
;;
e)
EXTENSION=$OPTARG
# OPTARG is set by getopts
echo
"setting EXTENSION to $EXTENSION"
;;
d)
OUTDIR=$OPTARG
;;
*)
echo $USAGE
exit 1
;;
esac
done
shift $(($OPTIND-1))
getopts is built into bash. The first string is the list of option letters, and whether something must follow them. "ve:d:" means that the options are -v, -e and -d, and the latter two have an argument following them (as indicated by the colon). The opt is a shell variable name of our choosing.
getopts also sets the two special variables OPTARG and OPTIND. OPTARG is the option argument, if one is expected; it plays the role of $2 in the first version. OPTIND is how many places in $* we have advanced.
We do not do a shift after each argument. Instead we do it all at the end, with shift $(($OPTIND-1)).
We also omit the hyphen before the option letters: they are v, e and d, not -v, -e and -d.
The first and most basic technique is to print out key variables at selected points, with
echo $myvar
or, better yet,
echo "filename is $myvar"
or something else that will help you understand what you're looking at.
It is helpful to have these print out only if some variable, eg DEBUG, is set to 1:
DEBUG=1
if test $DEBUG -eq 1; then echo "filename is $myvar"; fi
There is also debug mode, enabled by running your script as
bash -x myscript.sh
Debug mode prints out
As an example let's look at numsum:
#!/bin/bash
# adds up the integers from 1 to $1
NUM=$1
SUM=0
while test $NUM -ne 0
do
SUM=$(expr $SUM + $NUM)
NUM=$(( $NUM - 1))
#alternative way to do arithmetic
done
echo $SUM
The output of bash -x numsum 4 looks like this (some comments added).
+ NUM=4
+ SUM=0
+ test 4 -ne 0 #
first time through the loop
++ expr 0 + 4
+ SUM=4
+ NUM=3
+ test 3 -ne 0 #
second time through the loop
++ expr 4 + 3
+ SUM=7
+ NUM=2
+ test 2 -ne 0 #
third time through the loop
++ expr 7 + 2
+ SUM=9
+ NUM=1
+ test 1 -ne 0 #
fourth time through the loop
++ expr 9 + 1
+ SUM=10
+ NUM=0
+ test 0 -ne 0
+ echo 10
10
The lines beginning with ++ show external commands (not built-ins) that are invoked. In this case, that's expr.
Recall the example from earlier of linecounter where I piped the file into the while loop, instead of redirected the input. Use of debug mode let me see that the NUMLINES variable was being regularly incremented, only to revert to 0 at the end. From that, and with a little help from StackExchange, I was able to figure out that pipes live in subshells, and subshell variables have no connection to parent-shell variables.
You can turn debug mode on and off from within your script dynamically with
set -x # enable debugging
set +x # disable debugging
Note that minus, "-", is used to add debugging, and plus, "+", is used to take away debugging.
enabling syntax highlighting in your text editor can also help here. In vi/vim, you do that with :syntax on, in command mode.
Here are a few common ones:
Note that leaving off a do or then can be confusing, since bash does allow multiple commands as part of if/while tests, and the do/then marks the end of this list.
Shotts has a good section on this in Chapter 30.
Recall that bash does line expansion whenever a command (even a built-in) is executed. This exposes you to issues with file-name globbing and variable expansion. The Shotts example is this:
number=
# empty string!
if test $number = 1
then
echo something
fi
We should have written if test "$number" = 1, with quotes, but we didn't. So the second line expands to if test = 1, which is an illegal test expression.
Expansion problems in bash are legion. Watch out!
If we create a file with the above (so that the if test $number = 1 is line 7) and run it, we get
line 7: test: =: unary operator expected
That's mysterious, since looking at the code we're clearly using = as a binary operator. But here it is with bash -x:
+ number=
+ test = 1
trouble: line 7: test: =: unary operator expected
Now it is clear that the arguments to test are = 1, and indeed something is missing.
If we put "" around $number (as we should), the original script runs fine (though as written it produces no output, because "" is not equal to 1, and there is no else clause).
A couple more bash-specific issues are: