Shell Lab
This shell lab is inspired by the one by Bryant and O’Hallaron for Computer Systems: A Programmer’s Perspective, Third Edition
Due: Wednesday, November 9, 11:59pm
For this assignment, you will implement a simple shell-scripting language, whoosh. The whoosh language is not entirely unlike bash, but whoosh is intended exclusively for batch mode. The starting code includes the language parser and an initial evaluation framework that works for a single command. You will change the initial evaluation so that it supports running multiple processes, in some cases piping the output of one command to the input of another. Implementing just the interpreter without support for Ctl-C is worth a check grade (i.e., 80%). Adding support for Ctl-C is worth check~ (which will count as 90%). Finally, adding support for background tasks is worth a check+ (i.e., 100%).
whoosh Script Syntax
A whoosh script can contain blank lines or lines that start immediately with #, and those lines are ignored. Any other line must have the form of a ‹group›:
Text with a gray background, such as |, indicates characters that appear verbatim in a script. Text in angle brackets, such as ‹command›, is a non-terminal that refers to a grammar production. A * on a non-terminal means zero or more repetitions of the terminal, and a ? means that the non-terminal is optional. Non-linebreaking whitespace is implicitly allowed between grammar elements.
| ‹group› | ::= | ‹modifier›? ‹commands› |
We’ll return to the ‹modifier› options later. The ‹commands› part specifies either a single command, multiple commands joined by || to be run independently in parallel, or multiple commands joined by | to be run in a pipeline:
| ‹commands› | ::= | ‹command› |
|
| | | ‹or-commands› |
|
| | | ‹and-commands› |
| ‹or-commands› | ::= | ‹command› |
|
| | | ‹command› || ‹or-commands› |
| ‹and-commands› | ::= | ‹command› |
|
| | | ‹command› | ‹and-commands› |
A single ‹command› could also parse as ‹or-commands› or ‹and-commands›. It turns out that all three interpretations behave the same way, while the parser reports a single ‹command› as a special case.
Each ‹command› is much like a ‹command› in any shell language: the path of an executable file (as an absolute path or relative to the current directory) followed by arguments to the executable. A ‹command› can optionally end in @ ‹variable›, as explained futher below:
| ‹command› | ::= | ‹executable› ‹argument›* ‹at-variable›? |
| ‹at-variable› | ::= | @ ‹variable› |
If a ‹command› ends with @ ‹variable›, then ‹variable› will be set to the command’s process ID. An ‹executable› or ‹argument› can be a ‹literal›, such as /bin/ls or -l, where " acts as an escape to allow arbitrary ASCII characters (other than " itself) until a closing ". Instead of a ‹literal›, an ‹argument› can be a ‹variable›, which always starts $.
| ‹executable› | ::= | ‹literal› |
| ‹argument› | ::= | ‹literal› |
|
| | | ‹variable› |
| ‹literal› | ::= | sequence of characters a-z, A-Z, 0-9, ., :, _, -, =, and/or / and/or other characters between matching "s |
| ‹variable› | ::= | $ followed by a sequence of characters a-z, A-Z, and/or 0-9 |
Finally, the ‹modifier› at the start of a ‹group› can indicate a number of repetitions, a ‹variable› that receives the exit status of a ‹command› in the ‹group›, or both:
| ‹modifier› | ::= | repeat ‹n› |
|
| | | ‹variable› = |
|
| | | repeat ‹n› ‹variable› = |
whoosh Script Semantics
Each ‹command› in a whoosh program starts a process in the usual way. When a ‹group› contains multiple ‹command›s, the corresponding processes are all started at once.
Except for ‹command›s in an ‹and-command›s, each process for a ‹command› inherits the input, output, and error file descriptors of the whoosh process. Within an ‹and-commands›, the first ‹command› inherits the input of whoosh, and the last ‹command› inherits the output of whoosh; otherwise, the output of each ‹command› is piped to the next ‹command›’s input.
Each ‹group› in a whoosh program runs to completion before the next group is started. The definition of “completion” depends on the ‹group› form:
A ‹command› completes when the single process for the ‹command› terminates, either by return/exit or by a signal. In a version of whoosh that supports background tasks, a ‹command› also completes if it is suspended via SIGTSTP.
An ‹or-commands› completes when any one of the ‹command›s completes (in the same sense as a ‹command› by itself). As soon as one command completes, processes for non-yet-completed commands are terminated using SIGTERM. For the purposes of this assignment, assume that SIGTERM will always terminate a process.
An ‹and-commands› completes when all of the ‹command›s complete.
If whoosh receives SIGINT, such as when Ctl-C is pressed, then it immediately terminates all processes for the current ‹group› using SIGTERM and moves on to the next ‹group› (if any).
A ‹group› that starts repeat ‹n› is the same as ‹n› lines that contain the ‹group› without the repeat ‹n› prefix.
A ‹group› line that starts ‹variable› = causes the exit status of the completion-determining ‹command› in the ‹group› to be assigned to ‹variable›. For a ‹group› that is a single ‹command› or an ‹and-commands›, the exit status is used from the last ‹command› in the ‹group›; for a ‹group› that is an ‹or-commands›, the exit status is used from the first ‹command› to terminate (or any of the first terminating ‹command›s if more than one terminate at the same time). The value installed into variable should be the exit code if the process terminates via return/exit or the negation of the relevant signal if a ‹command› terminates or suspends due to a signal.
When a ‹command› ends with @ ‹variable›, then ‹variable› is set to the process ID created to run ‹command›, and the variable is set immediately when that process is started. Processes for ‹command›s in a ‹group› are started left-to-right for the purpose of setting and using variables, but they are started “at once” relative to any other scripting event.
In a whoosh variant that supports background tasks, a command that completes by being suspended becomes a background task. The whoosh interpreter pauses after all ‹group›s in a script have completed, waiting until all background tasks have also terminated. (Hopefully, some other ‹command› along the way resumes the task with a SIGCONT.) While waiting on background tasks to terminate after all ‹group›s have completed, a SIGINT/Ctl-C causes all remaining background tasks to be terminated with SIGTERM.
Every ‹variable› used by a script is initialized to 0 when the script starts.
Example Scripts
If whoosh were used in practice, then most scripts would resemble the first few examples below, which just run programs and maybe pipe output from one to the other. The other examples are meant to probe the process-control issues, even though those issues would show up less frequently in practice.
/bin/ls -l
Lists the content of the current directory in long form.
# This is a comment
/bin/ls -l
/bin/date
Lists the content of the current directory in long form and then reports the current time and date.
/bin/ls -l | /usr/bin/wc -l
Pipes the current directory’s content to a program that counts lines.
/usr/bin/curl www.google.com || /usr/bin/curl www.bing.com
Gets the web page at www.bing.com or www.google.com, stopping when one of them is completely received and printed. Part or all of both may be printed.
$bash = /bin/bash -c "exit 42"
/bin/echo $bash
Prints 42, since the bash process exits with that value.
$result = /bin/bash -c "kill 0"
/bin/echo $result
Prints -15, since 0 as a process ID in the bash process means “this process,” so the bash process exits due to signal number 15.
/bin/echo hi @ $echoPid
/bin/echo $echoPid
Prints hi followed by the process ID used to run the first echo process (although that process ID is of no use, since echo has terminated).
/bin/sleep 1000 @ $sleep | /bin/kill $sleep
Ends quickly, because the sleep process is terminated by kill.
/bin/sleep 1000
/bin/sleep 1000
Prints nothing for at least 33 minutes and 20 seconds. Hitting Ctl-C once reduces the time to 16 minutes and 40 seconds. Hitting Ctl-C twice can end the script quickly.
(If you don’t implement Ctl-C behavior, then hitting Ctl-C once will likely terminate the script immediately.)
$result = /bin/sleep 3
$patience = /bin/test $result "<" 0
/bin/echo "Patience level =" $patience
Prints Patience level = 1, but only if you’re patient enough.
/bin/sleep 3 @ $sleep | /bin/kill -TSTP $sleep
/bin/echo done
When background tasks are supported, prints done and waits for Ctl-C.
/bin/sleep 3 @ $sleep | /bin/kill -TSTP $sleep
/bin/echo done
/bin/kill -CONT $sleep
When background tasks are supported, prints done and waits for 3 seconds (or Ctl-C).
If you create an interesting example script to test your implementation of whoosh, consider posting the example on the discussion forum.
Implementation
The shlab-handout.zip archives provides an initial woosh implementation that works for the simplest example above. You job is to change "whoosh.c" to implement the rest of the whoosh functionality. Within "whoosh.c", you can add functions, change function signatures, or whatever to implement new functionality.
You will handin a single file, whoosh.c, which must use only ANSI standard C syntax, standard C libraries, Linux system libraries, and the "csapp.c" wrapper functions.
You will need to read "ast.h" to know how the whoosh parser represents programs, but you will not need to modify or understand "parse.c". Feel free to use the fail function provided by "fail.c".
Examples and Tests
The "scripts" directory of the unpacked archive includes three subdirectories: "basic", "ctl-c", and "full". The scripts in those directories correspond to the three specified levels of completion for this assignment: basic support, the addition of Ctl-C support, and the further addition of full support for background tasks.
For example, after unpacking the archive, you can use
$ make |
$ ./whoosh scripts/basic/ls-l.whoosh |
to try the first example.
The example scripts include (to varying degrees of precision) the expected output of each example. The expected-output information is in a format recognized by the "test.rkt" script. You can run the initial whoosh build on "ls-l.whoosh" as a test with
$ racket test.rkt scripts/basic/ls-l.whoosh |
If you supply a directory to "test.rkt", then all files in the directory that end with ".whoosh" are run as tests:
$ racket test.rkt scripts/basic |
The "test.rkt" script also accepts an optional --program option to specify a whoosh implementation other than "./whoosh".
Naturally, grading may test your implementation on more or different scripts.
Tips
Start by making a sequence of individual commands work, which involves adding fork.
For ‹and-commands›, you’ll need to use the Unix wrapper Pipe and Dup2 functions. Don’t forget to close pipes appropriately after forking, otherwise pipelines can get stuck.
To protect whoosh from a command that uses kill with 0 to send a signal to all processes in a group, you’ll need to use Setpgid.
When you start implementing Ctl-C handling, you’ll need functions like Signal, Sigprocmask, and sigsuspend. Note that Setpgid will prevent a Ctl-C that is intended for whoosh from being sent to child processes of whoosh, since a terminal will send Ctl-C to a process group. Remember that relevant signals will need to be blocked while a process is being created.
If a process is suspended on Linux, it won’t receive a SIGTERM until resumed with SIGCONT.
If your implementation includes a call to sleep, Sleep, pause, or Pause, then you’re doing it wrong.