CS 3210 - Lab7 - Signal Processing / Job Control

This lab is all about handling signals on a UNIX system and managing running jobs. The chapters correponding to this lab are Chapter 13 and 14.

Signal Handling

Signal handling is discussed in your book in Chapter 13 on p. 235.

What Is A Signal?

Basically, a "signal" in Unix parlance is similar to a hardware interrupt. Signals can be thought of as "soft" interrupts because the kinds of events signals handle are generated by software rather than hardware. Some examples of situations that would send a signal are given in your book on p. 235.

A complete list of all signals can be found in the table on p. 248. An explanation of each is given on pgs. 247-251. Alternatively, typing kill -l (letter ell) will show you a listing of all the signals that can be sent from the "kill" userspace program ("man kill").

Why Do We Have Signals?

The reason why signals exist is so that programs can respond to important events such as an alarm (timer), a segmentation fault (the program is gonna crash), a hangup (someone logged off or got kicked off the system), or a floating point exception (i.e. division by zero). Again, for the full list, see your book on pgs. 247-251.

One example that I'll show in class is how Apache (Web server) is programmed to perform a graceful restart when it receives the USR1 signal.

The Easy Hack

The way signals were originally implemented on UNIX systems was straightforward: you install a callback function that will be called when a signal is sent to the program, as in this example program: easy-signal.c - This one just handles the HUP signal. You can send it a HUP by getting the process ID (using ps), then typing kill -HUP pid.

Here's an alternate program that handles all signals: handle-every-sig.c. Note that signal #9 (SIGKILL) cannot be handled. (Otherwise you would never be able to get rid of the process.)

The Better (And More Complicated) Way

The easy hack, while nice and simple, has a large problem: What do you do if, say, a SIGINT signal arrives at the very moment that the program is running the handler for SIGINT? Either the signal handler would have to be reentrant (discussed on p. 237) or the kernel should just throw away any signal received while a process is in the middle of handling that same signal (which is the default behavior). This is only a partial solution, however; what if you have a time-critical program that needs to know how many times a certain signal was received? Ignoring pending signals while you're in the middle of handling one will not give you what you want. Thus, POSIX signals were introduced. (See p. 240-241)

The price of improved signal handling is increased complexity, as you can see in your book on pages 241-247. I'm not going to repeat all the stuff they say in your book, so read up on it. Basically, you can handle a signal and explicitly block which signals you don't want to receive while you're in the middle of handling another. This helps prevent race conditions. Further info on preventing race conditions is described in your book on pgs. 244-246 (under section 13.2.4 - "Manipulating a Process's Signal Mask").

Example Program: sighup.c - This is one from from your book. It shows how log rotation could be gracefully performed by handling the HUP signal and uses the sigaction() syntax. The good news is that the syntax is still pretty concise for the common case. Again, you can test this with kill -HUP pid.

And yes, the "Easy Hack" syntax is still supported for backward compatibility. (Although the man page recommends against using it.)

Job Control

Job control is discussed in your book in Chapter 14 on p. 257.

What Is Job Control?

Job Control is a feature of the OS (supported by the shell) that allows a single terminal to run multiple jobs. A job is defined as a group of one or more processes, occasionally having pipes ('|') or redirects (>, >>, <) in them.

The following list shows the major commands used in job control:

Example Program: sit-n-spin.c - You can use this program to test the job control features above.

Handling Job Signals

The good news is that handling job signals is done just like handling any other signals, you install a signal handler and there you go.

The two most important job control signals are stop (SIGTSTP) and continue (SIGCONT). These allow you to perform any pre-stopping or pre-re-starting work before the proces resumes. A listing of a few others is found in your book on p. 258.

Example Program: monitor.c - Catches SIGTSTP and SIGCONT signals and displays a message when they are caught. We'll go over this in class.

Further Study

Please note that the following links are not required reading, simply additional reading if you want further explanation or a different point of view on these concepts.

Signals by Beej.

Job Control in Bash From the Bash Reference Manual

Your book, Chapters 13 and 14.

Lab 7

This will be a lab on signals. (Surprise!)

Pre-requisite: Make a lab7 Directory

Make a subdirectory called 'lab7' underneath the unix directory in your home dir. All of the files you make for this lab belong in the lab7 directory. Don't put them anywhere else.

Description / Requirements

For this lab, I want you to write a program that does the following:

Save this in a file called sigcounter.c. The whole file should be about 30 or so lines long (including whitespace and curlies). Be sure to have a Makefile as well.

Output

If you follow the directions above, you should get some output that looks like this:

My pid is: 18481
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, resetting the counter...
1, 2, 3, 4, 5, resetting the counter...
1, 2, 3, 4, 5, 6, 7, resetting the counter...
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 

All the "resetting the counter..." messages are generated by typing kill -HUP 18481 in a seperate window.

A Note on Output

You will probably have code that looks like this:

	for (i = 1; i <= 20; i++ ) {
		printf("%i, ", i);
		sleep(1);
	}

If you have code like this, what will happen is that the program won't print anything until the counter hits 20 and then it will print all the numbers 1 through 20 all at once at the end. This is not what you want. This problem occurs because the system will buffer output sent to stdout. To get the system to flush the buffer right away add fflush() like so:

	for (i = 1; i <= 20; i++ ) {
		printf("%i, ", i);
		fflush(stdout);
		sleep(1);
	}

This is a gotcha that gets everyone.

Files

Here is a list of the files created in this lab.

Please turn in just the sigcounter.c file with your name, lab # and class # at the top.