This lab is all about how to match strings, either with globbing or with regular expressions.
The chapter correponding to this lab is Chapter 21.
The alert student will note that we are skipping chapters 19 and 20. Chapter 19 deals with Virtual Consoles, which are only usable at the console (not from an SSH / telnet session). Most people that need more than one console simply open up a new SSH session or use the X Window system. Chapter 20 talks about the Linux Console and explains some escape sequences that you can print to the screen which make text boldface, change colors, etc. We will cover all of that stuff next time when we talk about the Slang library.
As we have already learned in Chapter 11 (on pgs. 193-199) globbing can be used to perform quick wildcard matches. A list of valid globbing metacharacters is found on pgs. 407-408 in your book.
Example Program: fnmatch.c - This program takes one glob expression as an agrument and applies it to the lines read from stdin. To test this program type something like:
$ ls | ./fnmatch "*.c"Make sure you enclose the glob expression in quotes (single or double) otherwise the shell will try to expand it.
Regular Expressions were invented by Steven Kleene as a notation for describing "the algebra of regular sets". They have since become the standard way of matching patterns of text. Regulare Expressions (or "regexes" for short) contain much more powerful pattern matching capabilities than simple globbing. Numerous Unix programs use regexes, including grep, sed, awk, perl, and numerous others.
If you are unfamiliar with regular expressions, I highly recommend you read Using Regular Expressions by Stephen Ramsay. Numerous other regex resources are listed below in the Further Study section.
Use the incredibly useful grep program, passing it a regular expression and a file to search in like so:
$ grep "^q.*z$" /usr/share/dict/words quartz quiz
What that means is "find all the words that begin with 'q', end with 'z', and have anything else in the middle".
The following table gives a brief synopsis of regex metacharacters.
| Wildcard Matching | matches characters | 
|---|---|
| . | matches any one character | 
| [adf] | matches any one of the letters 'a', 'd', or 'f' | 
| [^adf] | matches any letter EXCEPT 'a', 'd', or 'f' | 
| Quantifiers | how many times to match a character | 
| ? | match previous character zero or one times | 
| * | match previous character zero to N times | 
| + | match previous character one to N times | 
| \{5\} | match previous character 5 times | 
| \{2, 6\} | match previous character at least 2 times but no more than 6 times | 
| Anchors | where it matches | 
| ^ | match expression at start of line | 
| $ | match expression at end of line | 
| \< | match expression at beginning of word | 
| \> | match expression at end of word | 
| \b | match expression at word boundary (beginning or end) | 
| \B | match expression in the middle of a word | 
These can be combined to form any expression you need. Examples.
| .* | match anything (any character, any number of times) | 
| v | match any string that has a 'v' | 
| [^v] | match any string that does not have a 'v' | 
| ^v | match any string that begins with a 'v' | 
| ^v.*e$ | match any string that begins with a 'v' and ends with an 'e' | 
| v{2,3} | match any string with two or three consecutive 'v's | 
Feel free to experiment by grepping through /usr/share/dict/words.
Four functions are involved in using regular expressions in C. They are listed in your book on p. 409. It basically goes like this:
The following example shows an application of those four steps.
Example Program: regmatch.c - A program that tests applying a regular expression to a sequence of strings. Experiment with this one by changing the regex, adding new strings, etc.
 Confusing Example Program: match.c - From
your book around pgs. 410-411. This is a very obtuse example of using regular
expressions. Simply put, it tests if a line begins with a # (meaning it's a
comment), but NOT a /#. It's not a terrific example.
  Further Study 
 
If you are unfamiliar with regular expressions, I recommend you look at the following pages:
Using Regular Expressions by Stephen Ramsay from the Electronic Text Center. A questions-and-answers approach to teaching regular expressions.
So What's A $#!%% Regular Expression, Anyway?! on DevShed.com. This is a fairly gentle introduction.
Regular Expressions on Linux-Learn.org. A straightfoward explanation of how to use regular expressions with application to grep and sed.
Regular Expressions from Zytrax.com. This is a "big picture" overview with a Web-centric approach.
Streams and sed -- The Stream Editor from rute.sourceforge.net. A sed tutorial that briefly explains regular expressions. Also contains some good information on piping and redirection in the shell.
Mastering Regular Expressions, 2nd Edition by O'Reilly and Associates.
sed & awk, 2nd Edition by O'Reilly and Associates.
Your book, Chapter 21
These programs are all on Gautama.
grep - Matches regular expressions in files or the output of other programs.
locate - See which files on the hard drive match a regular expression.
vi - In non-insert mode, you can type "/" followed by a string to search for a regular expression. You can also type ":s/regex/replace/" to search and replace text.
sed - The stream editor. Contains an odd little programming language.
awk - A strange little programming language.
perl - An even stranger big programming language with some nice regex extensions.
This will be a lab on regular expression matching.
For this lab, I want you to write a miniature version of grep. Your program will take one argument, a regular expression, and tries to match that regex with lines read from stdin.
Some steps to help you:
Save this in a file called minigrep.c. The whole file should be about 80 or so lines long, but you will be able to re-use a lot of the code from the previous examples. It shouldn't take you too long. Be sure to have a Makefile as well. Put both files in a lab11/ directory.
Calling the program like this from the command line:
$ ./minigrep "^f.*z.*e$" < /usr/share/dict/wordsshould produce the following output:
familiarize fantasize faze fertilize finalize fizzle formalize frazzle freeze frieze frozeAnother example: calling the program like this:
$ ./minigrep "^pu[zl].*e$" < /usr/share/dict/wordsshould produce the following output:
pulsate pulse puzzle
You can test if your program is working right by comparing the output of your program with the output of "grep".
Here is a list of the files created in this lab.
Please turn in just the minigrep.c file with your name, lab # and class # at the top.