CS 3210 - Lab 9 - Networking with Sockets

Today we'll talk all about how networking is done in UNIX. The chapter correponding to this lab is Chapter 16.

The fundamental mechanisms for networking machines in Unix have existed for many years; the Unix way of networking has been adopted by numerous other operating systems and has served as the basis for the Internet we know today.

Please note that we are not going to cover networking in extensive detail, just touch on the high points. Whole books have been written about the topic.

As always, the example programs we will be looking at are found in /tmp/lab9. To copy them into your own directory, type: cp /tmp/lab9/* . (don't forget the final dot). I have included a Makefile for your benefit.

First, Some Theory

Ideal vs. Real

Ideally, network connections should always connect to the intended destination and data would arrive in exactly the same order that it was sent. In practice, this is not the case. Data must be split up into packets and sent by routers to another router upstream (called the next hop) that will get the packets closer to their final destination. The example in section 16.1.2 on pg. 321 in your book spells it out nicely. You can also use the traceroute program to observe how packets travel from hop to hop.

Because the packets can become disordered, the sending computer must tag the packets with sequence numbers. If packets arrive out of order (e.g. packet 5 is received before packet 3) the destination computer must re-sequence the packets into the same order that they were sent. This often involves pushing out-of-order packets onto a stack and then popping them off when we need them, hence the term TCP/IP stack. Likewise, if packets get lost ("dropped") along the way, the destination computer must ask the sending computer to re-send them; this is known as error control. Page 322 in your book spells out some useful terms and explanations.

Two Types of Connections

There are actually two types of network conversations that can take place:

Connection-oriented or stream protocols, such as TCP/IP where each packet is sequenced and acknowledged. This type of connection is more reliable, but incurs more overhead (network traffic). You can see all the open TCP ports on a Unix host by typing nmap localhost or netstat -atpl.
Connectionless or datagram protocols, such as UDP/IP where packets are not sequenced nor acknowledged. This type of connection is less reliable, but does not consume as much bandwidth. Most games (Everquest, etc.) use UDP to communicate. You can see all the open UCP ports on a Unix host by typing nmap -sU localhost (must be root) or netstat -aupl.

We will be dealing only with connection-oriented (stream) protocols in this lab.

Getting from Here to There: Hosts and Ports

It looks like we are going to have to talk about the OSI model after all. Oh well.

To get to a remote host, you must use a number that uniquely identifies the host, called an IP address (p. 342); this is the "network" layer in the OSI model. Human beings remember names better than numbers, so a name-to-address service called Domain Name System (or DNS) exists to resolve names to numbers.

Example Program: lookup.c - This program will lookup the IP address of a hostname passed on the command line. (Try passing this program a hostname with more than one IP address, such as www.yahoo.com.) You can also use the "host" program on gautama to look up addresses by hostname.

However, there is an additional problem: one host could be running numerous services (or daemons) so you also need to specify a port number (p. 342) to indicate which service you want to access. Port numbers function at the transport layer in the OSI model.

Example Program: services.c - This program takes the name of a service (such as ftp, http, etc.) on the command line and reports what port it runs on. This information comes from the file /etc/services. (Try running this with a service with aliases, such as "www".)

As above, the nmap / netstat programs can show you which services are running. You can use telnet or netcat (nc) to connect to a host at a particular port.

Sockets in Practice

The basic syscalls that are used in creating a client-side and server-side connection are illustrated very clearly in Figure 16.1 on pg. 328 in your book.

There are two ways to use sockets on a Unix system: Domain (single host) and Internet (multi-host).

Unix Domain Sockets

The first application of sockets is to allow two processes to communicate on a single host. This might seem a little odd because we already have pipes which accomplish the same thing. The advantage with using sockets is that they allow bi-directional communication. Another advantage is they allow n-way communication: with pipes you have one sender and one receiver; with sockets, each connected process could be a sender or a receiver. One could use Domain Sockets to make a multi-user chat program.

When a domain socket is created, a file is created in the filesystem which programs can use for reading and writing ("everything is a file"). Note that you cannot "cat" this file, though, you must use the socket API to read from it.

Example Program: userver.c and uclient.c - This pair of programs shows how to use sockets to communicate on a single machine. Note that the output of nmap / netstat does not change when the server runs, because the socket created is a file. (Note also that userver.c only listens for a single connection. You could write a forking server, or use select or poll to handle more than one conneciton.)

Passing File Descriptors

Another point worth noting is that Domain Sockets can be used to pass file descriptors between two processes. This is a rather unusual capability, something you don't normally see.

Example Program: passfd.c - Yet another re-implementation of 'cat' in which the program forks and the child process opens a file descriptor which it then passes to the parent for reading.

TCP/IP Networking

The second, and more commonly seen application of sockets is to create network connections between two hosts. In order to accomplish this, you need a hostname (or IP address) and a port, which will connect you to a service. For the basics on both sides of the connection, I refer you to Figure 16.1 on p. 328.

Example Program: tserver.c and tclient.c - This pair of programs shows how to use sockets to communicate on a single machine.

Further Study

Beej's Guide to Network Programming

Programming UNIX Sockets in C - Frequently Asked Questions (There's a link to a tarball of sample source code, too.)

Books: UNIX Network Programming and UNIX Network Programming, Volume 2: Interprocess Communications by W. Richard Stevens. Warning: Not what I would call light reading.

Your book, Chapter 16

Example programs on Gautama: telnet, nc, nmap, netstat, ping, fping, bing, traceroute, ifconfig, host

Lab 9

Description / Requirements

For this lab, I want you to write a program that connects to the chargen daemon on gautama (or any other host) and reads one line.

Have the program take a hostname on the command-line and see if it can be resolved (inet_aton) by IP address (gethostbyaddr) or by name via DNS (gethostbyname).
Create an Internet TCP socket (socket, PF_INET, SOCK_STREAM)
Tell the socket to look at port 19 (chargen - type "nmap localohst" to see)
Connect to the host (connect)
Read one line from the socket (i.e. until you read a newline) and print it to stdout.
close the socket and exit.

Save this in a file called cgclient.c. The whole file should be about 60 or so lines long. Be sure to have a Makefile as well.

Output

If you follow the directions above, you should get some output that looks like this:

!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefg

If your output looks like that, you probably did the program correctly.

Tip

Don't use the copyData() function included in sockutil.c. Instead, just use the low-level read/write syscalls like so:

char ch;
...
while (read(sock, &ch, sizeof(ch)) > 0 && ch != '\n') {
	write(STDOUT_FILENO, &ch, 1);
}
printf("\n");

Files

Here is a list of the files created in this lab.

Makefile
cgclient.c

Please turn in just the cgclient.c file with your name, lab # and class # at the top.