Chapter 11. Sockets and Networking: There’s no place like 127.0.0.1

image with no caption

Programs on different machines need to talk to each other.

You’ve learned how to use I/O to communicate with files and how processes on the same machine can communicate with each other. Now you’re going to reach out to the rest of the world, and learn how to write C programs that can talk to other programs across the network and across the world. By the end of this chapter, you’ll be able to create programs that behave as servers and programs that behave as clients.

The Internet knock-knock server

C is used to write most of the low-level networking code on the Internet. Most networked applications need two separate programs: a server and a client.

You’re going to build a server in C that tells jokes over the Internet. You’ll be able to start the server on one machine like this:

image with no caption

Other than telling you it’s running, the server won’t display anything else on the screen. However, if you open a second console, you’ll be able to connect to the server using a client program called telnet. Telnet takes two parameters: the address of the server, and the port the server is running on. If you are running telnet on the same machine as the server, you can use 127.0.0.1 for the address:

image with no caption

Watch it!

You’ll be using telnet quite a lot in this chapter to test our server code.

If you try to use the built-in Windows telnet, you might have problems because of the way it communicates with the network. If you install the Cygwin version of telnet, you should be fine.

Do this!

You will need a telnet program in order to connect to the server. Most systems come with telnet already installed. You can check that you have telnet by typing:

telnet

on the command line.

If you don’t have telnet, you can install it in one of these ways:

Cygwin:

Run the setup.exe program for Cygwin and search for telnet.

Linux:

Search for telnet in your package manager. On many systems, the package manager is called Synaptic.

Mac:

If you don’t have telnet, you can install it from www.macports.org or www.finkproject.org.

Knock-knock server overview

The server will be able to talk to several clients at once. The client and the server will have a structured conversation called a protocol. There are different protocols used on the Internet. Some of them are low-level protocols, like the internet protocol (IP), which are used to control how binary 1s and 0s are sent around the Internet. Other protocols are high-level protocols, like the hypertext transfer protocol (HTTP), which controls how web browsers talk to web servers. The joke server is going to use a custom high-level protocol called the Internet knock-knock protocol (IKKP).

A protocol is a structured conversation.

image with no caption

The client and the server will exchange messages like this:

image with no caption

A protocol always has a strict set of rules. As long as the client and the server both follow those rules, everything is fine. But if one of them breaks the rules, the conversation usually stops pretty abruptly.

image with no caption

BLAB: how servers talk to the Internet

When C programs need to talk to the outside world, they use data streams to read and write bytes. You’ve used data streams that are connected to the files or Standard Input and Output. But if you’re going to write a program to talk to the network, you need a new kind of data stream called a socket.

image with no caption

Before a server can use a socket to talk to a client program, it needs to go through four stages that you can remember with the acronym BLAB: Bind, Listen, Accept, Begin.

Bind to a port.

Listen.

Accept a connection.

Begin talking.

1. Bind to a port

A computer might need to run several server programs at once. It might be sending out web pages, posting email, and running a chat server all at the same time. To prevent the different conversations from getting confused, each server uses a different port. A port is just like a channel on a TV. Different ports are used for different network services, just like different channels are used for different content.

When a server starts up, it needs to tell the operating system which port it’s going to use. This is called binding the port. The knock-knock server is going to use port 30000, and to bind it you’ll need two things: the socket descriptor and a socket name. A socket name is just a struct that means “Internet port 30000.”

image with no caption
image with no caption

2. Listen

If your server becomes popular, you’ll probably get lots of clients connecting to it at once. Would you like the clients to wait in a queue for a connection? The listen() system call tells the operating system how long you want the queue to be:

image with no caption

Calling listen() with a queue length of 10 means that up to 10 clients can try to connect to the server at once. They won’t all be immediately answered, but they’ll be able to wait. The 11th client will be told the server is too busy.

image with no caption

3. Accept a connection

Once you’ve bound a port and set up a listen queue, you then just have to...wait. Servers spend most of their lives waiting for clients to contact them. The accept() system call waits until a client contacts the server, and then it returns a second socket descriptor that you can use to hold a conversation on.

image with no caption

This new connection descriptor ( connect_d) is the one that the server will use to...

Begin talking.

Brain Barbell

Why do you think the accept() system call creates the descriptor for a new socket? Why don’t servers just use the socket they created to listen to the port?

A socket’s not your typical data stream

So far, data streams have all been the same. Whether you’re connected to files or Standard Input/Output, you’ve been able to use functions like fprintf() and fscanf() to talk to them. But sockets are a little different. A socket is two way: it can be used for input and output. That means it needs different functions to talk to it.

If you want to output data on a socket, you can’t use fprintf(). Instead, you use a function called send():

image with no caption

Remember: it’s important to always check the return value of system calls like send(). Network errors are really common, and your servers will have to cope with them.

Geek Bits

What port should I use?

You need to be careful when you choose a port number for a server application. There are lots of different servers available, and you need to make sure you don’t use a port number that’s normally used for some other program. On Cygwin and most Unix-style machines, you’ll find a file called /etc/services that lists the ports used by most of the common servers. When you choose a port, make sure there isn’t another application that already uses the same one.

Port numbers can be between 0 and 65535, and you need to decide whether you want to use a low number (< 1024) or a high one. Port numbers that are lower than 1024 are usually only available to the superuser or administrator on most systems. This is because the low port numbers are reserved for well-known services, like web servers and email servers. Operating systems restrict these ports to administrators only, to prevent ordinary users from starting unwanted services.

Most of the time, you’ll probably want to use a port number greater than 1024.

Sometimes the server doesn’t start properly

image with no caption
image with no caption
image with no caption

The server looks like it’s starting correctly the second time, but the client can’t get any response from it. Why is that?

Remember that the code was written without any error checking. Let’s add a little error check into the code and see if we can figure out what’s happening.

Why your mom always told you to check for errors

If you add an error check on the line that binds the socket to a port:

image with no caption

Then you’ll get a little more information from the server if it is stopped and restarted quickly:

image with no caption

If the server has responded to a client and then gets stopped and restarted, the call to the bind system call fails. But because the original version of the program never checked for errors, the rest of the server code ran even though it couldn’t use the server port.

Bound ports are sticky

When you bind a socket to a port, the operating system will prevent anything else from rebinding to it for the next 30 seconds or so, and that includes the program that bound the port in the first place. To get around the problem, you just need to set an option on the socket before you bind it:

ALWAYS check for errors on system calls.

image with no caption

This code makes the socket reuse the port when it’s bound. That means you can stop and restart the server and there will be no errors when you bind the port a second time.

Reading from the client

You’ve learned how to send data to the client, but what about reading from the client? In the same way that sockets have a special send() function to write data, they also have a recv() function to read data.

<bytes read> = recv(<descriptor>, <buffer>, <bytes to read>, 0);

If someone types in a line of text into a client and hits return, the recv() function stores the text into a character array like this:

image with no caption

There are a few things to remember:

  • The characters are not terminated with a character.

  • When someone types text in telnet, the string always ends .

  • The recv() will return the number of characters, or –1 if there’s an error, or 0 if the client has closed the connection.

  • You’re not guaranteed to receive all the characters in a single call to recv().

This last point is important. It means you might have to call recv() more than once:

image with no caption

That means recv() can be tricky to use. It’s best to wrap recv() in a function that stores a simple -terminated string in the array it’s given. Something like this:

image with no caption

Go Off Piste

This is one way of simplifying recv(), but could you do better? Why not write your own version of read_in() and let us know at headfirstlabs.com.

image with no caption

Ready-Bake Code

Here are some other functions that are useful when you are writing a server. Do you understand how each of them works?

image with no caption

Now that you have a set of server functions, let’s try them out...

The server can only talk to one person at a time

There’s a problem with the current server code. Imagine someone connects to it and he is a little slow with his responses:

image with no caption

Then, if someone else tries to get through to the server, she can’t; it’s busy with the first guy:

image with no caption

The problem is that the server is still busy talking to the first guy. The main server socket will keep the client waiting until the server calls the accept() system call again. But because of the guy already connected, it will be some time before that happens.

Brain Power

The server can’t respond to the second user, because it is busy dealing with the first. What have you learned that might help you deal with both clients at once?

You can fork() a process for each client

When the clients connect to the server, they start to have a conversation on a separate, newly created socket. That means the main server socket is free to go and find another client. So let’s do that.

When a client connects, you can fork() a separate child process to deal with the conversation between the server and the client.

image with no caption

While the client is talking to the child process, the server’s parent process can go connect to the next client.

image with no caption

The parent and child use different sockets

One thing to bear in mind is that the parent server process will only need to use the main listener socket. That’s because the main listener socket is the one that’s used to accept() new connections. On the other hand, the child process will only ever need to deal with the secondary socket that gets created by the accept() call. That means once the parent has forked the child, the parent can close the secondary socket and the child can close the main listener socket.

image with no caption

Writing a web client

What if you want to write your own client program? Is it really that different from a server? To see the similarities and differences, you’re going to write a web client for the hypertext transfer protocol (HTTP).

HTTP is a lot like the Internet knock-knock protocol you coded earlier. All protocols are structured conversations. Every time a web client and server talk, they say the same kind of things. Open telnet and see how to download http://en.wikipedia.org/wiki/O’Reilly_Media.

Do this!

image with no caption

When your program connects to the web server, it will need to send at least three things:

Note

Most web clients actually send a lot more information, but you’ll just send the minimum amount.

  • A GET command

    GET /wiki/O'Reilly_Media HTTP/1.1
  • The hostname

    Host: en.wikipedia.org
  • A blank line

But before you can send any data at all to the server, you need to make a connection from the client. How do you do that?

Clients are in charge

Clients and servers communicate using sockets, but the way that each gets hold of a socket is a little different. You’ve already seen that servers use the BLAB sequence:

  1. Bind a port.

  2. Listen.

  3. Accept a conversation.

  4. Begin talking.

A server spends most of its life waiting for a fresh connection from a client. Until a client connects, a server really can’t do anything. Clients don’t have that problem. A client can connect and start talking to a server whenever it likes. This is the sequence for a client:

  1. Connect to a remote port.

  2. Begin talking.

image with no caption

Remote ports and IP addresses

When a server connects to the network, it just has to decide which port it’s going to use. But clients need to know a little more: they need to know the port of the remote server, but they also need to know its internet protocol (IP) address:

image with no caption

Internet addresses are kind of hard to remember, which is why most of the time human beings use domain names. A domain name is just an easier-to-remember piece of text like:

www.oreilly.com

Even though human beings prefer domain names, the actual packets of information that flow across the network only use the numeric IP address.

Create a socket for an IP address

Once your client knows the address and port number of the server, it can create a client socket. Client sockets and server sockets are created the same way:

image with no caption

The difference between client and server code is what they do with sockets once they’re created. A server will bind the socket to a local port, but a client will connect the socket to a remote port:

image with no caption
image with no caption

The above code works only for numeric IP addresses.

To connect a socket to a remote domain name, you’ll need a function called getaddrinfo().

getaddrinfo() gets addresses for domains

The domain name system is a huge address book. It’s a way of converting a domain name like www.oreilly.com into the kinds of numeric IP addresses that computers need to address the packets of information they send across the network.

image with no caption

Create a socket for a domain name

Most of the time, you’ll want your client code to use the DNS system to create sockets. That way, your users won’t have to look up the IP addresses themselves. To use DNS, you need to construct your client sockets in a slightly different way:

image with no caption

The getaddrinfo() constructs a new data structure on the heap called a naming resource. The naming resource represents a port on a server with a given domain name. Hidden away inside the naming resource is the IP address that the computer will need. Sometimes very large domains can have several IP addresses, but the code here will simply pick one of them. You can then use the naming resource to create a socket.

image with no caption

Finally, you can connect to the remote socket. Because the naming resource was created on the heap, you’ll need to tidy it away with a function called freeaddrinfo():

image with no caption

Once you’ve connected a socket to a remote port, you can read and write to it using the same recv() and send() functions you used for the server. That means you should have enough information now to write a web client...

Go Off Piste

Why not update the code to automatically replace characters like spaces for you? For more details on how to replace characters for web addresses, see:

http://www.w3schools.com/tags/ref_urlencode.asp

Your C Toolbox

You’ve got Chapter 11 under your belt, and now you’ve added sockets and networking to your toolbox. For a complete list of tooltips in the book, see Appendix B.

image with no caption
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset