11. Sockets and Networking: There’s no place like 127.0.0.1

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 11. Sockets and Networking: There’s no place like 127.0.0.1

Programs on different machines need to talk to each other.

You’ve learned how to use I/O to communicate with files and how processes on the same machine can communicate with each other. Now you’re going to reach out to the rest of the world, and learn how to write C programs that can talk to other programs across the network and across the world. By the end of this chapter, you’ll be able to create programs that behave as servers and programs that behave as clients.

The Internet knock-knock server

C is used to write most of the low-level networking code on the Internet. Most networked applications need two separate programs: a server and a client.

You’re going to build a server in C that tells jokes over the Internet. You’ll be able to start the server on one machine like this:

Other than telling you it’s running, the server won’t display anything else on the screen. However, if you open a second console, you’ll be able to connect to the server using a client program called telnet. Telnet takes two parameters: the address of the server, and the port the server is running on. If you are running telnet on the same machine as the server, you can use 127.0.0.1 for the address:

Watch it!

You’ll be using telnet quite a lot in this chapter to test our server code.

If you try to use the built-in Windows telnet, you might have problems because of the way it communicates with the network. If you install the Cygwin version of telnet, you should be fine.

Do this!

You will need a telnet program in order to connect to the server. Most systems come with telnet already installed. You can check that you have telnet by typing:

telnet

on the command line.

If you don’t have telnet, you can install it in one of these ways:

Cygwin:

Run the setup.exe program for Cygwin and search for telnet.

Linux:

Search for telnet in your package manager. On many systems, the package manager is called Synaptic.

Mac:

If you don’t have telnet, you can install it from www.macports.org or www.finkproject.org.

Knock-knock server overview

The server will be able to talk to several clients at once. The client and the server will have a structured conversation called a protocol. There are different protocols used on the Internet. Some of them are low-level protocols, like the internet protocol (IP), which are used to control how binary 1s and 0s are sent around the Internet. Other protocols are high-level protocols, like the hypertext transfer protocol (HTTP), which controls how web browsers talk to web servers. The joke server is going to use a custom high-level protocol called the Internet knock-knock protocol (IKKP).

A protocol is a structured conversation.

The client and the server will exchange messages like this:

A protocol always has a strict set of rules. As long as the client and the server both follow those rules, everything is fine. But if one of them breaks the rules, the conversation usually stops pretty abruptly.

BLAB: how servers talk to the Internet

When C programs need to talk to the outside world, they use data streams to read and write bytes. You’ve used data streams that are connected to the files or Standard Input and Output. But if you’re going to write a program to talk to the network, you need a new kind of data stream called a socket.

Before a server can use a socket to talk to a client program, it needs to go through four stages that you can remember with the acronym BLAB: Bind, Listen, Accept, Begin.

Bind to a port.
Listen.
Accept a connection.
Begin talking.

1. Bind to a port

A computer might need to run several server programs at once. It might be sending out web pages, posting email, and running a chat server all at the same time. To prevent the different conversations from getting confused, each server uses a different port. A port is just like a channel on a TV. Different ports are used for different network services, just like different channels are used for different content.

When a server starts up, it needs to tell the operating system which port it’s going to use. This is called binding the port. The knock-knock server is going to use port 30000, and to bind it you’ll need two things: the socket descriptor and a socket name. A socket name is just a struct that means “Internet port 30000.”

2. Listen

If your server becomes popular, you’ll probably get lots of clients connecting to it at once. Would you like the clients to wait in a queue for a connection? The listen() system call tells the operating system how long you want the queue to be:

Calling listen() with a queue length of 10 means that up to 10 clients can try to connect to the server at once. They won’t all be immediately answered, but they’ll be able to wait. The 11th client will be told the server is too busy.

3. Accept a connection

Once you’ve bound a port and set up a listen queue, you then just have to...wait. Servers spend most of their lives waiting for clients to contact them. The accept() system call waits until a client contacts the server, and then it returns a second socket descriptor that you can use to hold a conversation on.

This new connection descriptor ( connect_d) is the one that the server will use to...

Begin talking.

Brain Barbell

Why do you think the accept() system call creates the descriptor for a new socket? Why don’t servers just use the socket they created to listen to the port?

A socket’s not your typical data stream

So far, data streams have all been the same. Whether you’re connected to files or Standard Input/Output, you’ve been able to use functions like fprintf() and fscanf() to talk to them. But sockets are a little different. A socket is two way: it can be used for input and output. That means it needs different functions to talk to it.

If you want to output data on a socket, you can’t use fprintf(). Instead, you use a function called send():

Remember: it’s important to always check the return value of system calls like send(). Network errors are really common, and your servers will have to cope with them.

Geek Bits

What port should I use?

You need to be careful when you choose a port number for a server application. There are lots of different servers available, and you need to make sure you don’t use a port number that’s normally used for some other program. On Cygwin and most Unix-style machines, you’ll find a file called /etc/services that lists the ports used by most of the common servers. When you choose a port, make sure there isn’t another application that already uses the same one.

Port numbers can be between 0 and 65535, and you need to decide whether you want to use a low number (< 1024) or a high one. Port numbers that are lower than 1024 are usually only available to the superuser or administrator on most systems. This is because the low port numbers are reserved for well-known services, like web servers and email servers. Operating systems restrict these ports to administrators only, to prevent ordinary users from starting unwanted services.

Most of the time, you’ll probably want to use a port number greater than 1024.

This server generates random advice for any client that connects to it, but it’s not quite complete. You need to fill in the missing system calls. Also, this version of the code will send back a single piece of advice and then end. Part of the code needs to be inside a loop. Which part?

And for a bonus point, if you add in the missing #include statements, the program will work. But what has the programmer missed out? Hint: look at the system calls.

The programmer has forgotten to ______________________________

This server generates random advice for any client that connects to it, but it’s not quite complete. You needed to fill in the missing system calls. Also, this version of the code will send back a single piece of advice and then end. Part of the code needs to be inside a loop. Which part?

And for a bonus point, if you add in the missing #include statements, the program will work. But what has the programmer missed out? Hint: look at the system calls.

Sometimes the server doesn’t start properly

The server looks like it’s starting correctly the second time, but the client can’t get any response from it. Why is that?

Remember that the code was written without any error checking. Let’s add a little error check into the code and see if we can figure out what’s happening.

Why your mom always told you to check for errors

If you add an error check on the line that binds the socket to a port:

Then you’ll get a little more information from the server if it is stopped and restarted quickly:

If the server has responded to a client and then gets stopped and restarted, the call to the bind system call fails. But because the original version of the program never checked for errors, the rest of the server code ran even though it couldn’t use the server port.

Bound ports are sticky

When you bind a socket to a port, the operating system will prevent anything else from rebinding to it for the next 30 seconds or so, and that includes the program that bound the port in the first place. To get around the problem, you just need to set an option on the socket before you bind it:

ALWAYS check for errors on system calls.

This code makes the socket reuse the port when it’s bound. That means you can stop and restart the server and there will be no errors when you bind the port a second time.

Reading from the client

You’ve learned how to send data to the client, but what about reading from the client? In the same way that sockets have a special send() function to write data, they also have a recv() function to read data.

<bytes read> = recv(<descriptor>, <buffer>, <bytes to read>, 0);

If someone types in a line of text into a client and hits return, the recv() function stores the text into a character array like this:

There are a few things to remember:

The characters are not terminated with a character.
When someone types text in telnet, the string always ends .
The recv() will return the number of characters, or –1 if there’s an error, or 0 if the client has closed the connection.
You’re not guaranteed to receive all the characters in a single call to recv().

This last point is important. It means you might have to call recv() more than once:

That means recv() can be tricky to use. It’s best to wrap recv() in a function that stores a simple -terminated string in the array it’s given. Something like this:

Go Off Piste

This is one way of simplifying recv(), but could you do better? Why not write your own version of read_in() and let us know at headfirstlabs.com.

Ready-Bake Code

Here are some other functions that are useful when you are writing a server. Do you understand how each of them works?

Now that you have a set of server functions, let’s try them out...

Now it’s time to write the code for the Internet knock-knock server. You’re going to write a little more code than usual, but you’ll be able to use the ready-bake code from the previous page. Here’s the start of the program.

Now it’s over to you to write the main function. You’ll need to create a new server socket and store it in listener_d. The socket will be bound to port 30000, and the queue depth should be set to 10. Once that’s done, you need to write code that works like this:

Try to check error codes and if the user says the wrong thing, just send an error message, close the connection, and then wait for another client.

Good luck!

Now it’s time to write the code for the Internet knock-knock server. You were to write a little more code than usual, but you’ll be able to use the ready-bake code from the previous page. Here’s the start of the program.

This is the kind of code you should have written. Is yours similar? It doesn’t matter if the code is exactly the same. The important thing is that your code can tell the joke in the right way, and cope with errors.

Now that you’ve written the knock-knock server, it’s time to compile it and fire it up.

The server’s waiting for a connection, so open a separate console and connect to it with telnet:

The server can tell you a joke, but what happens if you break the protocol and send back an invalid response?

The server is able to validate the data you send it and close the connection immediately. Once you’re done running the server, you can switch back to the server window and hit Ctrl-C to close it down neatly. It even sends you a farewell message:

That’s great! The server does everything you need it to do.

Or does it?

The server can only talk to one person at a time

There’s a problem with the current server code. Imagine someone connects to it and he is a little slow with his responses:

Then, if someone else tries to get through to the server, she can’t; it’s busy with the first guy:

The problem is that the server is still busy talking to the first guy. The main server socket will keep the client waiting until the server calls the accept() system call again. But because of the guy already connected, it will be some time before that happens.

Brain Power

The server can’t respond to the second user, because it is busy dealing with the first. What have you learned that might help you deal with both clients at once?

You can fork() a process for each client

When the clients connect to the server, they start to have a conversation on a separate, newly created socket. That means the main server socket is free to go and find another client. So let’s do that.

When a client connects, you can fork() a separate child process to deal with the conversation between the server and the client.

While the client is talking to the child process, the server’s parent process can go connect to the next client.

The parent and child use different sockets

One thing to bear in mind is that the parent server process will only need to use the main listener socket. That’s because the main listener socket is the one that’s used to accept() new connections. On the other hand, the child process will only ever need to deal with the secondary socket that gets created by the accept() call. That means once the parent has forked the child, the parent can close the secondary socket and the child can close the main listener socket.

Q:
If I create a new process for each client, what happens if hundreds of clients connect? Will my machine create hundreds of processes?
A:
Yes. If you think your server will get a lot of clients, you need to control how many processes you create. The child can signal you when it’s finished with a client, and you can use that to maintain a count of current child processes.

Let’s try the modified version of the server. You can compile and run it in the same way:

If you open a separate console and start telnet, you can connect, just like you did before:

Everything seems the same, but if you leave the client running with the joke half-told, you should be able to see what’s changed.

If you open a third console, you will see that there are now two processes for the server: one for the parent and one for the child:

That means you can connect, even while the first client is still talking to the server:

Now that you’ve built an Internet server, let’s go look at what it takes to build a client, by writing something that can read from the Web.

Writing a web client

What if you want to write your own client program? Is it really that different from a server? To see the similarities and differences, you’re going to write a web client for the hypertext transfer protocol (HTTP).

HTTP is a lot like the Internet knock-knock protocol you coded earlier. All protocols are structured conversations. Every time a web client and server talk, they say the same kind of things. Open telnet and see how to download http://en.wikipedia.org/wiki/O’Reilly_Media.

Do this!

When your program connects to the web server, it will need to send at least three things:

Note

Most web clients actually send a lot more information, but you’ll just send the minimum amount.

A GET command
```
GET /wiki/O'Reilly_Media HTTP/1.1
```
The hostname
```
Host: en.wikipedia.org
```
A blank line

But before you can send any data at all to the server, you need to make a connection from the client. How do you do that?

Clients are in charge

Clients and servers communicate using sockets, but the way that each gets hold of a socket is a little different. You’ve already seen that servers use the BLAB sequence:

Bind a port.
Listen.
Accept a conversation.
Begin talking.

A server spends most of its life waiting for a fresh connection from a client. Until a client connects, a server really can’t do anything. Clients don’t have that problem. A client can connect and start talking to a server whenever it likes. This is the sequence for a client:

Connect to a remote port.
Begin talking.

Remote ports and IP addresses

When a server connects to the network, it just has to decide which port it’s going to use. But clients need to know a little more: they need to know the port of the remote server, but they also need to know its internet protocol (IP) address:

Internet addresses are kind of hard to remember, which is why most of the time human beings use domain names. A domain name is just an easier-to-remember piece of text like:

www.oreilly.com

Even though human beings prefer domain names, the actual packets of information that flow across the network only use the numeric IP address.

Create a socket for an IP address

Once your client knows the address and port number of the server, it can create a client socket. Client sockets and server sockets are created the same way:

The difference between client and server code is what they do with sockets once they’re created. A server will bind the socket to a local port, but a client will connect the socket to a remote port:

The above code works only for numeric IP addresses.

To connect a socket to a remote domain name, you’ll need a function called getaddrinfo().

getaddrinfo() gets addresses for domains

The domain name system is a huge address book. It’s a way of converting a domain name like www.oreilly.com into the kinds of numeric IP addresses that computers need to address the packets of information they send across the network.

Create a socket for a domain name

Most of the time, you’ll want your client code to use the DNS system to create sockets. That way, your users won’t have to look up the IP addresses themselves. To use DNS, you need to construct your client sockets in a slightly different way:

The getaddrinfo() constructs a new data structure on the heap called a naming resource. The naming resource represents a port on a server with a given domain name. Hidden away inside the naming resource is the IP address that the computer will need. Sometimes very large domains can have several IP addresses, but the code here will simply pick one of them. You can then use the naming resource to create a socket.

Finally, you can connect to the remote socket. Because the naming resource was created on the heap, you’ll need to tidy it away with a function called freeaddrinfo():

Once you’ve connected a socket to a remote port, you can read and write to it using the same recv() and send() functions you used for the server. That means you should have enough information now to write a web client...

Go Off Piste

Why not update the code to automatically replace characters like spaces for you? For more details on how to replace characters for web addresses, see:

http://www.w3schools.com/tags/ref_urlencode.asp

Q:
Should I create sockets with IP addresses or domain names?
A:
Most of the time, you’ll want to use domain names. They’re easier to remember, and occasionally some servers will change their numeric addresses but keep the same domain names.
Q:
So, do I even need to know how to connect to a numeric address?
A:
Yes. If the server you are connecting to is not registered in the domain name system, such as machines on your home network, then you will need to know how to connect by IP.
Q:
Can I use getaddrinfo() with a numeric address?
A:
Yes, you can. But if you know that the address you are using is a numeric IP, the first version of the client socket code is simpler.

Your C Toolbox

You’ve got Chapter 11 under your belt, and now you’ve added sockets and networking to your toolbox. For a complete list of tooltips in the book, see Appendix B.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 11. Sockets and Networking: There’s no place like 127.0.0.1

Create new playlist

Sign In

Sign Up

Chapter 11. Sockets and Networking: There’s no place like 127.0.0.1

The Internet knock-knock server

Watch it!

Knock-knock server overview

BLAB: how servers talk to the Internet

1. Bind to a port

2. Listen

3. Accept a connection

Brain Barbell

A socket’s not your typical data stream

Geek Bits

Sometimes the server doesn’t start properly

Why your mom always told you to check for errors

Bound ports are sticky

Reading from the client

Go Off Piste

The server can only talk to one person at a time

Brain Power

You can fork() a process for each client

The parent and child use different sockets

Writing a web client

Note

Clients are in charge

Remote ports and IP addresses

Create a socket for an IP address

getaddrinfo() gets addresses for domains

Create a socket for a domain name

Go Off Piste

Your C Toolbox

Table of Contents for
11. Sockets and Networking: There’s no place like 127.0.0.1