Extending and Expanding UNIX with Your Own Programs

The set of commands that UNIX comprises offers a remarkable amount of flexibility, but even with fancy shell script programming, you'll doubtless come across situations where a program is your only solution. There are two popular UNIX solutions: the Perl scripting language (covered in the next hour) and the C programming language, as explored in this hour.

Task 21.1: fget, a Smarter FTP Client

When you learned about the FTP program earlier in this book, you probably thought to yourself, Sheesh, that's pretty ugly and hard to use. Well, you're not alone, but if you're stuck on the command line, it's quite surprising that there aren't any well distributed alternatives. The good news is that it turns out that all the commands you type when you're within FTP can be stored in a file and your session scripted, automatically doing whatever you want.


Want to have a file transferred to you at three in the morning? You can do that. More importantly, however, we can write an easier interface to FTP. The key is the -n flag to the ftp program, which says “read standard input for all the commands to use.”

  1. Let's say I wanted to see what files were available on the FTP archive on ftp.intuitive.com. Here's how that would look with a regular FTP session:

    % ftp ftp.intuitive.com
    Connected to www.intuitive.com.
    220 limbo.hostname.com FTP server (Version wu-2.4.2-academ[BETA- 15](1) Sat Nov 1
     03:08:32 EST 1997) ready.
    Name (ftp.intuitive.com:taylor): ftp
    331 Guest login ok, send your complete e-mail address as password.
    Password:
    230-Please read the file README
    230-  it was last modified on Sun Sep 13 18:50:12 1998 - 0 days ago
    230 Guest login ok, access restrictions apply.
    Remote system type is UNIX.
    Using binary mode to transfer files.
    ftp> dir
    200 PORT command successful.
    150 Opening ASCII mode data connection for /bin/ls.
    total 7
    drwxr-xr-x   6 root     root         1024 Sep 13 18:50 .
    drwxr-xr-x   6 root     root         1024 Sep 13 18:50 ..
    -rw-rw-r--   1 root     root          175 Sep 13 18:50 README
    d--x--x--x   2 root     root         1024 Mar 26 23:09 bin
    d--x--x--x   2 root     root         1024 Mar 26 23:09 etc
    drwxr-xr-x   2 root     root         1024 Mar 26 23:09 lib
    dr-xr-sr-x   2 root     ftp          1024 Sep 13 18:48 pub
    226 Transfer complete.
    ftp> quit
    221 Goodbye.
    %
    

    That's a lot of information for a simple directory listing.

  2. As it turns out, everything I had to type in the preceding step—all the words in bold—can be saved in a file and fed directly to the ftp program:

    % cat ftp-script
    user ftp [email protected]
    dir
    quit
    %
    

    The only thing I've had to add here is that you can see the password I'm using for the “ftp” account in this interaction. For anonymous FTP interaction, you'll almost always use your own email address, but other than that, it's exactly what I typed earlier.

    Now the secret step: I'm going to use the -n flag to FTP and feed it the preceding script. Watch what happens:

    % ftp -n ftp.intuitive.com < ftp-script
    total 7
    drwxr-xr-x   6 root     root         1024 Sep 13 18:50 .
    drwxr-xr-x   6 root     root         1024 Sep 13 18:50 ..
    -rw-rw-r--   1 root     root          175 Sep 13 18:50 README
    d--x--x--x   2 root     root         1024 Mar 26 23:09 bin
    d--x--x--x   2 root     root         1024 Mar 26 23:09 etc
    drwxr-xr-x   2 root     root         1024 Mar 26 23:09 lib
    dr-xr-sr-x   2 root     ftp          1024 Sep 13 18:48 pub
    %
    

    Way cool!

  3. What the fget program does is create the simple script shown previously on-the-fly so that you can specify the FTP server and have it automatically produce a file listing.

    I've added two other features to the fget program to make it as useful as possible: You can specify any directory you'd like to see on the remote system, and you can specify a file on the remote system and it will copy it into the current local directory automatically.

    Here's a synopsis of how to use fget:

    % fget
    
    Usage: fget host:remotefile { local}
    

    If you omit the :remotefile portion, fget will produce a listing of the files on the remote system. To specify a particular directory on the remote system, replace local with that directory name. For example, the command fget ftp.intuitive.com /pub will list the contents of the /pub directory on that machine. Copy a file from the remote system to the local system with fget ftp.intuitive.com:README, or rename it as you go by using fget ftp.intuitive.com:README new.readme. To display a file directly on the screen, use - as the value for local.

A handy hint: Always try to make the “usage” of your programs as helpful as possible!


  1. The logic flow of the program is as follows:

    figure out what elements the user has specified
    create a temporary work file
    output the 'user' line so you can log in to the server
    if (remotefile is not specified)
       if (localfile is specified)
         output "cd localfile";
       output "dir"
    else
       output "get remotefile localfile"
    feed temporary file to "ftp" program.
    
  2. The main C program, fget.c, is shown here:

    /**                             fget.c                  **/
    
    /** (C) Copyright 1998, Dave Taylor. All Rights Reserved.***/
    
    #include "fget.h"
    
    main(argc, argv)
    int argc;
    char **argv;
    {
            FILE *fd;
            char buffer[SLEN], username[NLEN], hostname[NLEN];
            char remotehost[SLEN], remotefile[SLEN], localfname[SLEN];
    
            if (argc < 2) usage();  /* too few args: usage and quit */
    
            splitword(argv[1], remotehost, remotefile); /* split host/file */
    
            if (argc == 2) strcpy(localfname, basename_of(remotefile));
            else           strcpy(localfname, argv[2]);
    
            initialize(username, hostname); /* get username and local host */
    
            if ((fd = fopen(TEMPFILE, "w")) == NULL) {
              fprintf(stderr,
                 "Couldn't open tempfile '%s': move into your home directory?
    ",
                 TEMPFILE);
              exit(1);
            }
    
            /** now build the information to hand to ftp in the temp file **/
    
            fprintf(fd, "ascii
    user %s %s@%s
    ", ANONFTP, username, hostname);
    
            if (strlen(remotefile) == 0) {
              if (strlen(localfname) > 0)          /* directory specified? */
                fprintf(fd, "cd %s
    ", localfname); /*    add 'cd' command */
              fprintf(fd, "dir
    ");
            }
            else    /* get a file from the remote site */
              fprintf(fd, "get %s %s
    ", remotefile, localfname);
    
            fprintf(fd, "bye
    ");
    
            fclose(fd);
    
            /* the input file is built, now to hand it to 'ftp' */
    
            sprintf(buffer, "ftp -n %s < %s; rm %s", remotehost,TEMPFILE, TEMPFILE);
    
            exit(system(buffer));
    }
    

    I won't go into details about exactly how this works or what the individual C statements do; that's beyond the scope of this book.

    What is important to notice, however, is the line #include "fget.h" at the top, which tells the C compiler to include an additional header file, a clue that the program is built from more than a single source file.

  3. Here's what fget.h looks like:

    /**                             fget.h                  **/
    
    /** Headers file for the FGET program.  See "fget.c" for more info.
        (C) Copyright 1998, Dave Taylor, All Rights Reserved.     ***/
    
    #include <stdio.h>
    
    #define FTP            "ftp -n"    /* how to invoke FTP in silent mode */
    #define TEMPFILE       ".fget.tmp"  /* the temp file for building cmds */
    #define ANONFTP        "ftp"        /* anonymous FTP user account */
    
    #define SLEN           256          /* length of a typical string */
    #define NLEN           40           /* length of a short string   */
    
    char *basename_of(), *getenv();
    

    It turns out that a third file is required too: utils.c, which contains the actual subroutines initialize, splitword, basename_of and usage. It's too long to present here in the book, but it's easily available online.

All three of these files—and more!—are available on the Web. Visit http://www.intuitive.com/tyu24/ to find them.


There's more to the fget program than the snippet you've seen here, but if you pop over to the Web site, you'll have everything you need in order to follow the rest of this lesson.


Task 21.2: Compiling the Program with cc, the C Compiler

Unlike Perl, BASIC, and UNIX shell scripts, C is one of a class of programming languages that must be translated into machine language before it becomes usable. The tool for translating source code like what's shown above to executable binaries is a compiler. Sensibly enough, the C compiler is cc.


A few different things are actually happening behind the scenes when you compile a program. The sequence is actually “add all the include files, compile the program to an intermediate object file, and then link all the object files together with any runtime libraries needed to produce the final program.”

Fortunately you don't need to worry about these steps, except to know that because we're working with multiple source files, we'll need to slightly change the default behavior of cc.

  1. The C compiler has oodles of command options, but there's really only one I need to be concerned with right now. The -c flag tells the compiler to build the object file associated with the individual C program, but not to try to link all the runtime libraries:

    % cc -c fget.c
    %
    

    Not much output, but no output is good news: Everything worked just fine.

You might not have the cc compiler on your system. If not, no worries; try typing gcc -c fget.c. Gcc is the GNU C compiler and it's 100% functionally equivalent to cc. Gcc is included with Linux, for example.


  1. There's one more file to compile before I'm done:

    % cc -c utils.c
    %
    

    Again, no problems.

  2. A quick peek with ls shows what has happened:

    % ls -l
    total 21
    -rw-r--r--   1 taylor   taylor       2523 Sep 13 13:08 fget.c
    -rw-r--r--   1 taylor   taylor        624 Sep 13 11:58 fget.h
    -rw-rw-r--   1 taylor   taylor       1988 Sep 13 13:08 fget.o
    -rw-r--r--   1 taylor   taylor       2802 Sep 13 11:58 utils.c
    -rw-rw-r--   1 taylor   taylor       2592 Sep 13 11:58 utils.o
    

    There are two new files: fget.o and utils.o. Those are the object files.

  3. Now let's put everything together and create the fget program itself. Notice that I won't need the -c flag but will instead use the -o flag to specify the name of the output file (the final program to be created).

    % cc fget.o utils.o -o fget
    %
    

    Great! It worked without a problem. Now let's try it:

    % fget ftp.intuitive.com
    total 7
    drwxr-xr-x   6 root     root         1024 Sep 13 18:50 .
    drwxr-xr-x   6 root     root         1024 Sep 13 18:50 ..
    -rw-rw-r--   1 root     root          175 Sep 13 18:50 README
    d--x--x--x   2 root     root         1024 Mar 26 23:09 bin
    d--x--x--x   2 root     root         1024 Mar 26 23:09 etc
    drwxr-xr-x   2 root     root         1024 Mar 26 23:09 lib
    dr-xr-sr-x   2 root     ftp          1024 Sep 13 18:48 pub
    

    Super!

Using the C compiler to put all the pieces together is easy, but as you might expect, it can become quite tedious. The more C files you have for your program, the more typing you'll end up doing, and what's worse, you're never sure that you have the very latest versions of every file unless you rebuild every one every time. That's where the make utility comes to the rescue.


Task 21.3: The Invaluable make Utility

You saw earlier that you can definitely type the cc commands needed each time to rebuild the object files from the C source files you have, and then use cc again to create the final program—but wouldn't it be nice if there were an easier way? In particular, imagine that you have a big program that includes 3 header (.h) files and 16 source files. It would be a nightmare to type all those commands each time you wanted to update the program!


Instead, UNIX has a great utility called make that lets you define a set of rules regarding how files should be compiled and linked together and can then build them automatically.

Having rules for compilation are good, but being able to keep track of the minimum amount of recompilation for any given modification to the program is really the big win for make, as you'll see.

The one price you pay is that you need to create a Makefile, a somewhat peculiar-looking data file that defines all the rules and dependencies in your project.

  1. First, I need to create one of these Makefile files, which means I need to define the rule for compiling a C program. In its most basic form, it's cc -c sourcefile, but I'll make it more general purpose by having the compiler specified as a Makefile variable and adding another optional variable, CFLAGS, in case I want to specify any other possible compilation flags.

    It looks like this:

    CC=/usr/bin/cc
    CFLAGS=
         $(CC) $(CFLAGS) -c sourcefile
    							

    If I specify this for the two files in the fget project, it will look like this:

    $(CC) $(CFLAGS) -c fget.c
    $(CC) $(CFLAGS) -c utils.c
    
  2. But there's more that can be included in the Makefile—there are file dependencies. These are straightforward, thankfully: The object file or program I'm trying to create is specified, followed by a colon, followed by a list of the files that are required.

    In fact, Makefile rules generally need to be

    								target: dependencies
    								    commands for building the target
    							

    Taking that into account, here are the two rules for the current project:

    CC=/usr/bin/cc
    CFLAGS=
    
    fget.o: fget.c fget.h
            $(CC) $(CFLAGS) -c fget.c
    utils.o: utils.c fget.h
            $(CC) $(CFLAGS) -c utils.c
    

    Only one more rule is needed and we've got the entire Makefile written.

    These rules tell make how to create the intermediate files from the original C source files using cc, but not how to build the actual fget program. That's done with one more rule:

    TARGET=fget
    
    $(TARGET): $(OBJ) fget.h
            $(CC) $(CFLAGS) $(OBJ) -o $(TARGET)
    
  3. Now, finally, here's the entire Makefile:

    #
    # Makefile for the FGET utility
    
    TARGET=fget
    OBJ=fget.o utils.o
    
    CFLAGS=
    CC=/usr/bin/cc
    
    $(TARGET): $(OBJ) fget.h
            $(CC) $(CFLAGS) $(OBJ) -o $(TARGET)
    
    fget.o: fget.c fget.h
            $(CC) $(CFLAGS) -c fget.c
    
    utils.o: utils.c fget.h
            $(CC) $(CFLAGS) -c utils.c
    
  4. Now we can build the program with a single command: make.

    % make
    /usr/bin/cc  -c fget.c
    /usr/bin/cc  -c utils.c
    /usr/bin/cc  fget.o utils.o -o fget
    %
    

    Now watch what happens if I try to build it again, without having made any changes to any of the C source files:

    $ make
    make: `fget' is up to date.
    

    If I make a change to the utilities source file (which I'll simulate by using the handy touch command), notice that only the minimum number of files are recompiled to build the program:

    % touch fget.c
    % make
    /usr/bin/cc  -c fget.c
    /usr/bin/cc  fget.o utils.o -o fget
    

    A lot less typing!

Once you get the hang of creating Makefiles, you'll find yourself unable to live without 'em, even for as small a project as the fget program. It's a good habit to learn, and it will pay dividends as your projects grow in size and complexity.


Task 21.4: Additional Useful C Tools

Between vi for entering programs and cc and make for building them, it seems as though programming is a breeze on UNIX. In fact, a standard cycle for developing software is edit->compile->test in a loop. What's missing from this picture are the UNIX test facilities, also known as debuggers.


Unfortunately, about a half-dozen debugging programs are available on different types of UNIX systems, and they are very different from each other! The good news is that they are but one of a set of useful commands you'll want to research further as you continue to learn how to program C within the UNIX environment.

  1. The easiest way to see exactly what C utilities you have on your system is to use the man -k command:

    % man -k c | wc -l
    2252
    

    With 2,252 matching commands, it's clear that the man command is showing all commands that have a letter c somewhere in their description. Not too useful. Instead, a simple filter is needed, one that shows us only commands that include the letter c without any other letters on either side of it and also includes only those commands in section one of the UNIX online manual—the actual interactive commands:

    % man -k c | grep -i ' c ' | grep 1 | wc -l
         12
    
  2. Much better. Let's have a look and see what these commands are on my Linux system by removing the “word count” at the end of the pipe:

    % man -k c | grep -i ' c ' | grep 1
    c2lout (1)           - convert C and C++ source code into Lout
    cdecl, c++decl (1)   - Compose C and C++ type declarations
    cproto (1)           - generate C function prototypes and convert function definitions.
    ctags (1)            - Generate C language tag files for use with vi
    f2c (1)              - Convert Fortran 77 to C or C++
    gcc, g++ (1)         - GNU project C and C++ Compiler (v2.7)
    indent (1)           - changes the appearance of a C program by inserting or deleting whitespace.
    p2c (1)              - Pascal to C translator, version 1.20
    perlembed (1)        - how to embed perl in your C program
    tcsh (1)             - C shell with file name completion and command line editing
    imake (1)            - C preprocessor interface to the make utility
    makestrs (1)         - makes string table C source and header(s)
    

    The most useful command on this list, in my experience, is the C compiler (here it's gcc).

  3. If I try the same command on a different version of UNIX, Solaris, you'll see that the output is quite different:

    % man -k c | grep -i ' c ' | grep 1
    ansic (7V)              - ANSI C (draft of December 7 1988) lint library
    cb (1)                  - a simple C program beautifier
    cc (1V)                 - C compiler
    cflow (1V)              - generate a flow graph for a C program
    cpp (1)                 - the C language preprocessor
    csh, %, @, alias, bg, break, breaksw, case, continue, default, dirs, else, end, endif, endsw, eval, exec, exit, fg, foreach, glob, goto, hashstat, history, if, jobs, label, limit, logout, notify, onintr, popd, pushd, rehash, repeat, set, setenv, shift, source, stop, suspend, switch, then, umask, unalias, unhash, unlimit, unset, unsetenv, while (1) - C shell built-in commands, see csh(1)
    ctrace (1V)             - generate a C program execution trace
    cxref (1V)              - generate a C program cross-reference
    gcc, g++ (1)            - GNU project C and C++ Compiler (v2 preliminary)
    h2ph (1)                - convert .h C header files to .ph Perl header files
    indent (1)              - indent and format a C program source file
    lint (1V)               - a C program verifier
    mkstr (1)               - create an error message file by massaging C source files
    ref (1)                 - Display a C function header
    tcsh (1)                - C shell with file name completion and command line editing
    xstr (1)                - extract strings from C programs to implement shared strings
    

    Quite a different result! In this case, the most useful commands are unquestionably lint and cc.

    Lint is an interesting program and worth using if you have access: It runs a fine-tooth comb over your program to see whether any potential problems are lurking in your coding style (for example, variables used before they're initialized) or use of functions and system libraries (for example, you call a library with a number, but it expects a string).

    The output is usually quite verbose. It's worth experimenting with different options to limit what's reported, and really clean programs have zero output from lint, though it can be quite a bit of work!

  4. The program we haven't found yet is the debugger, and a quick peek at the man page for gdb, the Gnu debugger, explains why:

    % man gdb | head -15
    
    gdb(1)                      GNU Tools                      gdb(1)
    
    NAME
           gdb - The GNU Debugger
    
    SYNOPSIS
           gdb    [-help] [-nx] [-q] [-batch] [-cd=dir] [-f] [-b bps]
                  [-tty=dev] [-s symfile] [-e prog] [-se prog] [-c
                  core] [-x cmds] [-d dir] [prog[core|procID]]
    
    DESCRIPTION
           The  purpose  of a debugger such as GDB is to allow you to
           see what is going on "inside" another program  while  it
    

    The problem is that it isn't described as being unique to the C programming language!

If you don't have gdb, look for either cdb or dbx, two of the other common debugging environments.


Many kinds of tools are available to help you with developing programs written in the C programming language. For me, I'd say that I end up using mostly vi as I edit things, make to hide the compilation step, and, sporadically, gdb to find problems while running the program itself.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset