Chapter 8. Creating and Using Libraries

Executable binaries can get functions from libraries in one of two ways: The functions can be copied from a static library directly into the executable binary image, or they can be indirectly referenced in a shared library file that is read when the executable is run. This chapter teaches you how to use and create both types of archives.

Static Libraries

Static libraries are simply collections of object files arranged by the ar (archiver) utility. ar collects object files into one archive file and adds a table that tells which object files in the archive define what symbols. The linker, ld, then binds references to a symbol in one object file to the definition of that symbol in an object file in the archive. Static libraries use the suffix .a.

You can convert a group of object files into a static library with a command like

ar rcs libname.a foo.o bar.o baz.o

You can also add one object file at a time to an existing archive.

ar rcs libname.a foo.o
ar rcs libname.a bar.o
ar rcs libname.a baz.o

In either case, libname.a will be the same. The options used here are:

r

Includes the object files in the library, replacing any object files already in the archive that have the same names.

c

Silently create the library if it does not already exist.

s

Maintain the table mapping symbol names to object file names.

There is rarely any need to use other options when building static libraries. However, ar has other options and other capabilities; the ar man page describes them in detail.

Shared Libraries

Shared libraries have several advantages over static libraries:

  • Linux shares the memory used for executable code among all the processes that use the shared library, so whenever you have more than one program using the same code, it is to your advantage, and to your users’ advantage, to put the code in a shared library.

  • Because shared libraries save system memory, they can make the whole system work faster, especially in situations in which memory is not plentiful.

  • Because code in a shared library is not copied into the executable, only one copy of the library code resides on disk, saving both disk space and the computer’s time spent copying the code from disk to memory when programs are run.

  • When bugs are found in a library, a shared library can be replaced by aversion that has the bugs fixed, instead of having to recompile every program that uses the library.

The cost exacted by these advantages is primarily complexity. The executable consists of several interdependent parts, and if you give a binary executable to someone who does not have a shared library that the executable requires, it will not run. A secondary cost is the time it takes when the program starts to find and load the shared libraries; this is generally negated because the shared libraries usually have already been loaded into memory for other processes, and so they do not have to be loaded from disk again when the new process is started.

Linux originally used a simplistic binary file format (actually, three variations on a simplistic binary file format) that made the process of creating shared libraries difficult and time-consuming. Once created, the libraries could not be easily extended in a backward-compatible way. The author of the library had to leave space for data structure expansion by hand-editing tables, and even that did not always work.

Now, the standard binary file format on almost every Linux platform is the modern, extensible Executable and Linking Format (ELF) file format.[1] This means that on practically all Linux platforms, the steps you take to create and use shared libraries are exactly the same.

Designing Shared Libraries

Building shared libraries is only marginally harder than building normal static libraries. There are a few constraints, all of which are easy to manage. There is also one major feature, designed to manage binary compatibility across library versions, that is unique to shared libraries.

Shared libraries are intended to preserve backward compatibility. That is, a binary built against an older version of the library still works when run against a newer version of the library. However, there needs to be a way to mark libraries as incompatible with each other for cases in which developers find it necessary to modify interfaces in a non-backward-compatible manner.

Managing Compatibility

Every Linux shared library is assigned a special name, called a soname, that includes the name of the library and a version number. When developers change interfaces, they increment the version number, altering the name. Some libraries do not have stable interfaces; developers may change their interface in an incompatible way when a new version is released that has changed only a minor number. Most library developers attempt to maintain stable interfaces that change in an incompatible manner only when they release a new major version of the library.

For example, the developers and maintainers of the Linux C library attempt to maintain backward compatibility for all releases of the C library with the same major number. Version 5 of the C library has gone through five minor revisions, and with few exceptions, programs that worked with the first minor revision work with the fifth. (The exceptions have been poorly coded programs that took advantage of C library behavior that was not specified or that was buggy in older versions and fixed in newer versions.)

Because all version 5 C libraries are intended to be backward compatible with older versions, they all use the same soname—libc.so.5—which is related to the name of the file in which it is stored, /lib/libc.so.5. m. r, where m is the minor version number and r is the release number.

Applications that link against a shared library do not link directly against /lib/libc.so.6 (for instance), even though that file exists. The ldconfig program, a standard Linux system utility, creates a symbolic link from /lib/libc.so.6 (the soname) to /lib/libc-2.3.2.so, the real name of the library. This makes upgrading shared libraries easy. To upgrade from 2.3.2 to 2.3.3, it is necessary only to put the new libc-2.3.3.so into the /lib directory and run ldconfig. The ldconfig looks at all the libraries that provide the libc.so.6 soname and makes the symbolic link from the soname to the latest library that provides the soname. Then all the applications linked against /lib/libc.so.6 automatically use the new library the next time they are run, and /lib/libc-2.3.2.so can be removed immediately, since it is no longer in use.[2]

Unless you have a particular reason to do so, do not link against a specific version of a library. Always use the standard -l libname option to the C compiler or the linker, and you will never accidentally link against the wrong version. The linker will look for the file liblibname.so, which will be a symlink to the correct version of the library.

So, for linking against the C library, the linker finds /usr/lib/libc.so, which tells the linker to use /lib/libc.so.6, which is a link to the /lib/libc-2.3.2.so file. The application is linked against libc-2.3.2.so’s soname, libc.so.6, so when the application is run, it finds /lib/libc.so.6 and links to libc-2.3.2.so, because libc.so.6 is a symlink to libc-2.3.2.so.

Incompatible Libraries

When a new version of a library needs to be incompatible with an old version, it should be given a different soname. For instance, to release a new version of the C library that is incompatible with the old one, developers used the soname libc.so.6 instead of libc.so.5, which shows that it is incompatible and also allows applications linked against either version to coexist on the same system. Applications linked against some version of libc.so.5 will continue to use the latest library version that provides the libc.so.5 soname, and applications linked against some version of libc.so.6 will use the latest library version that provides the libc.so.6 soname.

Designing Compatible Libraries

When you are designing your own libraries, you need to know what makes a library incompatible. There are three main causes of incompatibilities:

  1. Changing or removing exported function interfaces

  2. Changing exported data items, except adding optional items to the ends of structures that are allocated within the library

  3. Changing the behavior of functions to something outside the original specification

To keep new versions of your libraries compatible, you can:

  • Add new functions with different names rather than change the definitions or interfaces of existing functions.

  • When changing exported structure definitions, add items only to the end of the structures, and make the extra items optional or filled in by the library itself. Do not expand structures unless they are allocated within the library. Otherwise, applications will not allocate the right amount of data. Do not expand structures that are used in arrays.

Building Shared Libraries

Once you have grasped the concept of sonames, the rest is easy. Just follow a few simple rules.

  • Build your sources with gcc’s -fPIC flag. This generates position-independent code that can be linked and loaded at any address.[3]

  • Do not use the -fomit-frame-pointer compiler option. The libraries will still work, but debuggers will be useless. When you need a user to provide you with a traceback because of a bug in your code (or a savvy user wants a traceback to do his or her own debugging), it will not work.

  • When linking the library, use gcc rather than ld. The C compiler knows how to call the loader in order to link properly, and there is no guarantee that the interface to ld will remain constant.

  • When linking the library, do not forget to provide the soname. You use a special compiler option: -Wl passes options on to ld, with commas replaced with spaces. Use

    gcc -shared -Wl,-soname,soname -o libname filelist liblist
    

    to build your library, where soname is the soname; libname is the name of the library, including the whole version number, such as libc.so.5.3.12; filelist is the list of object files that you want to put in the library; and liblist is the list of other libraries that provide symbols that will be accessed by this library. The last item is easy to overlook, because the library will still work without it on the system on which it was created, but it may not work in all other situations, such as when multiple libraries are available. For nearly every library, the C library should be included in that list, so explicitly place -lc at the end of this list.

    To create the file libfoo.so.1.0.1, with a soname of libfoo.so.1, from the object files foo.o and bar.o, use this invocation:

    gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0.1 foo.o bar.o 
            -lc
    
  • Do not strip the library unless you are in a particularly space-hungry environment. Shared libraries that have been stripped will still work, but they have the same general disadvantages as libraries built from object files compiled with -fomit-frame-pointer.

Installing Shared Libraries

The ldconfig program does all the hard work of installing a shared library. You just need to put the files in place and run ldconfig. Follow these steps:

  1. Copy the shared library to the directory in which you want to keep it.

  2. If you want the linker to be able to find the library without giving it a -Ldirectory flag, install the library in /usr/lib, or make a sym-link in /usr/lib named libname.so that points to the shared library file. You should use a relative symlink (with /usr/lib/libc.so pointing to ../../lib/libc.so.5.3.12), instead of an absolute symlink (/usr/lib/libc.so would not point to /lib/libc.so.5.3.12).

  3. If you want the linker to be able to find the library without installing it on the system (or before installing it on the system), create a libname.so link in the current directory just like the system-wide one. Then use -L. to tell gcc to look in the current directory for libraries.

  4. If the full pathname of the directory in which you installed the shared library file is not listed in /etc/ld.so.conf, add it, one directory path per line of the file.

  5. Run the ldconfig program, which will make another symlink in the directory in which you installed the shared library file from the soname to the file you installed. It will then make an entry in the dynamic loader cache so that the dynamic loader finds your library when you run programs linked with it, without having to search many directories in an attempt to find it.[4]

You need to create entries in /etc/ld.so.conf and run ldconfig only if you are installing the libraries as system libraries—if you expect that programs linked against the library will automatically work. Other ways to use shared libraries are explained in the next section.

Example

As an extremely simple but still instructive example, we have created a library that contains one short function. Here, in its entirety, is libhello.c:

1: /* libhello.c */
2:
3: #include <stdio.h>
4:
5: void print_hello(void) {
6:     printf("Hello, library.
");
7: }

Of course, we need a program that makes use of libhello:

1: /* usehello.c */
2:
3: #include "libhello.h"
4:
5: int main (void) {
6:     print_hello();
7:     return 0;
8: }

The contents of libhello.h are left as an exercise for the reader.

In order to compile and use this library without installing it in the system, we take the following steps:

  1. Use -fPIC to build an object file for a shared library:

    gcc -fPIC -Wall -g -c libhello.c
    
  2. Link libhello against the C library for best results on all systems:

    gcc -g -shared -Wl,-soname,libhello.so.0 -o libhello.so.0.0 
            libhello.o -lc
    
  3. Create a pointer from the soname to the library:

    ln -sf libhello.so.0.0 libhello.so.0
    
  4. Create a pointer for the linker to use when linking applications against -lhello:

    ln -sf libhello.so.0 libhello.so
    
  5. Use -L. to cause the linker to look in the current directory for libraries, and use -lhello to tell it what library to link against:

    gcc -Wall -g -c usehello.c -o usehello.o
    gcc -g -o usehello usehello.o -L. -lhello
    

    (This way, if you install the library on the system instead of leaving it in the current directory, your application will still link with the same command line.)

  6. Now run usehello like this:

    LD_LIBRARY_PATH=$(pwd) ./usehello
    

    The LD_LIBRARY_PATH environment variable tells the system where to look for libraries (see the next section for details). Of course, you can install libhello.so.* in the /usr/lib directory and avoid setting the LD_LIBRARY_PATH environment variable, if you like.

Using Shared Libraries

The easiest way to use a shared library is to ignore the fact that it is a shared library. The C compiler automatically uses shared libraries instead of static ones unless it is explicitly told to link with static libraries. However, there are three other ways to use shared libraries. One, explicitly loading and unloading them from within a program while the program runs, is called dynamic loading, and is described in Chapter 27. The other two are explained here.

Using Noninstalled Libraries

When you run a program, the dynamic loader usually looks in a cache (/etc/ld.so.cache, created by ldconfig) of libraries that are in directories mentioned in /etc/ld.so.conf to find libraries that the program needs. However, if the LD_LIBRARY_PATH environment variable is set, it first dynamically scans the directories mentioned in LD_LIBRARY_PATH (which has the same format as the PATH environment variable) and loads all the directories it finds in the path, before it looks in its cache.

This means that if you want to use an altered version of the C library when running one specific program, you can put that library in a directory somewhere and run the program with the appropriate LD_LIBRARY_PATH to access that library. As an example, a few versions of the Netscape browser that were linked against the 5.2.18 version of the C library would die with a segmentation fault when run with the standard 5.3.12 C library because of a more stringent enforcement of malloc() policies. Many people put a copy of the 5.2.18 C library in a separate directory, such as /usr/local/netscape/lib/, move the Netscape binary there, and replace /usr/local/bin/netscape with a shell script that looks something like this:

#!/bin/sh
export LD_LIBRARY_PATH=/usr/local/netscape/lib:$LD_LIBRARY_PATH
exec /usr/local/netscape/lib/netscape $*

Preloading Libraries

Sometimes, rather than replacing an entire shared library, you wish to replace only a few functions. Because the dynamic loader searches for functions starting with the first loaded library and proceeds through the stack of libraries in order, it would be convenient to be able to tack an alternative library on top of the stack to replace only the functions you need.

An example is zlibc. This library replaces file functions in the C library with functions that deal with compressed files. When a file is opened, zlibc looks for both the requested file and a gzipped version of the file. If the requested file exists, zlibc mimics the C library functions exactly, but if it does not exist, and a gzipped version exists instead, it transparently uncompresses the gzipped file without the application knowing. There are limitations, which are described in the library’s documentation, but it can trade off speed for a considerable amount of space.

There are two ways to preload a library. To affect only certain programs, you can set an environment variable for the cases you wish to affect:

LD_PRELOAD=/lib/libsomething.o exec /bin/someprogram $*

However, as with zlibc, you might want to preload a library for every program on the system. The easiest way to do that is to add a line to the /etc/ld.so.preload file specifying which library to preload. In the case of zlibc, it would look something like this:

/lib/uncompress.o


[1] See Understanding ELF Object Files and Debugging Tools [Nohr, 1994] or ftp://tsx-11.mit.edu/pub/linux/packages/GCC/ELF.doc.tar.gz for detailed information on the ELF format. The Linux-specifix details are covered in the document ftp://tsx-11.mit.edu/pub/linux/packages/GCC/elf.ps.gz.

[2] That is, you can use the rm command to remove it from the directory structure immediately; programs that are still using it keep it on the disk automatically until they exit. See page 194 for an explanation of how this works.

[3] The difference between -fPIC and -fpic relates to how the position-independent code is generated. On some architectures, only relatively small shared libraries can be built with -fpic while on others they do exactly the same thing. Unless you have a very good reason to use -fpic, just use -fPIC and things will work properly on every architecture.

[4] If you remove /etc/ld.so.cache, you may be able to detect the slowdown in your system. Run ldconfig to regenerate /etc/ld.so.cache.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset