Chapter 6. The GNU C Library

The GNU C Library (glibc) is the standard C library on Linux systems. Other C libraries exist and are sometimes used for special purposes (such as very small subsets of the standard C libraries used for embedded systems and bootstrapping), but glibc is the standard C library on all Linux distributions. It provides a significant portion of the functionality documented in Linux Application Development—in fact, this book might more accurately but less concisely have been titled Linux and glibc Application Development.

Feature Selection

In glibc, there are a set of feature selection macros that are used to select which standards you wish glibc to comply with. Standards sometimes conflict, and so glibc allows you to select exactly which set of standards (formal, de jure, and informal, de facto) with which to comply, fully or partially. These macros are technically called feature test macros.

You need to be aware of these macros because the default set of macros defined does not provide all the functionality of glibc. A few mechanisms discussed in this book are not available with the default feature set selected; we document the required feature macros to use each of these mechanisms.

The feature test macros are designed to specify with what standards (de jure or de facto), and in some cases precisely which versions of those standards, glibc should comply. This compliance often includes not defining functions and macros beyond what is specified by a standard for header files that are themselves defined by that standard. That means that an application written to conform with a standard can define its own functions and macros without conflicting with extensions not defined by that standard.

The feature test macros do not guarantee that your application is fully compatible with the set of standards specified by the set of macros you define. Setting the feature test macros may find some use of nonportable extensions, but it will not show, for example, use of header files that are entirely unspecified by the standard.

The macros are defined in the system header file feature.h, which you should not include directly. Instead, all other header files that might be affected by the contents of feature.h include it.

The default set of feature macros if none are defined is _SVID_SOURCE=1,_BSD_SOURCE=1,_POSIX_SOURCE=1, and _POSIX_C_SOURCE=199506L. Each option is described in more detail below, but this essentially translates into “support the capabilities of the 1995 POSIX standard (see page 8; this is from before POSIX and the Single Unix Standard were combined), all standard System V features, and all BSD features that do not conflict with System V features.” This default set of feature macros suffices for most programs.

When you give gcc the -ansi option, as documented on page 46, it automatically defines the internal __STRICT_ANSI__ macro, which turns off all the default feature macros.

With the exception of the __STRICT_ANSI__ macro, which is special (and which should be set only by the compiler in the context of the -ansi command line option), these feature macros are cumulative; you can define any combination of them. The exact definition of _BSD_SOURCE changes depending on which other feature macros are set (as documented below); the rest are purely cumulative.

Some of the feature test macros are defined by various versions of POSIX or other standards, some are common in the industry, and others are strictly limited to glibc.

_POSIX_SOURCE

If this macro is defined, all the interfaces defined as part of the original POSIX.1 specification are made available.

 

This macro was defined by the original POSIX.1-1990 standard.

_POSIX_C_SOURCE

This macro supersedes _POSIX_SOURCE. If it is set to 1, it is equivalent to _POSIX_SOURCE. If it is >=2, then it also includes C interfaces defined by POSIX.2, including regular expressions. If it is >=199309L, then it also includes additional C interfaces defined in the 1993 revision of POSIX, particularly including the soft real-time functionality; if it is >=199506L (the default), it also includes additional C interfaces defined in the 1995 revision of POSIX, particularly including POSIX threads. This macro was defined by versions of POSIX released after 1990 in order to differentiate support for various versions of the POSIX (and now also Single Unix) standards. It is largely superseded by _XOPEN_SOURCE.

_XOPEN_SOURCE

The _XOPEN_SOURCE macro is defined by the XSI portion of the Single Unix Standard, and defines a logical superset of the interfaces included by _POSIX_C_SOURCE. It was also defined by XPG. If it is defined at all, base-level conformance with XPG4 (Unix95) is included. If it is defined as 500, then base-level conformance with XPG5 (Unix98, SuS version 2) is included. If it is defined as 600, base-level conformance with IEEE Std 1003.1-2003 (the combined POSIX and SuS document) is included.

_ISOC99_SOURCE

This feature test macro exports the interfaces defined by the new ISO/IEC C99 standard.

_SVID_SOURCE

This feature test macro makes functionality specified by the System V Interface Definition (SVID) available. This does not imply that glibc provides a complete implementation of the SVID standard; it merely exposes the SVID-specified functionality that exists in glibc.

_BSD_SOURCE

BSD features can conflict with other features, and the conflicts are always resolved in favor of System Vor standard-compliant behavior if any POSIX, X/Open, or System V feature macro is defined or implied—so the only feature macro that allows BSD behavior to be asserted is _ISOC99_SOURCE. (The exact definition of this feature test macro has changed from time to time, and may change again, since it is not specified by any standard.)

_GNU_SOURCE

_GNU_SOURCE turns on everything possible, favoring System V interfaces to BSD interfaces in cases of conflict. It also adds some GNU- and Linux-specific interfaces, such as file leases.

When the standard set of feature test macros will not suffice, the most commonly useful feature macros to define are _GNU_SOURCE (turn everything on—the easiest solution), _XOPEN_SOURCE=600 (most things you are likely to care about, a subset of _GNU_SOURCE), or _ISOC99_SOURCE (use features from the most recent C standard, a subset of _XOPEN_SOURCE=600).

POSIX Interfaces

POSIX Required Types

POSIX defines several typedefs defined in the header file <sys/types.h> and used for many arguments and return values. These typedefs are important because the standard C types can vary from machine to machine and are loosely defined by the C standard. The C language is more useful on a wide range of hardware because of this loose definition—a 16-bit machine does not have the same native word size as a 64-bit machine, and a low-level programming language should not pretend it does—but POSIX needs more guarantees, and so requires that the C library’s <sys/types.h> header file define a set of consistent types for each machine that implements POSIX. Each of these typedefs can be easily distinguished from a native C type because it ends in _t.

The subset used for interfaces described in this book is:

dev_t

An arithmetic type holding the major and minor numbers corresponding to device special files, normally found in the /dev subdirectory. In Linux, a dev_t can be manipulated using the major(), minor(), and makedev() macros found in <sys/sysmacros. h>. It is normally used only for system programming, and is described on page 189.

uid_t, gid_t

Integer types holding a unique user ID number or group ID number, respectively. The user ID and group ID credentials are described on page 108.

pid_t

An integer type providing a unique value for a process on a system, described on page 107.

id_t

An integer type capable of holding, without truncation, any pid_t, uid_t, or gid_t.

off_t

A signed integer type measuring a file size in bytes.

size_t

An unsigned integer type measuring an in-memory object, such as a character string, array, or buffer.

ssize_t

A signed integer type that holds a count of bytes (positive) or an error return code (negative).

time_t

An integer (on all normal systems) or real floating point (so that VMS can be considered a POSIX operating system) type giving the time in seconds, as described on page 484.

The type descriptions are intentionally vague. There is no guarantee that the types will be the same on two different Linux platforms, or even two different environments running on the same platform. It is quite likely that a 64-bit machine that supports both 64-bit and 32-bit environments will have different values in each environment for some of these types. Also, these types may change in future versions of Linux, within the scope allowed by POSIX.

Discovering Run-Time Capabilities

Many system capabilities have limits, others are optional, and some may have information associated with them. A limit on the length of the string of arguments passed to a new program protects the system from arbitrary demands for memory that could otherwise bring the system to a standstill. Not all POSIX systems implement job control. A program may wish to know the most recent version of the POSIX standard the currently running system claims to implement.

The sysconf() function provides this type of system-specific information that may differ from system to system for a single executable, information that cannot be known at the time the executable is compiled.

#include <unistd.h>

long sysconf(int);

The integer argument to sysconf() is one of a set of macros prefixed with _SC_. Here are the ones that are most likely to be useful to you:

_SC_CLK_TCK

Return the number of kernel internal clock ticks per second, as made visible to programs. Note that the kernel may have one or more clocks that run at a higher rate; _SC_CLK_TCK provides the accounting clock tick unit used to report information from the kernel and is not an indicator of system latency.

_SC_STREAM_MAX

Return the maximum number of C standard I/O streams that a process can have open at once.

_SC_ARG_MAX

Return the maximum length, in bytes, of the command-line arguments and environment variables used by any of the exec() functions. If this limit is exceeded, E2BIG is returned by the exec() call.

_SC_OPEN_MAX

Returns the maximum number of files that a process can have open at once; it is the same as the RLIMIT_NOFILE soft limit that can be queried by getrlimit() and set by setrlimit(). This is the only sysconf() value that can change value during the execution of a program; when setrlimit() is called to change the RLIMIT_NOFILE soft limit, _SC_OPEN_MAX follows the new soft limit.

_SC_PAGESIZE or _SC_PAGE_SIZE

Returns the size of a single page in bytes. On systems that can support multiple page sizes, returns the size of a single normal page as allocated to resolve a normal user-space request for memory, which is considered the native page size for the system.

_SC_LINE_MAX

Returns the length in bytes of the maximum line length that text-processing utilities on the system are required to handle, including the trailing newline character. Note that many of the GNU utilities used on Linux systems actually have no hard-coded maximum length and can take arbitrarily long input lines. However, a portable program must not provide text-processing utilities with text with line lengths longer than _SC_LINE_MAX; many Unix systems have utilities with fixed maximum line lengths, and exceeding this line length may produce undefined output.

_SC_NGROUPS_MAX

Returns the number of supplemental groups (as discussed in Chapter 10) a process can have.

Finding and Setting Basic System Information

There are a few pieces of information about the system on which a program is running that can be useful. The operating system name and version, for example, can be used to change what features system programs provide. The uname() system call allow a program to discover this run-time information.

#include <sys/utsname.h>

int uname(struct utsname * unameBuf);

The function returns nonzero on error, which occurs only if unameBuf is invalid. Otherwise, the structure it points to is filled in with NULL terminated strings describing the system the program is running on. Table 6.1 describes the members of struct utsname.

Table 6.1. Members of struct utsname

Member

Description

sysname

The name of the operating system running (Linux for the purposes of this book).

release

The version number of the kernel that is running. This is the full version, such as 2.6.2. This number can be easily changed by whoever builds a kernel, and it is common for more than these three numbers to appear. Many distributions use an additional number to describe what patches they have applied, leading to release numbers like 2.4.17-23.

version

Under Linux, this contains a time stamp describing when the kernel was built.

machine

A short string specifying the type of microprocessor on which the operating system is running. This could be i686 for a Pentium Pro or later processor, alpha for an Alpha-class processor, or ppc64 for a 64-bit PowerPC processor.

nodename

The host name of the machine, which is often the machine’s primary Internet host name.

domainname

The NIS(or YP)domain the machine is part of, if any.

The nodename member is what is commonly called the system host name (it is what the hostname command displays), but it should not be confused with an Internet host name. While these are the same on many systems, they are not necessarily the same thing. A system with multiple Internet addresses has multiple Internet host names, but only a single node name, so there is not a one-to-one equivalence.

A more common situation is home computers on broadband Internet connections. They normally have Internet host names something like host127-56.raleigh.myisp.com, and their Internet host names change whenever they have disconnected their broadband modem for an extended period of time.[1] People who own those machines give them node names that better suit their personalities, along the lines of loren or eleanor, which are not proper Internet addresses at all. If they have multiple machines behind a home gateway device, all of those machines will share a single Internet address (and a single Internet host name), but may have names like linux.mynetwork.org and freebsd.mynetwork.org, which are still not Internet host names. For all of these reasons, assuming that the system’s node name is a valid Internet host name for the machine is not a good idea.

The system’s node name is set using the sethostname() system call,[2] and its NIS (YP) domain name[3] is set by the setdomainname() system call.

#include <unistd.h>

int sethostname(const char * name, size_t len);
int setdomainname(const char * name, size_t len);

Both of these system calls take a pointer to a string (not necessarily NULL terminated) containing the appropriate name and the length of the string.

Compatibility

Applications that are compiled with header files from and linked to one version of glibc are intended to continue to work with later versions of glibc. This backward compatibility generally means that you do not have to rebuild applications just because a new version of glibc has been released.

There are practical limitations to this backward compatibility. First, mixing objects linked against different versions of glibc in a single executable may sometimes accidentally work, but not by intentional design. Note that this includes dynamically loaded objects as well as statically linked objects. Second, the application should be standard-conforming; an application that depends on side effects of bugs or on unspecified behavior in one version of glibc may not continue to work with later versions of glibc, and an application that links to private symbols in glibc (generally speaking, symbols prefixed with a _ character) is also very unlikely to work with later versions of glibc.

One way that this backward compatibility is maintained is with versioned symbols. When the authors of glibc wish to introduce an incompatible change in glibc, they preserve the original implementation or write a compatible implementation of the interface in question and mark it with the older glibc version number. They then implement the newer, different interface (which may differ in semantics, signature, or both) and mark it with the new glibc version number. Applications built against the older glibc version use the older interface; applications built against the newer glibc version use the new interface.

Most other libraries maintain compatibility by including the version number in the library name and having multiple different versions installed at the same time if necessary; for example, GTK+ 1.2 and GTK+ 2.0 might both be installed on the same system, each with its own set of header and library files and with the version name embedded in the path to the header files and in the names of the library files.

Part of the standard for naming shared libraries on Linux includes a major version number, in order to provide multiple versions of a library on a system. This is not much used for its intended purpose, because it does not allow you to link new applications against multiple versions of a library on a single system; it merely allows backward compatibility to be maintained for already-built applications built on older systems. In practice, developers have found that they have needed to build applications against multiple versions of the same library on the system, and therefore most major libraries have migrated to providing the version number of the library as part of the name of the library.



[1] Most, but not all, home Internet services assign dynamic IP addresses rather than static ones.

[2] Despite the misleading name, this system call sets the node name, not the machine’s Internet host name.

[3] Network Information Service, or NIS, is a mechanism for machines on a network to share information such as user names and passwords. It used to be called Yellow Pages, or YP, but was renamed. The NIS domain name is part of this mechanism, which is implemented outside of the kernel with the exception of the domain name being stored in struct utsname.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset