The GNU C Library

@shorttitlepage The GNU C Library Reference Manual The GNU C Library

Reference Manual

Sandra Loosemore with Roland McGrath, Andrew Oram, and Richard M. Stallman

last updated 9 April 1993

for version 1.06 Beta Copyright (C) 1993 Free Software Foundation, Inc.

Introduction

The C language provides no built-in facilities for performing such common operations as input/output, memory management, string manipulation, and the like. Instead, these facilities are defined in a standard library, which you compile and link with your programs.

The GNU C library, described in this document, defines all of the library functions that are specified by the ANSI C standard, as well as additional features specific to POSIX and other derivatives of the Unix operating system, and extensions specific to the GNU system.

The purpose of this manual is to tell you how to use the facilities of the GNU library. We have mentioned which features belong to which standards to help you identify things that are potentially nonportable to other systems. But the emphasis on this manual is not on strict portability.

Getting Started

This manual is written with the assumption that you are at least somewhat familiar with the C programming language and basic programming concepts. Specifically, familiarity with ANSI standard C (see section ANSI C), rather than "traditional" pre-ANSI C dialects, is assumed.

The GNU C library includes several header files, each of which provides definitions and declarations for a group of related facilities; this information is used by the C compiler when processing your program. For example, the header file `stdio.h' declares facilities for performing input and output, and the header file `string.h' declares string processing utilities. The organization of this manual generally follows the same division as the header files.

If you are reading this manual for the first time, you should read all of the introductory material and skim the remaining chapters. There are a lot of functions in the GNU C library and it's not realistic to expect that you will be able to remember exactly how to use each and every one of them. It's more important to become generally familiar with the kinds of facilities that the library provides, so that when you are writing your programs you can recognize when to make use of library functions, and where in this manual you can find more specific information about them.

Standards and Portability

This section discusses the various standards and other sources that the GNU C library is based upon. These sources include the ANSI C and POSIX standards, and the System V and Berkeley Unix implementations.

The primary focus of this manual is to tell you how to make effective use of the GNU library facilities. But if you are concerned about making your programs compatible with these standards, or portable to operating systems other than GNU, this can affect how you use the library. This section gives you an overview of these standards, so that you will know what they are when they are mentioned in other parts of the manual.

See section Summary of Library Facilities, for an alphabetical list of the functions and other symbols provided by the library. This list also states which standards each function or symbol comes from.

ANSI C

The GNU C library is compatible with the C standard adopted by the American National Standards Institute (ANSI): American National Standard X3.159-1989---"ANSI C". The header files and library facilities that make up the GNU library are a superset of those specified by the ANSI C standard.

If you are concerned about strict adherence to the ANSI C standard, you should use the `-ansi' option when you compile your programs with the GNU C compiler. This tells the compiler to define only ANSI standard features from the library header files, unless you explicitly ask for additional features. See section Feature Test Macros, for information on how to do this.

Being able to restrict the library to include only ANSI C features is important because ANSI C puts limitations on what names can be defined by the library implementation, and the GNU extensions don't fit these limitations. See section Reserved Names, for more information about these restrictions.

This manual does not attempt to give you complete details on the differences between ANSI C and older dialects. It gives advice on how to write programs to work portably under multiple C dialects, but does not aim for completeness.

POSIX (The Portable Operating System Interface)

The GNU library is also compatible with the IEEE POSIX family of standards, known more formally as the Portable Operating System Interface for Computer Environments. POSIX is derived mostly from various versions of the Unix operating system.

The library facilities specified by the POSIX standard are a superset of those required by ANSI C; POSIX specifies additional features for ANSI C functions, as well as specifying new additional functions. In general, the additional requirements and functionality defined by the POSIX standard are aimed at providing lower-level support for a particular kind of operating system environment, rather than general programming language support which can run in many diverse operating system environments.

The GNU C library implements all of the functions specified in IEEE Std 1003.1-1988, the POSIX System Application Program Interface, commonly referred to as POSIX.1. The primary extensions to the ANSI C facilities specified by this standard include file system interface primitives (see section File System Interface), device-specific terminal control functions (see section Low-Level Terminal Interface), and process control functions (see section Child Processes).

Some facilities from draft 11 of IEEE Std 1003.2, the POSIX Shell and Utilities standard (POSIX.2) are also implemented in the GNU library. These include utilities for dealing with regular expressions and other pattern matching facilities (see section Pattern Matching).

Berkeley Unix

The GNU C library defines facilities from some other versions of Unix, specifically from the 4.2 BSD and 4.3 BSD Unix systems (also known as Berkeley Unix) and from SunOS (a popular 4.2 BSD derivative that includes some Unix System V functionality).

The BSD facilities include symbolic links (see section Symbolic Links), the select function (see section Waiting for Input or Output), the BSD signal functions (see section BSD Signal Handling), and sockets (see section Sockets).

SVID (The System V Interface Description)

The System V Interface Description (SVID) is a document describing the AT&T Unix System V operating system. It is to some extent a superset of the POSIX standard (see section POSIX (The Portable Operating System Interface)).

The GNU C library defines some of the facilities required by the SVID that are not also required by the ANSI or POSIX standards, for compatibility with System V Unix and other Unix systems (such as SunOS) which include these facilities. However, many of the more obscure and less generally useful facilities required by the SVID are not included. (In fact, Unix System V itself does not provide them all.)

Incomplete: Are there any particular System V facilities that ought to be mentioned specifically here?

Using the Library

This section describes some of the practical issues involved in using the GNU C library.

Header Files

Libraries for use by C programs really consist of two parts: header files that define types and macros and declare variables and functions; and the actual library or archive that contains the definitions of the variables and functions.

(Recall that in C, a declaration merely provides information that a function or variable exists and gives its type. For a function declaration, information about the types of its arguments might be provided as well. The purpose of declarations is to allow the compiler to correctly process references to the declared variables and functions. A definition, on the other hand, actually allocates storage for a variable or says what a function does.)

In order to use the facilities in the GNU C library, you should be sure that your program source files include the appropriate header files. This is so that the compiler has declarations of these facilities available and can correctly process references to them. Once your program has been compiled, the linker resolves these references to the actual definitions provided in the archive file.

Header files are included into a program source file by the `#include' preprocessor directive. The C language supports two forms of this directive; the first,

#include "header"

is typically used to include a header file header that you write yourself; this would contain definitions and declarations describing the interfaces between the different parts of your particular application. By contrast,

#include <file.h>

is typically used to include a header file `file.h' that contains definitions and declarations for a standard library. This file would normally be installed in a standard place by your system administrator. You should use this second form for the C library header files.

Typically, `#include' directives are placed at the top of the C source file, before any other code. If you begin your source files with some comments explaining what the code in the file does (a good idea), put the `#include' directives immediately afterwards, following the feature test macro definition (see section Feature Test Macros).

For more information about the use of header files and `#include' directives, see section 'Header Files' in The GNU C Preprocessor Manual.

The GNU C library provides several header files, each of which contains the type and macro definitions and variable and function declarations for a group of related facilities. This means that your programs may need to include several header files, depending on exactly which facilities you are using.

Some library header files include other library header files automatically. However, as a matter of programming style, you should not rely on this; it is better to explicitly include all the header files required for the library facilities you are using. The GNU C library header files have been written in such a way that it doesn't matter if a header file is accidentally included more than once; including a header file a second time has no effect. Likewise, if your program needs to include multiple header files, the order in which they are included doesn't matter.

Compatibility Note: Inclusion of standard header files in any order and any number of times works in any ANSI C implementation. However, this has traditionally not been the case in many older C implementations.

Strictly speaking, you don't have to include a header file to use a function it declares; you could declare the function explicitly yourself, according to the specifications in this manual. But it is usually better to include the header file because it may define types and macros that are not otherwise available and because it may define more efficient macro replacements for some functions. It is also a sure way to have the correct declaration.

Macro Definitions of Functions

If we describe something as a function in this manual, it may have a macro definition as well. This normally has no effect on how your program runs--the macro definition does the same thing as the function would. In particular, macro equivalents for library functions evaluate arguments exactly once, in the same way that a function call would. The main reason for these macro definitions is that sometimes they can produce an inline expansion that is considerably faster than an actual function call.

Taking the address of a library function works even if it is also defined as a macro. This is because, in this context, the name of the function isn't followed by the left parenthesis that is syntactically necessary to recognize the a macro call.

You might occasionally want to avoid using the a macro definition of a function--perhaps to make your program easier to debug. There are two ways you can do this:

For example, suppose the header file `stdlib.h' declares a function named abs with

extern int abs (int);

and also provides a macro definition for abs. Then, in:

#include <stdlib.h>
int f (int *i) { return (abs (++*i)); }

the reference to abs might refer to either a macro or a function. On the other hand, in each of the following examples the reference is to a function and not a macro.

#include <stdlib.h>
int g (int *i) { return ((abs)(++*i)); }

#undef abs
int h (int *i) { return (abs (++*i)); }

Since macro definitions that double for a function behave in exactly the same way as the actual function version, there is usually no need for any of these methods. In fact, removing macro definitions usually just makes your program slower.

Reserved Names

The names of all library types, macros, variables and functions that come from the ANSI C standard are reserved unconditionally; your program may not redefine these names. All other library names are reserved if your programs explicitly includes the header file that defines or declares them. There are several reasons for these restrictions:

In addition to the names documented in this manual, reserved names include all external identifiers (global functions and variables) that begin with an underscore (`_') and all identifiers regardless of use that begin with either two underscores or an underscore followed by a capital letter are reserved names. This is so that the library and header files can define functions, variables, and macros for internal purposes without risk of conflict with names in user programs.

Some additional classes of identifier names are reserved for future extensions to the C language. While using these names for your own purposes right now might not cause a problem, they do raise the possibility of conflict with future versions of the C standard, so you should avoid these names.

In addition, some individual header files reserve names beyond those that they actually define. You only need to worry about these restrictions if your program includes that particular header file.

Feature Test Macros

The exact set of features available when you compile a source file is controlled by which feature test macros you define.

If you compile your programs using `gcc -ansi', you get only the ANSI C library features, unless you explicitly request additional features by defining one or more of the feature macros. See section 'Options' in The GNU CC Manual, for more information about GCC options.

You should define these macros by using `#define' preprocessor directives at the top of your source code files. You could also use the `-D' option to GCC, but it's better if you make the source files indicate their own meaning in a self-contained way.

Macro: _POSIX_SOURCE

If you define this macro, then the functionality from the POSIX.1 standard (IEEE Standard 1003.1) is available, as well as all of the ANSI C facilities.

Macro: _POSIX_C_SOURCE

If you define this macro with a value of 1, then the functionality from the POSIX.1 standard (IEEE Standard 1003.1) is made available. If you define this macro with a value of 2, then both the functionality from the POSIX.1 standard and the functionality from the POSIX.2 standard (IEEE Standard 1003.2) are made available. This is in addition to the ANSI C facilities.

Macro: _BSD_SOURCE

If you define this macro, functionality derived from 4.3 BSD Unix is included as well as the ANSI C, POSIX.1, and POSIX.2 material.

Some of the features derived from 4.3 BSD Unix conflict with the corresponding features specified by the POSIX.1 standard. If this macro is defined, the 4.3 BSD definitions take precedence over the POSIX definitions.

Macro: _SVID_SOURCE

If you define this macro, functionality derived from SVID is included as well as the ANSI C, POSIX.1, and POSIX.2 material.

Macro: _GNU_SOURCE

If you define this macro, everything is included: ANSI C, POSIX.1, POSIX.2, BSD, SVID, and GNU extensions. In the cases where POSIX.1 conflicts with BSD, the POSIX definitions take precedence.

If you want to get the full effect of _GNU_SOURCE but make the BSD definitions take precedence over the POSIX definitions, use this sequence of definitions:

#define _GNU_SOURCE
#define _BSD_SOURCE
#define _SVID_SOURCE

We recommend you use _GNU_SOURCE in new programs. If you don't specify the `-ansi' option to GCC and don't define any of these macros explicitly, the effect as the same as defining _GNU_SOURCE.

When you define a feature test macro to request a larger class of features, it is harmless to define in addition a feature test macro for a subset of those features. For example, if you define _POSIX_C_SOURCE, then defining _POSIX_SOURCE as well has no effect. Likewise, if you define _GNU_SOURCE, then defining either _POSIX_SOURCE or _POSIX_C_SOURCE or _SVID_SOURCE as well has no effect.

Note, however, that the features of _BSD_SOURCE are not a subset of any of the other feature test macros supported. This is because it defines BSD features that take precedence over the POSIX features that are requested by the other macros. For this reason, defining _BSD_SOURCE in addition to the other feature test macros does have an effect: it causes the BSD features to take priority over the conflicting POSIX features.

Roadmap to the Manual

Here is an overview of the contents of the remaining chapters of this manual.

If you already know the name of the facility you are interested in, you can look it up in section Summary of Library Facilities. This gives you a summary of its syntax and a pointer to where you can find a more detailed description. This appendix is particularly useful if you just want to verify the order and type of arguments to a function, for example.

Error Reporting

Many functions in the GNU C library detect and report error conditions, and sometimes your programs need to check for these error conditions. For example, when you open an input file, you should verify that the file was actually opened correctly, and print an error message or take other appropriate action if the call to the library function failed.

This chapter describes how the error reporting facility works. Your program should include the header file `errno.h' to use this facility.

Checking for Errors

Most library functions return a special value to indicate that they have failed. The special value is typically -1, a null pointer, or a constant such as EOF that is defined for that purpose. But this return value tells you only that an error has occurred. To find out what kind of error it was, you need to look at the error code stored in the variable errno. This variable is declared in the header file `errno.h'.

Variable: volatile int errno

The variable errno contains the system error number. You can change the value of errno.

Since errno is declared volatile, it might be changed asynchronously by a signal handler; see section Defining Signal Handlers. However, a properly written signal handler saves and restores the value of errno, so you generally do not need to worry about this possibility except when writing signal handlers.

The initial value of errno at program startup is zero. Many library functions are guaranteed to set it to certain nonzero values when they encounter certain kinds of errors. These error conditions are listed for each function. These functions do not change errno when they succeed; thus, the value of errno after a successful call is not necessarily zero, and you should not use errno to determine whether a call failed. The proper way to do that is documented for each function. If the call the failed, you can examine errno.

Many library functions can set errno to a nonzero value as a result of calling other library functions which might fail. You should assume that any library function might alter errno.

Portability Note: ANSI C specifies errno as a "modifiable lvalue" rather than as a variable, permitting it to be implemented as a macro. For example, its expansion might involve a function call, like *_errno (). In fact, that is what it is on the GNU system itself. The GNU library, on non-GNU systems, does whatever is right for the particular system.

There are a few library functions, like sqrt and atan, that return a perfectly legitimate value in case of an error, but also set errno. For these functions, if you want to check to see whether an error occurred, the recommended method is to set errno to zero before calling the function, and then check its value afterward.

All the error codes have symbolic names; they are macros defined in `errno.h'. The names start with `E' and an upper-case letter or digit; you should consider names of this form to be reserved names. See section Reserved Names.

The error code values are all positive integers and are all distinct. (Since the values are distinct, you can use them as labels in a switch statement, for example.) Your program should not make any other assumptions about the specific values of these symbolic constants.

The value of errno doesn't necessarily have to correspond to any of these macros, since some library functions might return other error codes of their own for other situations. The only values that are guaranteed to be meaningful for a particular library function are the ones that this manual lists for that function.

On non-GNU systems, almost any system call can return EFAULT if it is given an invalid pointer as an argument. Since this could only happen as a result of a bug in your program, and since it will not happen on the GNU system, we have saved space by not mentioning EFAULT in the descriptions of individual functions.

Error Codes

The error code macros are defined in the header file `errno.h'. All of them expand into integer constant values. Some of these error codes can't occur on the GNU system, but they can occur using the GNU library on other systems.

Macro: int EPERM

Operation not permitted; only the owner of the file (or other resource) or processes with special privileges can perform the operation.

Macro: int ENOENT

No such file or directory. This is a "file doesn't exist" error for ordinary files that are referenced in contexts where they are expected to already exist.

Macro: int ESRCH

No process matches the specified process ID.

Macro: int EINTR

Interrupted function call; an asynchronous signal occured and prevented completion of the call. When this happens, you should try the call again.

You can choose to have functions resume after a signal that is handled, rather than failing with EINTR; see section Primitives Interrupted by Signals.

Macro: int EIO

Input/output error; usually used for physical read or write errors.

Macro: int ENXIO

No such device or address. Typically, this means that a file representing a device has been installed incorrectly, and the system can't find the right kind of device driver for it.

Macro: int E2BIG

Argument list too long; used when the arguments passed to a new program being executed with one of the exec functions (see section Executing a File) occupy too much memory space. This condition never arises in the GNU system.

Macro: int ENOEXEC

Invalid executable file format. This condition is detected by the exec functions; see section Executing a File.

Macro: int EBADF

Bad file descriptor; for example, I/O on a descriptor that has been closed or reading from a descriptor open only for writing (or vice versa).

Macro: int ECHILD

There are no child processes. This error happens on operations that are supposed to manipulate child processes, when there aren't any processes to manipulate.

Macro: int EDEADLK

Deadlock avoided; allocating a system resource would have resulted in a deadlock situation. For an example, See section File Locks.

Macro: int ENOMEM

No memory available. The system cannot allocate more virtual memory because its capacity is full.

Macro: int EACCES

Permission denied; the file permissions do not allow the attempted operation.

Macro: int EFAULT

Bad address; an invalid pointer was detected.

Macro: int ENOTBLK

A file that isn't a block special file was given in a situation that requires one. For example, trying to mount an ordinary file as a file system in Unix gives this error.

Macro: int EBUSY

Resource busy; a system resource that can't be shared is already in use. For example, if you try to delete a file that is the root of a currently mounted filesystem, you get this error.

Macro: int EEXIST

File exists; an existing file was specified in a context where it only makes sense to specify a new file.

Macro: int EXDEV

An attempt to make an improper link across file systems was detected.

Macro: int ENODEV

The wrong type of device was given to a function that expects a particular sort of device.

Macro: int ENOTDIR

A file that isn't a directory was specified when a directory is required.

Macro: int EISDIR

File is a directory; attempting to open a directory for writing gives this error.

Macro: int EINVAL

Invalid argument. This is used to indicate various kinds of problems with passing the wrong argument to a library function.

Macro: int ENFILE

There are too many distinct file openings in the entire system. Note that any number of linked channels count as just one file opening; see section Linked Channels.

Macro: int EMFILE

The current process has too many files open and can't open any more. Duplicate descriptors do count toward this limit.

Macro: int ENOTTY

Inappropriate I/O control operation, such as trying to set terminal modes on an ordinary file.

Macro: int ETXTBSY

An attempt to execute a file that is currently open for writing, or write to a file that is currently being executed. (The name stands for "text file busy".) This is not an error in the GNU system; the text is copied as necessary.

Macro: int EFBIG

File too big; the size of a file would be larger than allowed by the system.

Macro: int ENOSPC

No space left on device; write operation on a file failed because the disk is full.

Macro: int ESPIPE

Invalid seek operation (such as on a pipe).

Macro: int EROFS

An attempt was made to modify a file on a read-only file system.

Macro: int EMLINK

Too many links; the link count of a single file is too large.

Macro: int EPIPE

Broken pipe; there is no process reading from the other end of a pipe. Every library function that returns this error code also generates a SIGPIPE signal; this signal terminates the program if not handled or blocked. Thus, your program will never actually see EPIPE unless it has handled or blocked SIGPIPE.

Macro: int EDOM

Domain error; used by mathematical functions when an argument value does not fall into the domain over which the function is defined.

Macro: int ERANGE

Range error; used by mathematical functions when the result value is not representable because of overflow or underflow.

Macro: int EAGAIN

Resource temporarily unavailable; the call might work if you try again later. Only fork returns error code EAGAIN for such a reason.

Macro: int EWOULDBLOCK

An operation that would block was attempted on an object that has non-blocking mode selected.

Portability Note: In 4.4BSD and GNU, EWOULDBLOCK and EAGAIN are the same. Earlier versions of BSD (see section Berkeley Unix) have two distinct codes, and use EWOULDBLOCK to indicate an I/O operation that would block on an object with non-blocking mode set, and EAGAIN for other kinds of errors.

Macro: int EINPROGRESS

An operation that cannot complete immediately was initiated on an object that has non-blocking mode selected.

Macro: int EALREADY

An operation is already in progress on an object that has non-blocking mode selected.

Macro: int ENOTSOCK

A file that isn't a socket was specified when a socket is required.

Macro: int EDESTADDRREQ

No destination address was supplied on a socket operation.

Macro: int EMSGSIZE

The size of a message sent on a socket was larger than the supported maximum size.

Macro: int EPROTOTYPE

The socket type does not support the requested communications protocol.

Macro: int ENOPROTOOPT

You specified a socket option that doesn't make sense for the particular protocol being used by the socket. See section Socket Options.

Macro: int EPROTONOSUPPORT

The socket domain does not support the requested communications protocol. See section Creating a Socket.

Macro: int ESOCKTNOSUPPORT

The socket type is not supported.

Macro: int EOPNOTSUPP

The operation you requested is not supported. Some socket functions don't make sense for all types of sockets, and others may not be implemented for all communications protocols.

Macro: int EPFNOSUPPORT

The socket communications protocol family you requested is not supported.

Macro: int EAFNOSUPPORT

The address family specified for a socket is not supported; it is inconsistent with the protocol being used on the socket. See section Sockets.

Macro: int EADDRINUSE

The requested socket address is already in use. See section Socket Addresses.

Macro: int EADDRNOTAVAIL

The requested socket address is not available; for example, you tried to give a socket a name that doesn't match the local host name. See section Socket Addresses.

Macro: int ENETDOWN

A socket operation failed because the network was down.

Macro: int ENETUNREACH

A socket operation failed because the subnet containing the remost host was unreachable.

Macro: int ENETRESET

A network connection was reset because the remote host crashed.

Macro: int ECONNABORTED

A network connection was aborted locally.

Macro: int ECONNRESET

A network connection was closed for reasons outside the control of the local host, such as by the remote machine rebooting.

Macro: int ENOBUFS

The kernel's buffers for I/O operations are all in use.

Macro: int EISCONN

You tried to connect a socket that is already connected. See section Making a Connection.

Macro: int ENOTCONN

The socket is not connected to anything. You get this error when you try to transmit data over a socket, without first specifying a destination for the data.

Macro: int ESHUTDOWN

The socket has already been shut down.

Macro: int ETIMEDOUT

A socket operation with a specified timeout received no response during the timeout period.

Macro: int ECONNREFUSED

A remote host refused to allow the network connection (typically because it is not running the requested service).

Macro: int ELOOP

Too many levels of symbolic links were encountered in looking up a file name. This often indicates a cycle of symbolic links.

Macro: int ENAMETOOLONG

Filename too long (longer than PATH_MAX; see section Limits on File System Capacity) or host name too long (in gethostname or sethostname; see section Host Identification).

Macro: int EHOSTDOWN

The remote host for a requested network connection is down.

Macro: int EHOSTUNREACH

The remote host for a requested network connection is not reachable.

Macro: int ENOTEMPTY

Directory not empty, where an empty directory was expected. Typically, this error occurs when you are trying to delete a directory.

Macro: int EUSERS

The file quota system is confused because there are too many users.

Macro: int EDQUOT

The user's disk quota was exceeded.

Macro: int ESTALE

Stale NFS file handle. This indicates an internal confusion in the NFS system which is due to file system rearrangements on the server host. Repairing this condition usually requires unmounting and remounting the NFS file system on the local host.

Macro: int EREMOTE

An attempt was made to NFS-mount a remote file system with a file name that already specifies an NFS-mounted file. (This is an error on some operating systems, but we expect it to work properly on the GNU system, making this error code impossible.)

Macro: int ENOLCK

No locks available. This is used by the file locking facilities; see section File Locks.

Macro: int ENOSYS

Function not implemented. Some functions have commands or options defined that might not be supported in all implementations, and this is the kind of error you get if you request them and they are not supported.

Macro: int ED

The experienced user will know what is wrong.

Macro: int EGRATUITOUS

This error code has no purpose.

Error Messages

The library has functions and variables designed to make it easy for your program to report informative error messages in the customary format about the failure of a library call. The functions strerror and perror give you the standard error message for a given error code; the variable program_invocation_short_name gives you convenient access to the name of the program that encountered the error.

Function: char * strerror (int errnum)

The strerror function maps the error code (see section Checking for Errors) specified by the errnum argument to a descriptive error message string. The return value is a pointer to this string.

The value errnum normally comes from the variable errno.

You should not modify the string returned by strerror. Also, if you make subsequent calls to strerror, the string might be overwritten. (But it's guaranteed that no library function ever calls strerror behind your back.)

The function strerror is declared in `string.h'.

Function: void perror (const char *message)

This function prints an error message to the stream stderr; see section Standard Streams.

If you call perror with a message that is either a null pointer or an empty string, perror just prints the error message corresponding to errno, adding a trailing newline.

If you supply a non-null message argument, then perror prefixes its output with this string. It adds a colon and a space character to separate the message from the error string corresponding to errno.

The function perror is declared in `stdio.h'.

strerror and perror produce the exact same message for any given error code; the precise text varies from system to system. On the GNU system, the messages are fairly short; there are no multi-line messages or embedded newlines. Each error message begins with a capital letter and does not include any terminating punctuation.

Compatibility Note: The strerror function is a new feature of ANSI C. Many older C systems do not support this function yet.

Many programs that don't read input from the terminal are designed to exit if any system call fails. By convention, the error message from such a program should start with the program's name, sans directories. You can find that name in the variable program_invocation_short_name; the full file name is stored the variable program_invocation_name:

Variable: char * program_invocation_name

This variable's value is the name that was used to invoke the program running in the current process. It is the same as argv[0].

Variable: char * program_invocation_short_name

This variable's value is the name that was used to invoke the program running in the current process, with directory names removed. (That is to say, it is the same as program_invocation_name minus everything up to the last slash, if any.)

Both program_invocation_name and program_invocation_short_name are set up by the system before main is called.

Portability Note: These two variables are GNU extensions. If you want your program to work with non-GNU libraries, you must save the value of argv[0] in main, and then strip off the directory names yourself. We added these extensions to make it possible to write self-contained error-reporting subroutines that require no explicit cooperation from main.

Here is an example showing how to handle failure to open a file correctly. The function open_sesame tries to open the named file for reading and returns a stream if successful. The fopen library function returns a null pointer if it couldn't open the file for some reason. In that situation, open_sesame constructs an appropriate error message using the strerror function, and terminates the program. If we were going to make some other library calls before passing the error code to strerror, we'd have to save it in a local variable instead, because those other library functions might overwrite errno in the meantime.

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

FILE *
open_sesame (char *name)
{ 
  FILE *stream;

  errno = 0;                     
  stream = fopen (name, "r");
  if (!stream) {
    fprintf (stderr, "%s: Couldn't open file %s; %s\n",
             program_invocation_short_name, name, strerror (errno));
    exit (EXIT_FAILURE);
  } else
    return stream;
}

Memory Allocation

The GNU system provides several methods for allocating memory space under explicit program control. They vary in generality and in efficiency.

Dynamic Memory Allocation Concepts

Dynamic memory allocation is a technique in which programs determine as they are running where to store some information. You need dynamic allocation when the number of memory blocks you need, or how long you continue to need them, depends on the data you are working on.

For example, you may need a block to store a line read from an input file; since there is no limit to how long a line can be, you must allocate the storage dynamically and make it dynamically larger as you read more of the line.

Or, you may need a block for each record or each definition in the input data; since you can't know in advance how many there will be, you must allocate a new block for each record or definition as you read it.

When you use dynamic allocation, the allocation of a block of memory is an action that the program requests explicitly. You call a function or macro when you want to allocate space, and specify the size with an argument. If you want to free the space, you do so by calling another function or macro. You can do these things whenever you want, as often as you want.

Dynamic Allocation and C

The C language supports two kinds of memory allocation through the variables in C programs:

Dynamic allocation is not supported by C variables; there is no storage class "dynamic", and there can never be a C variable whose value is stored in dynamically allocated space. The only way to refer to dynamically allocated space is through a pointer. Because it is less convenient, and because the actual process of dynamic allocation requires more computation time, programmers use dynamic allocation only when neither static nor automatic allocation will serve.

For example, if you want to allocate dynamically some space to hold a struct foobar, you cannot declare a variable of type struct foobar whose contents are the dynamically allocated space. But you can declare a variable of pointer type struct foobar * and assign it the address of the space. Then you can use the operators `*' and `->' on this pointer variable to refer to the contents of the space:

{
  struct foobar *ptr
     = (struct foobar *) malloc (sizeof (struct foobar));
  ptr->name = x;
  ptr->next = current_foobar;
  current_foobar = ptr;
}

Unconstrained Allocation

The most general dynamic allocation facility is malloc. It allows you to allocate blocks of memory of any size at any time, make them bigger or smaller at any time, and free the blocks individually at any time (or never).

Basic Storage Allocation

To allocate a block of memory, call malloc. The prototype for this function is in `stdlib.h'.

Function: void * malloc (size_t size)

This function returns a pointer to a newly allocated block size bytes long, or a null pointer if the block could not be allocated.

The contents of the block are undefined; you must initialize it yourself (or use calloc instead; see section Allocating Cleared Space). Normally you would cast the value as a pointer to the kind of object that you want to store in the block. Here we show an example of doing so, and of initializing the space with zeros using the library function memset (see section Copying and Concatenation):

struct foo *ptr;
...
ptr = (struct foo *) malloc (sizeof (struct foo));
if (ptr == 0) abort ();
memset (ptr, 0, sizeof (struct foo));

You can store the result of malloc into any pointer variable without a cast, because ANSI C automatically converts the type void * to another type of pointer when necessary. But the cast is necessary in contexts other than assignment operators or if you might want your code to run in traditional C.

Remember that when allocating space for a string, the argument to malloc must be one plus the length of the string. This is because a string is terminated with a null character that doesn't count in the "length" of the string but does need space. For example:

char *ptr;
...
ptr = (char *) malloc (length + 1);

See section Representation of Strings, for more information about this.

Examples of malloc

If no more space is available, malloc returns a null pointer. You should check the value of every call to malloc. It is useful to write a subroutine that calls malloc and reports an error if the value is a null pointer, returning only if the value is nonzero. This function is conventionally called xmalloc. Here it is:

void *
xmalloc (size_t size)
{
  register void *value = malloc (size);
  if (value == 0)
    fatal ("virtual memory exhausted");
  return value;
}

Here is a real example of using malloc (by way of xmalloc). The function savestring will copy a sequence of characters into a newly allocated null-terminated string:

char *
savestring (const char *ptr, size_t len)
{
  register char *value = (char *) xmalloc (len + 1);
  memcpy (value, ptr, len);
  value[len] = 0;
  return value;
}

The block that malloc gives you is guaranteed to be aligned so that it can hold any type of data. In the GNU system, the address is always a multiple of eight; if the size of block is 16 or more, then the address is always a multiple of 16. Only rarely is any higher boundary (such as a page boundary) necessary; for those cases, use memalign or valloc (see section Allocating Aligned Memory Blocks).

Note that the memory located after the end of the block is likely to be in use for something else; perhaps a block already allocated by another call to malloc. If you attempt to treat the block as longer than you asked for it to be, you are liable to destroy the data that malloc uses to keep track of its blocks, or you may destroy the contents of another block. If you have already allocated a block and discover you want it to be bigger, use realloc (see section Changing the Size of a Block).

Freeing Memory Allocated with malloc

When you no longer need a block that you got with malloc, use the function free to make the block available to be allocated again. The prototype for this function is in `stdlib.h'.

Function: void free (void *ptr)

The free function deallocates the block of storage pointed at by ptr.

Function: void cfree (void *ptr)

This function does the same thing as free. It's provided for backward compatibility with SunOS; you should use free instead.

Freeing a block alters the contents of the block. Do not expect to find any data (such as a pointer to the next block in a chain of blocks) in the block after freeing it. Copy whatever you need out of the block before freeing it! Here is an example of the proper way to free all the blocks in a chain, and the strings that they point to:

struct chain
  {
    struct chain *next;
    char *name;
  }

void
free_chain (struct chain *chain)
{
  while (chain != 0)
    {
      struct chain *next = chain->next;
      free (chain->name);
      free (chain);
      chain = next;
    }
}

Occasionally, free can actually return memory to the operating system and make the process smaller. Usually, all it can do is allow a later later call to malloc to reuse the space. In the mean time, the space remains in your program as part of a free-list used internally by malloc.

There is no point in freeing blocks at the end of a program, because all of the program's space is given back to the system when the process terminates.

Changing the Size of a Block

Often you do not know for certain how big a block you will ultimately need at the time you must begin to use the block. For example, the block might be a buffer that you use to hold a line being read from a file; no matter how long you make the buffer initially, you may encounter a line that is longer.

You can make the block longer by calling realloc. This function is declared in `stdlib.h'.

Function: void * realloc (void *ptr, size_t newsize)

The realloc function changes the size of the block whose address is ptr to be newsize.

Since the space after the end of the block may be in use, realloc may find it necessary to copy the block to a new address where more free space is available. The value of realloc is the new address of the block. If the block needs to be moved, realloc copies the old contents.

Like malloc, realloc may return a null pointer if no memory space is available to make the block bigger. When this happens, the original block is untouched; it has not been modified or relocated.

In most cases it makes no difference what happens to the original block when realloc fails, because the application program cannot continue when it is out of memory, and the only thing to do is to give a fatal error message. Often it is convenient to write and use a subroutine, conventionally called xrealloc, that takes care of the error message as xmalloc does for malloc:

void *
xrealloc (void *ptr, size_t size)
{
  register void *value = realloc (ptr, size);
  if (value == 0)
    fatal ("Virtual memory exhausted");
  return value;
}

You can also use realloc to make a block smaller. The reason you would do this is to avoid tying up a lot of memory space when only a little is needed. Making a block smaller sometimes necessitates copying it, so it can fail if no other space is available.

If the new size you specify is the same as the old size, realloc is guaranteed to change nothing and return the same address that you gave.

Allocating Cleared Space

The function calloc allocates memory and clears it to zero. It is declared in `stdlib.h'.

Function: void * calloc (size_t count, size_t eltsize)

This function allocates a block long enough to contain a vector of count elements, each of size eltsize. Its contents are cleared to zero before calloc returns.

You could define calloc as follows:

void *
calloc (size_t count, size_t eltsize)
{
  size_t size = count * eltsize;
  void *value = malloc (size);
  if (value != 0)
    memset (value, 0, size);
  return value;
}

We rarely use calloc today, because it is equivalent to such a simple combination of other features that are more often used. It is a historical holdover that is not quite obsolete.

Efficiency Considerations for malloc

To make the best use of malloc, it helps to know that the GNU version of malloc always dispenses small amounts of memory in blocks whose sizes are powers of two. It keeps separate pools for each power of two. This holds for sizes up to a page size. Therefore, if you are free to choose the size of a small block in order to make malloc more efficient, make it a power of two.

Once a page is split up for a particular block size, it can't be reused for another size unless all the blocks in it are freed. In many programs, this is unlikely to happen. Thus, you can sometimes make a program use memory more efficiently by using blocks of the same size for many different purposes.

When you ask for memory blocks of a page or larger, malloc uses a different strategy; it rounds the size up to a multiple of a page, and it can coalesce and split blocks as needed.

The reason for the two strategies is that it is important to allocate and free small blocks as fast as possible, but speed is less important for a large block since the program normally spends a fair amount of time using it. Also, large blocks are normally fewer in number. Therefore, for large blocks, it makes sense to use a method which takes more time to minimize the wasted space.

Allocating Aligned Memory Blocks

The address of a block returned by malloc or realloc in the GNU system is always a multiple of eight. If you need a block whose address is a multiple of a higher power of two than that, use memalign or valloc. These functions are declared in `stdlib.h'.

With the GNU library, you can use free to free the blocks that memalign and valloc return. That does not work in BSD, however--BSD does not provide any way to free such blocks.

Function: void * memalign (size_t size, int boundary)

The memalign function allocates a block of size bytes whose address is a multiple of boundary. The boundary must be a power of two! The function memalign works by calling malloc to allocate a somewhat larger block, and then returning an address within the block that is on the specified boundary.

Function: void * valloc (size_t size)

Using valloc is like using memalign and passing the page size as the value of the second argument.

Heap Consistency Checking

You can ask malloc to check the consistency of dynamic storage by using the mcheck function. This function is a GNU extension, declared in `malloc.h'.

Function: void mcheck (void (*abortfn) (void))

Calling mcheck tells malloc to perform occasional consistency checks. These will catch things such as writing past the end of a block that was allocated with malloc.

The abortfn argument is the function to call when an inconsistency is found. If you supply a null pointer, the abort function is used.

It is too late to begin allocation checking once you have allocated anything with malloc. So mcheck does nothing in that case. The function returns -1 if you call it too late, and 0 otherwise (when it is successful).

The easiest way to arrange to call mcheck early enough is to use the option `-lmcheck' when you link your program.

Storage Allocation Hooks

The GNU C library lets you modify the behavior of malloc, realloc, and free by specifying appropriate hook functions. You can use these hooks to help you debug programs that use dynamic storage allocation, for example.

The hook variables are declared in `malloc.h'.

Variable: __malloc_hook

The value of this variable is a pointer to function that malloc uses whenever it is called. You should define this function to look like malloc; that is, like:

void *function (size_t size)

Variable: __realloc_hook

The value of this variable is a pointer to function that realloc uses whenever it is called. You should define this function to look like realloc; that is, like:

void *function (void *ptr, size_t size)

Variable: __free_hook

The value of this variable is a pointer to function that free uses whenever it is called. You should define this function to look like free; that is, like:

void function (void *ptr)

You must make sure that the function you install as a hook for one of these functions does not call that function recursively without restoring the old value of the hook first! Otherwise, your program will get stuck in an infinite recursion.

Here is an example showing how to use __malloc_hook properly. It installs a function that prints out information every time malloc is called.

static void *(*old_malloc_hook) (size_t);
static void *
my_malloc_hook (size_t size)
{
  void *result;
  __malloc_hook = old_malloc_hook;
  result = malloc (size);
  __malloc_hook = my_malloc_hook;
  printf ("malloc (%u) returns %p\n", (unsigned int) size, result);
  return result;
}

main ()
{
  ...
  old_malloc_hook = __malloc_hook;
  __malloc_hook = my_malloc_hook;
  ...
}

The mcheck function (see section Heap Consistency Checking) works by installing such hooks.

Statistics for Storage Allocation with malloc

You can get information about dynamic storage allocation by calling the mstats function. This function and its associated data type are declared in `malloc.h'; they are a GNU extension.

Data Type: struct mstats

This structure type is used to return information about the dynamic storage allocator. It contains the following members:

size_t bytes_total
This is the total size of memory managed by malloc, in bytes.

size_t chunks_used
This is the number of chunks in use. (The storage allocator internally gets chunks of memory from the operating system, and then carves them up to satisfy individual malloc requests; see section Efficiency Considerations for malloc.)

size_t bytes_used
This is the number of bytes in use.

size_t chunks_free
This is the number of chunks which are free -- that is, that have been allocated by the operating system to your program, but which are not now being used.

size_t bytes_free
This is the number of bytes which are free.

Function: struct mstats mstats (void)

This function returns information about the current dynamic memory usage in a structure of type struct mstats.

Summary of malloc-Related Functions

Here is a summary of the functions that work with malloc:

void *malloc (size_t size)
Allocate a block of size bytes. See section Basic Storage Allocation.

void free (void *addr)
Free a block previously allocated by malloc. See section Freeing Memory Allocated with malloc.

void *realloc (void *addr, size_t size)
Make a block previously allocated by malloc larger or smaller, possibly by copying it to a new location. See section Changing the Size of a Block.

void *calloc (size_t count, size_t eltsize)
Allocate a block of count * eltsize bytes using malloc, and set its contents to zero. See section Allocating Cleared Space.

void *valloc (size_t size)
Allocate a block size bytes, starting on a page boundary. See section Allocating Aligned Memory Blocks.

void *memalign (size_t size, size_t boundary)
Allocate a block size bytes, starting on an address that is a multiple of boundary. See section Allocating Aligned Memory Blocks.

void mcheck (void (*abortfn) (void))
Tell malloc to perform occasional consistency checks on dynamically allocated memory, and to call abortfn when an inconsistency is found. See section Heap Consistency Checking.

void *(*__malloc_hook) (size_t size)
A pointer to a function that malloc uses whenever it is called.

void *(*__realloc_hook) (void *ptr, size_t size)
A pointer to a function that realloc uses whenever it is called.

void (*__free_hook) (void *ptr)
A pointer to a function that free uses whenever it is called.

void struct mstats mstats (void)
Read information about the current dynamic memory usage. See section Statistics for Storage Allocation with malloc.

Obstacks

An obstack is a pool of memory containing a stack of objects. You can create any number of separate obstacks, and then allocate objects in specified obstacks. Within each obstack, the last object allocated must always be the first one freed, but distinct obstacks are independent of each other.

Aside from this one constraint of order of freeing, obstacks are totally general: an obstack can contain any number of objects of any size. They are implemented with macros, so allocation is usually very fast as long as the objects are usually small. And the only space overhead per object is the padding needed to start each object on a suitable boundary.

Creating Obstacks

The utilities for manipulating obstacks are declared in the header file `obstack.h'.

Data Type: struct obstack

An obstack is represented by a data structure of type struct obstack. This structure has a small fixed size; it records the status of the obstack and how to find the space in which objects are allocated. It does not contain any of the objects themselves. You should not try to access the contents of the structure directly; use only the functions described in this chapter.

You can declare variables of type struct obstack and use them as obstacks, or you can allocate obstacks dynamically like any other kind of object. Dynamic allocation of obstacks allows your program to have a variable number of different stacks. (You can even allocate an obstack structure in another obstack, but this is rarely useful.)

All the functions that work with obstacks require you to specify which obstack to use. You do this with a pointer of type struct obstack *. In the following, we often say "an obstack" when strictly speaking the object at hand is such a pointer.

The objects in the obstack are packed into large blocks called chunks. The struct obstack structure points to a chain of the chunks currently in use.

The obstack library obtains a new chunk whenever you allocate an object that won't fit in the previous chunk. Since the obstack library manages chunks automatically, you don't need to pay much attention to them, but you do need to supply a function which the obstack library should use to get a chunk. Usually you supply a function which uses malloc directly or indirectly. You must also supply a function to free a chunk. These matters are described in the following section.

Preparing for Using Obstacks

Each source file in which you plan to use the obstack functions must include the header file `obstack.h', like this:

#include <obstack.h>

Also, if the source file uses the macro obstack_init, it must declare or define two functions or macros that will be called by the obstack library. One, obstack_chunk_alloc, is used to allocate the chunks of memory into which objects are packed. The other, obstack_chunk_free, is used to return chunks when the objects in them are freed.

Usually these are defined to use malloc via the intermediary xmalloc (see section Unconstrained Allocation). This is done with the following pair of macro definitions:

#define obstack_chunk_alloc xmalloc
#define obstack_chunk_free free

Though the storage you get using obstacks really comes from malloc, using obstacks is faster because malloc is called less often, for larger blocks of memory. See section Obstack Chunks, for full details.

At run time, before the program can use a struct obstack object as an obstack, it must initialize the obstack by calling obstack_init.

Function: void obstack_init (struct obstack *obstack_ptr)

Initialize obstack obstack_ptr for allocation of objects.

Here are two examples of how to allocate the space for an obstack and initialize it. First, an obstack that is a static variable:

struct obstack myobstack;
...
obstack_init (&myobstack);

Second, an obstack that is itself dynamically allocated:

struct obstack *myobstack_ptr
  = (struct obstack *) xmalloc (sizeof (struct obstack));

obstack_init (myobstack_ptr);

Allocation in an Obstack

The most direct way to allocate an object in an obstack is with obstack_alloc, which is invoked almost like malloc.

Function: void * obstack_alloc (struct obstack *obstack_ptr, size_t size)

This allocates an uninitialized block of size bytes in an obstack and returns its address. Here obstack_ptr specifies which obstack to allocate the block in; it is the address of the struct obstack object which represents the obstack. Each obstack function or macro requires you to specify an obstack_ptr as the first argument.

For example, here is a function that allocates a copy of a string str in a specific obstack, which is the variable string_obstack:

struct obstack string_obstack;

char *
copystring (char *string)
{
  char *s = (char *) obstack_alloc (&string_obstack,
                                    strlen (string) + 1);
  memcpy (s, string, strlen (string));
  return s;
}

To allocate a block with specified contents, use the function obstack_copy, declared like this:

Function: void * obstack_copy (struct obstack *obstack_ptr, void *address, size_t size)

This allocates a block and initializes it by copying size bytes of data starting at address.

Function: void * obstack_copy0 (struct obstack *obstack_ptr, void *address, size_t size)

Like obstack_copy, but appends an extra byte containing a null character. This extra byte is not counted in the argument size.

The obstack_copy0 function is convenient for copying a sequence of characters into an obstack as a null-terminated string. Here is an example of its use:

char *
obstack_savestring (char *addr, size_t size)
{
  return obstack_copy0 (&myobstack, addr, size);
}

Contrast this with the previous example of savestring using malloc (see section Basic Storage Allocation).

Freeing Objects in an Obstack

To free an object allocated in an obstack, use the function obstack_free. Since the obstack is a stack of objects, freeing one object automatically frees all other objects allocated more recently in the same obstack.

Function: void obstack_free (struct obstack *obstack_ptr, void *object)

If object is a null pointer, everything allocated in the obstack is freed. Otherwise, object must be the address of an object allocated in the obstack. Then object is freed, along with everything allocated in obstack since object.

Note that if object is a null pointer, the result is an uninitialized obstack. To free all storage in an obstack but leave it valid for further allocation, call obstack_free with the address of the first object allocated on the obstack:

obstack_free (obstack_ptr, first_object_allocated_ptr);

Recall that the objects in an obstack are grouped into chunks. When all the objects in a chunk become free, the obstack library automatically frees the chunk (see section Preparing for Using Obstacks). Then other obstacks, or non-obstack allocation, can reuse the space of the chunk.

Obstack Functions and Macros

The interfaces for using obstacks may be defined either as functions or as macros, depending on the compiler. The obstack facility works with all C compilers, including both ANSI C and traditional C, but there are precautions you must take if you plan to use compilers other than GNU C.

If you are using an old-fashioned non-ANSI C compiler, all the obstack "functions" are actually defined only as macros. You can call these macros like functions, but you cannot use them in any other way (for example, you cannot take their address).

Calling the macros requires a special precaution: namely, the first operand (the obstack pointer) may not contain any side effects, because it may be computed more than once. For example, if you write this:

obstack_alloc (get_obstack (), 4);

you will find that get_obstack may be called several times. If you use *obstack_list_ptr++ as the obstack pointer argument, you will get very strange results since the incrementation may occur several times.

In ANSI C, each function has both a macro definition and a function definition. The function definition is used if you take the address of the function without calling it. An ordinary call uses the macro definition by default, but you can request the function definition instead by writing the function name in parentheses, as shown here:

char *x;
void *(*funcp) ();
/* Use the macro.  */
x = (char *) obstack_alloc (obptr, size);
/* Call the function.  */
x = (char *) (obstack_alloc) (obptr, size);
/* Take the address of the function.  */
funcp = obstack_alloc;

This is the same situation that exists in ANSI C for the standard library functions. See section Macro Definitions of Functions.

Warning: When you do use the macros, you must observe the precaution of avoiding side effects in the first operand, even in ANSI C.

If you use the GNU C compiler, this precaution is not necessary, because various language extensions in GNU C permit defining the macros so as to compute each argument only once.

Growing Objects

Because storage in obstack chunks is used sequentially, it is possible to build up an object step by step, adding one or more bytes at a time to the end of the object. With this technique, you do not need to know how much data you will put in the object until you come to the end of it. We call this the technique of growing objects. The special functions for adding data to the growing object are described in this section.

You don't need to do anything special when you start to grow an object. Using one of the functions to add data to the object automatically starts it. However, it is necessary to say explicitly when the object is finished. This is done with the function obstack_finish.

The actual address of the object thus built up is not known until the object is finished. Until then, it always remains possible that you will add so much data that the object must be copied into a new chunk.

While the obstack is in use for a growing object, you cannot use it for ordinary allocation of another object. If you try to do so, the space already added to the growing object will become part of the other object.

Function: void obstack_blank (struct obstack *obstack_ptr, size_t size)

The most basic function for adding to a growing object is obstack_blank, which adds space without initializing it.

Function: void obstack_grow (struct obstack *obstack_ptr, void *data, size_t size)

To add a block of initialized space, use obstack_grow, which is the growing-object analogue of obstack_copy. It adds size bytes of data to the growing object, copying the contents from data.

Function: void obstack_grow0 (struct obstack *obstack_ptr, void *data, size_t size)

This is the growing-object analogue of obstack_copy0. It adds size bytes copied from data, followed by an additional null character.

Function: void obstack_1grow (struct obstack *obstack_ptr, char c)

To add one character at a time, use the function obstack_1grow. It adds a single byte containing c to the growing object.

Function: void * obstack_finish (struct obstack *obstack_ptr)

When you are finished growing the object, use the function obstack_finish to close it off and return its final address.

Once you have finished the object, the obstack is available for ordinary allocation or for growing another object.

When you build an object by growing it, you will probably need to know afterward how long it became. You need not keep track of this as you grow the object, because you can find out the length from the obstack just before finishing the object with the function obstack_object_size, declared as follows:

Function: size_t obstack_object_size (struct obstack *obstack_ptr)

This function returns the current size of the growing object, in bytes. Remember to call this function before finishing the object. After it is finished, obstack_object_size will return zero.

If you have started growing an object and wish to cancel it, you should finish it and then free it, like this:

obstack_free (obstack_ptr, obstack_finish (obstack_ptr));

This has no effect if no object was growing.

You can use obstack_blank with a negative size argument to make the current object smaller. Just don't try to shrink it beyond zero length--there's no telling what will happen if you do that.

Extra Fast Growing Objects

The usual functions for growing objects incur overhead for checking whether there is room for the new growth in the current chunk. If you are frequently constructing objects in small steps of growth, this overhead can be significant.

You can reduce the overhead by using special "fast growth" functions that grow the object without checking. In order to have a robust program, you must do the checking yourself. If you do this checking in the simplest way each time you are about to add data to the object, you have not saved anything, because that is what the ordinary growth functions do. But if you can arrange to check less often, or check more efficiently, then you make the program faster.

The function obstack_room returns the amount of room available in the current chunk. It is declared as follows:

Function: size_t obstack_room (struct obstack *obstack_ptr)

This returns the number of bytes that can be added safely to the current growing object (or to an object about to be started) in obstack obstack using the fast growth functions.

While you know there is room, you can use these fast growth functions for adding data to a growing object:

Function: void obstack_1grow_fast (struct obstack *obstack_ptr, char c)

The function obstack_1grow_fast adds one byte containing the character c to the growing object in obstack obstack_ptr.

Function: void obstack_blank_fast (struct obstack *obstack_ptr, size_t size)

The function obstack_blank_fast adds size bytes to the growing object in obstack obstack_ptr without initializing them.

When you check for space using obstack_room and there is not enough room for what you want to add, the fast growth functions are not safe. In this case, simply use the corresponding ordinary growth function instead. Very soon this will copy the object to a new chunk; then there will be lots of room available again.

So, each time you use an ordinary growth function, check afterward for sufficient space using obstack_room. Once the object is copied to a new chunk, there will be plenty of space again, so the program will start using the fast growth functions again.

Here is an example:

void
add_string (struct obstack *obstack, char *ptr, size_t len)
{
  while (len > 0)
    {
      if (obstack_room (obstack) > len)
        {
          /* We have enough room: add everything fast.  */
          while (len-- > 0)
            obstack_1grow_fast (obstack, *ptr++);
        }
      else
        {
          /* Not enough room. Add one character slowly,
             which may copy to a new chunk and make room.  */
          obstack_1grow (obstack, *ptr++);
          len--;
        }
    }
}

Status of an Obstack

Here are functions that provide information on the current status of allocation in an obstack. You can use them to learn about an object while still growing it.

Function: void * obstack_base (struct obstack *obstack_ptr)

This function returns the tentative address of the beginning of the currently growing object in obstack_ptr. If you finish the object immediately, it will have that address. If you make it larger first, it may outgrow the current chunk--then its address will change!

If no object is growing, this value says where the next object you allocate will start (once again assuming it fits in the current chunk).

Function: void * obstack_next_free (struct obstack *obstack_ptr)

This function returns the address of the first free byte in the current chunk of obstack obstack_ptr. This is the end of the currently growing object. If no object is growing, obstack_next_free returns the same value as obstack_base.

Function: size_t obstack_object_size (struct obstack *obstack_ptr)

This function returns the size in bytes of the currently growing object. This is equivalent to

obstack_next_free (obstack_ptr) - obstack_base (obstack_ptr)

Alignment of Data in Obstacks

Each obstack has an alignment boundary; each object allocated in the obstack automatically starts on an address that is a multiple of the specified boundary. By default, this boundary is 4 bytes.

To access an obstack's alignment boundary, use the macro obstack_alignment_mask, whose function prototype looks like this:

Macro: int obstack_alignment_mask (struct obstack *obstack_ptr)

The value is a bit mask; a bit that is 1 indicates that the corresponding bit in the address of an object should be 0. The mask value should be one less than a power of 2; the effect is that all object addresses are multiples of that power of 2. The default value of the mask is 3, so that addresses are multiples of 4. A mask value of 0 means an object can start on any multiple of 1 (that is, no alignment is required).

The expansion of the macro obstack_alignment_mask is an lvalue, so you can alter the mask by assignment. For example, this statement:

obstack_alignment_mask (obstack_ptr) = 0;

has the effect of turning off alignment processing in the specified obstack.

Note that a change in alignment mask does not take effect until after the next time an object is allocated or finished in the obstack. If you are not growing an object, you can make the new alignment mask take effect immediately by calling obstack_finish. This will finish a zero-length object and then do proper alignment for the next object.

Obstack Chunks

Obstacks work by allocating space for themselves in large chunks, and then parceling out space in the chunks to satisfy your requests. Chunks are normally 4096 bytes long unless you specify a different chunk size. The chunk size includes 8 bytes of overhead that are not actually used for storing objects. Regardless of the specified size, longer chunks will be allocated when necessary for long objects.

The obstack library allocates chunks by calling the function obstack_chunk_alloc, which you must define. When a chunk is no longer needed because you have freed all the objects in it, the obstack library frees the chunk by calling obstack_chunk_free, which you must also define.

These two must be defined (as macros) or declared (as functions) in each source file that uses obstack_init (see section Creating Obstacks). Most often they are defined as macros like this:

#define obstack_chunk_alloc xmalloc
#define obstack_chunk_free free

Note that these are simple macros (no arguments). Macro definitions with arguments will not work! It is necessary that obstack_chunk_alloc or obstack_chunk_free, alone, expand into a function name if it is not itself a function name.

The function that actually implements obstack_chunk_alloc cannot return "failure" in any fashion, because the obstack library is not prepared to handle failure. Therefore, malloc itself is not suitable. If the function cannot obtain space, it should either terminate the process (see section Program Termination) or do a nonlocal exit using longjmp (see section Non-Local Exits).

If you allocate chunks with malloc, the chunk size should be a power of 2. The default chunk size, 4096, was chosen because it is long enough to satisfy many typical requests on the obstack yet short enough not to waste too much memory in the portion of the last chunk not yet used.

Macro: size_t obstack_chunk_size (struct obstack *obstack_ptr)

This returns the chunk size of the given obstack.

Since this macro expands to an lvalue, you can specify a new chunk size by assigning it a new value. Doing so does not affect the chunks already allocated, but will change the size of chunks allocated for that particular obstack in the future. It is unlikely to be useful to make the chunk size smaller, but making it larger might improve efficiency if you are allocating many objects whose size is comparable to the chunk size. Here is how to do so cleanly:

if (obstack_chunk_size (obstack_ptr) < new_chunk_size)
  obstack_chunk_size (obstack_ptr) = new_chunk_size;

Summary of Obstack Functions

Here is a summary of all the functions associated with obstacks. Each takes the address of an obstack (struct obstack *) as its first argument.

void obstack_init (struct obstack *obstack_ptr)
Initialize use of an obstack. See section Creating Obstacks.

void *obstack_alloc (struct obstack *obstack_ptr, size_t size)
Allocate an object of size uninitialized bytes. See section Allocation in an Obstack.

void *obstack_copy (struct obstack *obstack_ptr, void *address, size_t size)
Allocate an object of size bytes, with contents copied from address. See section Allocation in an Obstack.

void *obstack_copy0 (struct obstack *obstack_ptr, void *address, size_t size)
Allocate an object of size+1 bytes, with size of them copied from address, followed by a null character at the end. See section Allocation in an Obstack.

void obstack_free (struct obstack *obstack_ptr, void *object)
Free object (and everything allocated in the specified obstack more recently than object). See section Freeing Objects in an Obstack.

void obstack_blank (struct obstack *obstack_ptr, size_t size)
Add size uninitialized bytes to a growing object. See section Growing Objects.

void obstack_grow (struct obstack *obstack_ptr, void *address, size_t size)
Add size bytes, copied from address, to a growing object. See section Growing Objects.

void obstack_grow0 (struct obstack *obstack_ptr, void *address, size_t size)
Add size bytes, copied from address, to a growing object, and then add another byte containing a null character. See section Growing Objects.

void obstack_1grow (struct obstack *obstack_ptr, char data_char)
Add one byte containing data_char to a growing object. See section Growing Objects.

void *obstack_finish (struct obstack *obstack_ptr)
Finalize the object that is growing and return its permanent address. See section Growing Objects.

size_t obstack_object_size (struct obstack *obstack_ptr)
Get the current size of the currently growing object. See section Growing Objects.

void obstack_blank_fast (struct obstack *obstack_ptr, size_t size)
Add size uninitialized bytes to a growing object without checking that there is enough room. See section Extra Fast Growing Objects.

void obstack_1grow_fast (struct obstack *obstack_ptr, char data_char)
Add one byte containing data_char to a growing object without checking that there is enough room. See section Extra Fast Growing Objects.

size_t obstack_room (struct obstack *obstack_ptr)
Get the amount of room now available for growing the current object. See section Extra Fast Growing Objects.

int obstack_alignment_mask (struct obstack *obstack_ptr)
The mask used for aligning the beginning of an object. This is an lvalue. See section Alignment of Data in Obstacks.

size_t obstack_chunk_size (struct obstack *obstack_ptr)
The size for allocating chunks. This is an lvalue. See section Obstack Chunks.

void *obstack_base (struct obstack *obstack_ptr)
Tentative starting address of the currently growing object. See section Status of an Obstack.

void *obstack_next_free (struct obstack *obstack_ptr)
Address just after the end of the currently growing object. See section Status of an Obstack.

Automatic Storage with Variable Size

The function alloca supports a kind of half-dynamic allocation in which blocks are allocated dynamically but freed automatically.

Allocating a block with alloca is an explicit action; you can allocate as many blocks as you wish, and compute the size at run time. But all the blocks are freed when you exit the function that alloca was called from, just as if they were automatic variables declared in that function. There is no way to free the space explicitly.

The prototype for alloca is in `stdlib.h'. This function is a BSD extension.

Function: void * alloca (size_t size);

The return value of alloca is the address of a block of size bytes of storage, allocated in the stack frame of the calling function.

Do not use alloca inside the arguments of a function call--you will get unpredictable results, because the stack space for the alloca would appear on the stack in the middle of the space for the function arguments. An example of what to avoid is foo (x, alloca (4), y).

alloca Example

As an example of use of alloca, here is a function that opens a file name made from concatenating two argument strings, and returns a file descriptor or minus one signifying failure:

int
open2 (char *str1, char *str2, int flags, int mode)
{
  char *name = (char *) alloca (strlen (str1) + strlen (str2) + 1);
  strcpy (name, str1);
  strcat (name, str2);
  return open (name, flags, mode);
}

Here is how you would get the same results with malloc and free:

int
open2 (char *str1, char *str2, int flags, int mode)
{
  char *name = (char *) malloc (strlen (str1) + strlen (str2) + 1);
  int desc;
  if (name == 0)
    fatal ("virtual memory exceeded");
  strcpy (name, str1);
  strcat (name, str2);
  desc = open (name, flags, mode);
  free (name);
  return desc;
}

As you can see, it is simpler with alloca. But alloca has other, more important advantages, and some disadvantages.

Advantages of alloca

Here are the reasons why alloca may be preferable to malloc:

Disadvantages of alloca

These are the disadvantages of alloca in comparison with malloc:

GNU C Variable-Size Arrays

In GNU C, you can replace most uses of alloca with an array of variable size. Here is how open2 would look then:

int open2 (char *str1, char *str2, int flags, int mode)
{
  char name[strlen (str1) + strlen (str2) + 1];
  strcpy (name, str1);
  strcat (name, str2);
  return open (name, flags, mode);
}

But alloca is not always equivalent to a variable-sized array, for several reasons:

Note: If you mix use of alloca and variable-sized arrays within one function, exiting a scope in which a variable-sized array was declared frees all blocks allocated with alloca during the execution of that scope.

Relocating Allocator

Any system of dynamic memory allocation has overhead: the amount of space it uses is more than the amount the program asks for. The relocating memory allocator achieves very low overhead by moving blocks in memory as necessary, on its own initiative.

Concepts of Relocating Allocation

When you allocate a block with malloc, the address of the block never changes unless you use realloc to change its size. Thus, you can safely store the address in various places, temporarily or permanently, as you like. This is not safe when you use the relocating memory allocator, because any and all relocatable blocks can move whenever you allocate memory in any fashion. Even calling malloc or realloc can move the relocatable blocks.

For each relocatable block, you must make a handle---a pointer object in memory, designated to store the address of that block. The relocating allocator knows where each block's handle is, and updates the address stored there whenever it moves the block, so that the handle always points to the block. Each time you access the contents of the block, you should fetch its address anew from the handle.

To call any of the relocating allocator functions from a signal handler is almost certainly incorrect, because the signal could happen at any time and relocate all the blocks. The only way to make this safe is to block the signal around any access to the contents of any relocatable block--not a convenient mode of operation. See section Signal Handling and Nonreentrant Functions.

Allocating and Freeing Relocatable Blocks

In the descriptions below, handleptr designates the address of the handle. All the functions are declared in `malloc.h'; all are GNU extensions.

Function: void * r_alloc (void **handleptr, size_t size)

This function allocates a relocatable block of size size. It stores the block's address in *handleptr and returns a non-null pointer to indicate success.

If r_alloc can't get the space needed, it stores a null pointer in *handleptr, and returns a null pointer.

Function: void r_alloc_free (void **handleptr)

This function is the way to free a relocatable block. It frees the block that *handleptr points to, and stores a null pointer in *handleptr to show it doesn't point to an allocated block any more.

Function: void * r_re_alloc (void **handleptr, size_t size)

The function r_re_alloc adjusts the size of the block that *handleptr points to, making it size bytes long. It stores the address of the resized block in *handleptr and returns a non-null pointer to indicate success.

If enough memory is not available, this function returns a null pointer and does not modify *handleptr.

Memory Usage Warnings

You can ask for warnings as the program approaches running out of memory space, by calling memory_warnings. This is a GNU extension declared in `malloc.h'.

Function: void memory_warnings (void *start, void (*warn_func) (char *))

Call this function to request warnings for nearing exhaustion of virtual memory.

The argument start says where data space begins, in memory. The allocator compares this against the last address used and against the limit of data space, to determine the fraction of available memory in use. If you supply zero for start, then a default value is used which is right in most circumstances.

For warn_func, supply a function that malloc can call to warn you. It is called with a string (a warning message) as argument. Normally it ought to display the string for the user to read.

The warnings come when memory becomes 75% full, when it becomes 85% full, and when it becomes 95% full. Above 95% you get another warning each time memory usage increases.

Character Handling

Programs that work with characters and strings often need to classify a character--is it alphabetic, is it a digit, is it whitespace, and so on--and perform case conversion operations on characters. The functions in the header file `ctype.h' are provided for this purpose.

Since the choice of locale and character set can alter the classifications of particular character codes, all of these functions are affected by the current locale. (More precisely, they are affected by the locale currently selected for character classification--the LC_CTYPE category; see section Categories of Activities that Locales Affect.)

Classification of Characters

This section explains the library functions for classifying characters. For example, isalpha is the function to test for an alphabetic character. It takes one argument, the character to test, and returns a nonzero integer if the character is alphabetic, and zero otherwise. You would use it like this:

if (isalpha (c))
  printf ("The character `%c' is alphabetic.\n", c);

Each of the functions in this section tests for membership in a particular class of characters; each has a name starting with `is'. Each of them takes one argument, which is a character to test, and returns an int which is treated as a boolean value. The character argument is passed as an int, and it may be the constant value EOF instead of a real character.

The attributes of any given character can vary between locales. See section Locales and Internationalization, for more information on locales.

These functions are declared in the header file `ctype.h'.

Function: int islower (int c)

Returns true if c is a lower-case letter.

Function: int isupper (int c)

Returns true if c is an upper-case letter.

Function: int isalpha (int c)

Returns true if c is an alphabetic character (a letter). If islower or isupper is true of a character, then isalpha is also true.

In some locales, there may be additional characters for which isalpha is true--letters which are neither upper case nor lower case. But in the standard "C" locale, there are no such additional characters.

Function: int isdigit (int c)

Returns true if c is a decimal digit (`0' through `9').

Function: int isalnum (int c)

Returns true if c is an alphanumeric character (a letter or number); in other words, if either isalpha or isdigit is true of a character, then isalnum is also true.

Function: int isxdigit (int c)

Returns true if c is a hexadecimal digit. Hexadecimal digits include the normal decimal digits `0' through `9' and the letters `A' through `F' and `a' through `f'.

Function: int ispunct (int c)

Returns true if c is a punctuation character. This means any printing character that is not alphanumeric or a space character.

Function: int isspace (int c)

Returns true if c is a whitespace character. In the standard "C" locale, isspace returns true for only the standard whitespace characters:

' '
space

'\f'
formfeed

'\n'
newline

'\r'
carriage return

'\t'
horizontal tab

'\v'
vertical tab

Function: int isblank (int c)

Returns true if c is a blank character; that is, a space or a tab. This function is a GNU extension.

Function: int isgraph (int c)

Returns true if c is a graphic character; that is, a character that has a glyph associated with it. The whitespace characters are not considered graphic.

Function: int isprint (int c)

Returns true if c is a printing character. Printing characters include all the graphic characters, plus the space (` ') character.

Function: int iscntrl (int c)

Returns true if c is a control character (that is, a character that is not a printing character).

Function: int isascii (int c)

Returns true if c is a 7-bit unsigned char value that fits into the US/UK ASCII character set. This function is a BSD extension and is also an SVID extension.

Case Conversion

This section explains the library functions for performing conversions such as case mappings on characters. For example, toupper converts any character to upper case if possible. If the character can't be converted, toupper returns it unchanged.

These functions take one argument of type int, which is the character to convert, and return the converted character as an int. If the conversion is not applicable to the argument given, the argument is returned unchanged.

Compatibility Note: In pre-ANSI C dialects, instead of returning the argument unchanged, these functions may fail when the argument is not suitable for the conversion. Thus for portability, you may need to write islower(c) ? toupper(c) : c rather than just toupper(c).

These functions are declared in the header file `ctype.h'.

Function: int tolower (int c)

If c is an upper-case letter, tolower returns the corresponding lower-case letter. If c is not an upper-case letter, c is returned unchanged.

Function: int toupper (int c)

If c is a lower-case letter, tolower returns the corresponding upper-case letter. Otherwise c is returned unchanged.

Function: int toascii (int c)

This function converts c to a 7-bit unsigned char value that fits into the US/UK ASCII character set, by clearing the high-order bits. This function is a BSD extension and is also an SVID extension.

Function: int _tolower (int c)

This is identical to tolower, and is provided for compatibility with the SVID. See section SVID (The System V Interface Description).

Function: int _toupper (int c)

This is identical to toupper, and is provided for compatibility with the SVID.

String and Array Utilities

Operations on strings (or arrays of characters) are an important part of many programs. The GNU C library provides an extensive set of string utility functions, including functions for copying, concatenating, comparing, and searching strings. Many of these functions can also operate on arbitrary regions of storage; for example, the memcpy function can be used to copy the contents of any kind of array.

It's fairly common for beginning C programmers to "reinvent the wheel" by duplicating this