@shorttitlepage The GNU C Library Reference Manual The GNU C Library
Reference Manual
Sandra Loosemore with Roland McGrath, Andrew Oram, and Richard M. Stallman
last updated 9 April 1993
for version 1.06 Beta Copyright (C) 1993 Free Software Foundation, Inc.
The C language provides no built-in facilities for performing such common operations as input/output, memory management, string manipulation, and the like. Instead, these facilities are defined in a standard library, which you compile and link with your programs.
The GNU C library, described in this document, defines all of the library functions that are specified by the ANSI C standard, as well as additional features specific to POSIX and other derivatives of the Unix operating system, and extensions specific to the GNU system.
The purpose of this manual is to tell you how to use the facilities of the GNU library. We have mentioned which features belong to which standards to help you identify things that are potentially nonportable to other systems. But the emphasis on this manual is not on strict portability.
This manual is written with the assumption that you are at least somewhat familiar with the C programming language and basic programming concepts. Specifically, familiarity with ANSI standard C (see section ANSI C), rather than "traditional" pre-ANSI C dialects, is assumed.
The GNU C library includes several header files, each of which provides definitions and declarations for a group of related facilities; this information is used by the C compiler when processing your program. For example, the header file `stdio.h' declares facilities for performing input and output, and the header file `string.h' declares string processing utilities. The organization of this manual generally follows the same division as the header files.
If you are reading this manual for the first time, you should read all of the introductory material and skim the remaining chapters. There are a lot of functions in the GNU C library and it's not realistic to expect that you will be able to remember exactly how to use each and every one of them. It's more important to become generally familiar with the kinds of facilities that the library provides, so that when you are writing your programs you can recognize when to make use of library functions, and where in this manual you can find more specific information about them.
This section discusses the various standards and other sources that the GNU C library is based upon. These sources include the ANSI C and POSIX standards, and the System V and Berkeley Unix implementations.
The primary focus of this manual is to tell you how to make effective use of the GNU library facilities. But if you are concerned about making your programs compatible with these standards, or portable to operating systems other than GNU, this can affect how you use the library. This section gives you an overview of these standards, so that you will know what they are when they are mentioned in other parts of the manual.
See section Summary of Library Facilities, for an alphabetical list of the functions and other symbols provided by the library. This list also states which standards each function or symbol comes from.
The GNU C library is compatible with the C standard adopted by the American National Standards Institute (ANSI): American National Standard X3.159-1989---"ANSI C". The header files and library facilities that make up the GNU library are a superset of those specified by the ANSI C standard.
If you are concerned about strict adherence to the ANSI C standard, you should use the `-ansi' option when you compile your programs with the GNU C compiler. This tells the compiler to define only ANSI standard features from the library header files, unless you explicitly ask for additional features. See section Feature Test Macros, for information on how to do this.
Being able to restrict the library to include only ANSI C features is important because ANSI C puts limitations on what names can be defined by the library implementation, and the GNU extensions don't fit these limitations. See section Reserved Names, for more information about these restrictions.
This manual does not attempt to give you complete details on the differences between ANSI C and older dialects. It gives advice on how to write programs to work portably under multiple C dialects, but does not aim for completeness.
The GNU library is also compatible with the IEEE POSIX family of standards, known more formally as the Portable Operating System Interface for Computer Environments. POSIX is derived mostly from various versions of the Unix operating system.
The library facilities specified by the POSIX standard are a superset of those required by ANSI C; POSIX specifies additional features for ANSI C functions, as well as specifying new additional functions. In general, the additional requirements and functionality defined by the POSIX standard are aimed at providing lower-level support for a particular kind of operating system environment, rather than general programming language support which can run in many diverse operating system environments.
The GNU C library implements all of the functions specified in IEEE Std 1003.1-1988, the POSIX System Application Program Interface, commonly referred to as POSIX.1. The primary extensions to the ANSI C facilities specified by this standard include file system interface primitives (see section File System Interface), device-specific terminal control functions (see section Low-Level Terminal Interface), and process control functions (see section Child Processes).
Some facilities from draft 11 of IEEE Std 1003.2, the POSIX Shell and Utilities standard (POSIX.2) are also implemented in the GNU library. These include utilities for dealing with regular expressions and other pattern matching facilities (see section Pattern Matching).
The GNU C library defines facilities from some other versions of Unix, specifically from the 4.2 BSD and 4.3 BSD Unix systems (also known as Berkeley Unix) and from SunOS (a popular 4.2 BSD derivative that includes some Unix System V functionality).
The BSD facilities include symbolic links (see section Symbolic Links), the
select function (see section Waiting for Input or Output), the BSD signal
functions (see section BSD Signal Handling), and sockets (see section Sockets).
The System V Interface Description (SVID) is a document describing the AT&T Unix System V operating system. It is to some extent a superset of the POSIX standard (see section POSIX (The Portable Operating System Interface)).
The GNU C library defines some of the facilities required by the SVID that are not also required by the ANSI or POSIX standards, for compatibility with System V Unix and other Unix systems (such as SunOS) which include these facilities. However, many of the more obscure and less generally useful facilities required by the SVID are not included. (In fact, Unix System V itself does not provide them all.)
Incomplete: Are there any particular System V facilities that ought to be mentioned specifically here?
This section describes some of the practical issues involved in using the GNU C library.
Libraries for use by C programs really consist of two parts: header files that define types and macros and declare variables and functions; and the actual library or archive that contains the definitions of the variables and functions.
(Recall that in C, a declaration merely provides information that a function or variable exists and gives its type. For a function declaration, information about the types of its arguments might be provided as well. The purpose of declarations is to allow the compiler to correctly process references to the declared variables and functions. A definition, on the other hand, actually allocates storage for a variable or says what a function does.)
In order to use the facilities in the GNU C library, you should be sure that your program source files include the appropriate header files. This is so that the compiler has declarations of these facilities available and can correctly process references to them. Once your program has been compiled, the linker resolves these references to the actual definitions provided in the archive file.
Header files are included into a program source file by the `#include' preprocessor directive. The C language supports two forms of this directive; the first,
#include "header"
is typically used to include a header file header that you write yourself; this would contain definitions and declarations describing the interfaces between the different parts of your particular application. By contrast,
#include <file.h>
is typically used to include a header file `file.h' that contains definitions and declarations for a standard library. This file would normally be installed in a standard place by your system administrator. You should use this second form for the C library header files.
Typically, `#include' directives are placed at the top of the C source file, before any other code. If you begin your source files with some comments explaining what the code in the file does (a good idea), put the `#include' directives immediately afterwards, following the feature test macro definition (see section Feature Test Macros).
For more information about the use of header files and `#include' directives, see section 'Header Files' in The GNU C Preprocessor Manual.
The GNU C library provides several header files, each of which contains the type and macro definitions and variable and function declarations for a group of related facilities. This means that your programs may need to include several header files, depending on exactly which facilities you are using.
Some library header files include other library header files automatically. However, as a matter of programming style, you should not rely on this; it is better to explicitly include all the header files required for the library facilities you are using. The GNU C library header files have been written in such a way that it doesn't matter if a header file is accidentally included more than once; including a header file a second time has no effect. Likewise, if your program needs to include multiple header files, the order in which they are included doesn't matter.
Compatibility Note: Inclusion of standard header files in any order and any number of times works in any ANSI C implementation. However, this has traditionally not been the case in many older C implementations.
Strictly speaking, you don't have to include a header file to use a function it declares; you could declare the function explicitly yourself, according to the specifications in this manual. But it is usually better to include the header file because it may define types and macros that are not otherwise available and because it may define more efficient macro replacements for some functions. It is also a sure way to have the correct declaration.
If we describe something as a function in this manual, it may have a macro definition as well. This normally has no effect on how your program runs--the macro definition does the same thing as the function would. In particular, macro equivalents for library functions evaluate arguments exactly once, in the same way that a function call would. The main reason for these macro definitions is that sometimes they can produce an inline expansion that is considerably faster than an actual function call.
Taking the address of a library function works even if it is also defined as a macro. This is because, in this context, the name of the function isn't followed by the left parenthesis that is syntactically necessary to recognize the a macro call.
You might occasionally want to avoid using the a macro definition of a function--perhaps to make your program easier to debug. There are two ways you can do this:
For example, suppose the header file `stdlib.h' declares a function
named abs with
extern int abs (int);
and also provides a macro definition for abs. Then, in:
#include <stdlib.h>
int f (int *i) { return (abs (++*i)); }
the reference to abs might refer to either a macro or a function.
On the other hand, in each of the following examples the reference is
to a function and not a macro.
#include <stdlib.h>
int g (int *i) { return ((abs)(++*i)); }
#undef abs
int h (int *i) { return (abs (++*i)); }
Since macro definitions that double for a function behave in exactly the same way as the actual function version, there is usually no need for any of these methods. In fact, removing macro definitions usually just makes your program slower.
The names of all library types, macros, variables and functions that come from the ANSI C standard are reserved unconditionally; your program may not redefine these names. All other library names are reserved if your programs explicitly includes the header file that defines or declares them. There are several reasons for these restrictions:
exit to do something completely different from
what the standard exit function does, for example. Preventing
this situation helps to make your programs easier to understand and
contributes to modularity and maintainability.
In addition to the names documented in this manual, reserved names include all external identifiers (global functions and variables) that begin with an underscore (`_') and all identifiers regardless of use that begin with either two underscores or an underscore followed by a capital letter are reserved names. This is so that the library and header files can define functions, variables, and macros for internal purposes without risk of conflict with names in user programs.
Some additional classes of identifier names are reserved for future extensions to the C language. While using these names for your own purposes right now might not cause a problem, they do raise the possibility of conflict with future versions of the C standard, so you should avoid these names.
float or long double arguments,
respectively.
In addition, some individual header files reserve names beyond those that they actually define. You only need to worry about these restrictions if your program includes that particular header file.
The exact set of features available when you compile a source file is controlled by which feature test macros you define.
If you compile your programs using `gcc -ansi', you get only the ANSI C library features, unless you explicitly request additional features by defining one or more of the feature macros. See section 'Options' in The GNU CC Manual, for more information about GCC options.
You should define these macros by using `#define' preprocessor directives at the top of your source code files. You could also use the `-D' option to GCC, but it's better if you make the source files indicate their own meaning in a self-contained way.
If you define this macro, then the functionality from the POSIX.1 standard (IEEE Standard 1003.1) is available, as well as all of the ANSI C facilities.
If you define this macro with a value of 1, then the
functionality from the POSIX.1 standard (IEEE Standard 1003.1) is made
available. If you define this macro with a value of 2, then both
the functionality from the POSIX.1 standard and the functionality from
the POSIX.2 standard (IEEE Standard 1003.2) are made available. This is
in addition to the ANSI C facilities.
If you define this macro, functionality derived from 4.3 BSD Unix is included as well as the ANSI C, POSIX.1, and POSIX.2 material.
Some of the features derived from 4.3 BSD Unix conflict with the corresponding features specified by the POSIX.1 standard. If this macro is defined, the 4.3 BSD definitions take precedence over the POSIX definitions.
If you define this macro, functionality derived from SVID is included as well as the ANSI C, POSIX.1, and POSIX.2 material.
If you define this macro, everything is included: ANSI C, POSIX.1, POSIX.2, BSD, SVID, and GNU extensions. In the cases where POSIX.1 conflicts with BSD, the POSIX definitions take precedence.
If you want to get the full effect of _GNU_SOURCE but make the
BSD definitions take precedence over the POSIX definitions, use this
sequence of definitions:
#define _GNU_SOURCE #define _BSD_SOURCE #define _SVID_SOURCE
We recommend you use _GNU_SOURCE in new programs.
If you don't specify the `-ansi' option to GCC and don't define
any of these macros explicitly, the effect as the same as defining
_GNU_SOURCE.
When you define a feature test macro to request a larger class of
features, it is harmless to define in addition a feature test macro for
a subset of those features. For example, if you define
_POSIX_C_SOURCE, then defining _POSIX_SOURCE as well has
no effect. Likewise, if you define _GNU_SOURCE, then defining
either _POSIX_SOURCE or _POSIX_C_SOURCE or
_SVID_SOURCE as well has no effect.
Note, however, that the features of _BSD_SOURCE are not a subset
of any of the other feature test macros supported. This is because it
defines BSD features that take precedence over the POSIX features that
are requested by the other macros. For this reason, defining
_BSD_SOURCE in addition to the other feature test macros does
have an effect: it causes the BSD features to take priority over the
conflicting POSIX features.
Here is an overview of the contents of the remaining chapters of this manual.
sizeof operator and the symbolic constant NULL, and how to
write functions accepting variable numbers of arguments.
isspace) and functions for
performing case conversion.
char data type.
FILE * objects). These are the normal C library functions
from `stdio.h'.
setjmp and
longjmp functions.
If you already know the name of the facility you are interested in, you can look it up in section Summary of Library Facilities. This gives you a summary of its syntax and a pointer to where you can find a more detailed description. This appendix is particularly useful if you just want to verify the order and type of arguments to a function, for example.
Many functions in the GNU C library detect and report error conditions, and sometimes your programs need to check for these error conditions. For example, when you open an input file, you should verify that the file was actually opened correctly, and print an error message or take other appropriate action if the call to the library function failed.
This chapter describes how the error reporting facility works. Your program should include the header file `errno.h' to use this facility.
Most library functions return a special value to indicate that they have
failed. The special value is typically -1, a null pointer, or a
constant such as EOF that is defined for that purpose. But this
return value tells you only that an error has occurred. To find out
what kind of error it was, you need to look at the error code stored in the
variable errno. This variable is declared in the header file
`errno.h'.
The variable errno contains the system error number. You can
change the value of errno.
Since errno is declared volatile, it might be changed
asynchronously by a signal handler; see section Defining Signal Handlers.
However, a properly written signal handler saves and restores the value
of errno, so you generally do not need to worry about this
possibility except when writing signal handlers.
The initial value of errno at program startup is zero. Many
library functions are guaranteed to set it to certain nonzero values
when they encounter certain kinds of errors. These error conditions are
listed for each function. These functions do not change errno
when they succeed; thus, the value of errno after a successful
call is not necessarily zero, and you should not use errno to
determine whether a call failed. The proper way to do that is
documented for each function. If the call the failed, you can
examine errno.
Many library functions can set errno to a nonzero value as a
result of calling other library functions which might fail. You should
assume that any library function might alter errno.
Portability Note: ANSI C specifies errno as a
"modifiable lvalue" rather than as a variable, permitting it to be
implemented as a macro. For example, its expansion might involve a
function call, like *_errno (). In fact, that is what it is
on the GNU system itself. The GNU library, on non-GNU systems, does
whatever is right for the particular system.
There are a few library functions, like sqrt and atan,
that return a perfectly legitimate value in case of an error, but also
set errno. For these functions, if you want to check to see
whether an error occurred, the recommended method is to set errno
to zero before calling the function, and then check its value afterward.
All the error codes have symbolic names; they are macros defined in `errno.h'. The names start with `E' and an upper-case letter or digit; you should consider names of this form to be reserved names. See section Reserved Names.
The error code values are all positive integers and are all distinct.
(Since the values are distinct, you can use them as labels in a
switch statement, for example.) Your program should not make any
other assumptions about the specific values of these symbolic constants.
The value of errno doesn't necessarily have to correspond to any
of these macros, since some library functions might return other error
codes of their own for other situations. The only values that are
guaranteed to be meaningful for a particular library function are the
ones that this manual lists for that function.
On non-GNU systems, almost any system call can return EFAULT if
it is given an invalid pointer as an argument. Since this could only
happen as a result of a bug in your program, and since it will not
happen on the GNU system, we have saved space by not mentioning
EFAULT in the descriptions of individual functions.
The error code macros are defined in the header file `errno.h'. All of them expand into integer constant values. Some of these error codes can't occur on the GNU system, but they can occur using the GNU library on other systems.
Operation not permitted; only the owner of the file (or other resource) or processes with special privileges can perform the operation.
No such file or directory. This is a "file doesn't exist" error for ordinary files that are referenced in contexts where they are expected to already exist.
No process matches the specified process ID.
Interrupted function call; an asynchronous signal occured and prevented completion of the call. When this happens, you should try the call again.
You can choose to have functions resume after a signal that is handled,
rather than failing with EINTR; see section Primitives Interrupted by Signals.
Input/output error; usually used for physical read or write errors.
No such device or address. Typically, this means that a file representing a device has been installed incorrectly, and the system can't find the right kind of device driver for it.
Argument list too long; used when the arguments passed to a new program
being executed with one of the exec functions (see section Executing a File) occupy too much memory space. This condition never arises in the
GNU system.
Invalid executable file format. This condition is detected by the
exec functions; see section Executing a File.
Bad file descriptor; for example, I/O on a descriptor that has been closed or reading from a descriptor open only for writing (or vice versa).
There are no child processes. This error happens on operations that are supposed to manipulate child processes, when there aren't any processes to manipulate.
Deadlock avoided; allocating a system resource would have resulted in a deadlock situation. For an example, See section File Locks.
No memory available. The system cannot allocate more virtual memory because its capacity is full.
Permission denied; the file permissions do not allow the attempted operation.
Bad address; an invalid pointer was detected.
A file that isn't a block special file was given in a situation that requires one. For example, trying to mount an ordinary file as a file system in Unix gives this error.
Resource busy; a system resource that can't be shared is already in use. For example, if you try to delete a file that is the root of a currently mounted filesystem, you get this error.
File exists; an existing file was specified in a context where it only makes sense to specify a new file.
An attempt to make an improper link across file systems was detected.
The wrong type of device was given to a function that expects a particular sort of device.
A file that isn't a directory was specified when a directory is required.
File is a directory; attempting to open a directory for writing gives this error.
Invalid argument. This is used to indicate various kinds of problems with passing the wrong argument to a library function.
There are too many distinct file openings in the entire system. Note that any number of linked channels count as just one file opening; see section Linked Channels.
The current process has too many files open and can't open any more. Duplicate descriptors do count toward this limit.
Inappropriate I/O control operation, such as trying to set terminal modes on an ordinary file.
An attempt to execute a file that is currently open for writing, or write to a file that is currently being executed. (The name stands for "text file busy".) This is not an error in the GNU system; the text is copied as necessary.
File too big; the size of a file would be larger than allowed by the system.
No space left on device; write operation on a file failed because the disk is full.
Invalid seek operation (such as on a pipe).
An attempt was made to modify a file on a read-only file system.
Too many links; the link count of a single file is too large.
Broken pipe; there is no process reading from the other end of a pipe.
Every library function that returns this error code also generates a
SIGPIPE signal; this signal terminates the program if not handled
or blocked. Thus, your program will never actually see EPIPE
unless it has handled or blocked SIGPIPE.
Domain error; used by mathematical functions when an argument value does not fall into the domain over which the function is defined.
Range error; used by mathematical functions when the result value is not representable because of overflow or underflow.
Resource temporarily unavailable; the call might work if you try again
later. Only fork returns error code EAGAIN for such a
reason.
An operation that would block was attempted on an object that has non-blocking mode selected.
Portability Note: In 4.4BSD and GNU, EWOULDBLOCK and
EAGAIN are the same. Earlier versions of BSD (see section Berkeley Unix) have two distinct codes, and use EWOULDBLOCK to indicate
an I/O operation that would block on an object with non-blocking mode
set, and EAGAIN for other kinds of errors.
An operation that cannot complete immediately was initiated on an object that has non-blocking mode selected.
An operation is already in progress on an object that has non-blocking mode selected.
A file that isn't a socket was specified when a socket is required.
No destination address was supplied on a socket operation.
The size of a message sent on a socket was larger than the supported maximum size.
The socket type does not support the requested communications protocol.
You specified a socket option that doesn't make sense for the particular protocol being used by the socket. See section Socket Options.
The socket domain does not support the requested communications protocol. See section Creating a Socket.
The socket type is not supported.
The operation you requested is not supported. Some socket functions don't make sense for all types of sockets, and others may not be implemented for all communications protocols.
The socket communications protocol family you requested is not supported.
The address family specified for a socket is not supported; it is inconsistent with the protocol being used on the socket. See section Sockets.
The requested socket address is already in use. See section Socket Addresses.
The requested socket address is not available; for example, you tried to give a socket a name that doesn't match the local host name. See section Socket Addresses.
A socket operation failed because the network was down.
A socket operation failed because the subnet containing the remost host was unreachable.
A network connection was reset because the remote host crashed.
A network connection was aborted locally.
A network connection was closed for reasons outside the control of the local host, such as by the remote machine rebooting.
The kernel's buffers for I/O operations are all in use.
You tried to connect a socket that is already connected. See section Making a Connection.
The socket is not connected to anything. You get this error when you try to transmit data over a socket, without first specifying a destination for the data.
The socket has already been shut down.
A socket operation with a specified timeout received no response during the timeout period.
A remote host refused to allow the network connection (typically because it is not running the requested service).
Too many levels of symbolic links were encountered in looking up a file name. This often indicates a cycle of symbolic links.
Filename too long (longer than PATH_MAX; see section Limits on File System Capacity) or host name too long (in gethostname or
sethostname; see section Host Identification).
The remote host for a requested network connection is down.
The remote host for a requested network connection is not reachable.
Directory not empty, where an empty directory was expected. Typically, this error occurs when you are trying to delete a directory.
The file quota system is confused because there are too many users.
The user's disk quota was exceeded.
Stale NFS file handle. This indicates an internal confusion in the NFS system which is due to file system rearrangements on the server host. Repairing this condition usually requires unmounting and remounting the NFS file system on the local host.
An attempt was made to NFS-mount a remote file system with a file name that already specifies an NFS-mounted file. (This is an error on some operating systems, but we expect it to work properly on the GNU system, making this error code impossible.)
No locks available. This is used by the file locking facilities; see section File Locks.
Function not implemented. Some functions have commands or options defined that might not be supported in all implementations, and this is the kind of error you get if you request them and they are not supported.
The experienced user will know what is wrong.
This error code has no purpose.
The library has functions and variables designed to make it easy for
your program to report informative error messages in the customary
format about the failure of a library call. The functions
strerror and perror give you the standard error message
for a given error code; the variable
program_invocation_short_name gives you convenient access to the
name of the program that encountered the error.
Function: char * strerror (int errnum)
The strerror function maps the error code (see section Checking for Errors) specified by the errnum argument to a descriptive error
message string. The return value is a pointer to this string.
The value errnum normally comes from the variable errno.
You should not modify the string returned by strerror. Also, if
you make subsequent calls to strerror, the string might be
overwritten. (But it's guaranteed that no library function ever calls
strerror behind your back.)
The function strerror is declared in `string.h'.
Function: void perror (const char *message)
This function prints an error message to the stream stderr;
see section Standard Streams.
If you call perror with a message that is either a null
pointer or an empty string, perror just prints the error message
corresponding to errno, adding a trailing newline.
If you supply a non-null message argument, then perror
prefixes its output with this string. It adds a colon and a space
character to separate the message from the error string corresponding
to errno.
The function perror is declared in `stdio.h'.
strerror and perror produce the exact same message for any
given error code; the precise text varies from system to system. On the
GNU system, the messages are fairly short; there are no multi-line
messages or embedded newlines. Each error message begins with a capital
letter and does not include any terminating punctuation.
Compatibility Note: The strerror function is a new
feature of ANSI C. Many older C systems do not support this function
yet.
Many programs that don't read input from the terminal are designed to
exit if any system call fails. By convention, the error message from
such a program should start with the program's name, sans directories.
You can find that name in the variable
program_invocation_short_name; the full file name is stored the
variable program_invocation_name:
Variable: char * program_invocation_name
This variable's value is the name that was used to invoke the program
running in the current process. It is the same as argv[0].
Variable: char * program_invocation_short_name
This variable's value is the name that was used to invoke the program
running in the current process, with directory names removed. (That is
to say, it is the same as program_invocation_name minus
everything up to the last slash, if any.)
Both program_invocation_name and
program_invocation_short_name are set up by the system before
main is called.
Portability Note: These two variables are GNU extensions. If
you want your program to work with non-GNU libraries, you must save the
value of argv[0] in main, and then strip off the directory
names yourself. We added these extensions to make it possible to write
self-contained error-reporting subroutines that require no explicit
cooperation from main.
Here is an example showing how to handle failure to open a file
correctly. The function open_sesame tries to open the named file
for reading and returns a stream if successful. The fopen
library function returns a null pointer if it couldn't open the file for
some reason. In that situation, open_sesame constructs an
appropriate error message using the strerror function, and
terminates the program. If we were going to make some other library
calls before passing the error code to strerror, we'd have to
save it in a local variable instead, because those other library
functions might overwrite errno in the meantime.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
FILE *
open_sesame (char *name)
{
FILE *stream;
errno = 0;
stream = fopen (name, "r");
if (!stream) {
fprintf (stderr, "%s: Couldn't open file %s; %s\n",
program_invocation_short_name, name, strerror (errno));
exit (EXIT_FAILURE);
} else
return stream;
}
The GNU system provides several methods for allocating memory space under explicit program control. They vary in generality and in efficiency.
malloc facility allows fully general dynamic allocation.
See section Unconstrained Allocation.
malloc but more
efficient and convenient for stacklike allocation. See section Obstacks.
alloca lets you allocate storage dynamically that
will be freed automatically. See section Automatic Storage with Variable Size.
Dynamic memory allocation is a technique in which programs determine as they are running where to store some information. You need dynamic allocation when the number of memory blocks you need, or how long you continue to need them, depends on the data you are working on.
For example, you may need a block to store a line read from an input file; since there is no limit to how long a line can be, you must allocate the storage dynamically and make it dynamically larger as you read more of the line.
Or, you may need a block for each record or each definition in the input data; since you can't know in advance how many there will be, you must allocate a new block for each record or definition as you read it.
When you use dynamic allocation, the allocation of a block of memory is an action that the program requests explicitly. You call a function or macro when you want to allocate space, and specify the size with an argument. If you want to free the space, you do so by calling another function or macro. You can do these things whenever you want, as often as you want.
The C language supports two kinds of memory allocation through the variables in C programs:
In GNU C, the length of the automatic storage can be an expression that varies. In other C implementations, it must be a constant.
Dynamic allocation is not supported by C variables; there is no storage class "dynamic", and there can never be a C variable whose value is stored in dynamically allocated space. The only way to refer to dynamically allocated space is through a pointer. Because it is less convenient, and because the actual process of dynamic allocation requires more computation time, programmers use dynamic allocation only when neither static nor automatic allocation will serve.
For example, if you want to allocate dynamically some space to hold a
struct foobar, you cannot declare a variable of type struct
foobar whose contents are the dynamically allocated space. But you can
declare a variable of pointer type struct foobar * and assign it the
address of the space. Then you can use the operators `*' and
`->' on this pointer variable to refer to the contents of the space:
{
struct foobar *ptr
= (struct foobar *) malloc (sizeof (struct foobar));
ptr->name = x;
ptr->next = current_foobar;
current_foobar = ptr;
}
The most general dynamic allocation facility is malloc. It
allows you to allocate blocks of memory of any size at any time, make
them bigger or smaller at any time, and free the blocks individually at
any time (or never).
To allocate a block of memory, call malloc. The prototype for
this function is in `stdlib.h'.
Function: void * malloc (size_t size)
This function returns a pointer to a newly allocated block size bytes long, or a null pointer if the block could not be allocated.
The contents of the block are undefined; you must initialize it yourself
(or use calloc instead; see section Allocating Cleared Space).
Normally you would cast the value as a pointer to the kind of object
that you want to store in the block. Here we show an example of doing
so, and of initializing the space with zeros using the library function
memset (see section Copying and Concatenation):
struct foo *ptr; ... ptr = (struct foo *) malloc (sizeof (struct foo)); if (ptr == 0) abort (); memset (ptr, 0, sizeof (struct foo));
You can store the result of malloc into any pointer variable
without a cast, because ANSI C automatically converts the type
void * to another type of pointer when necessary. But the cast
is necessary in contexts other than assignment operators or if you might
want your code to run in traditional C.
Remember that when allocating space for a string, the argument to
malloc must be one plus the length of the string. This is
because a string is terminated with a null character that doesn't count
in the "length" of the string but does need space. For example:
char *ptr; ... ptr = (char *) malloc (length + 1);
See section Representation of Strings, for more information about this.
malloc
If no more space is available, malloc returns a null pointer.
You should check the value of every call to malloc. It is
useful to write a subroutine that calls malloc and reports an
error if the value is a null pointer, returning only if the value is
nonzero. This function is conventionally called xmalloc. Here
it is:
void *
xmalloc (size_t size)
{
register void *value = malloc (size);
if (value == 0)
fatal ("virtual memory exhausted");
return value;
}
Here is a real example of using malloc (by way of xmalloc).
The function savestring will copy a sequence of characters into
a newly allocated null-terminated string:
char *
savestring (const char *ptr, size_t len)
{
register char *value = (char *) xmalloc (len + 1);
memcpy (value, ptr, len);
value[len] = 0;
return value;
}
The block that malloc gives you is guaranteed to be aligned so
that it can hold any type of data. In the GNU system, the address is
always a multiple of eight; if the size of block is 16 or more, then the
address is always a multiple of 16. Only rarely is any higher boundary
(such as a page boundary) necessary; for those cases, use
memalign or valloc (see section Allocating Aligned Memory Blocks).
Note that the memory located after the end of the block is likely to be
in use for something else; perhaps a block already allocated by another
call to malloc. If you attempt to treat the block as longer than
you asked for it to be, you are liable to destroy the data that
malloc uses to keep track of its blocks, or you may destroy the
contents of another block. If you have already allocated a block and
discover you want it to be bigger, use realloc (see section Changing the Size of a Block).
malloc
When you no longer need a block that you got with malloc, use the
function free to make the block available to be allocated again.
The prototype for this function is in `stdlib.h'.
Function: void free (void *ptr)
The free function deallocates the block of storage pointed at
by ptr.
Function: void cfree (void *ptr)
This function does the same thing as free. It's provided for
backward compatibility with SunOS; you should use free instead.
Freeing a block alters the contents of the block. Do not expect to find any data (such as a pointer to the next block in a chain of blocks) in the block after freeing it. Copy whatever you need out of the block before freeing it! Here is an example of the proper way to free all the blocks in a chain, and the strings that they point to:
struct chain
{
struct chain *next;
char *name;
}
void
free_chain (struct chain *chain)
{
while (chain != 0)
{
struct chain *next = chain->next;
free (chain->name);
free (chain);
chain = next;
}
}
Occasionally, free can actually return memory to the operating
system and make the process smaller. Usually, all it can do is allow a
later later call to malloc to reuse the space. In the mean time,
the space remains in your program as part of a free-list used internally
by malloc.
There is no point in freeing blocks at the end of a program, because all of the program's space is given back to the system when the process terminates.
Often you do not know for certain how big a block you will ultimately need at the time you must begin to use the block. For example, the block might be a buffer that you use to hold a line being read from a file; no matter how long you make the buffer initially, you may encounter a line that is longer.
You can make the block longer by calling realloc. This function
is declared in `stdlib.h'.
Function: void * realloc (void *ptr, size_t newsize)
The realloc function changes the size of the block whose address is
ptr to be newsize.
Since the space after the end of the block may be in use, realloc
may find it necessary to copy the block to a new address where more free
space is available. The value of realloc is the new address of the
block. If the block needs to be moved, realloc copies the old
contents.
Like malloc, realloc may return a null pointer if no
memory space is available to make the block bigger. When this happens,
the original block is untouched; it has not been modified or relocated.
In most cases it makes no difference what happens to the original block
when realloc fails, because the application program cannot continue
when it is out of memory, and the only thing to do is to give a fatal error
message. Often it is convenient to write and use a subroutine,
conventionally called xrealloc, that takes care of the error message
as xmalloc does for malloc:
void *
xrealloc (void *ptr, size_t size)
{
register void *value = realloc (ptr, size);
if (value == 0)
fatal ("Virtual memory exhausted");
return value;
}
You can also use realloc to make a block smaller. The reason you
would do this is to avoid tying up a lot of memory space when only a little
is needed. Making a block smaller sometimes necessitates copying it, so it
can fail if no other space is available.
If the new size you specify is the same as the old size, realloc
is guaranteed to change nothing and return the same address that you gave.
The function calloc allocates memory and clears it to zero. It
is declared in `stdlib.h'.
Function: void * calloc (size_t count, size_t eltsize)
This function allocates a block long enough to contain a vector of
count elements, each of size eltsize. Its contents are
cleared to zero before calloc returns.
You could define calloc as follows:
void *
calloc (size_t count, size_t eltsize)
{
size_t size = count * eltsize;
void *value = malloc (size);
if (value != 0)
memset (value, 0, size);
return value;
}
We rarely use calloc today, because it is equivalent to such a
simple combination of other features that are more often used. It is a
historical holdover that is not quite obsolete.
malloc
To make the best use of malloc, it helps to know that the GNU
version of malloc always dispenses small amounts of memory in
blocks whose sizes are powers of two. It keeps separate pools for each
power of two. This holds for sizes up to a page size. Therefore, if
you are free to choose the size of a small block in order to make
malloc more efficient, make it a power of two.
Once a page is split up for a particular block size, it can't be reused for another size unless all the blocks in it are freed. In many programs, this is unlikely to happen. Thus, you can sometimes make a program use memory more efficiently by using blocks of the same size for many different purposes.
When you ask for memory blocks of a page or larger, malloc uses a
different strategy; it rounds the size up to a multiple of a page, and
it can coalesce and split blocks as needed.
The reason for the two strategies is that it is important to allocate and free small blocks as fast as possible, but speed is less important for a large block since the program normally spends a fair amount of time using it. Also, large blocks are normally fewer in number. Therefore, for large blocks, it makes sense to use a method which takes more time to minimize the wasted space.
The address of a block returned by malloc or realloc in
the GNU system is always a multiple of eight. If you need a block whose
address is a multiple of a higher power of two than that, use
memalign or valloc. These functions are declared in
`stdlib.h'.
With the GNU library, you can use free to free the blocks that
memalign and valloc return. That does not work in BSD,
however--BSD does not provide any way to free such blocks.
Function: void * memalign (size_t size, int boundary)
The memalign function allocates a block of size bytes whose
address is a multiple of boundary. The boundary must be a
power of two! The function memalign works by calling
malloc to allocate a somewhat larger block, and then returning an
address within the block that is on the specified boundary.
Function: void * valloc (size_t size)
Using valloc is like using memalign and passing the page size
as the value of the second argument.
You can ask malloc to check the consistency of dynamic storage by
using the mcheck function. This function is a GNU extension,
declared in `malloc.h'.
Function: void mcheck (void (*abortfn) (void))
Calling mcheck tells malloc to perform occasional
consistency checks. These will catch things such as writing
past the end of a block that was allocated with malloc.
The abortfn argument is the function to call when an inconsistency
is found. If you supply a null pointer, the abort function is
used.
It is too late to begin allocation checking once you have allocated
anything with malloc. So mcheck does nothing in that
case. The function returns -1 if you call it too late, and
0 otherwise (when it is successful).
The easiest way to arrange to call mcheck early enough is to use
the option `-lmcheck' when you link your program.
The GNU C library lets you modify the behavior of malloc,
realloc, and free by specifying appropriate hook
functions. You can use these hooks to help you debug programs that use
dynamic storage allocation, for example.
The hook variables are declared in `malloc.h'.
The value of this variable is a pointer to function that malloc
uses whenever it is called. You should define this function to look
like malloc; that is, like:
void *function (size_t size)
The value of this variable is a pointer to function that realloc
uses whenever it is called. You should define this function to look
like realloc; that is, like:
void *function (void *ptr, size_t size)
The value of this variable is a pointer to function that free
uses whenever it is called. You should define this function to look
like free; that is, like:
void function (void *ptr)
You must make sure that the function you install as a hook for one of these functions does not call that function recursively without restoring the old value of the hook first! Otherwise, your program will get stuck in an infinite recursion.
Here is an example showing how to use __malloc_hook properly. It
installs a function that prints out information every time malloc
is called.
static void *(*old_malloc_hook) (size_t);
static void *
my_malloc_hook (size_t size)
{
void *result;
__malloc_hook = old_malloc_hook;
result = malloc (size);
__malloc_hook = my_malloc_hook;
printf ("malloc (%u) returns %p\n", (unsigned int) size, result);
return result;
}
main ()
{
...
old_malloc_hook = __malloc_hook;
__malloc_hook = my_malloc_hook;
...
}
The mcheck function (see section Heap Consistency Checking) works by
installing such hooks.
malloc
You can get information about dynamic storage allocation by calling the
mstats function. This function and its associated data type are
declared in `malloc.h'; they are a GNU extension.
This structure type is used to return information about the dynamic storage allocator. It contains the following members:
size_t bytes_total
size_t chunks_used
malloc requests; see section Efficiency Considerations for malloc.)
size_t bytes_used
size_t chunks_free
size_t bytes_free
Function: struct mstats mstats (void)
This function returns information about the current dynamic memory usage
in a structure of type struct mstats.
malloc-Related Functions
Here is a summary of the functions that work with malloc:
void *malloc (size_t size)
void free (void *addr)
malloc. See section Freeing Memory Allocated with malloc.
void *realloc (void *addr, size_t size)
malloc larger or smaller,
possibly by copying it to a new location. See section Changing the Size of a Block.
void *calloc (size_t count, size_t eltsize)
malloc, and set its contents to zero. See section Allocating Cleared Space.
void *valloc (size_t size)
void *memalign (size_t size, size_t boundary)
void mcheck (void (*abortfn) (void))
malloc to perform occasional consistency checks on
dynamically allocated memory, and to call abortfn when an
inconsistency is found. See section Heap Consistency Checking.
void *(*__malloc_hook) (size_t size)
malloc uses whenever it is called.
void *(*__realloc_hook) (void *ptr, size_t size)
realloc uses whenever it is called.
void (*__free_hook) (void *ptr)
free uses whenever it is called.
void struct mstats mstats (void)
malloc.
An obstack is a pool of memory containing a stack of objects. You can create any number of separate obstacks, and then allocate objects in specified obstacks. Within each obstack, the last object allocated must always be the first one freed, but distinct obstacks are independent of each other.
Aside from this one constraint of order of freeing, obstacks are totally general: an obstack can contain any number of objects of any size. They are implemented with macros, so allocation is usually very fast as long as the objects are usually small. And the only space overhead per object is the padding needed to start each object on a suitable boundary.
The utilities for manipulating obstacks are declared in the header file `obstack.h'.
An obstack is represented by a data structure of type struct
obstack. This structure has a small fixed size; it records the status
of the obstack and how to find the space in which objects are allocated.
It does not contain any of the objects themselves. You should not try
to access the contents of the structure directly; use only the functions
described in this chapter.
You can declare variables of type struct obstack and use them as
obstacks, or you can allocate obstacks dynamically like any other kind
of object. Dynamic allocation of obstacks allows your program to have a
variable number of different stacks. (You can even allocate an
obstack structure in another obstack, but this is rarely useful.)
All the functions that work with obstacks require you to specify which
obstack to use. You do this with a pointer of type struct obstack
*. In the following, we often say "an obstack" when strictly
speaking the object at hand is such a pointer.
The objects in the obstack are packed into large blocks called
chunks. The struct obstack structure points to a chain of
the chunks currently in use.
The obstack library obtains a new chunk whenever you allocate an object
that won't fit in the previous chunk. Since the obstack library manages
chunks automatically, you don't need to pay much attention to them, but
you do need to supply a function which the obstack library should use to
get a chunk. Usually you supply a function which uses malloc
directly or indirectly. You must also supply a function to free a chunk.
These matters are described in the following section.
Each source file in which you plan to use the obstack functions must include the header file `obstack.h', like this:
#include <obstack.h>
Also, if the source file uses the macro obstack_init, it must
declare or define two functions or macros that will be called by the
obstack library. One, obstack_chunk_alloc, is used to allocate the
chunks of memory into which objects are packed. The other,
obstack_chunk_free, is used to return chunks when the objects in
them are freed.
Usually these are defined to use malloc via the intermediary
xmalloc (see section Unconstrained Allocation). This is done with
the following pair of macro definitions:
#define obstack_chunk_alloc xmalloc #define obstack_chunk_free free
Though the storage you get using obstacks really comes from malloc,
using obstacks is faster because malloc is called less often, for
larger blocks of memory. See section Obstack Chunks, for full details.
At run time, before the program can use a struct obstack object
as an obstack, it must initialize the obstack by calling
obstack_init.
Function: void obstack_init (struct obstack *obstack_ptr)
Initialize obstack obstack_ptr for allocation of objects.
Here are two examples of how to allocate the space for an obstack and initialize it. First, an obstack that is a static variable:
struct obstack myobstack; ... obstack_init (&myobstack);
Second, an obstack that is itself dynamically allocated:
struct obstack *myobstack_ptr = (struct obstack *) xmalloc (sizeof (struct obstack)); obstack_init (myobstack_ptr);
The most direct way to allocate an object in an obstack is with
obstack_alloc, which is invoked almost like malloc.
Function: void * obstack_alloc (struct obstack *obstack_ptr, size_t size)
This allocates an uninitialized block of size bytes in an obstack
and returns its address. Here obstack_ptr specifies which obstack
to allocate the block in; it is the address of the struct obstack
object which represents the obstack. Each obstack function or macro
requires you to specify an obstack_ptr as the first argument.
For example, here is a function that allocates a copy of a string str
in a specific obstack, which is the variable string_obstack:
struct obstack string_obstack;
char *
copystring (char *string)
{
char *s = (char *) obstack_alloc (&string_obstack,
strlen (string) + 1);
memcpy (s, string, strlen (string));
return s;
}
To allocate a block with specified contents, use the function
obstack_copy, declared like this:
Function: void * obstack_copy (struct obstack *obstack_ptr, void *address, size_t size)
This allocates a block and initializes it by copying size bytes of data starting at address.
Function: void * obstack_copy0 (struct obstack *obstack_ptr, void *address, size_t size)
Like obstack_copy, but appends an extra byte containing a null
character. This extra byte is not counted in the argument size.
The obstack_copy0 function is convenient for copying a sequence
of characters into an obstack as a null-terminated string. Here is an
example of its use:
char *
obstack_savestring (char *addr, size_t size)
{
return obstack_copy0 (&myobstack, addr, size);
}
Contrast this with the previous example of savestring using
malloc (see section Basic Storage Allocation).
To free an object allocated in an obstack, use the function
obstack_free. Since the obstack is a stack of objects, freeing
one object automatically frees all other objects allocated more recently
in the same obstack.
Function: void obstack_free (struct obstack *obstack_ptr, void *object)
If object is a null pointer, everything allocated in the obstack is freed. Otherwise, object must be the address of an object allocated in the obstack. Then object is freed, along with everything allocated in obstack since object.
Note that if object is a null pointer, the result is an
uninitialized obstack. To free all storage in an obstack but leave it
valid for further allocation, call obstack_free with the address
of the first object allocated on the obstack:
obstack_free (obstack_ptr, first_object_allocated_ptr);
Recall that the objects in an obstack are grouped into chunks. When all the objects in a chunk become free, the obstack library automatically frees the chunk (see section Preparing for Using Obstacks). Then other obstacks, or non-obstack allocation, can reuse the space of the chunk.
The interfaces for using obstacks may be defined either as functions or as macros, depending on the compiler. The obstack facility works with all C compilers, including both ANSI C and traditional C, but there are precautions you must take if you plan to use compilers other than GNU C.
If you are using an old-fashioned non-ANSI C compiler, all the obstack "functions" are actually defined only as macros. You can call these macros like functions, but you cannot use them in any other way (for example, you cannot take their address).
Calling the macros requires a special precaution: namely, the first operand (the obstack pointer) may not contain any side effects, because it may be computed more than once. For example, if you write this:
obstack_alloc (get_obstack (), 4);
you will find that get_obstack may be called several times.
If you use *obstack_list_ptr++ as the obstack pointer argument,
you will get very strange results since the incrementation may occur
several times.
In ANSI C, each function has both a macro definition and a function definition. The function definition is used if you take the address of the function without calling it. An ordinary call uses the macro definition by default, but you can request the function definition instead by writing the function name in parentheses, as shown here:
char *x; void *(*funcp) (); /* Use the macro. */ x = (char *) obstack_alloc (obptr, size); /* Call the function. */ x = (char *) (obstack_alloc) (obptr, size); /* Take the address of the function. */ funcp = obstack_alloc;
This is the same situation that exists in ANSI C for the standard library functions. See section Macro Definitions of Functions.
Warning: When you do use the macros, you must observe the precaution of avoiding side effects in the first operand, even in ANSI C.
If you use the GNU C compiler, this precaution is not necessary, because various language extensions in GNU C permit defining the macros so as to compute each argument only once.
Because storage in obstack chunks is used sequentially, it is possible to build up an object step by step, adding one or more bytes at a time to the end of the object. With this technique, you do not need to know how much data you will put in the object until you come to the end of it. We call this the technique of growing objects. The special functions for adding data to the growing object are described in this section.
You don't need to do anything special when you start to grow an object.
Using one of the functions to add data to the object automatically
starts it. However, it is necessary to say explicitly when the object is
finished. This is done with the function obstack_finish.
The actual address of the object thus built up is not known until the object is finished. Until then, it always remains possible that you will add so much data that the object must be copied into a new chunk.
While the obstack is in use for a growing object, you cannot use it for ordinary allocation of another object. If you try to do so, the space already added to the growing object will become part of the other object.
Function: void obstack_blank (struct obstack *obstack_ptr, size_t size)
The most basic function for adding to a growing object is
obstack_blank, which adds space without initializing it.
Function: void obstack_grow (struct obstack *obstack_ptr, void *data, size_t size)
To add a block of initialized space, use obstack_grow, which is
the growing-object analogue of obstack_copy. It adds size
bytes of data to the growing object, copying the contents from
data.
Function: void obstack_grow0 (struct obstack *obstack_ptr, void *data, size_t size)
This is the growing-object analogue of obstack_copy0. It adds
size bytes copied from data, followed by an additional null
character.
Function: void obstack_1grow (struct obstack *obstack_ptr, char c)
To add one character at a time, use the function obstack_1grow.
It adds a single byte containing c to the growing object.
Function: void * obstack_finish (struct obstack *obstack_ptr)
When you are finished growing the object, use the function
obstack_finish to close it off and return its final address.
Once you have finished the object, the obstack is available for ordinary allocation or for growing another object.
When you build an object by growing it, you will probably need to know
afterward how long it became. You need not keep track of this as you grow
the object, because you can find out the length from the obstack just
before finishing the object with the function obstack_object_size,
declared as follows:
Function: size_t obstack_object_size (struct obstack *obstack_ptr)
This function returns the current size of the growing object, in bytes.
Remember to call this function before finishing the object.
After it is finished, obstack_object_size will return zero.
If you have started growing an object and wish to cancel it, you should finish it and then free it, like this:
obstack_free (obstack_ptr, obstack_finish (obstack_ptr));
This has no effect if no object was growing.
You can use obstack_blank with a negative size argument to make
the current object smaller. Just don't try to shrink it beyond zero
length--there's no telling what will happen if you do that.
The usual functions for growing objects incur overhead for checking whether there is room for the new growth in the current chunk. If you are frequently constructing objects in small steps of growth, this overhead can be significant.
You can reduce the overhead by using special "fast growth" functions that grow the object without checking. In order to have a robust program, you must do the checking yourself. If you do this checking in the simplest way each time you are about to add data to the object, you have not saved anything, because that is what the ordinary growth functions do. But if you can arrange to check less often, or check more efficiently, then you make the program faster.
The function obstack_room returns the amount of room available
in the current chunk. It is declared as follows:
Function: size_t obstack_room (struct obstack *obstack_ptr)
This returns the number of bytes that can be added safely to the current growing object (or to an object about to be started) in obstack obstack using the fast growth functions.
While you know there is room, you can use these fast growth functions for adding data to a growing object:
Function: void obstack_1grow_fast (struct obstack *obstack_ptr, char c)
The function obstack_1grow_fast adds one byte containing the
character c to the growing object in obstack obstack_ptr.
Function: void obstack_blank_fast (struct obstack *obstack_ptr, size_t size)
The function obstack_blank_fast adds size bytes to the
growing object in obstack obstack_ptr without initializing them.
When you check for space using obstack_room and there is not
enough room for what you want to add, the fast growth functions
are not safe. In this case, simply use the corresponding ordinary
growth function instead. Very soon this will copy the object to a
new chunk; then there will be lots of room available again.
So, each time you use an ordinary growth function, check afterward for
sufficient space using obstack_room. Once the object is copied
to a new chunk, there will be plenty of space again, so the program will
start using the fast growth functions again.
Here is an example:
void
add_string (struct obstack *obstack, char *ptr, size_t len)
{
while (len > 0)
{
if (obstack_room (obstack) > len)
{
/* We have enough room: add everything fast. */
while (len-- > 0)
obstack_1grow_fast (obstack, *ptr++);
}
else
{
/* Not enough room. Add one character slowly,
which may copy to a new chunk and make room. */
obstack_1grow (obstack, *ptr++);
len--;
}
}
}
Here are functions that provide information on the current status of allocation in an obstack. You can use them to learn about an object while still growing it.
Function: void * obstack_base (struct obstack *obstack_ptr)
This function returns the tentative address of the beginning of the currently growing object in obstack_ptr. If you finish the object immediately, it will have that address. If you make it larger first, it may outgrow the current chunk--then its address will change!
If no object is growing, this value says where the next object you allocate will start (once again assuming it fits in the current chunk).
Function: void * obstack_next_free (struct obstack *obstack_ptr)
This function returns the address of the first free byte in the current
chunk of obstack obstack_ptr. This is the end of the currently
growing object. If no object is growing, obstack_next_free
returns the same value as obstack_base.
Function: size_t obstack_object_size (struct obstack *obstack_ptr)
This function returns the size in bytes of the currently growing object. This is equivalent to
obstack_next_free (obstack_ptr) - obstack_base (obstack_ptr)
Each obstack has an alignment boundary; each object allocated in the obstack automatically starts on an address that is a multiple of the specified boundary. By default, this boundary is 4 bytes.
To access an obstack's alignment boundary, use the macro
obstack_alignment_mask, whose function prototype looks like
this:
Macro: int obstack_alignment_mask (struct obstack *obstack_ptr)
The value is a bit mask; a bit that is 1 indicates that the corresponding bit in the address of an object should be 0. The mask value should be one less than a power of 2; the effect is that all object addresses are multiples of that power of 2. The default value of the mask is 3, so that addresses are multiples of 4. A mask value of 0 means an object can start on any multiple of 1 (that is, no alignment is required).
The expansion of the macro obstack_alignment_mask is an lvalue,
so you can alter the mask by assignment. For example, this statement:
obstack_alignment_mask (obstack_ptr) = 0;
has the effect of turning off alignment processing in the specified obstack.
Note that a change in alignment mask does not take effect until
after the next time an object is allocated or finished in the
obstack. If you are not growing an object, you can make the new
alignment mask take effect immediately by calling obstack_finish.
This will finish a zero-length object and then do proper alignment for
the next object.
Obstacks work by allocating space for themselves in large chunks, and then parceling out space in the chunks to satisfy your requests. Chunks are normally 4096 bytes long unless you specify a different chunk size. The chunk size includes 8 bytes of overhead that are not actually used for storing objects. Regardless of the specified size, longer chunks will be allocated when necessary for long objects.
The obstack library allocates chunks by calling the function
obstack_chunk_alloc, which you must define. When a chunk is no
longer needed because you have freed all the objects in it, the obstack
library frees the chunk by calling obstack_chunk_free, which you
must also define.
These two must be defined (as macros) or declared (as functions) in each
source file that uses obstack_init (see section Creating Obstacks).
Most often they are defined as macros like this:
#define obstack_chunk_alloc xmalloc #define obstack_chunk_free free
Note that these are simple macros (no arguments). Macro definitions with
arguments will not work! It is necessary that obstack_chunk_alloc
or obstack_chunk_free, alone, expand into a function name if it is
not itself a function name.
The function that actually implements obstack_chunk_alloc cannot
return "failure" in any fashion, because the obstack library is not
prepared to handle failure. Therefore, malloc itself is not
suitable. If the function cannot obtain space, it should either
terminate the process (see section Program Termination) or do a nonlocal
exit using longjmp (see section Non-Local Exits).
If you allocate chunks with malloc, the chunk size should be a
power of 2. The default chunk size, 4096, was chosen because it is long
enough to satisfy many typical requests on the obstack yet short enough
not to waste too much memory in the portion of the last chunk not yet used.
Macro: size_t obstack_chunk_size (struct obstack *obstack_ptr)
This returns the chunk size of the given obstack.
Since this macro expands to an lvalue, you can specify a new chunk size by assigning it a new value. Doing so does not affect the chunks already allocated, but will change the size of chunks allocated for that particular obstack in the future. It is unlikely to be useful to make the chunk size smaller, but making it larger might improve efficiency if you are allocating many objects whose size is comparable to the chunk size. Here is how to do so cleanly:
if (obstack_chunk_size (obstack_ptr) < new_chunk_size) obstack_chunk_size (obstack_ptr) = new_chunk_size;
Here is a summary of all the functions associated with obstacks. Each
takes the address of an obstack (struct obstack *) as its first
argument.
void obstack_init (struct obstack *obstack_ptr)
void *obstack_alloc (struct obstack *obstack_ptr, size_t size)
void *obstack_copy (struct obstack *obstack_ptr, void *address, size_t size)
void *obstack_copy0 (struct obstack *obstack_ptr, void *address, size_t size)
void obstack_free (struct obstack *obstack_ptr, void *object)
void obstack_blank (struct obstack *obstack_ptr, size_t size)
void obstack_grow (struct obstack *obstack_ptr, void *address, size_t size)
void obstack_grow0 (struct obstack *obstack_ptr, void *address, size_t size)
void obstack_1grow (struct obstack *obstack_ptr, char data_char)
void *obstack_finish (struct obstack *obstack_ptr)
size_t obstack_object_size (struct obstack *obstack_ptr)
void obstack_blank_fast (struct obstack *obstack_ptr, size_t size)
void obstack_1grow_fast (struct obstack *obstack_ptr, char data_char)
size_t obstack_room (struct obstack *obstack_ptr)
int obstack_alignment_mask (struct obstack *obstack_ptr)
size_t obstack_chunk_size (struct obstack *obstack_ptr)
void *obstack_base (struct obstack *obstack_ptr)
void *obstack_next_free (struct obstack *obstack_ptr)
The function alloca supports a kind of half-dynamic allocation in
which blocks are allocated dynamically but freed automatically.
Allocating a block with alloca is an explicit action; you can
allocate as many blocks as you wish, and compute the size at run time. But
all the blocks are freed when you exit the function that alloca was
called from, just as if they were automatic variables declared in that
function. There is no way to free the space explicitly.
The prototype for alloca is in `stdlib.h'. This function is
a BSD extension.
Function: void * alloca (size_t size);
The return value of alloca is the address of a block of size
bytes of storage, allocated in the stack frame of the calling function.
Do not use alloca inside the arguments of a function call--you
will get unpredictable results, because the stack space for the
alloca would appear on the stack in the middle of the space for
the function arguments. An example of what to avoid is foo (x,
alloca (4), y).
alloca Example
As an example of use of alloca, here is a function that opens a file
name made from concatenating two argument strings, and returns a file
descriptor or minus one signifying failure:
int
open2 (char *str1, char *str2, int flags, int mode)
{
char *name = (char *) alloca (strlen (str1) + strlen (str2) + 1);
strcpy (name, str1);
strcat (name, str2);
return open (name, flags, mode);
}
Here is how you would get the same results with malloc and
free:
int
open2 (char *str1, char *str2, int flags, int mode)
{
char *name = (char *) malloc (strlen (str1) + strlen (str2) + 1);
int desc;
if (name == 0)
fatal ("virtual memory exceeded");
strcpy (name, str1);
strcat (name, str2);
desc = open (name, flags, mode);
free (name);
return desc;
}
As you can see, it is simpler with alloca. But alloca has
other, more important advantages, and some disadvantages.
alloca
Here are the reasons why alloca may be preferable to malloc:
alloca wastes very little space and is very fast. (It is
open-coded by the GNU C compiler.)
alloca does not have separate pools for different sizes of
block, space used for any size block can be reused for any other size.
alloca does not cause storage fragmentation.
longjmp (see section Non-Local Exits)
automatically free the space allocated with alloca when they exit
through the function that called alloca. This is the most
important reason to use alloca.
To illustrate this, suppose you have a function
open_or_report_error which returns a descriptor, like
open, if it succeeds, but does not return to its caller if it
fails. If the file cannot be opened, it prints an error message and
jumps out to the command level of your program using longjmp.
Let's change open2 (see section alloca Example) to use this
subroutine:
int
open2 (char *str1, char *str2, int flags, int mode)
{
char *name = (char *) alloca (strlen (str1) + strlen (str2) + 1);
strcpy (name, str1);
strcat (name, str2);
return open_or_report_error (name, flags, mode);
}
Because of the way alloca works, the storage it allocates is
freed even when an error occurs, with no special effort required.
By contrast, the previous definition of open2 (which uses
malloc and free) would develop a storage leak if it were
changed in this way. Even if you are willing to make more changes to
fix it, there is no easy way to do so.
alloca
These are the disadvantages of alloca in comparison with
malloc:
alloca, so it is less
portable. However, a slower emulation of alloca written in C
is available for use on systems with this deficiency.
In GNU C, you can replace most uses of alloca with an array of
variable size. Here is how open2 would look then:
int open2 (char *str1, char *str2, int flags, int mode)
{
char name[strlen (str1) + strlen (str2) + 1];
strcpy (name, str1);
strcat (name, str2);
return open (name, flags, mode);
}
But alloca is not always equivalent to a variable-sized array, for
several reasons:
alloca usually
remains until the end of the function.
alloca within a loop, allocating an
additional block on each iteration. This is impossible with
variable-sized arrays. On the other hand, this is also slightly
unclean.
Note: If you mix use of alloca and variable-sized arrays
within one function, exiting a scope in which a variable-sized array was
declared frees all blocks allocated with alloca during the
execution of that scope.
Any system of dynamic memory allocation has overhead: the amount of space it uses is more than the amount the program asks for. The relocating memory allocator achieves very low overhead by moving blocks in memory as necessary, on its own initiative.
When you allocate a block with malloc, the address of the block
never changes unless you use realloc to change its size. Thus,
you can safely store the address in various places, temporarily or
permanently, as you like. This is not safe when you use the relocating
memory allocator, because any and all relocatable blocks can move
whenever you allocate memory in any fashion. Even calling malloc
or realloc can move the relocatable blocks.
For each relocatable block, you must make a handle---a pointer object in memory, designated to store the address of that block. The relocating allocator knows where each block's handle is, and updates the address stored there whenever it moves the block, so that the handle always points to the block. Each time you access the contents of the block, you should fetch its address anew from the handle.
To call any of the relocating allocator functions from a signal handler is almost certainly incorrect, because the signal could happen at any time and relocate all the blocks. The only way to make this safe is to block the signal around any access to the contents of any relocatable block--not a convenient mode of operation. See section Signal Handling and Nonreentrant Functions.
In the descriptions below, handleptr designates the address of the handle. All the functions are declared in `malloc.h'; all are GNU extensions.
Function: void * r_alloc (void **handleptr, size_t size)
This function allocates a relocatable block of size size. It
stores the block's address in *handleptr and returns
a non-null pointer to indicate success.
If r_alloc can't get the space needed, it stores a null pointer
in *handleptr, and returns a null pointer.
Function: void r_alloc_free (void **handleptr)
This function is the way to free a relocatable block. It frees the
block that *handleptr points to, and stores a null pointer
in *handleptr to show it doesn't point to an allocated
block any more.
Function: void * r_re_alloc (void **handleptr, size_t size)
The function r_re_alloc adjusts the size of the block that
*handleptr points to, making it size bytes long. It
stores the address of the resized block in *handleptr and
returns a non-null pointer to indicate success.
If enough memory is not available, this function returns a null pointer
and does not modify *handleptr.
You can ask for warnings as the program approaches running out of memory
space, by calling memory_warnings. This is a GNU extension
declared in `malloc.h'.
Function: void memory_warnings (void *start, void (*warn_func) (char *))
Call this function to request warnings for nearing exhaustion of virtual memory.
The argument start says where data space begins, in memory. The allocator compares this against the last address used and against the limit of data space, to determine the fraction of available memory in use. If you supply zero for start, then a default value is used which is right in most circumstances.
For warn_func, supply a function that malloc can call to
warn you. It is called with a string (a warning message) as argument.
Normally it ought to display the string for the user to read.
The warnings come when memory becomes 75% full, when it becomes 85% full, and when it becomes 95% full. Above 95% you get another warning each time memory usage increases.
Programs that work with characters and strings often need to classify a character--is it alphabetic, is it a digit, is it whitespace, and so on--and perform case conversion operations on characters. The functions in the header file `ctype.h' are provided for this purpose.
Since the choice of locale and character set can alter the
classifications of particular character codes, all of these functions
are affected by the current locale. (More precisely, they are affected
by the locale currently selected for character classification--the
LC_CTYPE category; see section Categories of Activities that Locales Affect.)
This section explains the library functions for classifying characters.
For example, isalpha is the function to test for an alphabetic
character. It takes one argument, the character to test, and returns a
nonzero integer if the character is alphabetic, and zero otherwise. You
would use it like this:
if (isalpha (c))
printf ("The character `%c' is alphabetic.\n", c);
Each of the functions in this section tests for membership in a
particular class of characters; each has a name starting with `is'.
Each of them takes one argument, which is a character to test, and
returns an int which is treated as a boolean value. The
character argument is passed as an int, and it may be the
constant value EOF instead of a real character.
The attributes of any given character can vary between locales. See section Locales and Internationalization, for more information on locales.
These functions are declared in the header file `ctype.h'.
Returns true if c is a lower-case letter.
Returns true if c is an upper-case letter.
Returns true if c is an alphabetic character (a letter). If
islower or isupper is true of a character, then
isalpha is also true.
In some locales, there may be additional characters for which
isalpha is true--letters which are neither upper case nor lower
case. But in the standard "C" locale, there are no such
additional characters.
Returns true if c is a decimal digit (`0' through `9').
Returns true if c is an alphanumeric character (a letter or
number); in other words, if either isalpha or isdigit is
true of a character, then isalnum is also true.
Function: int isxdigit (int c)
Returns true if c is a hexadecimal digit. Hexadecimal digits include the normal decimal digits `0' through `9' and the letters `A' through `F' and `a' through `f'.
Returns true if c is a punctuation character. This means any printing character that is not alphanumeric or a space character.
Returns true if c is a whitespace character. In the standard
"C" locale, isspace returns true for only the standard
whitespace characters:
' '
'\f'
'\n'
'\r'
'\t'
'\v'
Returns true if c is a blank character; that is, a space or a tab. This function is a GNU extension.
Returns true if c is a graphic character; that is, a character that has a glyph associated with it. The whitespace characters are not considered graphic.
Returns true if c is a printing character. Printing characters include all the graphic characters, plus the space (` ') character.
Returns true if c is a control character (that is, a character that is not a printing character).
Returns true if c is a 7-bit unsigned char value that fits
into the US/UK ASCII character set. This function is a BSD extension
and is also an SVID extension.
This section explains the library functions for performing conversions
such as case mappings on characters. For example, toupper
converts any character to upper case if possible. If the character
can't be converted, toupper returns it unchanged.
These functions take one argument of type int, which is the
character to convert, and return the converted character as an
int. If the conversion is not applicable to the argument given,
the argument is returned unchanged.
Compatibility Note: In pre-ANSI C dialects, instead of
returning the argument unchanged, these functions may fail when the
argument is not suitable for the conversion. Thus for portability, you
may need to write islower(c) ? toupper(c) : c rather than just
toupper(c).
These functions are declared in the header file `ctype.h'.
If c is an upper-case letter, tolower returns the corresponding
lower-case letter. If c is not an upper-case letter,
c is returned unchanged.
If c is a lower-case letter, tolower returns the corresponding
upper-case letter. Otherwise c is returned unchanged.
This function converts c to a 7-bit unsigned char value
that fits into the US/UK ASCII character set, by clearing the high-order
bits. This function is a BSD extension and is also an SVID extension.
Function: int _tolower (int c)
This is identical to tolower, and is provided for compatibility
with the SVID. See section SVID (The System V Interface Description).
Function: int _toupper (int c)
This is identical to toupper, and is provided for compatibility
with the SVID.
Operations on strings (or arrays of characters) are an important part of
many programs. The GNU C library provides an extensive set of string
utility functions, including functions for copying, concatenating,
comparing, and searching strings. Many of these functions can also
operate on arbitrary regions of storage; for example, the memcpy
function can be used to copy the contents of any kind of array.
It's fairly common for beginning C programmers to "reinvent the wheel" by duplicating this