An Outline of the Source Code

This chapter describes the overall organization of Flick's source code and provides some insights into why things are strucutured as they are. Programmers should read this chapter thoroughly before proceeding to later parts of the manual.

2.1 Flick Programs, Files, and Libraries


PIC
Figure 2.1Overview of the Flick idl Compiler.

Flick works in three phases, each implemented as a separate program: the front end, the presentation generator, and the back end. These steps are illustrated in Figure 2.1. A Flick front end is simply a parser that translates some idl input into an intermediate representation called an aoi (Abstract Object Interface, `.aoi') file. Next, a presentation generator determines how the constructs in the aoi file (the parsed idl file) are to be mapped onto type and function definitions in the C or C++ programming language. In other words, the presentation generator determines how the generated stubs will appear -- e.g., the stubs' function prototypes -- and how they will interact with user code -- e.g., the stubs' parameter passing conventions. This information about the stubs is written into a pres_c (Presentation in C/C++, `.prc') file. The final phase of compilation, the back end, reads a pres_c file and one or more scml (Source Code Markup Language) files, and from those descriptions produces the C or C++ code for the optimized stubs, written to use a particular transport system (e.g., tcp) and message format. A Makefile is generally used to run the desired Flick compiler programs in series.1

Flick has multiple implementations of each of the three phases described above. There are three separate front ends: one to parse corba idl files, one to parse onc rpc (Sun) idl files, and one to parse mig idl files. There are five different presentation generators, implementing the corba (C and C++), onc rpc, mig, and Fluke2 mappings of idl constructs onto C or C++. Finally, there are seven separate back ends for producing stubs that use iiop (C and C++), onc/tcp, Mach messages, Trapeze,3 Khazana-style messages,4 and Fluke ipc.

As shown in Figure 2.1, the separate programs communicate through intermediate aoi and pres_c files. (An aoi file also contains meta data; a pres_c file also contains meta, mint, and cast data.5 ) This means that Flick must be able to write all of these intermediate languages to files and read them back again. To facilitate these steps, Flick defines these intermediate languages in onc rpc idl files (`.x' files, contained in the mom directory). When Flick is compiled, these files are processed by rpcgen to produce the functions to read and write Flick's intermediate languages to and from files.

Each of Flick's compilation stages is primarily implemented by one or more large libraries of C and C++ code. Different implementations of an idl compilation stage are created by specializing library C++ classes or overriding library C functions to implement new, tailored behavior. For instance, the onc rpc presentation generator was created by deriving from the base presentation generator C++ class, and then supplying a small amount of new code to implement the specifics of the onc rpc language mapping. The libraries that manipulate Flick's intermediate languages are described in Part II of this manual; the libraries for individual compiler stages are described in Part III.

Before moving on to the organization of Flick's source code, here are some retrospective thoughts on the general design and implementation of Flick:

Using rpcgen to Generate Code for Flick's Intermediate Representations.
The primary reason for using rpcgen in defining Flick's intermediate representations was to eliminate the need for programmers to write IR se- rialization and other I/O code by hand. In retrospect, this early decision has saved the Flick implementors a lot of work. At the same time, however, it has made other parts of the work more difficult. The limita- tions of the onc rpc idl forced compromises in the design of the IRs, e.g., in the representation of scopes and references in aoi. Further, because the rpcgen-produced data types are transparent C types, Flick's programmers were tempted to treat them as transparent -- not abstract -- entities. At many points in the code, rpcgen-defined structures are filled out field-by-field, rather than by calling a separate constructor or initialization function. This has made maintenance more difficult, because when the onc rpc idl definition of a data type is changed, all the manipulating code must be updated as well. This is not a consistent prob- lem: there are many Flick library functions to help programmers manipulate IRs data structures as abstract objects. It would have been useful, however, to force this style of programming throughout Flick's develop- ment.6
Back Ends.
A back end is responsible for generating code that works with a particular transport system, using a par- ticular message format and a particular data encoding format, in conjunction with a certain runtime library. Thus, there are at least four messaging aspects that each back end decides. It would be useful to separate these aspects in Flick's back ends, so that each could be specified separately.7
Mix-and-Match.
In principle, Flick's design makes it possible and straightforward to mix different front ends, presentation generators, and back ends in novel ways to create a wide variety of stubs. In practice, however, there are several significant barriers to the random mixing of components:

Therefore, in practice, Flick developers have tended to focus on well-defined pairings between different Flick components.

2.2 Organization of the Source Code

Flick's source code is organized in a directory hierarchy as follows:

In addition, there are top-level directories for other major components of the source code, including the runtime libraries and headers (runtime), extra support files for certain platforms (support), the Flick testing infrastructure (test), and documentation (doc).

Below is a detailed listing of the major directories in the Flick source code tree. This catalog is generated automatically by a script that collects the contents of the WHAT-IS-THIS files in the Flick source tree. Some of the listed directories -- those that contain out-of-date or obsolete code -- may not be included as part of general Flick distributions.

.
This is the root of the source tree for Flick, the Flexible, Optimizing IDL Compiler Kit.
c
This directory contains all of the Flick libraries and components that are specific to presentation- and code-generation for the C and C++ programming languages.
c/libcast
This directory contains general-purpose routines for creating and manipulating C Abstract Syntax Tree (CAST) data structures.
c/libcparse
This directory contains lex/yacc code to parse a C source code file into a C Abstract Syntax Tree (CAST). This isn't currently used by Flick, but can be used for testing CAST code.
c/libpres_c
This directory contains general-purpose routines for creating and manipulating PRES_C (Presentation in C/C++) data structures.
c/pbe
This directory contains the code for all back ends that produce C or C++ code. Aside from certain data structure and general support libraries (libcast, libcompiler, etc.), this directory should contain all code related to the translation of PRES_C (.prc) files to C and C++ source and header files.
c/pbe/fluke
This directory contains code for stub generation using the Fluke IPC transport mechanism. This back end produces stubs that are suitable for use within the Fluke kernel server.
c/pbe/iiop
This directory contains code for stub generation using the CORBA IIOP transport mechanism.
c/pbe/iiopxx
This directory contains code for stub generation using the TAO runtime. TAO is the real-time, open source CORBA ORB from Washington University in St. Louis.
c/pbe/khazana
This directory contains code for generating stubs for use within Khazana. Khazana is a distributed service exporting the abstraction of a distributed, secure, persistent, globally shared store that applications can use to store their data. See http://www.cs.utah.edu/flux/ for more information.
c/pbe/lib
This directory contains the base implementation of all Flick back ends that produce C or C++ code. The core of the library is an abstract base class; certain member functions must be defined in each derived class to create a complete back end.
c/pbe/local
This directory contains (incomplete?) back-end code for generation of a local transport. Intraprocess? Host-local ports? Not really sure. This code is no longer maintained, and it hasn't been substantially changed since 1995, so it is most certainly out of date.
c/pbe/mach3
This directory contains back-end code to complete the back end library for the Mach 3 transport. This is different from the Mach3MIG back-end code because it does not attempt to implement the MIG message format. This code is no longer maintained. It has severe bit-rot and almost certainly does not compile or work any longer.
c/pbe/mach3mig
This directory contains code to generate Mach 3 transport stubs that emulate the MIG message format.
c/pbe/mach4
This directory contains code to produce Mach 4 transport stubs. The code for this back end is incomplete and is no longer maintained.
c/pbe/sun
This directory contains code for stub generation using the ONC RPC (Sun RPC) transport mechanism.
c/pbe/trapeze
This directory contains code for stub generation using the Trapeze transport mechanism.
c/pdl
This directory contains old code for the Presentation Definition Language (PDL) piece of Flick, which also served as a top-level driver program for all of the separate Flick programs (front end, presentation generator, and back end). PDL allowed one to modify a C language presentation (as described by a PRES_C file); however, the PDL program is old, out of date, and is no longer maintained. Some of the code is derived from Sun's `rpcgen' source code.
c/pfe
This directory contains all of the code (except for common data structure manipulation libraries) for translation from AOI to PRES_C. This is called the "presentation generation" phase of IDL compilation.
c/pfe/corba
This directory contains the code for Flick's CORBA C language presentation generator. Because the CORBA presentation style is implemented as a library (libcorba), this directory is practically empty.
c/pfe/corbaxx
This directory contains the code for Flick's CORBA C++ language presentation generator. Like the CORBA C presentation, this is also implemented as a library (libcorbaxx), so this directory is practically empty.
c/pfe/fluke
This directory contains the Fluke/MOM presentation generator for C language stubs. The Fluke language mapping is for the most part like the CORBA C language mapping, but with all the generated stubs and type names changed.
c/pfe/lib
This directory contains a "generic" AOI-to-C presentation generator. The generic mapping is largely but not entirely CORBA-like; some of the more complicated mapping procedures are simplified and more `rpcgen'-like. All C and C++ language presentation generators (except the MIG presentation generator in `fe/mig') are derived from this generic presentation generator base class, either directly or indirectly.
c/pfe/libcorba
This directory contains the CORBA-specific modifications to the "generic" AOI to C language mapping. The CORBA language mapping is organized as a library so that the Fluke presentation generator can be derived from it.
c/pfe/libcorbaxx
This directory contains the CORBA C++-specific modifications to the "generic" AOI to C language mapping. It is a library so that other presentation generators can be derived from it.
c/pfe/sun
This directory contains the Sun RPC, "rpcgen"-like presentation generator. The Sun presentation generator is derived from the "generic" base class for C language presentation generators.
doc
This directory contains Flick documentation.
doc/guts
This diretcory contains the LATEX sources for the Flick Programmer's Manual.
doc/usersguide
This diretcory contains the LATEX sources for the Flick User's Manual.
fe
This directory contains code for Flick's front ends, which translate from some IDL (CORBA IDL, ONC RPC IDL, MIG IDL, ...) to AOI. Actually, the MIG front end skips over AOI and generates PRES_C directly.
fe/corba
This directory contains a "back-end" for the Sun's CORBA IDL compiler front end (which can be obtained from ftp.omg.org). It translates from Sun's CORBA AST to AOI. The code in this directory is out of date and is no longer maintained. Flick's current CORBA front end is in the `newcorba' directory.
fe/mig
This directory contains Flick's MIG front end and presentation generator. MIG contains way too much presentation glop to use the AOI IR, so this MIG front end generates PRES_C directly.
fe/newcorba
This directory contains Flick's CORBA IDL front end, which translates CORBA IDL input files into AOI.
fe/sun
This directory contains Flick's ONC RPC (a.k.a. Sun RPC) IDL front end, which translates ONC RPC IDL files into AOI files. This front end uses the original `rpcgen' code to parse the IDL; Flick-specific code then translates the parse tree into AOI format.
libaoi
This directory contains data creation and manipulation routines for the AOI data format. The AOI library is used extensively in Flick's front ends and presentation generators.
libcompiler
This directory contains utility code used throughout Flick, including error-checked malloc/calloc/realloc routines, uniform warning and error printing routines, and some code printing routines.
libmeta
This directory contains routines for creating and manipulating the metadata within Flick's intermediate files.
libmint
This directory contains data creation and manipulation routines for the MINT data format. It is used extensively throughout presentation generation and back end code generation.
mom
This directory contains the definitions of all of the ONC RPC IDL- specified intermediate data formats formats used internally by Flick (AOI, MINT, CAST, PRES_C, and META). These formats are specified in ONC RPC IDL because Flick needs to send these intermediate formats between separate programs, and Flick does that by writing data to intermediate files. Because the formats are specified in IDL, we can use `rpcgen' to automatically create the code to read and write Flick's intermediate data representations.

This directory also contains the header files for the libraries that manipulate Flick's non-C/C++-specific intermediate representations (i.e., libaoi, libmint, and libmeta).

mom/c
This directory contains header files that are specific to the C and C++ language-specific phases of Flick (i.e., the presentation generation and back end stages).
mom/c/be
This directory contains additional header files specific to Flick's C/C++ back ends.
test
This directory is the root of test tree. The structure of the test directory tree is patterned after the stages of the Flick compiler. There are separate directories to hold the results of each front end, presentation generator, and back end. The output from each stage can be verified against known-good copies of the output files. Driver programs are compiled and linked separately. There is a Makefile in the root of the testing tree that simplifies the testing process, and a README file to explain everything.
util
This directory contains Flick utility programs.
util/aoid
This directory contains the source code for `aoid', a pretty-printer for AOI files. When the AOI format is changed, this tool should be modified to reflect those changes. Given a `filename.aoi', the `aoid' program will create a human-readable `filename.aod' file.
util/presd
This directory contains `presd', a pretty-printer for PRES_C files. As PRES_C changes, this program should be changed as well. Given a `filename.prc', the `presd' program will create a human-readable `filename.prd' file.

2.3 Coding Style

Flick is not implemented in an extremely rigid style. Nevertheless, the code has been written using a few general rules and idioms, and these rules should be followed whenever one adds new code or modifies existing code.

2.3.1 Files

Obviously, new files should be placed in accordance with the existing directory hierarchy described in Section 2.2. File names should be chosen to match the file's existing peers; e.g., if the existing file names use underscores to separate words, the new file name should follow this pattern. Flick source files generally use the following file name extensions:

.cc
C++ source files.
.hh
C++-only header files.
.c
C source files.
.h
C header files. Note that C header files must be #include-able by C++ code. In general, this means that one uses preprocessor magic to ensure that whenever the file is included by C++ source, the header contents are wrapped within an extern "C" declaration.
.idl
corba idl files.
.x
onc rpc idl files.
.defs
mig idl files.

Each make-able directory contains a GNUmakefile.in file that is processed by Flick's configure script to produce a GNUmakefile for building the directory contents. (Flick requires GNU Make.) Most "leaf" GNUmakefiles do little more than define a few macros. The GNUmakerules.* files at the root of the source tree contain the actual rules used to build various Flick libraries and programs. Consult the user's manual for further help in building Flick.

2.3.2 Aesthetics

When writing new code or modifying existing code, please follow these guidelines:

Maintain consistency.
When modifying Flick in any way, the primary rule is to make new code look like the code that's already there. A corollary to the rule is this: do not reformat existing code. Keep the same declaration style, commenting style, indentation style, and so on as you write and modify code. Most any good, consistent coding style can be readable, but an inconsistent style creates Programmer's Hell. Gratuitous reformatting of existing code makes it difficult to tell from a diff what was actually changed and what is just cosmetic.
Comment the code.
Source code is useless if people cannot understand it. Comment new code as you write it. Fix poorly commented code as you find it. Comments come in a variety of forms, but the preferred style is to have the initiators (/*) and terminators (*/) on separate lines with each line of text starting with an asterisk (*). The result looks clean, even though it can be troublesome to edit afterwards. Be careful not to use `//'-style comments in C code; not all C compilers accept such comments.
Avoid tricks; write portable code.
Obvious code is best. Avoid tricks and non-portable idioms; in particular, be careful to avoid language extensions provided by gcc and g++. Use the GNU compiler options -pedantic and -ansi to weed out GNUisms in new code.
Treat compiler warnings as errors.
Not only should your code compile on a variety of platforms with a variety of compilers, it should also compile without any warnings. A warning is a signal that something is wrong. A constant stream of warnings makes it too easy for new warnings to hide.
Use C for general-purpose libraries.
The libraries for Flick's intermediate representations (aoi, mint, and so on) are implemented in C so that they may be used in both C and C++ programs. The functions in libcompiler are implemented in C for the same reason.
Maintain language consistency.
Some parts of Flick are implemented in C, and others are implemented in C++. However, it is confusing when a single component is implemented using both languages. When working on an existing component, use the language that is already in use for that component. When starting a new component, use the language that is common for components of that class.
Use the Flux Project coding style.
The Flux coding style should be used in all C- and C++-based programs implemented by the Flux Project. The most obvious hallmark of this style is 8-character indentation stops -- chosen to make vi users happy. A second hallmark is that braces are "hung" after if, else, for, and switch. Braces are not hung, however, around the bodies of functions, structure type definitions, and so on. The following code snippet illustrates these points of the Flux coding style:
      int
      make_payments(. . .)
      {
              . . .
              for (i = 0; i < pmt_max; ++i) {
                      if (i == pmt_num) {
                              skip_payment(pmts, i);
                              . . .
                      } else {
                              double_payment(pmts, i);
                              . . .
                      }
                      print_current_payments(pmts);
              }
      }
     

It is more important to maintain consistency with existing code than it is to apply the Flux coding style. In particular, this means that when "foreign" code is assimilated into Flick (e.g., bits of rpcgen for the onc rpc front end, and bits of mig for the mig front end), the formatting of the foreign code should be preserved.

Use library routines as much as possible.
In particular, avoid treating Flick's intermediate representation types (e.g., the various aoi and pres_c node types) as "exposed" C structures whenever it is feasible and reasonable to do so. As much as possible, use the appropriate library functions to manipulate IR nodes as abstract data types. This makes the code easier to read and maintain. (This advice is somewhat idealistic. While it is generally straightforward to create nodes through a library interface, there is currently very little library support for accessing node data in any kind of abstract way.)

2.3.3 Storage Allocation

Although Flick-generated stubs and skeletons are careful to manage memory in a robust manner, Flick itself handles memory in a way that is more appropriate to a compiler.

Flick is written with the assumption that memory is plentiful, and this assumption is evident in two ways throughout the code. First, Flick makes frequent use of library routines that implement "allocate memory or panic" behavior (see Section 9.2). Thus, if memory runs out, Flick will exit ungracefully. Second, Flick is often not very careful in freeing memory. Memory may "leak" when the last pointer to a bit of previously allocated storage is lost.

In practice, this style of coding is not problematic for Flick: there is plenty of memory on modern computer systems, the compiler's memory demands are not large in any case, and the various Flick compiler passes run very quickly. It is certainly possible to find idl inputs, however, that cause Flick to consume inordinate amounts of memory or CPU time. For this and other reasons, we expect to eventually "clean up" Flick's memory management code, either through recoding or through the addition of some kind of garbage collector.

2.3.4 Tags

Many functions in Flick take their arguments as "tag lists": lists of name-value associations. Each association consists of an identifier, representing some attribute, followed by the set of (zero or more) values for that attribute. A special tag identifier signals the end of the tag list. For instance, the function add_function accepts a list of tags describing the properties of a C++ function:
     add_function(tl,
                  sc_out_name,
                  PFA_Protection, current_protection,
                  PFA_Constructor,
                  PFA_FunctionKind, "T_out(T_var &)",
                  PFA_Scope, scope,
                  PFA_ReturnType, cast_new_type(CAST_TYPE_NULL),
                  PFA_Parameter, type, "r", NULL,
                  PFA_TAG_DONE);

Each tag (the PFA_* identifiers) in the above example is paired with zero or more data values. Most tags have a single associated value; for example, PFA_Protection is paired with the value of the variable current_protection. Some tags, however, are associated with no values (e.g., PFA_Constructor in the example above) or with multiple values (e.g., PFA_Parameter). When a tag has no value, it usually signifies a boolean switch: the attribute is true when the tag appears in the tag list. The special PFA_TAG_DONE identifier in the above example signals the end of the tag list.

Tags are generally grouped into sets or "families" according to their purpose. For instance, all of the tag identifiers in the above example begin with PFA because they are all "presentation function attributes" as defined in mom/c/libpres_c.h. The documentation for any tag set should explain the meaning of each tag identifier and the values that each tag must be associated with. (For more information about the PFA tags, see Section 7.3.)

Essentially all functions that accept tag lists are varargs-style functions, meaning that they may be called with an arbitrary-length list of tags and values. This is why a special tag is required to mark the end of a tag list. Functions that accept tag lists generally do so because they construct objects (e.g., pres_c nodes) or just because they have a large number of possible arguments. When used to construct objects, the tags generally correspond to slots or flags within the constructed object. Some subset of the possible tags may not be present in the argument list given to a function: for instance, in the above example, the PFA_GlobalFunction tag was not specified. The documentation for a function that takes a tag list should describe what tags are required (if any) and what happens when certain tags are not present in a given tag list.

Because tag lists are generally passed to varargs-style functions, programmers must be extremely careful to ensure that the right types and number of values are associated with each tag: the C compiler cannot typecheck a tag list! Most notably, programmers must make sure to use NULL rather than 0 when passing a literal null pointer to a tags-based function. On some architectures (e.g., 64-bit machines), a null pointer and an integer 0 have different representations, and may not even be represented by the same number of bits. In a varargs list, the compiler does not know what types of values are required for each tag, and thus the compiler cannot automatically convert zeros to null pointer values.

The notion of a tag lists can be extended to create entire object hierarchies. For example, the following statement uses tag lists to create a nest of four pres_c nodes: a PRES_C_INLINE_ATOM containing a PRES_C_MAPPING_TEMPORARY, which contains a PRES_C_MAPPING_ARGUMENT, which contains a PRES_C_MAPPING_DIRECT. Tag lists are used to specify the hierarchy of nodes and fill in the nodes' data slots.
     ac->length = PRES_C_I_ATOM,
                    PIA_Index, 0,
                    PIA_Mapping, PRES_C_M_TEMPORARY,
                      PMA_Name, "string_len",
                      PMA_CType, lctype,
                      PMA_PreHandler, "stringlen",
                      PMA_TempType, TEMP_TYPE_ENCODED,
                      PMA_Target, PRES_C_M_ARGUMENT,
                        PMA_ArgList, ac->arglist_name,
                        PMA_Name, "length",
                        PMA_Mapping, PRES_C_M_DIRECT, END_PRES_C,
                        END_PRES_C,
                      END_PRES_C,
                    END_PRES_C;

The node types are specified by special PRES_C_* macros, which expand into calls to functions. Each function takes a tag list: PIA_* tags specify "pres_c inline attributes" and PMA_* tags specify "pres_c mapping attributes." The special tag END_PRES_C marks the end of each tag list.

The tag-based code makes it clear how the four nodes relate to each other. Moreover, the above tag-based statement is easier to read and maintain than the alternative: a block of C statements that create and initialize each node individually.

2.4 Summary and Comments

Flick began as a collection of C-only programs and libraries. Originally, individual presentation generators and back ends would "specialize" library code simply by providing new versions of certain functions. At compile time, the linker would include the specialized function in the generated program. This technique made it difficult to create a specialized version of a function that extended the behavior of the library routine; in object-oriented terms, it was impossible to invoke a parent method. Further, because libraries were simply collections of C functions (and types), it was not inherently obvious which sets of functions applied to which "objects" in the library.

For these reasons, most of Flick was converted to C++ in 1995. Some parts, notably the libraries for manipulating Flick's intermediate languages, remained in C. (This made it easier to write C programs based on these libraries.) The conversion was done in the most straightforward manner: certain C structure types were replaced with C++ classes, and functions on those structures were transformed into methods. Most of the existing C idioms (e.g., the use of stdio, malloc, and so on), however, were left in place. This approach was highly pragmatic: the existing C code worked, so there was no need to replace it; the programmers already knew the C idioms; and in 1995, many C++ implementations were notoriously buggy.

The result was that Flick became a combination of C and C++, programmed in a very C-like style. Largely, Flick has remained in this state, although newer parts of Flick take greater advantage of C++-specific idioms.

The overall design of the original C implementation has survived through Flick's conversion to C++ and through the many extensions that have been made since. The remaining chapters of this manual describe Flick's current design and implementation in detail. In places, this design is starting to show its age. Therefore, where appropriate, this manual will point out problems that have become apparent in the current design and implementation of Flick, and will suggest ways in which these problems might be corrected in the future.