Distributed Filaments: Efficient Fine-Grain Parallelism on a Cluster of Workstations


Vincent W. Freeh, David K. Lowenthal, Gregory R. Andrews
University of Arizona
{vin,dkl,greg}@cs.arizona.edu

Abstract

A fine-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism aries naturally in many situations, such as iterative grid computations, recursive fork/join programs, the bodies of parallel FOR loops, and the implicit parallelism in functional or dataflow languages. It is useful both to desribe massively parallel computations and as a target for compilers. However, fine-grain parallelism has long been thought to be inefficient due to the overheads of process creation, context switching, and synchronization. This paper describes a software kernel, Distributed Filaments (DF), that implements fine-grain parallelism both portably and efficiently on a workstation cluster. DF runs on existing, off-the-shelf hardware and software. DF has a simple interface, with only seven routines, so it is easy to use. It achieves efficiency by using stateless threads on each node, overlapping communication and computation, employing a new low-overhead, reliable datagram communication protocol, and automatically balancing the work generated by fork/join computations. The net result is good speedup on a variety of applications.