Distributed Filaments: Efficient Fine-Grain Parallelism on a Cluster of Workstations
Vincent W. Freeh, David K. Lowenthal, Gregory R. Andrews
University of Arizona
{vin,dkl,greg}@cs.arizona.edu
Abstract
A fine-grain parallel program is one in which processes are typically
small, ranging from a few to a few hundred instructions. Fine-grain
parallelism aries naturally in many situations, such as iterative
grid computations, recursive fork/join programs, the bodies of parallel
FOR loops, and the implicit parallelism in functional or dataflow
languages. It is useful both to desribe massively parallel computations
and as a target for compilers. However, fine-grain parallelism has long
been thought to be inefficient due to the overheads of process creation,
context switching, and synchronization. This paper describes a software
kernel, Distributed Filaments (DF), that implements fine-grain
parallelism both portably and efficiently on a workstation cluster.
DF runs on existing, off-the-shelf hardware and software. DF has a simple
interface, with only seven routines, so it is easy to use. It achieves
efficiency by using stateless threads on each node, overlapping
communication and computation, employing a new low-overhead, reliable
datagram communication protocol, and automatically balancing the work
generated by fork/join computations. The net result is good speedup on
a variety of applications.