NSF Workshop on Future Directions for Systems Research
July 31, 1997
Write-up: John Carter, retrac@cs.utah.edu
Working group participants:
- John Carter, University of Utah, retrac@cs.utah.edu
- Pei Cao, University of Wisconsin, cao@cs.wisc.edu
- Michael Dahlin, University of Texas, dahlin@cs.utexas.edu
- Michael Scott, University of Rochester, scott@cs.rocheter.edu
- Marc Shapiro, INRIA, shapiro@prof.inria.fr
- Willy Zwaenepoel, Rice University, willy@cs.rice.edu
Charter:
The distributed state management focus group discussed what kind(s) of system infrastructure should be developed to support wide-area sharing of state between distributed applications. In essence, the group was discussing the extent to which the common infrastructure on which distributed applications are built could be lifted from IP (raw communication) to that of distributed access to shared state. Particular attention was given to the following issues:
- Applications (what types of future applications might benefit from the infrastructure?)
- Scalability (thousands to millions of nodes, millions to billions of "objects")
- Consistency (multiple flavors, under user/system control)
- Persistence (on demand versus automatic, does garbage collection play a role?, …)
- Availability (replication policies, automatic gradual reduction of QoS)
- Distributed resource management (how to do it, how to scale it, "charge" model?)
- Dynamic replication and migration of data
- Scalable location management
- Extensibility
- Security
- Naming
Key conclusion:
Given the extensive advances in distributed systems technology that have occurred over the last two decades, the number of different distributed applications that have been deployed has been surprisingly small. One reason for this is that many difficult problems (e.g., consistency management, fault tolerance, security, location transparency, scalability, and performance) must be addressed to build a complex distributed application, and there is very little infrastructure available to ease the effort. In addition, the amount of bandwidth being consumed on the Internet for the WWW is increasing exponentially. Both of these observations suggest that a serious effort to build an infrastructure for wide-area state sharing is highly recommended. Such an infrastructure could help current applications scale better (e.g., via improved distributed web page caching) and significantly increase the rate at which complex distributed applications are deployed in the future. The working group's recommendations for what should be in this wide area sharing infrastructure are described below.
Motivating applications:
The group identified three major flavors of distributed applications that might benefit from a distributed state sharing infrastructure:
- Data servers: Many existing distributed services at their core are some form of distributed database (e.g., name servers, file servers, web servers, databases, directory servers, routers, …). To scale such services, you want to replicate its data to multiple peer servers that can each service part of the demand. To maintain the same quality of service (QoS) as the unreplicated server, you want to provide guarantees of some form about the quality of the data. The specific requirements (e.g., how consistent the replicated data is, the extent to which the service can tolerate node and network failures, etc.) tends to be very application specific. Among the primary needs from a distributed state management service for this type of application are user-controllable consistency management mechanisms, acceptable fault tolerance semantics, and security.
- Data collectors: This type of application is typified by a "data mining" tool that collects information from a wide variety of fairly static locations (e.g., data warehouses, sensors, user input, files, web pages, etc.) and performs some form of filtering or analysis on this heterogeneous data. For example, a stock market analysis tool that collects information for diverse sources to generate buy-sell signals or a command and control tool that collects inputs from a variety of remote sensors and information databases, and uses this information to control some real-time process. Among the primary needs from a distributed state management service for this type of application is the ability to integrate heterogeneous data formats, total system scalability (e.g., using some form of cooperative caching), and scalable location management mechanisms.
- Peer-to-peer applications: This flavor of application is typified by applications that cooperate with one another to perform some function. For example, distributed simulations, multi-player games, and collaborative environments all fall into this category. This type of application was deemed "interesting" in the sense that both the source and destinations of shared data will tend to move around over time, unlike the above systems where the data tends to have a primary "home." Among the primary needs from a distributed state management service for this type of application is scalable location management mechanisms, authentication, survivability, persistence, and some form of abstraction for data that would allow the data to evolve and different applications to be able to access it (e.g., objects).
In addition, it was observed that streamed data is becoming increasingly important as multimedia applications increase in popularity, not to mention the rapidly increasing number of "smart sensors" that are being put on networks. Streamed data does not tend to fit well with a conventional block or file oriented data service, so it should be kept in mind when designing any distributed state management service.
Two possible grand challenge problems were identified:
- Virtual (networked) democracy: The challenge is to build the appropriate infrastructure so that we can hold the 2004 national elections entirely electronically. Although it was pointed out that the number of transactions required to count 150 million votes in a day is not huge by modern standards, the security, fault tolerance, and "remote access" (by millions of interested net cruisers and analysts) needs make such a system a very tough problem.
- Data mining the Internet: Imagine having a declarative interface on most/all data available on the net, with the appropriate schema so that remote programs could easily search most data archives. If such an interface existed, the resulting load on the network would increase dramatically, further exacerbating the need to develop effective caching and replication policies.
Motivation:
- Load on the Internet is expanding exponentially. We need some way to improve the amount of caching and thereby network locality. A wide area caching infrastructure would attack this problem, while at the same time spreading some of the cost of replication to users, unlike static site replication (mirror sites).
- Many distributed services "roll their own" distributed state management mechanisms, because there is no underlying support mechanism that they can use. This leads to redundant development, ad hoc solutions, and limited ability to integrate heterogeneous applications and services. An integrated state sharing infrastructure would ease the development of complex distributed applications, as well as make it more easy for heterogeneous applications to be "federated."
Issues that came up in the discussion:
- What metadata do we need associated with blobs/objects? Size, reference count, location(s), security information, etc. For management purposes, it is desirable to export a way for external entities (tools, users) to view the "blobs", which lets you look at some subset of the interesting metadata.
- It is imperative that the infrastructure be flexible in how data is managed, because different applications have very different coherency and availability requirements. Therefore, there should be ways for clients to tune how the underlying system works. However, a library of common defaults that typical clients can use (e.g., strong consistency, stream consistency, leases, notifications with controlled latency) should be developed and supplied.
- There was no consensus as to whether any distributed state management system should be object-oriented (as proposed in Barbara Liskov's keynote address) or not. A number of advantages to requiring an object veneer include:
- Evolvability: It is easier to evolve the contents of shared state accessed by multiple applications if those applications always access the data via access methods associated with the data. If there is some reason that the format of the data needs to be changed, such as to add new features to some of the applications, legacy applications will be unaffected. Without associating access methods with data, all applications that use a particular piece of shared state will need to be evolved if the format of the data is changed.
- Data vs control migration: If the shared state is treated as a collection of objects, it is easier to dynamically choose between migrating the data to the computation or migrating the computation to the data (i.e., remote method invocation). Depending on the application, it is sometimes best to cache the data on the invoking client for local access, thereby offloading the "server," while other times it is best to invoke an operation on the "server," depending on the relative size of the data required and the computation required.
- Support for heterogeneity: Wide area distributed applications will tend to operate on heterogeneous architectures and store data in different formats far more often than conventional LAN-level distributed services. The extra level of indirection provided by objects, including type information describing the format of the shared state, makes it easier to transparently convert the data formats and/or method code when data is shared between heterogeneous clients and/or servers.
All members of the committee agreed that these properties of object oriented shared repositories are useful in certain circumstances. However, some members felt that they were not needed in all circumstances and would add unnecessary overhead and complexity to applications that did not need the extra flexibility.
- A simple way to think of consistency was discussed. There are producers and there are consumers, and it is the state management system's responsibility to ensure that consumers access acceptably coherent pieces of data generated by producers. After producers produce data, they set a flag indicating that a new version of the data is available (e.g., they perform a synch() operation or unlock a lock that is associated with the shared state). Similarly, consumers interact with the state management system by registering an interest in particular pieces of data (e.g., by performing a "read" operation - indicating a one-time interest in the data, or by setting up a "channel" whereby they specify that they are interested in all future changes). A simple API to accomplish connection between consumers and producers might be for producers to state "I’m about to modify something" and "I’m done modifying it," while clients indicate that "I need to access something" and "I’m done accessing it." Different consistency protocols could be built to support different desired semantics, e.g., strict consistency, leases, causal ordering, etc.
- This discussion led to a discussion of the need for notification services for many distributed applications (e.g., web proxy caches and distributed whiteboards). Such a "notification service" could clearly be hooked off of the registration of "done changing" and "I want to see it" operations. One issue that concerned some members of the committee was the need for predictable (at least probabilistically) propagation delays for notifications sent to potentially large numbers of clients. It was also accepted that some form of hierarchical notification scheme is required for scalability.
- Another point of discussion was the extent to which atomic transactions should play a role. In Barbara Liskov's keynote address, she strongly advocated that they be a fundamental part of the core sharing system. In general, this working group disagreed with that sentiment, feeling that while transactions clearly are important for many distributed applications, they are unnecessary and excessively complex and constraining (to the programmer) in many circumstances. Nevertheless, it was agreed that any system that is developed should be sufficiently flexible and provide a sufficiently rich API that atomic transactions could be supported either optionally or by a layered middleware component.
- In terms of manageability of the large shared state system, the working group discussed the need for identifying sets of related "objects" and managing them as a single entity (e.g., all objects associated with a particular project or site). There should be a convenient way to browse and manage these collections of related "objects". It was also observed that this might provide a convenient interface for things like security.
Leveraging/synergy:
Although the distributed state management group was not officially a "cross-cutting theme" group, there was a general feeling that it essentially was. In many ways, the cross-cutting group(s) discussing support for distributed applications covered many of the issues discussed above. As such, any work in this area could clearly benefit from collaboration across a large number of system disciplines, including:
- Security
- Networking community (resource management)
- Databases and filesystems
- Languages
- Group communication
- Distributed shared memory