Chapter 8
The Resource Manager

This chapter presents the design of the resource manager, a middleware application that adds value to the scheduling hierarchy by permitting guarantees to be reasoned about dynamically using the guarantee system that was described in Chapter 5, in addition to a rule-based system that can enforce high-level rules about the allocation of processor time.

There is a synergy between the scheduling hierarchy and the resource manager--each has capabilities that the other does not. For example, the resource manager, on its own, cannot provide any particular kind of scheduling behavior at all. The scheduling hierarchy, on its own, cannot limit the aggregate CPU usage of a user across all schedulers.

8.1 Introduction

The resource manager is an organized collection of hooks and reflective information that user-supplied code can use to make resource allocation decisions. The primary goal of the resource manager is: for a given set of requirements supplied by applications, and rules supplied by users and system administrators, to generate a mapping of threads to schedulers in that hierarchy that violates no rules and satisfies as many application requirements as possible. It will not always be possible to meet all application requirements because they may result in overload. In general, administrator requirements enforce fairness between users (or other resource principals) and user requirements tell the system which application requirements should be met when there is a conflict between them.

The following list of assumptions is designed to limit the scope of the resource manager to a tractable set of features.

8.2 Entities in the Resource Manager

The following entities exist within the resource manager:

All resource manager entities except events and rules are represented as passive data structures. These data structures are available for inspection (and sometimes modification) by rules, which are represented by executable code that is called by events.

8.3 Guarantee Acquisition

When an application requests a guarantee (or when a guarantee is requested on behalf of an application), the guarantee is used for several distinct purposes. First, the resource manager checks if the request violates any system- or user-specified rules. If no rules are violated, the resource manager interprets the guarantee in order to determine which, if any, schedulers in the hierarchy are capable of granting the guarantee. Finally, the guarantee is passed to individual schedulers that perform schedulability analysis based on the values inside of the guarantee.

In response to a new request, the resource manager performs the following actions:

  1. It verifies that the request does not violate any processor allocation rules that the user has set up.
  2. It finds a set of candidate schedulers that (1) can provide a guarantee of the type that is being requested, and (2) have a guarantee that belongs to the system or to the resource principal making the request. Schedulers advertise what kinds of guarantees that they can provide. The advertised guarantees are abstract templates that contain unbound numeric parameters. In general, when the scheduling hierarchy is designed well, there should be only one scheduler that can satisfy a given type of request for each user.
  3. If there are several schedulers that can provide the requested guarantee, try to acquire the guarantee from each of them in turn.

If the set of candidate schedulers is empty, the request is rejected. It is possible that the resource manager could attempt to load a new scheduler in the hierarchy. We leave this option to future work, since (1) scheduling hierarchies should be simple, and can probably be constructed in skeletal form ahead of time for most situations, (2) experts requiring scheduling behavior not available from default hierarchies will be able to load schedulers manually, and (3) it is unlikely that a heuristic will be able to build a good scheduling hierarchy. So, we currently require the scheduling hierarchy to be pre-built or to be built manually.

8.4 CPU Allocation Rules

8.4.1 Per-User and System Rules

Guarantees describe a contract between a scheduler and a virtual processor about the ongoing allocation of CPU time. Rules describe constraints on the allocation of CPU time that cannot be expressed easily (or at all) using guarantees.

When a guarantee is requested, either by an application or by a scheduler, which may or may not already have a guarantee, it must be shown that the guarantee, if given, would not violate any rules. The basic result of each rule is to either reject a request for a guarantee, or to not reject it. If no applicable rule rejects a request, then the request is passed to a scheduler in the hierarchy, which performs a schedulability analysis to determine if the guarantee is actually feasible. Rules may have side effects, such as signaling an event (possibly invoking other rules) or causing the revocation or renegotiation of a previously granted guarantee in order to grant a guarantee to a more important application.

System-level rules apply to all requests for guarantees and per-user rules apply only to the particular principal that instantiated the rule. Typically, system rules will enforce overall fairness between resource principals and enforce system-wide policies (such as mandating non-NULL guarantees for time-sharing schedulers). Per-user rules will determine which applications (or kinds of applications) are chosen to run during overload, and also perform special actions such as creating a new scheduler in order to isolate the CPU usage of an untrusted application.

8.4.2 Rule Implementation

Rules are written in C or C++ and run in the address space of the resource manager. Therefore, they cannot be written by untrusted users who could exploit the lack of type safety in these languages to subvert the resource manager. Rather, we envision a number of predefined, parameterizable rules being available to end-users. Although it would be beneficial to have a safe, domain-specific language to write rules in, the design of such a language is beyond the scope of this work.

Rules are made available to the resource manager through dynamic link libraries (DLLs). The resource manager, then, simply provides numerous hooks for user- and administrator-supplied routines to tell it how to allocate CPU time. Rules have access to reflective information about the current set of pending and granted requests, the current set of resource principals, and the scheduling hierarchy.

Users need not select any special rules to implement the default CPU allocation policies of admission control (which is provided by the schedulability analysis routines in reservation-based schedulers) or best effort (which is provided by non-real-time schedulers that simply accept all requests for service).

8.4.3 Rule Evaluation Order

If the signaling of an event triggers more than one rule, they are evaluated in an arbitrary order. It is therefore necessary for rule authors to ensure that the outcome of rule evaluation is not order-dependent. If there are groups of rules that need to be executed in a particular order, this can be accomplished by having the first be rule be triggered by a built-in event such as NEW_REQUEST, and then having that rule raise a custom event that triggers that next rule in the sequence.

8.4.4 Example Rules

8.4.4.1 Running the Most Important Applications

The pseudocode in Figure 8.1 illustrates the implementation of a per-user resource allocation policy that attempts to admit the subset of the set of applications that have requested service that the user has deemed most important. It makes use of several library functions such as the req_pct function to determine the percentage of the CPU that the new request is asking for and the EndGuarantee function to revoke a previously granted guarantee.


  
  RULE_STATUS Rule_AdmitMostImportant (struct REQUEST_INFO req_info)
  {
          // keep terminating the guarantees of less important victim
          // applications until the new request can be admitted or no
          // more victims can be found
  
          while ((req_pct (req_info) + total_user_pct (req_info->User)) >
                 100) {
                  struct GUARANTEE min =
                    FindLeastImportant (req_info->User);
                  if (min && min->Value < req_info->Value) {
                          EndGuarantee (min);
                  } else {
                          return RULE_REJECT;
                  }
          }
  
          return RULE_PASS;
  }

Figure 8.1: Pseudocode for a rule admitting the subset of applications that have the highest importance

This rule would be loaded into the system using a command such as this one, which stipulates that the rule is to be triggered by the custom NEW_REALTIME_REQUEST event:

  LoadNewRule (Rule_AdmitMostImportant, NEW_REALTIME_REQUEST);

Instead of ending reservations as it runs across them, a real implementation of this rule would need to find a set of reservations to revoke that provides enough extra capacity to grant the new reservation before actually revoking any reservations. This would prevent the bad result of having the rule revoke some reservations, without being able to begin the new one. This rule implicitly assumes that there is a way to convert each guarantee into a percentage, and that CPU addition is linear--that, for example, the total effective utilization of two reservations with utilization 40% is 80%. It is not the case that utilizations can be added linearly for all schedulers; for example, in some cases a reservation scheduler based on rate-monotonic scheduling would be able to grant two reservations each having utilization 40%, and in other cases (depending on the ratio between the periods of the reservations) it would not. A more sophisticated version of this rule could be developed that takes non-linear CPU addition into account. Finally, in practice, this rule would need to be coupled with a rule that, each time an application cancels a guarantee, attempts to give a guarantee to the most important feasible application that has a pending request.

8.4.4.2 Enforcing Fairness Between Users

The pseudocode in Figure 8.2 rule implements a global policy that would be useful on multi-user machines. It limits the CPU allocation of each user to his or her fair share of the total available CPU bandwidth.


  
  RULE_STATUS Rule_LimitUserPct (struct REQUEST_INFO req_info)
  {
          if ((req_pct (req_info) + total_user_pct (req_info->User)) >
              (100 / num_users ())) {
                  return RULE_REJECT;
          } else {
                  return RULE_PASS;
          }
  }

Figure 8.2: Pseudocode for a rule implementing fairness between users

In practice, this rule would need to be supplemented with another rule that revokes resources reserved by users whose reserved allocation exceeds their new maximum number of shares when a new user enters the system. This could be implemented by creating a custom RECALCULATE_SHARES event that either performs the revocation or notifies the user to reduce her resource allocation. Another refinement would be to penalize users for running threads that have very short periods--these threads pollute the cache for other users, effectively reducing the performance of their applications.

8.4.4.3 Other Rules

The following list describes other rules that would be useful to perform in the resource manager:

8.5 Supporting Unmodified Applications

8.5.1 Determining Application Scheduling Parameters

This section proposes a method for determining the scheduling parameters of multi-threaded multimedia applications. This calibration procedure must be run on each hardware platform each time a new real-time application is installed, although vendor-installed real-time software could be pre-calibrated. It will work under the following set of assumptions:

The calibration procedure operates as follows. On an idle machine (i.e. no other applications are running, services and daemons disabled) the application is run for a long enough period of time that short-term variations in CPU usage are discovered. Instrumentation built into the hierarchical scheduling infrastructure discovers the period and amount of CPU time used by each thread (threads in a real-time application typically use a timer to schedule themselves with a given period). Threads are identified by their start address, which can be discovered either by adding a hook to the kernel routine that creates threads or by having the calibration software attach itself to the real-time application using debugging hooks. The resource manager keeps a checksum of the application executable file to determine when it changes, requiring recalibration. In addition, the resource manager should monitor the status of the operating system, shared libraries, device drivers, and system hardware--a change in any of these can potentially affect the CPU utilization of real-time applications. Fortunately, most end users seldom upgrade any of these system components.

Once the CPU usage of all threads has been identified, an amount of CPU time to guarantee future instantiations of the application can be determined. A conservative estimate would take the maximum amount of CPU time that the application used during any time period and then overestimate by a small factor. The degree of conservatism for each application should reflect the application’s importance and how gracefully its performance degrades when receiving less CPU time than it requires. For example, an application that plays music that has small, deterministic CPU requirements could be assigned a generous reservation, allowing it to miss essentially no deadlines.

Applications that do not meet one of the requirements for this method will have to be scheduled in a different way. A progress-based scheduler or a traditional static-priority scheduler can be used to schedule these applications, although without any hard guarantees. Applications whose source code is available can be modified to be self-calibrate using calibration library routines, or simply to operate in a manner that is more conducive to calibration by the external tool.

8.5.2 Applying Scheduling Parameters to Applications

The resource manager should maintain a persistent store of information about real-time applications. This information can be supplied by users, by system vendors, or by the calibration procedure described in the previous section. It may be tuned by sophisticated users from time to time to reflect changing preferences about the degree of pessimism different applications require.

When an application begins execution, it can be identified either by name or (more securely and accurately) by a strong checksum of its executable file. Once an application is identified as being one that should receive real-time scheduling, the resource manager can internally request a guarantee for the application.

8.6 Implementation

The resource manager is invoked infrequently compared to the number of low-level scheduling decisions--typically only when an application requests real-time service, has a change of requirements, or exits. So, the resource manager can be implemented as a user-level process, with requests being performed though remote procedure calls generated by a library. It interacts with the scheduling hierarchy in the kernel through the HLSCtl interface, to which it has exclusive access. Since the scheduler infrastructure assigns new threads to a default best-effort scheduler, only applications that request other types of scheduling behavior need to interact with the resource manager.

8.7 Doing Without the Resource Manager

The resource manager is designed to provide a flexible level of indirection between requests for scheduling and the scheduling hierarchy. It also allows high-level rules about processor allocation to be enforced.

In a system without a resource manager, the only supported resource management policies are those implemented by the schedulers themselves. For example, admission control for reservation-based schedulers and best-effort for time-sharing and proportional share schedulers that do not perform reservation. Thus, the burden of running a mix of applications that provides high value falls on the user.

Without the resource manager, special-purpose code libraries and scripts can be written to help manage the scheduling hierarchy. These programs are particularly easy to write if they can make assumptions about the structure of the scheduling hierarchy rather than attempting to dynamically find a scheduler that can meet an arbitrary request. For example, the hlsutil command that was implemented for the HLS prototype allows a CPU reservation to be started or ended for an arbitrary thread. It does this by passing the name of the thread and the parameters of its potential reservation to scheduler infrastructure using the HLSCtl command. The infrastructure then locates the reservation scheduler by name, and requests real-time scheduling for the thread. hlsutil can manipulate the scheduling hierarchy in a number of other ways. For example, it can create and delete a scheduler, change the default scheduler, move a thread between schedulers, move all threads from the default scheduler to a different scheduler, and change the clock interrupt frequency.

8.8 Conclusions

This section has presented the design of the HLS resource manager, a middleware application that provides a level of indirection between applications and the scheduling hierarchy, allowing high-level policies about processor allocation to be enforced. The resource manager can add value, but the utility of the scheduling hierarchy is not contingent on having a resource manager.