2010

Open soft real-time systems, such as mobile robots, must respond adaptively to varying operating conditions, while balancing the need to perform multiple mission specific tasks against the requirement that those tasks complete in a timely manner. Setting and enforcing a utilization target for shared resources is a key mechanism for achieving this behavior. However, because of the uncertainty and non-preemptability of some tasks, key assumptions of classical scheduling approaches do not hold. In previous work we presented foundational methods for generating task scheduling policies to enforce proportional resource utilization for open soft real-time systems with these properties. However, these methods scale exponentially in the number of tasks, limiting their practical applicability.
In this paper, we present a novel parameterized scheduling policy that scales our technique to a much wider range of systems. These policies can represent geometric features of the scheduling policies produced by our earlier methods, but only require a number of parameters that is quadratic in the number of tasks. We provide empirical evidence that the best of these policies are competitive with exact solution methods in small problems, and significantly outperform heuristic methods in larger ones.
2009

Scheduling policies for open soft real-time systems must be able to balance the competing concerns of meeting their objectives under exceptional conditions while achieving good performance in the average case. Balancing these concerns requires modeling strategies that represent the range of possible task behaviors, and solution techniques that are capable of effectively managing uncertainty in order to discover scheduling policies that are effective across the range of system modes. We develop methods for solving a particular class of task scheduling problems in an open soft real-time setting involving repeating, non-preemptable tasks that contend for a single shared resource. We enforce timeliness by optimizing performance with respect to the proportional progress of tasks in the system.
We model this scheduling problem as an infinite-state Markov decision process, and provide guarantees regarding the existence of optimal solutions to this problem. We derive several methods for approximating optimal scheduling policies and provide theoretical justification and empirical evidence that these solutions are good approximations to the optimal solution. We consider cases in which task models are known, and adapt reinforcement learning methods to learn task models when they are not available.

Scheduling the execution of multiple concurrent tasks on shared resources such as CPUs and network links is essential to ensuring the reliable operation of many autonomic systems. Well known techniques such as rate-monotonic scheduling can offer rigorous timing and preemption guarantees, but only under assumptions (i.e., a fixed set of tasks with well-known execution times and invocation rates) that do not hold in many autonomic systems. New hierarchical scheduling techniques are better suited to enforce the more flexible execution constraints and enforcement mechanisms that are required for autonomic systems, but a rigorous and efficient foundation for verifying and enforcing concurrency and timing guarantees is still needed for these approaches. This paper summarizes our previous work on addressing these challenges, on Markov Decision Process based scheduling policy design and on wrapping repeated structure of the scheduling state spaces involved into a more efficient model, and presents a new algorithm called Expanding State Policy Iteration (ESPI), that allows us to compute the optimal policy for a wrapped state model.

Open soft real-time systems, such as mobile robots, experience unpredictable interactions with their environments and yet must respond both adaptively and with reasonable temporal predictability. Because of the uncertainty inherent in such interactions, many of the assumptions of the real-time scheduling techniques traditionally used to ensure predictable timing of system actions do not hold in those environments. In previous work we have developed novel techniques for scheduling policy design where up-front knowledge of execution time distributions can be used to produce both compact representations of resource utilization state spaces and efficient optimal scheduling policies over those state spaces.
This paper makes two main contributions beyond our previous work, to the state of the art in scheduling open soft real-time systems: (1) it shows how to relax the assumption that the entire distribution of execution times is known up front, to allow online learning of an execution time distribution during system run-time; and (2) it shows how to relax the assumption that the execution time of a system action can be characterized by a single distribution, to accommodate different execution time distributions for an action being taken in one of multiple modes. Each of these contributions allows a wider range of system actions to be scheduled adaptively and with temporal predictability, which increases the applicability of our approach to even more general classes of open soft real-time systems.
2008

Open soft real-time systems, such as mobile robots, experience unpredictable interactions with their environments and yet must respond both adaptively and with reasonable temporal predictability. New scheduling approaches are needed to address the demands of such systems, in which many of the assumptions made by traditional real-time scheduling theory do not hold. In previous work we established foundations for a scheduling policy design and verification approach for open soft real-time systems, that can use different decision models, e.g., a Markov Decision Process (MDP), to capture the nuances of their scheduling semantics.
However, several important refinements to the preliminary techniques developed in that work are needed to make the approach applicable in practice. This paper makes three main contributions to the state of the art in scheduling open soft real-time systems: (1) it defines a novel representation of the scheduling state space that is both more compact and more expressive than the model defined in our previous work; (2) it exploits regular structure of that representation to allow efficient verification of properties involving both discrete and continuous system state variables under specific scheduling policies; and (3) it removes the unnecessary use of a time horizon in our previous approach, thus allowing the more precise specification and enforcement of a wider range of scheduling policies for open soft real-time systems.
Scheduling the execution of multiple concurrent tasks on shared resources such as CPUs and network links is essential to ensuring the reliable operation of many autonomic systems. Well known techniques such as rate-monotonic scheduling can offer rigorous timing and preemption guarantees, but only under assumptions (i.e., a fixed set of tasks with well-known execution times and invocation rates) that do not hold in many autonomic systems. New hierarchical scheduling techniques are better suited to enforce the more flexible execution constraints and enforcement mechanisms that are required for autonomic systems, but a rigorous foundation for verifying and enforcing concurrency and timing guarantees is still needed for these approaches. The primary contributions of this paper are: (1) a scheduling policy design technique that can use different decision models across a wide range of systems models, and an example of how a specific (Markov Desicion Process) decision model can be applied to a basic multi-threaded system model; (2) generalized state space analysis techniques that can evaluate the behavior of the system model under the control of the resulting scheduling policy; and (3) an evaluation of the scheduling policy design and verification techniques for a simple but representative example of the kinds of execution scenarious that can arise in autonomic systems.
Scheduling the execution of multiple concurrent tasks on shared resources such as CPUs and network links is essential to ensuring the reliable and correct operation of real-time systems. For closed hard real-time systems in which task sets and the dependences among them are known a priori, existing realtime scheduling techniques can offer rigorous timing and preemption guarantees. However, for open soft-real-time systems in which task sets and dependences may vary or may not be known a priori and for which we would still like assurance of real-time behavior, new scheduling techniques are needed.
Our recent work has shown that modeling non-preemptive resource sharing between threads as a Markov Decision Process (MDP) produces (1) an analyzable utilization state space, and (2) a representation of a scheduling decision policy based on the MDP, even when task execution times are loosened from exact values to known distributions within which the execution times may vary. However, if dependences among tasks, or the distributions of their execution times are not known, then how to obtain the appropriate MDP remains an open problem.
In this paper, we posit that this problem can be addressed by applying focused reinforcement learning techniques. In doing so, our goal is to overcome a lack of knowledge about system tasks by observing their states (e.g., task resource utilizations) and their actions (e.g., which tasks are scheduled), and comparing the transitions among states under different actions to obtain models of system behavior through which to analyze and enforce desired system properties.
Earlier Publications

Reinforcement Learning (RL) has proven to be a useful set of techniques for planning under uncertainty in robot systems. Effective RL algorithms for this domain need to be able to deal with large, continuous state spaces, and must make efficient use of experience. In this paper, we present methods to better leverage observed experience by reusing experience across parts of the problem state space that are known to be similar. We present experimental results in a navigational, goal-based domain. We develop an approach to identifying portions of the world that appear similar based on observed transition samples.

Reinforcement Learning (RL) has been shown to be an effective paradigm for learning control policies for problems with discrete state spaces. For problems with continuous multi-dimensional state spaces, the results are less compelling. When these state spaces can be effectively discretized, traditional techniques can be applied. However, many interesting problems must be discretized into an infeasibly large number of states. In these cases, other techniques must be used.
Value-function approximation (VFA) addresses some of the problems of traditional RL algorithms, including that of dealing with continuous state spaces. Although it has been successful in a number of specific applications, it has been shown to fail in the general case, and has been known to fail even on simple problems.
We propose a novel approach to VFA based on ideas from topology. We identify a key failing of current techniques, and show how our approach avoids this problem by constructing an explicit model of the state space topology. We begin with a brief description of the problems of current VFA techniques. We then motivate our proposed approach, outline its mathematical foundations, and provide results from initial experiments.