TIK-76.275

Seppo Kuikka 2.12.1997

Abstract

This paper discusses planning and control. Planning is seen as an integral part of deliberative intelligent control. The main issues are in what way to plan for real-time control applications and how to integrate planning and control during execution. Various planners and planning algorithms are introduced and notes are given on their applicability for control purposes.

Abstract

Table of Contents

1 Introduction

2 Planning for real-time execution

Conditional planning
The Cooperative Intelligent Real-time Control Architecture
Dynamic Abstraction Planning

3 (Re)planning while executing

Situated planning agent
Procedural reasoning system
The RAP (Reactive Action Packages) system and INTERRAP

4 Discussion

5 References

1 Introduction

The success of dynamic, domain specific AI planning systems requires their tight integration into the environment in which they operate. They have to be able to address the needs of complex application issues in the domain. In this paper real-time control domain is discussed.

Planning and real-time control try to solve the same problem in this context, i.e. how to choose actions to influence a physical process. The levels of abstraction and the time scales for these activities normally differ, planning having a more abstract and longer scope than control. It is, however, beneficial to consider the two activities together in order to achieve the best overall results.

Planning systems have to be able to access databases and other functional parts of the control application in question. Their user interface shall also be compatible to that of the whole application and expressive and understandable as such. What is even more important is that the AI planning functionality shall be well integrated to the execution of the plans in real-time.

Most traditional AI languages - for example the seminal planner STRIPS, /FN93/ lack some characteristics that are needed for realistic application domains: hierarchical plans, complex conditions, temporal issues and resource limitations. Also STRIPS-like languages can, however, be extended to incorporate these issues, most important from the applicability point of view.

Plans are usually generated and evaluated off-line using a world model of the domain specific environment for which the plans are made. In dynamic domains the planner cannot, however, be convinced of the accuracy of its model and inclusion of all the potential events from the environment (contingencies in the dynamic execution). Thus it needs execution monitoring.

If/when problems in the plan execution exist - noticed by execution monitoring - it will be necessary to stop the sequence of actions specified in the plan. Subsequently, a new plan has to be formulated or - more often - the steps in the current plan have to be modified, that is, replanning is needed. By relying in this way less on the world model of the planner and more on the feedback from the sensors in the environment - which is a normal practice in control systems -a lot of work may be saved from the planner.

In this paper planning is seen as an integral part of deliberative intelligent control. The main issues are in what way to plan in order to fulfil the requirements of real-time control applications and how to integrate planning and (reactive) feedback control in order to best achieve the overall control goals.

Pure reactive execution architectures - most notably the subsumption architecture by Brooks, /BRO91/ - that do not contain AI planning functionality, are not discussed, although their influence can often be seen in the architectural implementations described. Integrating planning and execution of plans in control applications is considered both in terms of planning for real-time execution and (re)planning while executing the plan.

A starting point for the first approach, handled on chapter two, is separation of time consuming and complex planning functionality from execution of the plans into a separate subsystem or activity. The AI planning subsystem then supplies the execution subsystem with plans appropriate in fulfilling real-time requirements. The level of abstraction and completeness of the plans may vary and accordingly varies also how timely they will be ready for execution.

The premise for the second approach, handled on chapter three, is that planning and execution of the plans may be integrated more tightly into the same real-time control system. This is possible normally only by relying on the fact that what is needed during the plan execution is not planning from the first principles but modification or adaptation of existing plans, planning from the second principles.

The approaches above are not as clear cut as it might first seem. Most importantly, the original plans in the replanning systems of chapter three are often made in such a way that real-time requirements are considered and thus the ideas of planning for real time (of chapter two) are included, too. Also the time used in planning may be seen as a parameter, which can be varied considering the needs of reactivity as exemplified in so called anytime algorithms. In some cases one might consider also the reaction time required as a parameter, which could be somewhat relaxed according to the needs of deliberation.

On chapter four of this paper, some conclusions are drawn on the approaches described and their applicability to various control domains is discussed.

2 Planning for real-time execution

The premise of the planning approach in this chapter is, that the planned actions may have unexpected effects on the environment and the environment may change also otherwise - for example due to the actions of other agents. It assumed, however, that the possible effects can be enumerated and described a priori as world state transitions. If that is the case we may speak of bounded indeterminacy.

2.1 Conditional planning

Conditional planning or contingency planning deals with incomplete information about the environment and its dynamic contingencies, by including sensing actions in the plan to test the relevant conditions. Conditional planning is thus a general approach for constructing plans for partially unknown environments, not only real-time ones. It is, however included here, because control applications by nature are such that quite a lot of uncertainty about the environment exists.

The basic algorithm to deal with conditions is given in figure 1, /RN95/, p. 395.

function CONDITIONAL-PLANNING-AGENT(percept)

static: KB, a knowledge base

p, a plan

t, a counter

G, a goal

TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t)

current <- STATE-DESCRIPTION(KB, t)

if p = NoPlan then p <- CPOP(current, G, KB)

if p = NoPlan or p is empty then action <- NoOp

else

action <- First(p)

while CONDITIONAL?(action) do

if ASK(KB, CONDITION-PART(action)

then p<- APPEND(THEN-PART(action),REST(p))

else p <- APPEND(ELSE-PART(action), REST(p))

action <- FIRST(p)

end

p<-REST(p)

TELL(KB, MAKE-ACTION-SENTENCE(action, t))

t <- t + 1

return action

Figure 1. A conditional planning agent

The conditional planning algorithm, CPOP, included in the agent, is based on considering the context (union of the conditions that must hold in order for the step to be executed ) of each step in constructing a plan. The context will be inherited from one step to another in the plan. By inserting conditional links and conditional steps appropriately for sensing, the CPOP algorithm constructs a plan having all possible final states and all paths to these states.

To make sure that a conditional plan, constructed above, is executable, the planning agent must sometimes insert also ordinary actions - in addition to the sensing actions described above. These actions are needed for the purpose of obtaining information needed to test the conditions. The steps may additionally be conditioned in such a way that their contexts become incompatible with the contexts of those steps whose preconditions they are threatening.

One variation of conditional plans is parameterized plans, where the exact actions to be carried out are not known until the plan is executed, /RN95/, p. 398 … 400. The decisions of conditions are made according to values of runtime variables which are retrieved from the knowledge base during execution of the plan.

Very often, it is also necessary in control applications to take various kinds of temporal criteria into consideration when planning, /DW91/ pp. 191 … 198. There are absolute deadlines (the production batch must be finished before 10.00 am) for the plan, there may also be graded time preferences (the longer it takes, the more it costs) and relative deadlines (reaction must be finished before the vessel can be emptied). One way to include time in plans is to use temporal logic when reasoning about the truth values of the conditions at various points of time, /DW91/ pp. 57 … 70.

2.2 The Cooperative Intelligent Real-time Control Architecture

An important approach to planning for real-time is Cooperative Intelligent Real-time Control Architecture (CIRCA), /MDS95/. In CIRCA, AI planning functionality- upon which real-time requirements are not imposed - runs in parallel with a real-time control subsystem. The planning subsystem (AIS) and real-time subsystem (RTS), however, collaborate with each other AIS being "a real time system designer" for RTS , figure 2.

Figure 2. The Cooperative Intelligent Real-time Control Architecture (CIRCA)

The system separates real-time control-level goals from non-real-time task-level goals of the plan. Performance guarantees needed for real-time goals are achieved by using worst case execution time estimates for (condition) tests and actions in the plans. Task level goals are achieved on a best effort basis.

Plans in CIRCA are in effect cyclic schedules for test-action pairs (TAPs). Planning relies heavily on the use of a world model, the constituents of which are domain independent, but content is specific to the application domain in question:

(S, F, T_E, T_A, T_T), where :

S = {S₁, S₂, … , S_m} a finite set of statesF a failure state

T_E = {T_E1, … , T_En} a finite set of event transitions (from environment)T_A = {T_A1, … , T_Ap} a finite set of action transitions (planned)T_T = {T_T1, … , T_Tq} a finite set of temporal transitions (from passage of time)

each transition T_i belongs to T_E U T_A U T_T; T_i : S -> S,

D : T -> S and R : T -> S are domain and range of a transition, respectively

At every instance of time the world will be in one state. AIS planner makes plans in such a manner that no failure states will be reachable and thus the plans may safely be executed by the real-time subsystem RTS.

The construction of the world model and planning of control actions is dynamically interleaved. Backtracking is made possible by a choise-stack state machine. Inputs for planning are transition descriptions, initial world model state descriptions and goal descriptions. CIRCA planning terminates when no failure state is reachable, all control-level plans are achievable and task-level goals are at least reachable.

The issue needing most development in CIRCA has been the fact that during the plan generation all paths that the world might take have to be found out in detail, which makes the planning very time consuming. Dynamic Abstraction Planning (DAP), described next, will however probably alleviate this problem.

2.3 Dynamic Abstraction Planning

Dynamic abstraction, /GMKB97/ is used to omit a large amount of detail from the world model state representation of CIRCA during planning, reducing both the state space to be explored and the resulting plan. In order to omit detail from the states, the states are represented with a set of features or characteristics:

C = {C₁, C₂, …, C_r}

Each feature C_i has a finite set of possible values val(C_i). The selection of which features to omit is performed by the planner locally (different parts of the state space are abstracted in different degrees) and in a way that the safety of the system is preserved. By leaving out certain features in reasoning, the planner can thus avoid considering each individual base-level state.

It may, however, turn out during planning that the abstracted state does not specify values for all necessary features in the action's preconditions. Then the planner must be able to increase the precision by (re)including one or more of the omitted features. This is called splitting or refinement of the plan. DAP refines the world model by taking an existing state and splitting it into a number of more specific states, one for each possible value of a particular feature C_i.

The DAP planning may be described as a nondeterministic algorithm, figure 3.

DAP-plan(isd) ; isd is initial state description

N = 0 ; the graph

openlist = 0

is = MAKE-INITIAL-STATE(isd)

N <- N u {is}

push(is, openlist)

while true do

if no more reachable states in the openlist then break

else

s <- choose a reachable state from openlist

openlist <- openlist - {s}

oneof

split-state:

choose a proposition p and split s into | val(p)| states

remove s from N and insert the new states

add the new states to the openlist

assign-action:

choose an action or NoOp that is applicable for s

fail

end

Figure 3. The DAP planning algorithm

The algorithm fails, if the state chosen by the nondeterministic choose operator above finds out a state for which there is no applicable action or proposition on which to split. An action is applicable in assign-action part of the algorithm, if the state necessarily satisfies its preconditions and if the action preempts all transitions to failure from the given state. When the algorithm fails in a given state, it backtracks (by using a state stack). If the algorithm fails on the initial state, the search as a whole has failed.

The DAP algorithm above is an example of an anytime algorithm, /ZIL95/. The quality of the results of anytime algorithms improves gradually as computation time increases; they offer thus a trade-off between resource consumption and output quality. Through backtracking, the DAP algorithm generates plans that satisfy ever more of the task level goals; after first guaranteeing the real-time safety of the plan, which is a strict domain specific requirement.

The prototype DAP planner, described above, has been implemented on some of the example domains of CIRCA. As described, the DAP planner uses the same principles of guaranteeing control goals and trying to achieve task goals as CIRCA. It does not, however, yet take into account the temporal model provided by CIRCA. With the help of DAP the problematically large search space of CIRCA has been reduced with nearly an order of magnitude according to preliminary results.

3 (Re)planning while executing

For complicated and/or dynamic domains - like real-time control applications - the nature and amount of contingencies in planning becomes quite large, as was described in chapter two. It is often even so large that the resulting set of possible world states becomes impossible to reasonably anticipate or enumerate. In this chapter we consider this kind of unbounded indeterminacy. We have to be able to react to the unanticipated changes in the environment and replan the respective plans accordingly during execution.

Replanning may be implemented in such a way, that a planning agent keeps track of the remaining plan segment p and the complete plan q, figure 4 /RN95/, pp. 402 … 403.

function REPLANNING-AGENT(percept)

static: KB, a knowledge base

p, an annotated plan, initially NoPlan

q, an annotated plan, initially NoPlan

G, a goal

TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))

current <- STATE-DESCRIPTION(KB, t)

if p = NoPlan then

p <- PLANNER(current, G, KB)

q <- p

if p = NoPlan or p is empty then return NoOp

if PRECONDITIONS(p) not currently true in KB then

p' <- CHOOSE-BEST-CONTINUATION(current, q)

p <- APPEND(PLANNER(current, PRECONDITIONS(p'), KB), p')

q <- p

action <- FIRST(p)

p <- REST(p)

return action

Figure 4. Execution monitoring and replanning

Before executing the first action of p, it checks the preconditions and if they are not met it chooses some point in the complete plan q such that the plan p' from that point to the end of the complete plan q is easiest to achieve from the current state. The new, replanned plan is then to first achieve the preconditions of p' and then to execute it.

It is to be noted, that the algorithm above introduces the principle of replanning, only. It does not consider the way in which the actual planning algorithm constructs the sequence of actions from the current state to the goal. Various planning algorithms, for example those described in /RN95/, pp. 337 … 391 could be used for the PLANNER, above.

3.1 Situated planning agent

Planning and plan execution can be also thought as a single process in a situated planning agent. The idea is that while the agent is executing some steps of the plan, it simultaneously refines or modifies the plan according to the additional information got from the environment during execution. The situated agent thus continuously monitors the environment updating the world model from new percepts while its reasoning is proceeding, figure 5 /RN95/, pp. 403 … 407.

function SITUATED-PLANNING-AGENT(percept)

static: KB, a knowledge base

p, a plan, initially NoPlan

t, a counter

G, a goal

TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))

current <- STATE-DESCRIPTION(KB, t)

EFFECTS(START(p)) <- current

if p = NoPlan then

G <- ASK(KB, MAKE-GOAL-QUERY(t))

p <- MAKE-PLAN(current, G, KB)

action <- NoOp (the default)

Termination:

if there are no open preconditions and p has no steps other than START and FINISH then

p <- NoPlan, skip remaining steps

Resolving standard flaws:

resolve any open condition by adding a causal link from any existing prior step or a new step

resolve potential threats by promotion or demotion

Remove unsupported causal links:

if there is a causal link START -> S protecting a proposition c

that no longer holds in START then remove the link and associated bindings

Extend causal links back to earliest possible step;

if there is a causal link S_j -> S_k such that

another step S_i exists with S_i < S_j and the link S_i -> S_k is safe then replace S_j -> S_kwith S_i -> S_k

Remove redundant actions: remove any step S that supplies no causal links

Execute actions when ready for execution::

if a step S in the plan other than FINISH satisfies the following:

all preconditions satisfied by START,
no other steps necessarily between START and S, and
S does not threaten any causal link in p then

add ordering constraints to force all steps after S

remove S from p, and all causal links to and from S

action <- the action in S

TELL(KB, MAKE-ACTION-SENTENCE(action, t))

t <- t +1

return action

Figure 5. A situated planning agent

The situated view described above can be also formalized by situated automata theory, /RK95/. The developers of the theory argue that there is much to gain from analyzing the semantics of knowledge representation from the point of view of the needed reactive behaviour. The approach gives also guidelines on designing agents with pure action, pure perception and combined perception and action properties.

3.2 Procedural reasoning system

Many approaches to planning while executing focus on the specification of a highly reactive execution module embedded in a planning environment. A major representative for these approaches is the Procedural Reasoning System (PRS) /GL90/, /DW91/ pp. 151 … 156.

The PRS system consists of a data base containing current beliefs or facts about the world, a set of current goals or desires to be realized, a set of procedures (knowledge areas, KAs) describing how certain sequences of actions and tests are performed and an interpreter to manipulate the components.

The system also has a process stack, which contains all active KAs, system's current intentions for achieving its goals or reacting to some observed situation. The PRS interpreter runs the system with the help of the process stack by interleaving plan selection, formation and execution.

The plans (KAs) selected are both partial and hierarchical. PRS tries to fulfil any intentions it has previously committed to, but if a new fact or request becomes known, PRS will reassess its goals and intentions and accordingly chooses to work on something else. PRS can even alter its intentions regarding its own reasoning process using its metalevel capabilities.

PRS is the first implementation of a BDI (Belief, Desire, Intention) architectural model, /CL90/ based on modal logic. There exist many "BDI-successors" for PRS in agent and multiagent applications, for example INTERRAP, described in the next section.

3.3 The RAP (Reactive Action Packages) system and INTERRAP

The basic idea in Reactive Action Packages (RAP) is that the action concept of AI is preserved while the actual controlling is performed by multiple concurrent processes. The plan representation in RAP allows both concurrency and synchronizing execution with events from the environment which is crucial in using the plans to control large real-time systems.

The overall RAP architecture consists of three layers: a planning layer, which produces sketchy plans for goals, the RAP execution system to fill in the details at run time, and a control system to carry out actions, figure 6, /FIR92/.

Figure 6. RAP architecture

The key component is the RAP executor, which expands vague plan steps into detailed task instructions at run time by choosing an appropriate, context sensitive method for the next step in execution from a preexisting library.

Waiting until run time allows the use of direct rather than predicted knowledge of the situation and thus much of the incompleteness and uncertainty will disappear. However, incorrect methods for tasks may sometimes be chosen. These are handled by trying to make sure that each method achieves its goal and, if it does not (failure situation), choosing another method for the task and trying again.

A restricting assumption of a task representation above, namely that a given task may have only two outcomes: success or failure, has been later relaxed, /FIR94/. In the new version of RAP architecture there is a task net representation that explicitly recognizes when subtasks have multiple outcomes and allows a different thread to be followed for each one.

Perhaps even more important successor to the original RAP system is the INTERRAP agent architecture, /MUE96/, which adopts many of RAP's symbolic action - real-time execution mechanisms. It also adapts existing plans to specific situations, not unlike RAP. In knowledge representation INTERRAP uses, however BDI - theory, like PRS (cf. chapter 3.2).

The principal idea in INTERRAP is that an agent, belonging to a multi-agent society is divided into three interacting layers:

a behaviour-based layer, for reactive control tasks
a local planning layer, for local reasoning to produce goal directed behaviour
a cooperative planning layer, for reasoning about other agents and coordinated interaction with them

The local INTERRAP plans are selected, interpreted, scheduled, and executed in the local planning layer of the architecture. INTERRAP uses the principle of planning from second principles by using a plan library. It interleaves planning with execution by utilizing a control cycle, that is implemented as an object whose main methods correspond to the activities above. The main attributes of the control cycle are:

Planstacks, that point to a list of plan stacks
Msgs, pointing to a list of messages from the neighbouring layers
Plib, providing access to the plan library
Sched, containing the current schedule of the agent

An important original contribution of INTERRAP is the cooperative planning layer which extends the system's capabilities to multi-agent planning and execution. It is activated by the local planning layer of the system to cope with situations that cannot be satisfactorily dealt with local planning. After requests from the local planning layer are received, additional information about other agents' goals is gathered, and then a suitable negotiation strategy is selected for multi-agent interaction.

4 Discussion

The traditional AI planners may be extended to suit to the dynamic domain requirements, /RN95/, pp. 367 … 389. Thus extending STRIPS by hierarchical plan decomposition allows nonprimitive operators to be included in plans. Universal quantification allows the preconditions and effects to refer to objects in general. Actions can also be made to generate and consume resources. To some extent, even time can be handled with the mechanisms for manipulating resource measures.

With the extensions above, STRIPS-based, extended planners have been able to plan in even such - from planning perspective computationally complex - control domains as spacecraft missions and manufacturing.

At least conditional planners are, however normally needed to properly take the dynamic domain environment into account when planning. The number of conditions to be reasoned grows exponentially with the number of steps in the plan. Thus it is time consuming to plan realistic real-time plans having enough steps.

Abstraction may be used to reduce unnecessary detail in planning. This leads naturally into allowing planning at a higher level of abstraction and using plan libraries or other ready-made plans or procedures at the lower level. This technique is used also in several replanning approaches.

The CIRCA planner has been used for robot control and aircraft control simulation applications, both having strict stimulus-response type requirements in addition to planned sequences of actions. Applications for PRS and its successors are mainly such as need also substantial amount of procedural control. Most notable of them are the ones using SRI's robot Flakey and some chemical process control applications.

Characteristic for RAP-based and several "RAP-like" planners seems to be that they are often a part of a large, layered, and integrated architecture. Perhaps the best example is INTERRAP. These kinds of systems may be used to enhance - or in some cases even replace - also large traditional distributed process management and control systems. A reactive level takes care of immediate measurements and controls, a local planning level performs reasoning that may be done by the agent itself and a collaboration level negotiates with other agents of the application.

The emerging possibilities to apply AI planning in various real-time control applications described shortly above, give interesting potential to transfer this technology from laboratories to production use.

5 References

/BRO91/ R. A. Brooks: Intelligence without representation, Artificial Intelligence

47(1991) 139-159

/CL90/ P. R. Cohen, H. J. Levesque: Intention is choise with commitment,

Artificial Intelligence 42(1990) 213-261

/DW91/ T. L. Dean, M. P. Wellman: Planning and Control, Morgan Kaufmann

Publishers Inc., USA, 1991

/FIR92/ R. J. Firby: Building Symbolic Primitives with Continuous Control

Routines, Proc. First Int. Conf. On AI Planning Systems, 1992

/FIR94/ R. J. Firby: Task Networks for Controlling Continuous Processes,

Proc. Second Int. Conf. On AI Planning Systems, 1994

/FN93/ R. F. Fikes, N. J. Nilsson: STRIPS, a retrospective, Artificial Intelligence

59(1993) 227-232

/GMKB97/ R. P. Goldman, D. J. Musliner, K. D. Krebsbach, M. S. Boddy: Dynamic

Abstraction Planning, Proc. AAAI, 1997

/GL90/ M. P. Georgeff, A. L. Lansky: Reactive reasoning and planning, Readings

in Planning, 1990, 729 … 734

/MDS95/ D. J. Musliner, E. H. Durfee, K. G. Shin: World modeling for the dynamic

construction of real-time control plans, Artificial Intelligence 74(1995)

83-127

/MUE96/ J. P. Mueller: The Design of Intelligent Agents - A Layered Approach,

Springer-Verlag, Germany, 1996

/RK95/ S. J. Rosenschein, L. P. Kaelbling: A Situated View of Representation and

Control, Artificial Intelligence 73(1995)

/RN95/ S. J. Russel, P. Norvig : Artificial intelligence - A Modern Approach,

Prentice Hall, USA, 1995

/ZIL95/ S. Zilberstein: Operational Rationality through Compilation of Anytime

Algorithms, AI Magazine, 16(1995) 79 - 80