TIK-76.275
Seppo Kuikka 2.12.1997
This paper discusses planning and control. Planning is seen as
an integral part of deliberative intelligent control. The main
issues are in what way to plan for real-time control applications
and how to integrate planning and control during execution. Various
planners and planning algorithms are introduced and notes are
given on their applicability for control purposes.
Abstract
Table of Contents
1 Introduction
2 Planning for real-time execution
3 (Re)planning while executing
4 Discussion
5 References
The success of dynamic, domain specific AI planning systems
requires their tight integration into the environment in which
they operate. They have to be able to address the needs of complex
application issues in the domain. In this paper real-time control
domain is discussed.
Planning and real-time control try to solve the same problem
in this context, i.e. how to choose actions to influence a
physical process. The levels of abstraction and the time scales
for these activities normally differ, planning having a more abstract
and longer scope than control. It is, however, beneficial to consider
the two activities together in order to achieve the best overall
results.
Planning systems have to be able to access databases and other
functional parts of the control application in question. Their
user interface shall also be compatible to that of the whole application
and expressive and understandable as such. What is even more important
is that the AI planning functionality shall be well integrated
to the execution of the plans in real-time.
Most traditional AI languages - for example the seminal
planner STRIPS, /FN93/ lack some characteristics that are needed
for realistic application domains: hierarchical plans, complex
conditions, temporal issues and resource limitations. Also STRIPS-like
languages can, however, be extended to incorporate these
issues, most important from the applicability point of view.
Plans are usually generated and evaluated off-line using
a world model of the domain specific environment for which the
plans are made. In dynamic domains the planner cannot, however,
be convinced of the accuracy of its model and inclusion of all
the potential events from the environment (contingencies
in the dynamic execution). Thus it needs execution monitoring.
If/when problems in the plan execution exist - noticed
by execution monitoring - it will be necessary to stop the sequence
of actions specified in the plan. Subsequently, a new plan has
to be formulated or - more often - the steps in the current plan
have to be modified, that is, replanning is needed. By
relying in this way less on the world model of the planner and
more on the feedback from the sensors in the environment
- which is a normal practice in control systems -a lot of work
may be saved from the planner.
In this paper planning is seen as an integral part of deliberative
intelligent control. The main issues are in what way
to plan in order to fulfil the requirements of real-time control
applications and how to integrate planning and (reactive)
feedback control in order to best achieve the overall control
goals.
Pure reactive execution architectures - most notably the
subsumption architecture by Brooks, /BRO91/ - that do not
contain AI planning functionality, are not discussed, although
their influence can often be seen in the architectural implementations
described. Integrating planning and execution of plans in control
applications is considered both in terms of planning for real-time
execution and (re)planning while executing the plan.
A starting point for the first approach, handled on chapter two,
is separation of time consuming and complex planning functionality
from execution of the plans into a separate subsystem or
activity. The AI planning subsystem then supplies the execution
subsystem with plans appropriate in fulfilling real-time requirements.
The level of abstraction and completeness of the plans may vary
and accordingly varies also how timely they will be ready
for execution.
The premise for the second approach, handled on chapter three,
is that planning and execution of the plans may be integrated
more tightly into the same real-time control system. This is possible
normally only by relying on the fact that what is needed during
the plan execution is not planning from the first principles
but modification or adaptation of existing plans, planning
from the second principles.
The approaches above are not as clear cut as it might first seem.
Most importantly, the original plans in the replanning systems
of chapter three are often made in such a way that real-time requirements
are considered and thus the ideas of planning for real time (of
chapter two) are included, too. Also the time used in planning
may be seen as a parameter, which can be varied considering the
needs of reactivity as exemplified in so called anytime algorithms.
In some cases one might consider also the reaction time required
as a parameter, which could be somewhat relaxed according to the
needs of deliberation.
On chapter four of this paper, some conclusions are drawn on the
approaches described and their applicability to various control
domains is discussed.
The premise of the planning approach in this chapter is, that
the planned actions may have unexpected effects on the environment
and the environment may change also otherwise - for example due
to the actions of other agents. It assumed, however, that the
possible effects can be enumerated and described a priori
as world state transitions. If that is the case we may speak of
bounded indeterminacy.
Conditional planning or contingency planning deals
with incomplete information about the environment and its dynamic
contingencies, by including sensing actions in the plan
to test the relevant conditions. Conditional planning is thus
a general approach for constructing plans for partially unknown
environments, not only real-time ones. It is, however included
here, because control applications by nature are such that quite
a lot of uncertainty about the environment exists.
The basic algorithm to deal with conditions is given in figure
1, /RN95/, p. 395.
function CONDITIONAL-PLANNING-AGENT(percept)
static: KB, a knowledge base
p, a plan
t, a counter
G, a goal
TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t)
current <- STATE-DESCRIPTION(KB, t)
if p = NoPlan then p <- CPOP(current, G, KB)
if p = NoPlan or p is empty then action <- NoOp
else
action <- First(p)
while CONDITIONAL?(action) do
if ASK(KB, CONDITION-PART(action)
then p<- APPEND(THEN-PART(action),REST(p))
else p <- APPEND(ELSE-PART(action), REST(p))
action <- FIRST(p)
end
p<-REST(p)
TELL(KB, MAKE-ACTION-SENTENCE(action, t))
t <- t + 1
return action
Figure 1. A conditional planning agent
The conditional planning algorithm, CPOP, included in the agent,
is based on considering the context (union of the conditions
that must hold in order for the step to be executed ) of each
step in constructing a plan. The context will be inherited from
one step to another in the plan. By inserting conditional links
and conditional steps appropriately for sensing, the
CPOP algorithm constructs a plan having all possible final states
and all paths to these states.
To make sure that a conditional plan, constructed above,
is executable, the planning agent must sometimes insert also ordinary
actions - in addition to the sensing actions described above.
These actions are needed for the purpose of obtaining information
needed to test the conditions. The steps may additionally be conditioned
in such a way that their contexts become incompatible with the
contexts of those steps whose preconditions they are threatening.
One variation of conditional plans is parameterized plans,
where the exact actions to be carried out are not known until
the plan is executed, /RN95/, p. 398
400. The decisions
of conditions are made according to values of runtime variables
which are retrieved from the knowledge base during execution of
the plan.
Very often, it is also necessary in control applications to take
various kinds of temporal criteria into consideration when
planning, /DW91/ pp. 191
198. There are absolute deadlines
(the production batch must be finished before 10.00 am) for the
plan, there may also be graded time preferences (the longer
it takes, the more it costs) and relative deadlines (reaction
must be finished before the vessel can be emptied). One way to
include time in plans is to use temporal logic when reasoning
about the truth values of the conditions at various points of
time, /DW91/ pp. 57
70.
An important approach to planning for real-time is Cooperative
Intelligent Real-time Control Architecture (CIRCA),
/MDS95/. In CIRCA, AI planning functionality- upon which real-time
requirements are not imposed - runs in parallel with a real-time
control subsystem. The planning subsystem (AIS) and real-time
subsystem (RTS), however, collaborate with each other AIS being
"a real time system designer" for RTS , figure 2.
Figure 2. The Cooperative Intelligent Real-time Control Architecture
(CIRCA)
The system separates real-time control-level goals from
non-real-time task-level goals of the plan. Performance
guarantees needed for real-time goals are achieved by using worst
case execution time estimates for (condition) tests and actions
in the plans. Task level goals are achieved on a best effort basis.
Plans in CIRCA are in effect cyclic schedules for test-action
pairs (TAPs). Planning relies heavily on the use of a world
model, the constituents of which are domain independent, but
content is specific to the application domain in question:
(S, F, TE, TA, TT), where :
S = {S1, S2, , Sm} a finite set of statesF a failure state
TE = {TE1, , TEn} a finite set of event transitions (from environment)TA = {TA1, , TAp} a finite set of action transitions (planned)TT = {TT1, , TTq} a finite set of temporal transitions (from passage of time)
each transition Ti belongs to TE U TA U TT; Ti : S -> S,
D : T -> S and R : T -> S are domain and range of a transition,
respectively
At every instance of time the world will be in one state. AIS
planner makes plans in such a manner that no failure states
will be reachable and thus the plans may safely be executed
by the real-time subsystem RTS.
The construction of the world model and planning of control actions
is dynamically interleaved. Backtracking is made possible by a
choise-stack state machine. Inputs for planning are transition
descriptions, initial world model state descriptions and goal
descriptions. CIRCA planning terminates when no failure
state is reachable, all control-level plans are achievable and
task-level goals are at least reachable.
The issue needing most development in CIRCA has been the fact that during the plan generation all paths that the world might take have to be found out in detail, which makes the planning very time consuming. Dynamic Abstraction Planning (DAP), described next, will however probably alleviate this problem.
Dynamic abstraction, /GMKB97/ is used to omit a large amount of
detail from the world model state representation of CIRCA during
planning, reducing both the state space to be explored and the
resulting plan. In order to omit detail from the states, the states
are represented with a set of features or characteristics:
C = {C1, C2,
, Cr}
Each feature Ci has a finite set of possible values
val(Ci). The selection of which features to omit is
performed by the planner locally (different parts of the
state space are abstracted in different degrees) and in a way
that the safety of the system is preserved. By leaving out certain
features in reasoning, the planner can thus avoid considering
each individual base-level state.
It may, however, turn out during planning that the abstracted
state does not specify values for all necessary features in the
action's preconditions. Then the planner must be able to increase
the precision by (re)including one or more of the omitted features.
This is called splitting or refinement of the plan.
DAP refines the world model by taking an existing state and splitting
it into a number of more specific states, one for each possible
value of a particular feature Ci.
The DAP planning may be described as a nondeterministic algorithm,
figure 3.
DAP-plan(isd) ; isd is initial state description
N = 0 ; the graph
openlist = 0
is = MAKE-INITIAL-STATE(isd)
N <- N u {is}
push(is, openlist)
while true do
if no more reachable states in the openlist then break
else
s <- choose a reachable state from openlist
openlist <- openlist - {s}
oneof
split-state:
choose a proposition p and split s into | val(p)| states
remove s from N and insert the new states
add the new states to the openlist
assign-action:
choose an action or NoOp that is applicable for s
fail
end
Figure 3. The DAP planning algorithm
The algorithm fails, if the state chosen by the nondeterministic
choose operator above finds out a state for which there
is no applicable action or proposition on which to split.
An action is applicable in assign-action part of the algorithm,
if the state necessarily satisfies its preconditions and if the
action preempts all transitions to failure from the given
state. When the algorithm fails in a given state, it backtracks
(by using a state stack). If the algorithm fails on the initial
state, the search as a whole has failed.
The DAP algorithm above is an example of an anytime algorithm,
/ZIL95/. The quality of the results of anytime algorithms improves
gradually as computation time increases; they offer thus a trade-off
between resource consumption and output quality. Through backtracking,
the DAP algorithm generates plans that satisfy ever more of the
task level goals; after first guaranteeing the real-time safety
of the plan, which is a strict domain specific requirement.
The prototype DAP planner, described above, has been implemented
on some of the example domains of CIRCA. As described, the DAP
planner uses the same principles of guaranteeing control goals
and trying to achieve task goals as CIRCA. It does not, however,
yet take into account the temporal model provided by CIRCA. With
the help of DAP the problematically large search space of CIRCA
has been reduced with nearly an order of magnitude according
to preliminary results.
For complicated and/or dynamic domains - like real-time control
applications - the nature and amount of contingencies in planning
becomes quite large, as was described in chapter two. It is often
even so large that the resulting set of possible world states
becomes impossible to reasonably anticipate or enumerate. In this
chapter we consider this kind of unbounded indeterminacy.
We have to be able to react to the unanticipated changes
in the environment and replan the respective plans accordingly
during execution.
Replanning may be implemented in such a way, that a planning agent
keeps track of the remaining plan segment p and the complete
plan q, figure 4 /RN95/, pp. 402
403.
function REPLANNING-AGENT(percept)
static: KB, a knowledge base
p, an annotated plan, initially NoPlan
q, an annotated plan, initially NoPlan
G, a goal
TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
current <- STATE-DESCRIPTION(KB, t)
if p = NoPlan then
p <- PLANNER(current, G, KB)
q <- p
if p = NoPlan or p is empty then return NoOp
if PRECONDITIONS(p) not currently true in KB then
p' <- CHOOSE-BEST-CONTINUATION(current, q)
p <- APPEND(PLANNER(current, PRECONDITIONS(p'), KB), p')
q <- p
action <- FIRST(p)
p <- REST(p)
return action
Figure 4. Execution monitoring and replanning
Before executing the first action of p, it checks the preconditions
and if they are not met it chooses some point in the complete
plan q such that the plan p' from that point to
the end of the complete plan q is easiest to achieve from
the current state. The new, replanned plan is then to first
achieve the preconditions of p' and then to execute it.
It is to be noted, that the algorithm above introduces the principle
of replanning, only. It does not consider the way in which
the actual planning algorithm constructs the sequence of actions
from the current state to the goal. Various planning algorithms,
for example those described in /RN95/, pp. 337
391 could
be used for the PLANNER, above.
Planning and plan execution can be also thought as a single process
in a situated planning agent. The idea is that while the
agent is executing some steps of the plan, it simultaneously
refines or modifies the plan according to the additional information
got from the environment during execution. The situated agent
thus continuously monitors the environment updating the
world model from new percepts while its reasoning is proceeding,
figure 5 /RN95/, pp. 403
407.
function SITUATED-PLANNING-AGENT(percept)
static: KB, a knowledge base
p, a plan, initially NoPlan
t, a counter
G, a goal
TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
current <- STATE-DESCRIPTION(KB, t)
EFFECTS(START(p)) <- current
if p = NoPlan then
G <- ASK(KB, MAKE-GOAL-QUERY(t))
p <- MAKE-PLAN(current, G, KB)
action <- NoOp (the default)
Termination:
if there are no open preconditions and p has no steps other than START and FINISH then
p <- NoPlan, skip remaining steps
Resolving standard flaws:
resolve any open condition by adding a causal link from any existing prior step or a new step
resolve potential threats by promotion or demotion
Remove unsupported causal links:
if there is a causal link START -> S protecting a proposition c
that no longer holds in START then remove
the link and associated bindings
Extend causal links back to earliest possible step;
if there is a causal link Sj -> Sk such that
another step Si exists with Si
< Sj and the link Si -> Sk
is safe then replace Sj -> Sk with
Si -> Sk
Remove redundant actions: remove
any step S that supplies no causal links
Execute actions when ready for execution::
if a step S in the plan other than FINISH satisfies the following:
add ordering constraints to force all steps after S
remove S from p, and all causal links to and from S
action <- the action in S
TELL(KB, MAKE-ACTION-SENTENCE(action, t))
t <- t +1
return action
Figure 5. A situated planning agent
The situated view described above can be also formalized by situated
automata theory, /RK95/. The developers of the theory argue
that there is much to gain from analyzing the semantics of knowledge
representation from the point of view of the needed reactive behaviour.
The approach gives also guidelines on designing agents with pure
action, pure perception and combined perception and action properties.
Many approaches to planning while executing focus on the specification
of a highly reactive execution module embedded in a planning
environment. A major representative for these approaches is the
Procedural Reasoning System (PRS) /GL90/, /DW91/ pp. 151
156.
The PRS system consists of a data base containing current
beliefs or facts about the world, a set of current goals
or desires to be realized, a set of procedures (knowledge
areas, KAs) describing how certain sequences of actions and
tests are performed and an interpreter to manipulate the
components.
The system also has a process stack, which contains all
active KAs, system's current intentions for achieving its
goals or reacting to some observed situation. The PRS interpreter
runs the system with the help of the process stack by interleaving
plan selection, formation and execution.
The plans (KAs) selected are both partial and hierarchical.
PRS tries to fulfil any intentions it has previously committed
to, but if a new fact or request becomes known, PRS will reassess
its goals and intentions and accordingly chooses to work on something
else. PRS can even alter its intentions regarding its own
reasoning process using its metalevel capabilities.
PRS is the first implementation of a BDI (Belief, Desire, Intention)
architectural model, /CL90/ based on modal logic. There exist
many "BDI-successors" for PRS in agent and multiagent
applications, for example INTERRAP, described in the next section.
The basic idea in Reactive Action Packages (RAP) is that
the action concept of AI is preserved while the actual
controlling is performed by multiple concurrent processes.
The plan representation in RAP allows both concurrency and synchronizing
execution with events from the environment which is crucial in
using the plans to control large real-time systems.
The overall RAP architecture consists of three layers: a planning
layer, which produces sketchy plans for goals, the RAP execution
system to fill in the details at run time, and a control system
to carry out actions, figure 6, /FIR92/.
Figure 6. RAP architecture
The key component is the RAP executor, which expands vague
plan steps into detailed task instructions at run time
by choosing an appropriate, context sensitive method for the next
step in execution from a preexisting library.
Waiting until run time allows the use of direct rather than predicted
knowledge of the situation and thus much of the incompleteness
and uncertainty will disappear. However, incorrect methods for
tasks may sometimes be chosen. These are handled by trying to
make sure that each method achieves its goal and, if it does not
(failure situation), choosing another method for the task and
trying again.
A restricting assumption of a task representation above, namely
that a given task may have only two outcomes: success or failure,
has been later relaxed, /FIR94/. In the new version of RAP architecture
there is a task net representation that explicitly recognizes
when subtasks have multiple outcomes and allows a different thread
to be followed for each one.
Perhaps even more important successor to the original RAP system
is the INTERRAP agent architecture, /MUE96/, which adopts
many of RAP's symbolic action - real-time execution mechanisms.
It also adapts existing plans to specific situations, not unlike
RAP. In knowledge representation INTERRAP uses, however BDI -
theory, like PRS (cf. chapter 3.2).
The principal idea in INTERRAP is that an agent, belonging to a multi-agent society is divided into three interacting layers:
The local INTERRAP plans are selected, interpreted, scheduled, and executed in the local planning layer of the architecture. INTERRAP uses the principle of planning from second principles by using a plan library. It interleaves planning with execution by utilizing a control cycle, that is implemented as an object whose main methods correspond to the activities above. The main attributes of the control cycle are:
An important original contribution of INTERRAP is the cooperative planning layer which extends the system's capabilities to multi-agent planning and execution. It is activated by the local planning layer of the system to cope with situations that cannot be satisfactorily dealt with local planning. After requests from the local planning layer are received, additional information about other agents' goals is gathered, and then a suitable negotiation strategy is selected for multi-agent interaction.
The traditional AI planners may be extended to suit to
the dynamic domain requirements, /RN95/, pp. 367
389. Thus
extending STRIPS by hierarchical plan decomposition allows
nonprimitive operators to be included in plans. Universal quantification
allows the preconditions and effects to refer to objects in general.
Actions can also be made to generate and consume resources.
To some extent, even time can be handled with the mechanisms
for manipulating resource measures.
With the extensions above, STRIPS-based, extended planners
have been able to plan in even such - from planning perspective
computationally complex - control domains as spacecraft missions
and manufacturing.
At least conditional planners are, however normally needed
to properly take the dynamic domain environment into account when
planning. The number of conditions to be reasoned grows exponentially
with the number of steps in the plan. Thus it is time consuming
to plan realistic real-time plans having enough steps.
Abstraction may be used to reduce unnecessary detail in
planning. This leads naturally into allowing planning at a higher
level of abstraction and using plan libraries or other ready-made
plans or procedures at the lower level. This technique is used
also in several replanning approaches.
The CIRCA planner has been used for robot control and aircraft
control simulation applications, both having strict stimulus-response
type requirements in addition to planned sequences of actions.
Applications for PRS and its successors are mainly such as need
also substantial amount of procedural control. Most notable
of them are the ones using SRI's robot Flakey and some
chemical process control applications.
Characteristic for RAP-based and several "RAP-like"
planners seems to be that they are often a part of a large, layered,
and integrated architecture. Perhaps the best example is INTERRAP.
These kinds of systems may be used to enhance - or in some cases
even replace - also large traditional distributed process management
and control systems. A reactive level takes care of immediate
measurements and controls, a local planning level performs reasoning
that may be done by the agent itself and a collaboration level
negotiates with other agents of the application.
The emerging possibilities to apply AI planning in various real-time control applications described shortly above, give interesting potential to transfer this technology from laboratories to production use.
/BRO91/ R. A. Brooks: Intelligence without representation, Artificial Intelligence
47(1991) 139-159
/CL90/ P. R. Cohen, H. J. Levesque: Intention is choise with commitment,
Artificial Intelligence 42(1990) 213-261
/DW91/ T. L. Dean, M. P. Wellman: Planning and Control, Morgan Kaufmann
Publishers Inc., USA, 1991
/FIR92/ R. J. Firby: Building Symbolic Primitives with Continuous Control
Routines, Proc. First Int. Conf. On AI Planning Systems, 1992
/FIR94/ R. J. Firby: Task Networks for Controlling Continuous Processes,
Proc. Second Int. Conf. On AI Planning Systems, 1994
/FN93/ R. F. Fikes, N. J. Nilsson: STRIPS, a retrospective, Artificial Intelligence
59(1993) 227-232
/GMKB97/ R. P. Goldman, D. J. Musliner, K. D. Krebsbach, M. S. Boddy: Dynamic
Abstraction Planning, Proc. AAAI, 1997
/GL90/ M. P. Georgeff, A. L. Lansky: Reactive reasoning and planning, Readings
in Planning, 1990, 729
734
/MDS95/ D. J. Musliner, E. H. Durfee, K. G. Shin: World modeling for the dynamic
construction of real-time control plans, Artificial Intelligence 74(1995)
83-127
/MUE96/ J. P. Mueller: The Design of Intelligent Agents - A Layered Approach,
Springer-Verlag, Germany, 1996
/RK95/ S. J. Rosenschein, L. P. Kaelbling: A Situated View of Representation and
Control, Artificial Intelligence 73(1995)
/RN95/ S. J. Russel, P. Norvig : Artificial intelligence - A Modern Approach,
Prentice Hall, USA, 1995
/ZIL95/ S. Zilberstein: Operational Rationality through Compilation of Anytime
Algorithms, AI Magazine, 16(1995) 79 - 80