J. Koehler and J. Hoffmann
Volume 12, 2000
Links to Full Text:Journal of Artificial Intelligence Research 12 {2000} 339{386 Submitted 12/99;
published 6/00
On Reasonable and Forced Goal Orderings and their Use in
an Agenda-Driven Planning Algorithm
Jana Koehler janakoehler@ch.schindler.com Schindler Lifts, Ltd.
R & D Technology Management
6031 Ebikon, Switzerland
Jªorg Hoffmann hoffmann@informatik.uni-freiburg.de Institute for Computer
Science Albert Ludwigs University
Georges-Kªohler-Allee, Geb. 52 79110 Freiburg, Germany Abstract
The paper addresses the problem of computing goal orderings, which is one
of the
longstandingissues in AI planning. It makes two new contributions. First,
it formally
defines and discusses two different goal orderings, which are called the
reasonable and the forced ordering. Both orderings are defined for simple
STRIPS operators as well as for
more complex ADL operators supporting negation and conditional effects.
The complexity
of these orderings is investigated and their practical relevance is discussed.
Secondly, two different methods to compute reasonable goal orderings are
developed. One of them is
based on planning graphs, while the other investigates the set of actions
directly. Finally,
it is shown how the ordering relations, which have been derived for a given
set of goals
G , can be used to compute a so-called goal agenda that divides G into an
ordered set of subgoals. Any planner can then, in principle, use the goal
agenda to plan for increasing
sets of subgoals. This can lead to an exponential complexity reduction,
as the solution to a
complex planning problem is found by solving easier subproblems. Since only
a polynomial
overheadis caused by the goal agenda computation, a potential exists to
dramatically speed
up planning algorithms as we demonstrate in the empirical evaluation, where
we use this method in the IPP planner. 1. Introduction
How to effectively plan for interdependent subgoals has been in the focus
of AI planning
research for a very long time. Starting with the early work on ABSTRIPS
{Sacerdoti, 1974}
or on conjunctive-goal planningproblems {Chapman, 1987}, quite a number
of approaches
have been presented and the complexity of the problems has been studied.
But until today, planners have made only some progress in solving bigger
planning instances and scalability of classical planning systems is still
a problem.
In this paper, we focus on the following problem: Given a set of conjunctive
goals, can
we define and detect an ordering relation over subsets from the original
goal set? To arrive
at such an ordering relation over subsets, we first focus on the atomic
facts contained in the goal set. We formally define two closely related ordering
relations over such atomic goals,
c
fl2000 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved.Koehler
& Hoffmann whichwe call re asonable and forc e d ordering, and study their
complexity. It turns out that both are very hard to decide. Consequently,
we introduce two efficient methods that can both be used to approximate
re asonable goal orderings. The definitions are first given for simple STRIPS
domains, where thedesired theoretical properties can be easily proven. Afterwards,
we extend our definitions
to ADL operators {Pednault, 1989} handling conditional effects and negative
preconditions,
and discuss why we do not further invest any effort in trying to find forc
e d orderings. We show how a set of ordering relations between atomic goals
can be used to divide the
goal set into disjunct subsets, and how these subsets can be ordered with
respect to each
other. The resulting sequence of subsets comprises the so-called goal agenda,
whichcan then be used to control an agenda-driven planning algorithm.
The method, called Goal Agenda Manager, is implemented in the context of
the IPP planning system, where we show its potential of exponentially reducing
computation times
on certain planningdomains. The paper is organized as follows: Section 2
introduces and motivates reasonable and
forced goal orderings. Starting with simple STRIPS operators, they are formally
defined,
and their complexity is investigated. In Section 3, we present two methods,
whichcom-
pute an approximation of the reasonable ordering and discuss both orderings
from a more practical point of view. The section concludes with an extension
of our definitions to ADL
operators having conditional effects. Section 4 shows how a planning system
can benefit
from ordering information by computing a goal agenda that guides the planner.
We define
how subsets of goals can be ordered with respect to each other and discuss
how a goal
agendacan affect the theoretical properties, in particular the completeness
of a planning
algorithm. Section 5 contains the empirical evaluation of our work, showing
results that we obtained using the goal agenda in IPP. In Section 6 we summarize
our approach in the light
of related work. The paper concludes with an outlook on possible future
research directions
in Section 7.
2. Ordering Relations between Atomic Goals For a start, weonly investigate
simple STRIPS domains just allowing sets of atoms to
describe states, the preconditions, and the add and delete lists of operators.
Definition 1 {State} The set of al l ground atoms is denoted with P . A state
s 2 2
P
is a
subset of ground atoms. Note that all states are assumed to be complete,
i.e., we always know for an atom p whether
p 2 s or p 62 s holds. We also assume that all operator schemata are ground,
i.e., we only
talk about actions.
Definition 2 {Strips Action} A STRIPSaction o has the usual form
pre{o} 000! ADD add{o} DEL del{o}
where pre{o} are the pre c onditions ofo, add{o} is the Add list of o and
del{o} is the Delete list of the action, each being a set of ground atoms.
We also assume that del{o} \ add{o} = ;.
The result of applying a STRIPS action to a state is defined as usual:
340On Reasonable and Forced Goal Orderings
Result{s; o} :=
032
{s [ add{o}) n del{o} if pre{o} 022 s
s otherwise
If pre{o} 022 s holds, the action is said to be applicable in s. The result
of applying a
sequenc e of more than one action to a state is re cursively defined as
Result{s; ho
1
; : : : ; o n
i} := Result{Result{s; ho 1
; : : : ; o
n0001
i}; o n
}:
Definition 3 {Planning Problem} A planning problem {O; I ; G } is a triple
where O is
the set of actions, and I {the initial state} and G {thegoals} are sets
of ground atoms. A
plan P is an order e d sequenc e of actions. If al l actions in a plan are
taken out of a certain action set O, we denote this by writing P
O
. Note that we define a plan to be a sequence of actions, not a sequence
of par al lel steps,
as it is done for graphplan {Blum & Furst, 1997}, forexample. This makes
the subsequent
theoretical investigation more readable. The results directly carry over
to parallel plans.
Given two atomic goals A and B , various ways to define an ordering relation
over
themcan be imagined. First, one can distinguishbetween domain-spe cific
and domain-
independent goal ordering relations. But although domain-specific orderings
can be very
effective, they need to be redeveloped for each single domain. Therefore,
one is in particular
interested in domain-independent ordering relations having a broader range
of applicability.
Secondly , following Hªullem et al. {1999}, onecan distinguish the goal
selection and the goal
achievement order. The first ordering determines in which order a planner
works on the
various atomic goals, while the second one determines the order, in which
the solution
plan achieves the goals. In this paper, we compute an ordering of the latter
type. In the agenda-driven planning approach that we propose later in the
paper, both orderings
coincide anyway. The goals that are achieved first in the plan are those
that the planner
works on first.
The following scenario motivates how an achievement order for goals can
be possibly
defined. Given two atomic goals A and B , for which a solution plan exists,
let us assume
the planner has just achieved the goal A, i.e., it has arrived at a state
s
{A;:B} , in which A
holds, but B does not hold yet. Now, if there exists a plan that is executable
in s
{A;:B}
and achieves B without ever deleting A, a solution has been found. If no
such plan can be found, then two possiblereasons exist: 1. The problem is
unsolvable|achieving A first leads the planner into a deadlock situa-
tion. Thus, the planner is forc e d to achieve B before or simultaneously
with A.
2. The only existing solution plans have to destroy A temporarily in order
to achieve B .
But then, A should not be achieved first. Instead, it seems to be re asonable
to achieve
B before or simultaneously with A for the sake of shorter solution plans.
In the first situation, the ordering \B before or simultaneously with A"
is forc e d by in-
herent properties of the planning domain. In the second situation, the ordering
\B before or
simultaneously with A" appears to be re asonable in order to avoid non-optimal
plans. Con-
sequently, we will define two goal orderings, called the forc e d and the
re asonable ordering.
For the sake of clarity, we first give some more basic definitions.
341Koehler & Hoffmann Definition 4 {Reachable State} Let {O; I ; G } be
a planning problem and let P be the set of ground atoms that oc cur in the
problem. We say that a state s 022 P is reachable, iff
there exists a sequenc e ho
1
; : : : ; o
n
i out of actions in O for which s= Result{I ; ho
1
; : : : ; o n
i}
holds. Definition 5 {Generic State s {A;:B}
} Let {O; I ; G } be a planning problem. By s {A;:B}
we denote any re achable state in which A has just be en achieved, but B
is false,
i.e., B 62 s {A;:B}
and there is a sequenc e of actions ho
1
; : : : ; o
n
i such that s
{A;:B}
=
Result{I ; ho
1
; : : : ; o n
i}, with A 2 add{o
n }.
One can imagine s
{A;:B} as a state about which we only have incomplete information.
All the states s it represents satisfy sj= A; :B , but the other atoms p
2 P with p 6= A; B
can adopt arbitrary truth values.
Definition 6 {Reduced Action Set O
A } Let {O; I ; G } be a planning problem, andlet
A 2 G be an atomic goal. By O
A
we denote the set of al l actions that do not delete A,
i.e., O A
= fo 2 O j A 62 del{o}g.
We are now prepared to define what we exactly mean by forc e d and re asonable
goal orderings.
Definition 7 {Forced Ordering 024
f } Let {O; I ; G } be a planning problem, and let A; B 2
G be two atomic goals. We say that there is a forced ordering betwe en B
and A, written
B 024
f
A, if and only if
8 s
{A;:B}
: :9 P O
: B 2Result{s
{A;:B}
; P
O }
If Definition 7 is satisfied, then each plan achieving A and B mustachieve
B before orsimultaneously with A, because otherwise it will encounter a deadlock,
rendering the
problem unsolvable. Definition 8 {Reasonable Ordering 024
r
} Let {O; I ; G } be a planning problem, and let
A; B 2 G be two atomic goals. We say that there is a reasonable ordering
betwe en B and
A, written B 024 r
A, if and only if 8 s
{A;:B}
: :9 P O
A
: B 2Result{s
{A;:B}
; P O
A
}
Definition 8 gives B 024 r
A the meaning that if, after the goal A has been achieved, there is no plan
anymore that achieves B without|atleast temporarily|destroying A, then B
isa goal prior to A.
We remark that obviously B 024
f
A implies B 024
r
A, but not vice versa. We also make
a slightly less obvious observation at this point: The formulae in Definitions
7 and 8 use a universal quantification over states s
{A;:B} . If in a planning problem there is no such
state at all, the formulae are satisfied and the goals A and B get ordered,
i.e., B 024
f
A and
B 024
r
A follow, respectively. In this case, however, there is not much information
gained
by a goal ordering between A and B , because any sequence of actions will
achieve B prior or simultaneously with A - A cannot be achieved with B still
being false. Thus in this
case, the ordering relations B 024
f
A and B 024
r
A are trivial in the sense that no reasonable planner would invest much
effort in considering the goals A and B ordered the other way
roundanyway.
342On Reasonable and Forced Goal Orderings
Definition 9 {Trivial Ordering Relation} Let {O; I ; G } be a planning problem,
and let
A; B 2 G be two atomic goals. An ordering relation B 024 f
A or B 024 r
A is cal led trivial iff
there is no state s
{A;:B}
.
In this paper, we will usually consider forced and reasonable goal orderings
as non-trivial
orderings and make the distinction explicit only if we have to do so.
Definitions 7 and 8 seem to deliver promising candidates for an achievement
order.
Unfortunately, both are very hard to test: it turns out that their corresponding
decision problems are PSPACE hard. Theorem 1 Let FORDER denote the fol lowing
problem:
Given two atomic facts A and B , as well as an action set O andan initial
state I , does
B 024 f
A hold ?
Deciding FORDER is PSPACE-har d.
Proof: The proof proceeds by polynomially reducing PLANSAT {Bylander, 1994}|the
decision problem of whether there exists a solution plan for a given arbitrary
STRIPS
planning instance|to the problem of deciding FORDER. LetI , G , and O denote
the initial state, the goal state, and the action set in an arbitrary
STRIPS instance. Let A, B , and C be new atomic facts not contained in the
instance so
far. We build a new action set and initial state for our FORDER instance
by setting
O
0
:= O [
8 <
:
o
I 1
= fC g 000! ADD fAg DEL fC g;
o I
2
= fAg 000! ADD I DEL fAg;
o G
= G 000! ADD fB g DEL ;
9
=
; and
I
0 := fC g
With these definitions, reaching B from A is equivalent to solving the original
problem. The
other way round, unreachability of B fromA|forced ordering B 024
f A|isequivalent to the unsolvability of the original problem. In order
to prove this, we consider the following:
The only way of achieving A is by applying o I
1
to I 0
. Consequently, the only state s
{A;:B}
is fAg, cf. Definition 5. Thus starting with the assumption that B 024
f
A is valid, we apply
the following equivalences:
B 024
f A
, 8 s
{A;:B}
: :9 P
O
0
: B 2Result{s {A;:B}
; P
O
0 } cf. Definition 7 , :9 P
O
0
: B 2 Result{fAg; P
O
0
} fAg is the only reachable state s
{A;:B} , :9 P
O
: G 022 Result{I ; P
O
} with the definition of O 0
, no solution plan exists for I ; G and given O
343Koehler & Hoffmann Thus, the complement of PLANSAT can be polynomially
reduced to FORDER. As PSPACE
= co-PSPACE, we are done.Theorem 2 Let RORDER denote the fol lowing problem:
Given two atomic facts A and B , as well as an action set O andan initial
state I , does
B 024
r
A hold ?
Deciding RORDER is PSPACE-har d.
Proof: The proof proceeds by polynomially reducing PLANSAT to RORDER.
Let I , G , and O bethe initial state, the goal state, and the action set
in an arbitrary
STRIPS planning instance. Let A, B , C , and D benew atomic facts not contained
in the
instance so far. We define the new action set O
0
by setting
O 0
:= O [
8
<
:
o I
1
= fC g 000! ADD fA; Dg DEL fC g;
o
I
2
= fA; Dg 000! ADD I DEL fDg;
o
G = G 000! ADD fB g DEL ;
9
=
;
and the new initial state by
I
0
:= fC g As in the proof of Theorem 1, the intention behindthese definitionsis
to make solvability
of the original problem equivalent to reachability of B from A. For reasonable
orderings,
reachability is concerned with actions that do not delete A, which is why
we need the safety
condition D.
Precisely, the only way to achieve A is by applying o
I
1
to I
0
, i.e., per Definition 5 the
only state s
{A;:B}
is fA; Dg. As no action in the new operator set O
0
deletes A, we have
thefollowing sequence of equivalences. B 024
r
A , 8 s
{A;:B}
:9 P
O 0
A
: B 2 Result{s
{A;:B}
; P
O
0
A
} cf. Definition 8
, :9 P
O 0
A
: B 2 Result{fA; Dg; P
O
0
A
} fA; Dg is the only reachable state s
{A;:B} , :9 P
O
0
: B 2 Result{fA; Dg; P O
0
} no action in O
0
deletes A
, :9 P O
such that G 022 Result{I ; P
O
} with the definition of O
0 , no solution plan exists for I ; G ; O
Thus, thecomplement of PLANSAT can be polynomiallyreduced to RORDER. With
PSPACE = co-PSPACE, we are done.Consequently, finding reasonable and forced
ordering relations between atomic goals is
already as hard as the original planning problem and it appears unlikely
that a planner will
gain any advantage from doing that. A possible way out of the dilemma is
to define new
ordering relations, which can be decided in polynomial time and which are,
ideally, sufficient for the existence of reasonable or forced goal orderings.
In the following, we introduce two
such orderings. 344On Reasonable and Forced Goal Orderings
3. The Computation of Goal Orderings
In this section, we will 1. define a goal ordering 024 e
, which can be computed using graphplan's exclusivity
information about facts. We prove that this ordering is sufficient for 024
r
and that it can be decided in polynomial time {the subscript \e" stands
for \efficient"}. 2. define a goal ordering 024 h
, whichis computed based on a heuristic method that is
much faster than the computation based on graphplan, and also delivers powerful
goal ordering information {the subscript \h" stands for \heuristic"}.
3. discuss that most of the currently available benchmark planning domains
do not con- tain forced orderings, i.e., 024 f
will fail in providing a problem decomposition for them.
4. show how our orderings can be extended to handle more expressive ADL
operators. 3.1 Reasonable Goal Orderings based on graphplan
A goal ordering is always computed for a specific planning problem involving
an initial
state I , a goal set G fA; B g, and the set O of all ground actions. In
order to develop an efficient computational method, we proceed in two steps
now:
1. We compute more knowledge about the generic state s
{A;:B}
.
2. We define the relation 024
e and investigate its theoretical properties. In particular, we
prove that 024
e
implies 024 r
.
The state s
{A;:B} represents states that are reachable from I , and in which A has
been achieved, but B doesnot hold. Given this information about s
{A;:B}
, one can derive
additional knowledge about it. In particular, it is possible to determine
a subset of atoms F,
of which one definitely knows that F \ s {A;:B}
= ; must hold. One method to determine F is
obtained via the computation of invariants, i.e., logical formulae that
hold in all reachable states, cf. {Fox & Long, 1998}. After having determined
the invariants, one assumes that A
holds, but B doesnot, and then computes the logical implications. Another
possibility is to
simply use graphplan {Blum & Furst, 1997}. Starting from I with O, the planning
graph
is built until the graph has leveled off at some time step. The proposition
level at this time step represents a set of states, which is a superset of
all states that are reachable from I
when applying actions from O. All atoms, whichare marked as mutual ly exclusive{Blum
& Furst, 1997} of A in this level can never hold in a state satisfyingA.
Thus, they cannot
hold in s
{A;:B}
. We denote this set with F
A
GP
|the False set with respect to A returned by
graphplan.
1
F
A GP
:=fp j p is exclusive of A when the graph has leveled offg {1}
Note that the planninggraph is only grown once for a given I and O, but
can be used to
determine the F
A
GP
sets for all atomic goals A 2 G .1. We assume the reader to be familiar
with graphplan, because this planning system is very well known in the planning
research community. Otherwise,{Blum & Furst, 1997} provide the necessary
background.
345Koehler & Hoffmann Lemma 1 F
A
GP \ s
A
= ; holds for al l states s A
satisfying A 2 s
A
that are re achable from
I using actions from O. The proof follows immediately from the definitions
of \level-off " and \two propositions
being mutual exclusive" given in {Blum & Furst, 1997}. We now provide a
simple test which is sufficient for the existence of a reasonable ordering
B 024
r A between two atomic goals A and B .
Definition 10 {Efficient Ordering 024
e
} Let {O; I ; G fA; B g} be a planning problem.
L et F
A
GP
be the False set for A. The ordering B 024 e
A holds if and only if 8 o 2 O
A : B 2add{o} } pre{o}\ F A
GP
6= ;
This means, B isordered before A if the reduced action set only contains
actions, which
either do not have B in their add lists or if they do, then they require
a precondition which
is contained in the False set. Such preconditions can never hold in a state
satisfying A and
thus, these actions willnever be applicable.
Theorem 3 B024
e
A } B 024
r
A
Proof: Assume that B 6024
r
A, i.e., B 2 Result{s {A;:B}
; P
O
A } for a reachable state s {A;:B}
with A 2 s
{A;:B}
, B 62 s
{A;:B} , and a Plan P
O A
= ho
1
; : : : ; o
n
i where o
i
2 O
A
for 1 024 i 024 n. As A62 del{o i
} for all i {Definition 6}, we have
A 2 Result{s {A;:B}
; ho
1
; : : : ; o
i
i} for 0 024 i 024 n
and, with Lemma 1,
F
A
GP
\ Result{s
{A;:B}
; ho
1
; : : : ; o i
i} = ; for 0 024 i 024 n {2}
Furthermore, as B 62 s
{A;:B}
, butB 2 Result{s
{A;:B}
; ho
1
; : : : ; o n
i}, there must be a
step which makes B true, i.e.,
91 024 k 024n : B 62 Result{s
{A;:B}
; ho
1 ; : : : ; o
k0001
i} ^ B 2 Result{s
{A;:B}
; ho
1
; : : : ; o k
i}
For this step, we obviously have B 2 add{o
k
} and consequently, with the definition
of B 024
e
A,pre{o
k
} \ F
A
GP
6= ;. Now, as o k
must be applicable in the state where it
is executed {otherwise it would not add anything to this state}, the preconditions
of o
k must hold, i.e., pre{o
k
} 022 Result{s
{A;:B}
; ho 1
; : : : ; o k0001
i}. This immediately leads to F
A
GP
Result{s
{A;:B}
; ho
1 ; : : : ; o
k0001
i} 6= ;, which is a contradiction to Equation {2}.Quite obviously, the ordering
024
e can be decided in polynomial time. Theorem 4 Let EORDER denote the fol
lowing problem:
Given two atomic facts A and B , as well as an initialstate I and an action
set O, does B 024
e
A hold ?
Then, EORDER can be decide d in polynomial time: EORDER 2 P.
346On Reasonable and Forced Goal Orderings
Proof: To begin with, we need to show that computing F
A
GP
takes only polynomial time.
From the results in {Blum & Furst, 1997}, it follows directly that building
a planninggraph
is polynomial in jI j, jOj, l and t, where l is the maximal length of any
precondition, add or delete list of an action, and t is the number of time
steps built. Taking l as a parameter of the input size, it remains to show
that a planninggraph levels off after a polynomial
number t of time steps. Now, a planninggraph has leveled off if between
some time steps
t and t + 1 neither the set of facts nor the number of exclusion relations
change. Between
two subsequent time steps, the set of facts can only increase|facts already
occuring in the
graph remain there|and the number of exclusions can only decrease|non-exclusive
facts
will be non-exclusive in all subsequent layers. Thus, the maximal number
of time steps to
be built until the graph has leveled off is dominated by the maximal number
of changes
that can occur between two subsequent layers, which is dominated by the
maximal number of facts plus the maximal number of exclusion relations. The
maximal number of facts is
O{jI j + jOj 003 l}, and the maximal number of exclusions is O{(jI j +
jOj 003 l} 2
}, the square of the maximal number of facts. Having computed F
A GP
in polynomial time, testing B 024
e
A involves looking at all actions in O, and rejecting them if they either
* delete A, which is decidable in time O{l}, or
* have a precondition, which is an element of F
A
GP
, decidable in time O{l 003 {jI j + jOj 003 l}).
Thus we have an additional runtime for the test, which is O{jOj 003 l 003
{jI j + jOj 003 l}).Let us consider the following example, which illustrates
the computation of 024 e
using
a common representational variant of the blocks world with actions to stack,
unstack,
pickup, and putdown blocks: pickup{?ob}
clear{?ob} on-table{?ob} arm-empty{} 000! ADD holding{?ob}
DEL clear{?ob} on-table{?ob} arm-empty{}.
putdown{?ob}
holding{?ob} 000! ADD clear{?ob} arm-empty{} on-table{?ob} DEL holding{?ob}.
stack{?ob,?underob} clear{?underob} holding{?ob} 000! ADD arm-empty{} clear{?ob}
on{?ob,?underob}
DEL clear{?underob} holding{?ob}. unstack{?ob,?underob}
on{?ob,?underob} clear{?ob} arm-empty{} 000! ADD holding{?ob} clear{?underob}
DEL on{?ob,?underob} clear{?ob} arm-empty{}. Given the simple task of stacking
three blocks:
initial state: on-table{a} on-table{b} on-table{c}
goal state: on{a,b} on{b,c}
347Koehler & Hoffmann is there a reasonable ordering between the two atomic
goals? Intuitively, the blocks world
domain possesses a very natural goal ordering, namely that the planner should
start building each tower from the bottom to the top and not the other way
round. 2
Let us first investigate whether the relation on{a; b} 024
e
on{b; c} holds. Vividly speaking,
it asks whether it is still possible to stack the block a on b after on{b;
c} has been achieved.
As a first step, we run graphplan to find out which atoms are exclusive
of on{b; c} when
the planninggraph, which corresponds to this problem, has leveled off. The
result is
F
on{b;c}
GP
=fclear{c}, on-table{b}, holding{c}, holding{b}, on{a,c}, on{c,b}, on{b,a}g
One observes immediately that these atoms can never be true in a state that
satisfies
on{b; c}.
Secondly, we remove all ground actions which delete on{b; c} {in this case,
onlythe action
unstack{b,c} satisfies this condition} and obtain the reduced action set
O
on{b;c} .
Now we are ready to test if on{a; b} 024 e
on{b; c} holds. The only action, which can add
on{a; b} is stack{a,b}. It has the preconditions holding{a} and clear{b},
neitherof which
is a member of F
on{b;c}
GP
. The test fails and we get on{a; b} 6024
e
on{b; c}. As a next step, we test whether on{b; c} 024 e
on{a; b} holds. graphplan returns the following False set:
F
on{a;b} GP
= fclear{b}, on-table{a}, holding{b}, holding{a}, on{a,c}, on{c,b}, on{b,a}g
The action unstack{a,b} is not contained in O
on{a;b} because it deletes on{a; b}. The
only action which adds on{b; c} is stack{b,c}. It needs the preconditions
clear{c} and
holding{b}. The second precondition holding{b} iscontained in the set of
false facts,
i.e., holding{b} 2 F
on{a;b}
GP
and thus, we conclude on{b; c} 024
e
on{a; b}. Altogether, we have on{a; b} 6024 e
on{b; c} and on{b; c} 024
e
on{a; b}, whichcorrectly reflects the intuition that b
needs to be stacked onto c befor e a can be stacked onto b. Although 024
e
appears to impose very strict conditions on a domain in order to derive
a
reasonable goal ordering, it succeeds in finding reasonable goal orderings
in all available test
domainsin which such orderings exists. For example, in the tyreworld, in
bul ldozer problems,
in the shopping problem {Russel & Norvig, 1995}, thefridgeworld, the glass
domain, the tower of hanoi domain, the link-world, and the woo dshop. Its
only disadvantage are the computational resources it requires, since buildingplanning
graphs, while being theoretically polynomial, is a quite time- and memory-consuming
thing to do.
3 Therefore, the next section presents a fast heuristic computation of goal
orderings, which analyzes the domain actions directly and does not need to
buildplanning graphs anymore.2. Note that the goals do not specify where
the block c has to go, but leave this to the planner.
3. More recent implementations of planning graphs, which are for example
developed for STAN {Fox & Long, 1999} and IPP 4.0 {Koehler, 1999} do not
build the graphs explicitly anymore and are orders of
magnitude faster than the original graphplan implementation, but still the
computation of the planning
graph takes almost all the time that is needed to determine the 024
e
relations.
348On Reasonable and Forced Goal Orderings
3.2 Reasonable Goal Orderings derived by a Fast Heuristic Method One can
analyze the available actions directly using a method we will call Dire ct
Analysis
{DA}. It determines an initial value for F by computing the intersection
of all delete lists of
all actions which contain A in their add list, as defined in the following
equation.
F
A DA
:=
o 2 O; A 2 add{o}
del{o} {3}
The atoms in this set are all false in a state where A has just been achieved:
they are
deleted from the state description independently of the action that is used
to add A. As a
short example, let us consider the two actions
000! ADD fAg DEL fC; Dg
000! ADD fA; C g DEL fDg
Only the atom D is deleted by both actions, and thus D is the only element
initially contained in F
A DA
.
However, Equation {3} only says that when A is added then the atoms from
F
A
DA
will be
deleted. It does not say anything about whether it might be possible to
reestablish atoms in F
A
DA . One can easily imagine that actions exist, which leave A true, and
at the same
time add such atoms. If this is the case, there are reachable states in
which A and atoms
from F
A
DA
hold.
Now, our goal is to derive an ordering relation that can be easily computed,
and that
ideally, like the 024
e
relation, is sufficient for the 024
r
relation. Therefore, we want to make sure that the atoms in F
A
DA
are really false in any state after A has been achieved. We arrive at an
approximation of atoms that remain false by performing a fixpoint reduction
on the F
A
DA
set, removing those atoms that are achievable in the following sense.
Definition 11 {Achievable Atoms} An atom p is achievable from a state s
given an
action set O {written A{s; p; O}} if and only if p 2 s _ 9 o 2 O : p 2 add{o}
^ 8 p
0
2 pre{o} : A{s; p
0
; O}
The definition says that an atom p is achievable from a state s if it holds
in s, or if there exists an action in the domain, which adds p and whose
preconditions are all achievable
from s. This is a necessary condition for the existence of a plan P
O
from s to a state where
p holds. Lemma 2 9 P
O : p 2 Result{s; P
O
} } A{s; p; O} Proof: The atom p must either already be contained in the
state s, or it has to be added by a step o out of P O
. In the second case, all preconditions of o need to be established by
P
O
in the same way. Thus p and all preconditions of the step, which adds it,
are achievable
in the sense of Definition 11.349Koehler & Hoffmann There are two obvious
difficulties with Definition 11: First, p2 s must be tested. With
complete knowledge about the state s, this should not cause any problems.
In our case, however, we only have the generic state s
{A;:B}
and cannot decide whether an arbitrary
atom is contained in it or not. Secondly, we observe an infinite regression
over preconditions,
which must be tested for achievability . As for the first problem, it turns
out that it is a good heuristic to simply assume p62 s,
i.e., no test is performed at all. As for the second problem, in order to
avoid infinite
looping of the \achievable"-test, one needs to terminate the regression
over preconditions at a particular level. The point in question is how far
to regress? A quick approximation
simply decides \achievable" after the first recursive call. Definition
12 {Possibly Achievable Atoms} An atom p is possiblyachievable given an action
set O {written pA{p; O}} if and only if
9 o 2 O : p 2 add{o} ^ 8 p
0
2 pre{o} : 9 o
0
2 O : p
0
2 add{o
0
}
holds, i.e.,there is an action that adds p and al l of its pre c onditions
are add effects of other
actions in O.
If the assumption is justified that none of the atoms p is contained in
the state s, then being possibly achievable is a necessary condition for
being achievable.
Lemma 3 Let s be a state for which p 62 s and also 8o 2 O : p 2 add{o} }
pre{o} \ s = ;
holds. Then we have
A{s; p; O} } pA{p; O}
Proof: From A{s; p; O} and p 62 s, we know that there is a step o 2 O, p
2 add{o}, with
8 p
0
2 pre{o} A{s; p 0
; O}. We also know that pre{o} \ s = ;, so for each p
0
2 pre{o} there
must be an achiever o 0
2 O : p 0
2 add{o 0
}.The condition that all of the facts p must not be contained in the state
s seems to be
rather rigid. Nevertheless, the condition of being possibly achievable delivers
good results onall of the benchmark domains and it is easy to decide. We
can now use this test to both
* perform a fixpoint reduction on the set F
A
DA
and
* decide whether an atomic goal B should be ordered before A.
The fixpoint reduction, as depicted in Figure 1 below, uses the approximative
test pA{f ; O 003
}
to remove facts from F
A
DA that can be achieved. It finds al l these facts under certain restrictions,
see below. As a side effect of the fixpoint algorithm, we obtain the set
O
003 of
actions that our method assumes to be applicable after a state s {A;:B}
. We then order B
before A iff it cannot possiblybe achieved using these actions.
350On Reasonable and Forced Goal Orderings
F
003 := F
A
DA O
003
:= O A
nfo j F 003
\ pre{o} 6= ;g
f ixpointreached := false
while :f ixpointreached
f ixpointreached := true for f 2 F
003 if pA{f ; O
003
} then F
003
:= F
003
nff g
O
003
:= O
A
n fo j F
003 \ pre{o} 6= ;g
f ixpointreached := false endif
endfor
endwhile return F
003
, O
003
Figure 1: Quick, heuristic fixpoint reduction of the set F
A
DA
.
Thecomputation checks whether atoms of F
003
, which is initially set to F
A
DA
, are possibly
achievable using only those actions, which do not delete A and which do
not require atoms
from F
003 as a precondition. Achievable atoms are removed from F
003
, and O
003 gets updated
accordingly. If in one iteration, F
003
does not change, the fixpoint is reached, i.e., F
003
will not
further decrease and O
003
will not further increase|the final sets F
003 of false facts and O
003
of
applicable actions are returned.
Let us illustrate the fixpoint computation with a short example consisting
of the empty
initial state, the goals fA; B g, and the following set of actions
op1: 000! ADD f A g DEL f C, D g
op2: 000! ADD fA, C g DEL f D g
op3: f C g 000! ADD f D g
op4: f D g 000! ADD f B g
When assuming that A has been achieved, we obtain F
003
= F
A
DA = fDg as the initial value of the False set, since D is the only atom
that op1 and op2 delete when adding A. Figure 2 illustrates a hypothetical
planning process. Starting in the empty initial state
and trying to achieve A first, we get two different states s
{A;:B} in which A holds. The atom D does not hold in any of them and thus
in both states, no action is applicable that
requires D as a precondition. This excludes op4 from O
A
, yieldingthe initial action set
O
003 = fop1; op2; op3g. Now, op4 is the only action that can add B . Therefore,
if we used
this action set to see if B can still be achieved, we would find that this
is not the case.
Consequently, without performing the fixpoint computation, we would order
B beforeA.
Butas can be seen in Figure 2, this would not be a reasonable ordering:
there is the plan
hop3 ;op4i that achieves B fromthe state s
{A;:B} = Result{I ;op2} without destroying A.
The fixpoint computation works us around this problem as follows: There
is the ac-
tionop3, which can add the precondition D of op4 without deleting A. When
checking pA{D; O
003
} in the first iteration, the fixpoint procedure finds this action. It then
checks
351Koehler & Hoffmann whether the preconditions of op3 are achievable in
the sense that they are added by an-
other action. This is the case since the only precondition C is added by
op2. Thus, D is
removedfrom F
003
, which becomes empty now. The action op4 is put back into the set O
003
,
which now becomes identical with the action set O
A . This set, inturn, is identical with the
original action set O as no action deletes A. The fixpoint process terminates
and B will
not be ordered before A as it can be achieved using the action op4. This
correctly reflects
the fact that there exists a plan from the state s {A;:B}
= Result{I ; hop2i} = fC; Ag to a
state that satisfies B withoutdestroying A.
Jana Koehler0/op1op2op3op4C, A, DC, A, D, BC, AADeadlockD holds in a state
satisfying AThere is a plan from A to BFigure 2: An example illustrating
why we need the fixpoint computation.
As already pointed out, the intention behind the fixpoint procedure is the
following:
Starting from a state s
{A;:B} , we want to know which facts can become true without
destroying A, and consequently, which actions can become applicable. In
the first step,
only actions that do not use any of the facts in F
A
DA
are applicable, as all those facts are
deleted from the state descriptionwhen A is added. However, such actions
may make facts
in F
A
DA
true, so we want to remove those facts from F
A
DA
. If we manage to find al l the facts
that can be made true without destroying A, then the final set F
003
will contain only those facts that do not hold in a state reachable from
s
{A;:B}
without destroying A. In this case,
the final action set O
003
will contain al l the actions that can be applied after s
{A;:B}
, and we
can safely use this action set to determine whether another goal B canstill
be achieved or not.
However, aswe only use the approximative test pA{f ; O
003
} with f 2 F
003
to find out if
afact in the current F
003
set is achievable, there may be facts which are achievable without
destroying A, butwhich remain in the set F 003
. This could exclude actions from the set
O
003
which can be safely applied after s
{A;:B}
. Under certain restrictions, however, we can
prove that this will not happen. In order to do so, we need to impose a
restriction on the particular state s
{A;:B}
, in which we achieved the goal A: If none of the preconditions of
actions, whichadd facts contained in F
A
DA
, occur in the state s
{A;:B} , then the fixpoint
procedure will remove all facts from F
A
DA
that are achievable without destroying A. We will
use this property of the fixpoint procedure later to show that our heuristic
ordering relation
approximates reasonable orderings.
352On Reasonable and Forced Goal Orderings
Lemma 4 Let {O; I ; G } be a planning problem, and let A 2 G be an atomic
goal. Let
s
{A;:B}
b e a re achable state where A has just be en achieved. Let P O
A
= ho 1
; : : : ; o n
i be
a sequenc e of actions not destroying A. Let F 003
be the set of facts that is returne d by the
fixpoint computation depicted in Figure 1. If we have 8f 2 F
A DA
: 8o 2 O A
: f 2 add{o} } pre{o} \ s
{A;:B}
= ; {003}
then no fact in F
003
holds in the state that is re ache d by applying P O
A
, i.e., Result{s
{A;:B}
; P O
A
} \ F
003
= ; Proof:
Let F
003
j
and O 003
j
denote the state of the fact and action sets, respectively, after j iterations
of the algorithm depicted in Figure 1. As F
003
only decreases during the computation, we have
F 003
022 F
003
j
for all j . Let s
0
; : : : ; s
n
denote the sequence of states that are encountered when
executing P
O
A
= ho 1
; : : : ; o n
i in s
{A;:B}
, i.e., s
0
= s
{A;:B}
and s
i
= Result{s
i0001
; ho
i
i} for
0 024 i 024 n. We can assume that each action o
i
is applicable in state s
i0001 , i.e., pre{o
i
} 022 s
i0001
.
Otherwise, o
i
does not cause any state transition, and we can skip it from P
O
A
. Obviously,
we have s
n
= Result{s
{A;:B}
; P
O A
}, so we need to show that s
n
\F 003
= ;. The proof proceeds
by induction over the length n of P
O A
.
n = 0: P
O
A
= hi and s n
= s
0 = s
{A;:B}
. All facts in F
A
DA
are deleted from the state
description when A is added, so we have s n
\ F
A DA
= ;. As F
A
DA
= F
003
0 and F
003
022 F
003
0 , the
proposition follows immediately.
n ! n + 1: P
O
A
= ho 1
; : : : ; o n
; o
n+1 i. From the induction hypothesis, we know that
s i
\ F
003 = ; for 0024 i 024 n. What we need to show is s
n+1
\ F
003
= ;.
Let j be the step in the fixpoint iteration where F
003
j
S
i=0;:::;n
s
i
becomes empty, i.e., j
denotes the iteration in which the intersection of all the states s
i
; i 024 n with F
003
j
is empty
for the first time. Such an iteration exists, because all the intersections
s
i \ F
003
with i 024 n
are empty.
Now each action o i
; 1 024 i 024 n + 1 is applicable in state s
i0001
, i.e., pre{o
i
} 022 s
i0001 , and
thus pre{o
i
} \ F
003
j
= ; for all the actions o
i
in P
O A
. Therefore, all these actions are contained
in O 003
j
, as this set contains all the actions out of O A
whose intersection with F
003
j
is empty.
Let us focus on the facts in the state s
n+1 . All these facts are achieved by executing P
O
A in
s
{A;:B}
. In other words, there is a plan from s
{A;:B}
to each of these facts. As we have just
seen, this plan consists out of actions in O
003
j
. Applying Lemma 2 to all the facts p 2 s
n+1
using s
{A;:B}
and P
O
A
{= P
O 003
j
}, we know that all facts p are achievable using actions from
O
003 j
.
8p2 s
n+1
: A{s
{A;:B}
; p; O
003
j
}
We willnow show that those facts f 2 s
n+1 we are interested in, namely the F facts that are
added by o
n+1
and that are still contained in F
j
, are also possibly achievable using actions
from O 003
j
. Let f be a fact f 2 s
n+1
, f 2 F
003
j . We apply Lemma 3 using s
{A;:B} , f , and
353Koehler & Hoffmann O
003
j
. We can apply Lemma 3 as obviously f 62 s
{A;:B}
, and as 8o 2 O
003
j
: f 2 add{o} }
pre{o} \ s
{A;:B}
= ; by prerequisite {003}. With A{s
{A;:B} ; p; O
003
j
}, we arrive at 8f 2 s
n+1
\ F
003 j
: pA{f ; O
003
j }
What remains to be proven is that all these facts f will be removed from
F
003 during the
fixpoint computation. With the argumentation above, it is sufficient to
show that all the facts f 2 s
n+1
\ F
003 j
will get tested for pA{f ; O
003
j
} in iteration j + 1 of the fixpoint computation.
These tests will succeed and lead to s
n+1
\ F
003
j+1 = ;, yielding, as desired, s
n+1
\ F
003
= ;.
Remember that F
003
j+1
F
003
. There are two cases, which we need to consider:
1. j = 0: all intersections s
i
\ F
003
0 are initially empty, i.e., s
i
\ F
A
DA
= ; for 0024 i 024 n. In
this case, all facts f 2 s
n+1 \ F
A
DA are tested for pA{f ; O
003
0
} in iteration j + 1 = 1 of
the fixpoint computation.
2. j > 0: in this case, at least one of the intersections s
i
\ F
003
j
became empty in iteration
j by definition of j , i.e.,at least one fact was removed from F
003
in this iteration.
Therefore,the fixpoint has not been reached yet, andthe computation performs
at least one more iteration, namely iteration j + 1. All facts in F 003
j
will be tested in this
iteration, in particular all facts f 2 s
n+1
\ F
003
j .
With these observations, the induction is complete and the proposition is
proven.As has already been said, we now simply order B beforeA, if it is
not possiblyachievable
using the action set that resulted from the fixpoint computation. The ordering
relation 024
h
{where h stands for \heuristic"} obtained in this way approximates the
reasonable goal
ordering 024 r
.
Definition 13 {Heuristic Ordering 024
h } Let {O; I ; G fA; B g} be a planning problem.
L et O
003 be the set of actions that is obtained from O by performing the fixpoint
computation
shown in Figure 1.
The ordering B024
h
A holds if and only if
:pA{B ; O 003
}
If A has been reached in a particular state s
{A;:B} where the assumptions made by the fixpoint computation and by the
test for pA{B ; O 003
} are justified, then being not pos-
sibly achievable is a sufficient condition for the non-existence of a plan
to B thatdoes not
temporarily destroy A. Theorem 5 Let {O; I ; G } be a planning problem,
and let A; B 2 G be two atomic goals. Let
s
{A;:B}
b e a re achable state where A has just be en achieved, but B is stil l
false, i.e., B 62
s
{A;:B} . Let F
003 and O
003
be the sets of facts and actions, resp e ctively, that are derived by the
fixpoint computation shown in Figure 1. If we have
8f 2 F
A
DA
[ fB g : 8o 2 O A
: f 2 add{o} } pre{o} \ s
{A;:B}
= ; {003003}
then we have :pA{B ; O 003
} } :9P O
A
: B 2 Result{s
{A;:B}
; P
O
A
}
354On Reasonable and Forced Goal Orderings
Proof: Assume that there is a plan P
O A
= ho
1
; : : : ; o
n
i that does not destroy A, but
achieves B , i.e., B 2 Result{s
{A;:B}
; ho
1 ; : : : ; o
n
i}. With the restriction of {003003} to the
facts in F
A
DA
, Lemma 4 can be applied to each action sequence ho
1
; : : : ; o
i0001 i yielding
Result{s
{A;:B}
; ho
1 ; : : : ; o
i0001
i} \ F
003
= ;. Consequently, each o
i
is either
* not applicable in Result{s
{A;:B}
; ho
1 ; : : : ; o
i0001
i},
* or its preconditions are contained in Result{s
{A;:B}
; ho 1
; : : : ; o
i0001
i}, yielding pre{o
i
} F
003
= ;.
In the first case, we simply skip o
i
as it does not have any effects. In the second case,
o
i 2 O
003
follows. Thus, we have a plan constructed out of actions in O
003
that achieves B
from s
{A;:B} . Applying Lemma 2 leads us to A{s
{A;:B}
; B ; O 003
}. We have B 62s
{A;:B}
.
We also know, from {003003} withrespect to B , as O
003
022 O
A , that 8o 2 O
003
: B 2 add{o} }
pre{o} \ s
{A;:B}
= ; holds. Therefore, we can now apply Lemma 3 and arrive at pA{B ; O
003
},
which is a contradiction.We return to the blocks world example and show
how the computation of 024
h
proceeds.
Let us first investigate whether on{a; b} 024
h
on{b; c} holds. The initial value for F
on{b;c}
DA
is
obtained from the delete list of the stack{b,c} action, which is the only
one that adds this
goal.
F on{b;c}
DA = fclear{c}; holding{b}g Intuitively, it is immediately clear that neither
of these facts can ever hold in a state
where on{b; c} is true: if b is on c, then c is not clear and the gripper
cannot hold b. It
turns out that the fixpoint computation respects this intuition and leaves
the set F
on{b;c} DA
unchanged,yielding F
003
= fclear{c}; holding{b}g. We do not repeat the fixpoint process in
detail here, because it can be reconstructed from Figure 1 and the details
are not necessary for understanding how the correct ordering relations are
derived. In short, for both facts
there are achievers in the reduced action set, but all of them need preconditions
for which
no achiever is available. For example, holding{b} can be achieved by either
an unstack or
a pickup action. Both either need b to stand on another block or to stand
on the table. All actions that can achieve these facts need holding{b} to
be true and are thus excluded
from the reduced action set.
After finishing the fixpoint computation, the planner tests pA{on{a; b};
O
003
}, where O
003
contains all actions except those that delete on{b; c} and those that use
clear{c} or holding{b}
as a precondition. It finds that the action stack{a,b} adds on{a; b}. The
preconditions
of this action are holding{a} and clear{b}. These conditions are added by
the actions
pickup{a} and unstack{a,b}, respectively, which are both contained in O
003
: neither of them needs c to be clear or b to be in the gripper. Thus, the
test finds that in fact, on{a; b}
is possibly achievable using the actions in O 003
, and no ordering is derived, i.e., on{a; b} 6024
h
on{b; c} follows.
Now, the other way round, on{b; c} 024
h
on{a; b} is tested. The initial value for F
on{a;b}
DA
is
obtained from the single action stack{a,b} as
F
on{a;b}
DA = fclear{b}; holding{a}g
355Koehler & Hoffmann Again, the fixpoint computation does not cause any
changes, resulting in F
003
= fclear{b};
holding{a}g. The process now tests whether pA{on{b; c}; O
003
} holds, where O
003 contains
all actions except those that delete on{a; b} and those that use clear{b}
or holding{a} as a precondition. The only action that can add on{b; c} is
stack{b,c}. This action needs
as preconditions the facts holding{b} and clear{c}. The process now finds
that a crucial
conditionfor achieving the first fact is violated: Each action that can
achieve holding{b}
has clear{b} as a precondition, because b must be clear first before the
gripper can hold it.
Since clear{b} is an element of F 003
, none of the actions achieving holding{b} is contained in
O
003
. Consequently, the test for pA{on{b; c}; O
003
} fails and we obtain the ordering on{b; c} 024
h
on{a; b}. This makes sense as the gripper cannot grasp b and stack it onto
c anymore, once
on{a; b} is achieved.
3.3 On Forced Goal Orderings and Invertible Planning Problems
Sofar, we have introducedtwo easily computable ordering relations 024 h
and 024
e
that both
approximate the reasonable goal ordering 024
r
. One might wonder why we do not invest any
effort in trying to find forc e d goal orderings. There are two reasons
for that: 1. As we have already seen in Section 2, any forced goal ordering
is also a reasonable
goal ordering, i.e., a method that approximates the latter can also be used
as a crude
approximation to the former. 2. Many benchmark planning problems are invertible
in a certain sense. Those problems
do not contain forced orderings anyway.
In this section, we elaborate in detail the second argument. The results
are a bit more general than necessary at this point. We want to make use
of them later when we show that
the Agenda-Driven planning algorithm we propose is complete with respect
to a certain class
of planning problems. We proceed by formally defining this class of planning
problems, show that these problems do not contain forced orderings, and identify
a sufficient criterion for
the membership of a problem in this class. Finally, we demonstrate that
many benchmark
planning problems do in fact satisfy this criterion. For a start, we introduce
the notion of
a dead lock in a planning problem. Definition 14 {Deadlock} Let {O; I ;
G } be a planning problem. A re achable state s is
cal led a deadlock iff there is no sequenc e of actions that leads from
s to the goal, i.e.,iff
s = Result{I ; P
O
} and :9 P
0 O
: G 022 Result{s; P
0
O
}.
The class of planning problems we are interested in is the class of problems
that are dead lock-fr e e. Naturally, a problem is called deadlock-free if
none of its reachable states is a deadlock in the sense of Definition 14.
Non-trivial forced goal orderings imply the existence of deadlocks {remember
that an ordering B 024
f A or B 024
r A is called trivial iff there is no state s
{A;:B}
at all}.
Lemma 5 Let {O; I ; G } be a planning problem, and let A; B 2 G be two atomic
goals. If there is a non-trivial forc e d ordering B 024 f
A betwe en A and B , then there exists a dead lock
state s in the problem.
356On Reasonable and Forced Goal Orderings
Proof: Recalling Definition 9 and assuming non-triviality of 024
f
, we know that there is
at least one state s {A;:B}
where A is made true, but B is still false. From Definition 7,
we know that there is no plan in any such state that achieves B . In particular,
it is not
possible to achieve all goals starting out from s
{A;:B}
. Thus, the state s := s
{A;:B}
must be
a deadlock.We willnow investigate deadlocks in more detail and discuss that
most of the commonly
used benchmark problems do not contain them, i.e., they are deadlock-free.
With Lemma 5,
we then also know that such domains do not contain non-trivial forced goal
orderings either|so there is not much point in trying to find them. We do
not care about trivial goal
orderings. Such orderings force any reasonable planning algorithm to consider
the goals in
the correct order. The existence of deadlocks depends on structural properties
of a planningproblem: There must be action sequences, which, once executed,
lead into states from which the goals
cannot be reached anymore. These sequences must have undesiredeffects, whichcannot
be
inverted by any other sequence of actions in O. Changing perspective, one
obtains a hint on how a sufficient condition for the non-existence of deadlocks
might be defined. Assume
we have a planning problem where the effects of each action sequence in
the domain can
be inverted by executing a certain other sequence of actions. In such an
invertible planning
problem,it is in particular possible to get back to the initial state from
each reachable state.
Therefore, if such a problem is solvable, then it does not contain deadlocks:
From any state,
one can reach all goals by going back to the initial state first, and then
execute an arbitrary
solution thereafter. We willnow formally define the notion of invertible
planning problems,
and turn the above argumentation into a proof.
Definition 15 {Invertible Planning Problem} Let {O; I ; G } be a planning
problem, and let s denote the states that are re achable from I with actions
from O. The problem is cal led
invertible if and only if 8 s : 8 P
O
: 9P
O
: Result{Result{s; P O
};P
O
} = s Theorem 6 Let {O; I ; G } be an invertible planning problem, for which
a solution exists. Then {O; I ; G } does not contain any dead locks.
Proof: Let s= Result{I ; P
O
s
} be an arbitrary reachable state. As the problem is invert- ible, we know
that there is a sequence of actionsP
O s
for which Result{s;P
O s
} = I holds. As the problem is solvable, we have a solution plan P
O starting from I and achieving G 022 Result{I ; P
O
}. Together, we obtain G 022Result{Result{s;P
O
s }; P
O
}. Therefore, the
concatenation ofP
O
s and P
O
is a solution plan executable in s and consequently, s is no
deadlock.We now know that invertible planning problems, if solvable, do
not contain deadlocks and consequently, they do not contain {non-trivial}
forced goal orderings. What we will see next
is that, as a matter of fact, most benchmark planning problems are invertible.
We arrive
ata sufficient condition for invertibility through the notion of inverse
actions.
357Koehler & Hoffmann Definition 16 {Inverse Action} Given an action set
O containing an action o of the
form pre{o} 000! add{o} del{o}. An actiono 2 O is cal led inverse to o
if and only ifo has
the form pre{o} 000! add{o} del{o} and satisfies the fol lowing conditions
1. pre{o} 022 pre{o} [ add{o} n del{o}
2. add{o} = del{o} 3. del{o} = add{o}
Under certain conditions, applying an inverse action leads back to the state
one started from.
Lemma 6 Let s be a state and o be an action, which is applicable in s. If
del{o} 022 pre{o}
and s \ add{o} = ; hold, then an actiono that is inverse to o in the sense
of Definition 16 is applicable in Result{s; hoi} and Result{Result{s; hoi};
hoi} = s fol lows. Proof: As o is applicable in s, we have pre{o} 022 s.
The atoms in add{o} are added, and theatoms in del{o} are removed from s,
so altogether we have Result{s; hoi} {pre{o}[ add{o}) n del{o} pre{o}
Thus,o is applicable in Result{s; hoi}.
Furthermore, we have Result{s; hoi} = s [ add{o} n del{o} and with that
Result{Result{s; hoi}; hoi}
= Result{s [ add{o} n del{o}; hoi}
= {s [ add{o} n del{o}) [ add{o} n del{o}
= {s [ add{o} n del{o}) [ del{o} n add{o} {cf. Definition 16}
= s [ add{o} n add{o} {because del{o} 022 pre{o} 022 s}
= s {because s \ add{o} = ;}Lemma 6 states two prerequisites: {1} inclusionof
the operator's delete list in its precon-
ditions and {2} an empty intersection of the operator's add list with the
state where it is
applicable. A planning problem is called invertible if it meets both prerequisites
and if there
is an inverse to each action.
Theorem 7 Given a planning problem {O; I ; G } with the set of ground actions
O satisfying
del{o} 022 pre{o} and pre{o} 022 s } add{o} \ s = ; for al l actions
and re achable states s. If
there is an inverse actiono 2 O for each action o 2 O, then the problem
is invertible.
Proof: Let s be a reachable state, andlet P
O
= ho 1
; : : : o n
i be a sequence of actions. We
need to show the existence of a sequenceP
O
for which Result{Result{s; P
O
};P
O
} = s {003 003 003}
358On Reasonable and Forced Goal Orderings
holds. We defineP
O := ho
n
; : : : ;o
1
i, and prove {003 003 003} by induction over n. n = 0: Here, we have
P
O
=P
O = hi, and Result{Result{s; hi}; hi} = s is obvious. n! n + 1: Now P
O
= ho
1
; : : : ; o
n
; o n+1
i. From the induction hypothesis we know that Result{Result{s; ho
1
; : : : ; o
n
i}; ho
n
; : : : ;o
1
i} = s. To make the following a bit more readable,
let s
0
denote s
0
:= Result{s; ho
1
; : : : ; o
n
i}. We have
Result{Result{s; ho 1
; : : : ; o n+1
i}; ho
n+1 ; : : : ;o 1
i}
= Result{Result{s
0
; ho n+1
i}; ho
n+1 ; : : : ;o 1
i}
= Result{Result{Result{s
0 ; ho
n+1 i}; ho
n+1
i}; ho
n ; : : : ;o 1
i}
= Result{s
0 ; ho
n
; : : : ;o
1
i} {cf. Lemma 6 ons
0 and o
n+1
}
= s {per induction}Altogether, we know now that invertible problems, if
solvable, do not contain forced orderings. We also know that problems, where
there is an inverse action to each action in
O, are invertible following Theorem 7. Theorem 7 requires del{o} 022 pre{o}
to hold for each action o, and pre{o} 022 s } add{o} \ s = ; to hold for
all actions and reachable states s.
We willsee that all conditions, {a} inclusion of the delete list in the
precondition list, {b}
empty intersection of an action's add list with reachable states where it
is applicable, and
{c} existence of inverse actions, hold in most currently used benchmark
domains.
4
Concerning the condition {a} that actions only delete facts they require
as precondi- tions, one finds this phenomenon in al l domains that are commonly
used by the planning
community, at least in those that are known to the authors. It is just something
that seems to hold in any reasonable logical problem formulation. Some authors
even postulate it as
anassumption for their algorithms to work, cf. {Fox & Long, 1998}. Similarly
in the case of conditions {b} and {c}: One usually finds inverse actions
in
benchmark domains. Also, an action's preconditions usually imply|by state
invariants|
that its add effects are all false. For example in the blocks world, stackand
unstack
actions invert each other, andan action's add effects are exclusive of its
preconditions|
the former are contained in the union of the False constructed for the preconditions,
see Section 3.1. Similarly in domains that deal with logistics problems,
for example logistics,
tr ains, ferry, gripper etc., one can often find inverse pairs of actions
with their preconditions
always excluding the add effects. Sometimes, two different ground instances
of the same
operator schema yield an inverse pair. For example, in gripper, the two
ground instances
move{roomA, roomB}
at-robby{roomA} 000! ADD at-robby{roomB} DEL at-robby{roomA}.
and4. Inorder to avoid reasoning about reachable states in condition {b},
one could also postulate that an
action has all of its add effects as negative preconditions, cf.{Jonsson,
Haslum, & Bªackstrª om, 2000}.
This is, however, not commonly used in the typical planning benchmark problems.
359Koehler & Hoffmann move{roomB, roomA}
at-robby{roomB} 000! ADD at-robby{roomA} DEL at-robby{roomB}.
of the move{?from,?to} operator schema invert each other. Similarly, in
towers of hanoi,
where there is only the single move operator schema, aninverse instance
can be found
for each ground instance of the schema, and the add effects are always false
when the
preconditions are true.
Only very rarely, non-invertible actions can be found in benchmark domains.
If they
occur, their role in the domain is often quite limited as for example the
operators cuss and
inflate in Russel's Tyreworld.
cuss
000! DEL annoyed{}.
inflate{?x:wheel} have{pump} not-inflated{?x} intact{?x} 000! ADD inflated{?x}
DEL not-inflated{?x}.
Obviously, there is not much point in defining something like a decuss or
a deflate operator. More formally speaking, none of the ground actions to
these operators destroys
a goal or a precondition of any other action in the domain. Therefore, it
does not matter
that their effects cannot be inverted. In particular, no forced goal ordering
can be derived
wrt. these actions. 5
The importance of inverse actions in real-world domains has also been discussed
by
Nayak and Williams {1997}, who describe the planner BURTON controlling the
Cassini
spacecraft. In contrast to these domains, problemssuch as those for example
used by
Barrett et al. in {1994} almost never contain inverse actions. Consequently,
in these domains
plenty of forced goal orderings could be discovered and used by a planner
to avoid deadlock
situations. The widespread, although perhaps unconscious use of invertible
problems for
benchmarking is a current phenomenon related to STRIPS descending planning
systems. As
one of the anonymous reviewers pointed out to us, quite a number of non-invertible
planning
problems have also been proposed in the planning literature, e.g., the re
gister assignment
problem {Nilsson, 1980}, therob ot crossing a ro ad problem {Sanborn & Hendler,
1988}, some
instances of manufacturing problems {Regli, Gupta, & Nau, 1995}, and the
Yale Shooting problem {McDermott &Hanks, 1987}. For these problems, i.e.,
for problems that are not
invertible, one could|in the spirit of argument 1 at the very beginning
of this section|
simply use 024 e
and 024
h to approximate forced orderings if one is interested in finding at least
those. More precisely, 024 e
and 024
h
are methods that might detect forced orderings|as those
are also reasonable|but that might also find more, not necessarily forced,
orderings. If
one is not interested in finding only the forced orderings, this is a possible
way to go. For
example, in a simple blocks world modificationwhere blocks cannot be unstacked
anymore
once they are stacked|which forc es the planner to buildthe stacks bottom
up|both 024
e and 024
h
are still capable of finding the correct goal orderings.5. The cuss operator,
by the way, is the only one known to the authors that deletes a fact it is
not using
as a precondition. It is also the only one we know that could be removed
from the domain description without changing anything.
360On Reasonable and Forced Goal Orderings
3.4 An Extension of Goal Orderings to ADL Actions
The orderings, whichhave been introduced so far, can be easily extended
to deal with
ground ADL actions having conditional effects and using negation instead
of delete lists. Such actions have the following syntactic structure:
o : 036 0
{o} = pre 0
{o} 000! eff
+
0
{o}; eff
000
0
{o} 036
1
{o} = pre
1
{o} 000! eff
+
1
{o}; eff
000
1 {o}
.
. .
036
n
{o} = pre
n
{o} 000! eff
+
n
{o}; eff
000
n
{o}
All unconditional elements of the action are summarized in 036
0
{o}: The precondition of the action is denoted with pre 0
{o}, andits unconditional positive and negative effects with eff
+
0 {o} and eff
000
0
{o}, respectively. Each conditional effect 036
i
{o} consists of an effect
condition {antecedent} pre
i
{o}, and the positive and negative effects eff
+
i
{o} and eff
000
i
{o}.
Additionally, we denote with 010{o} the set of all unconditional and conditional
effects, i.e., 010{o} = f036
0
{o}; 036
1
{o}; : : : ; 036
n {o}g.
The computation of 024
e
immediately carries over to ADL actions when an extension of
planning graphs is used, which can handle conditional effects, e.g., IPP
{Koehler, Nebel,
Hoffmann, & Dimopoulos, 1997} or SGP {Anderson & Weld, 1998}. One simply
takes the
set of exclusive facts that is returned by these systems to determine the
set F A
GP
. The test
from Definition 10, which decides whether there is an ordering B 024 e
A of two atomic goals A and B , is extended to ADL as follows.
Definition 17 {Ordering 024
e
for ADL} Let {O; I ; G fA; B g} be a planning problem. L et F
A
GP be the False set for A. The ordering B 024
e
A holds if and only if
8 o 2 O; 036
i
{o} 2 010{o} : B 2 eff
+
i
{o} ^ A 62 D i
{o} } {pre
i
{o}[ pre
0 {o}) \ F
A
GP
6= ; Here, D
i
{o} denotes al l negative effects that are implied by the conditions of
036
i
{o}.
D
i
{o} :=
032
eff
000
0
{o} [
S pre
j
{o} 022 pre
i {o}
eff 000
j
{o} i 6= 0
eff 000
0
{o} i = 0
Thus, B is ordered before A if all {unconditional or conditional} effects
that add B either
imply an effect that deletes A, or need conditions that cannot be made true
together with
A. Note that an effect 036 i
requires all the conditions in pre
i
{o} [ pre
0
{o} to be satisfied, which is impossible in any state where A holds because
of the non-empty intersection with
F
A GP
.
The computation of 024
h
requires a little more adaptation effort. In order to obtain the set F
A
DA , we now need to investigate the conditional effects as well. For each
action that
has A as a conditional or unconditional effect, we determine which atoms
are negated by
it, no matter which effect is used to achieve A. We obtain these atoms by
intersecting the
appropriate sets D i
{o}.
D{o} :=
\ A 2 eff
+ i
{o}
D
i
{o} 361Koehler & Hoffmann These are exactly the facts that are always deleted
by o when achieving A, no matter which
effect we use.
The intersection of the sets D{o} for all actions o yields the desired set
F
A DA
. Let us
consider the following small example to clarify the computation.
036
0 {o}= fU g 000! fW g f:X g;
036
1 {o} = fV ; W g 000! fAg f:X g;
036 2
{o} = fW g 000! fU g f:Y g
We obtain D
1
{o} = f:X g [ f:Yg = f:X; :Y g, because the precondition of 036
2
{o} is implied by the first conditional effect 036
1
{o}. As 036
1
{o} is the only effect that can achieve A,
we get D{o} = D
1
{o} = f:X; :Y g.
We obtain a smaller set D{o}, if we add A as an unconditionalpositive effect
of the
action.
036
0
{o} = fU g 000! fW; Ag f:X g;
036
1
{o} = fV ; W g 000! fAg f:X g; 036
2
{o} = fW g 000! fU g f:Y g
In this case, weneed to intersect the sets D
0
{o} = f:X g and D 1
{o} = f:X; :Y g,
yielding D{o} = f:X g. This reflects the fact that, when achieving A via
the unconditional
effect of o, only X gets removed from the state.
The fixpoint computation requires to adapt the computation of O
003
. First, we repeat the
same steps as in the case of simple STRIPS actions and consider the unconditionalnegative
effects and the intersection of the preconditions with the False set: O
003
:= O n fo j A 2 eff 000
0
{o}_ F
A
DA \ pre
0
{o} 6= ;g
Then, we additionallyremove from each action the conditional effects that
either imply the
deletion of A or have an impossibleeffect condition.
O 003
:= red{O
003
} = fred{o}jo 2 O 003
g
Here, red is a function red{o} : o 7! o
0 such that
010{o 0
}= 010{o}n f036
k
{o} j A 2 D
k
{o} _ pre k
{o} \ F
A
DA
6= ;g
Finally, we need to redefine Definition 12, which expresses the conditions
under which a
fact is believed to be possibly achievable given a certain set of operators
O.
Definition 18 {Possibly Achievable Atoms for ADL} An atom p is possibly
achiev- able given an action set O {written pA{p; O}} if and only if
9 o 2 O; 036
i
2 010{o} : p 2 eff
+
i
{o} ^
8 p
0
2 {pre
i
{o} [ pre
0
{o}) : 9 o
0
2 O; 036
i
0
2 010{o
0
} : p
0
2 eff
+
i
0
{o
0 }
holds, i.e., there is a positive effect for p and al l of its conditions
and pre c onditions can be
made true by other effects in the re duc e d action set. 362On Reasonable
and Forced Goal Orderings
The process, which decides whether an atomic goal B is heuristically ordered
before another
goal A {i.e., whether B 024
h A holds} proceeds in exactly the same way as described in
Section 3.2: The False set F A
DA
for A is reduced by the fixpoint computation, which remains
unchanged, but employs the updated routines for computing O
003
and for deciding pA{f ; O
003 }.
As a result, B is ordered before A {B 024 h
A} if and only if it is not possibly achievable
pA{B ; O
003 } using the action set that results from the fixpoint.
4. The Use of Goal Orderings During Planning After having determined the
ordering relations that hold between pairs of atomic goals
from a given goal set, the question is how to make use of them during planning.
Several
proposals have been made in the literature, see Section 6 for a detailed
discussion. In this paper, we propose a novel approach that extracts an explicit
ordering between subsets of
the goal set|called the goal agenda. The planner, in our case IPP, is then
run successively on the planning subproblems represented in the agenda.
4.1 The Goal Agenda The first step one has to take for computing the goal
agenda is to perform a so-called goal
analysis. During goal analysis, each pair A; B 2 G of atomic goals must
be examined in
order to find out whether an ordering relation A 024 B , or B 024 A, or
both, or none holds
between them. For the ordering relation 024, an arbitrary definition can
be used. In our experiments, the relation 024 was always either 024
e
or 024
h .
After having determined all ordering relations that hold between atomic
goals, we want
to split the goal set into smaller sets based on these relations, and we
want to order the
smaller sets, also based on these relations. More precisely, our goal is
to have a sequence of goal sets G
1
; : : : ; G
n
with
n
[
i=1
G
i = G
and
G i
\ G
j = ;
for i 6= j; 1 024 i; j 024 n. We also want the sequence of goal sets to
respect the ordering relations that have been derived between atomic goals.
To make this explicit, we first
introduce a simple representation for the detected atomic orderings: the
goal graph G.
G := {V ; E }
where
V := G
and
E :=f{A; B } 2 G 002 G jA 024 B g
Now, the desired properties, which the sequence of goal sets should possess,
can be easily
stated:
363Koehler & Hoffmann
* Goals A; B thatlie on a cycle in G belong to the same set, i.e., A; B
2 G
i .
* If G contains a path from a goal A to a goal B , but not vice versa, then
A is ordered
before B , i.e., A 2 G
i and B 2 G
j with i < j .
These are the only properties that appear to be reasonable for a goal-set
sequence respecting the atomic orderings. We willnow introduce a simple algorithmic
method that does produce
a sequence of goal sets which meets these requirements.
First of all, the transitive closure of G is computed. This can be done
in at most cubic time in the size of the goal set {Warshall, 1962}. Then,
for each node A in the transitive
closure, the ingoing edges A
in and outgoing edges A
out are counted. All disconnected nodes with A
in
= A
out
= 0 are moved into a separate set of goals G-sep containing now those
atomic goals, which do not participate in a 024 relation. For all other
nodes A, their degr e e
d{A} = A
in
000 A
out
is determined as the difference between the number of ingoing edges and
the number of outgoing edges. Nodes with identical degree are merged into
one set. The
sets are then ordered by increasing degree and yield our desired sequence
of goal sets. The
only problem remaining is the set G-sep. If it is non-empty, it is not clear
in which place to put it.
Let us consider a small example of the process. Figure 3 depicts on the
left the goal
graph, which results from the goal set G = fA; B ; C; D; E g and the ordering
relations
A 024 B ; B 024 C and B 024D, and its transitive closure on the right.Jana
KoehlerABCDEEABDCFigure 3: On the left, the goal graph depicting the 024
relations between the atomic subgoals.
On the right, the transitive closure of this graph. In Figure 4, the number
of in- and outgoing edges of each goal, the corresponding degrees,
and resulting goal-set sequence are shown.
Jana Koehler00E-3-12EG-sep0312002A2DCB{A}{B}{C,D}Figure 4: On the left, the
number of in- and outgoing edges for each node. On the right,
the degree of the nodes and the merged sets of goals having same degree.
The
node E becomesa member of the G-sep set and remains unordered.
It is not difficult to verify that the resulting goal sequence respects
the atomic goal orderings:
364On Reasonable and Forced Goal Orderings
* Nodes occurring on a cycle in a graph have isomorphicin- and outgoing
edges in the transitive closure of that graph. In particular, they have the
same degree and get
merged into the same set G
i
.
* Say we have a graph, where there is a path from A to B , but not vice
versa. Then,
in the transitive closure of that graph, we willhave an edge from A to each
node
that B has a path to, and additionally the edge from A to B , i.e., A out
> B
out follows. Similarly, we have an ingoing edge to B for each node that
has a path to
A, and additionally, the edge from A to B , whichgives us B
in
> A
in
. Altogether,
d{A} = A
in
000 A
out < B
in
000 A
out
< B in
000 B
out
= d{B } and thus, the degree of A is
smaller than the degree of B and as required, A gets ordered before B .
Note that nothing is said in this argumentation about the set of unordered
goals, G-
sep. This set could, in principle, be inserted anywhere in the sequence
with the resulting
sequence still respecting the atomic orderings. A possible heuristic may
use this goal set as
the first in the sequence, because apparently there is no problem to reach
all other goals
after the goals in this set have been achieved. Another heuristic could
put this set at the end as there is neither a problem to reach this goal
set from all other goals. We have decided to
deal with the problem in a more sophisticated way by trying to derive an
ordering relation
between G-sep and the other goal sets G i
that have already been derived. In order to do
so,we need to extend our definitions of goal orderings to sets of goals.
4.2 Extension of Goal Orderings to Goal Sets
Given a set of atomic goals, it has always been a problem which of the exponentially
many
subsets should be compared with each other in order to derive a reasonable
goal ordering
betweengoal sets. A consideration of all possible subsets is out of question,
because it will
result in an exponential overhead. The partial goal agenda that we have
obtained so far
offers one possible answer. It suggests taking the set G-sep and trying
to order it with
respect to the goal sets emerging from the goal graph. Given a planningproblem
{O; I ; G } and two subsets of atomic goals fA
1 ; : : : ; A
n g 022 G
and fB
1
; : : : ; B k
g 022 G , the definition of 024
e and 024
h
for sets of atomic goals is straightforward. For the sake of simplicity,
we consider only STRIPS actions here. The definitions can be
directly extended to ADL.
To define an ordering 024
E
, which extends 024
e
to sets, we begin by defining a set F fA
1
;:::;A n
g
GP
of all atoms, which are exclusive of at least one atomic goal A
i
in the planninggraph
generated for {O; I ; G }:
F
fA 1
;:::;A
n g
GP
:= fp j p is exclusive of at least one A
i
when the graph has leveled off g
The set O fA
1
;:::;A n
g
is obtained accordingly by removing from O all actions that delete
at least one of the A
i
, i.e., O fA
1
;:::;A n
g
= fo 2 O j 8 i 2 f1; : : : ; ng : A
i
62 del{o}g.
Definition 19 {Ordering 024
E
over Goal Sets} Let {O; I ; G } be a planning problem with
fA
1 ; : : : ; A
n g 022 G andfB 1
; : : : ; B k
g 022 G . Let F
fA
1
;:::;A
n g
GP
be the False set for fA 1
; : : : ; A n
g.
The ordering fB
1 ; : : : ; B
k g 024
E
fA
1
; : : : A n
g holds if and only if 9 j 2 f1; : : : ; kg : 8 o 2 O
fA
1 ;:::;A
n
g : B
j
2 add{o} } pre{o} \ F
fA 1
;:::;A
n g
GP
6= ;:
365Koehler & Hoffmann In a similar way, 024 h
can be extended to 024
H
. For each A
i
, the sets F
A
i
DA are determined
based on Equation {3}. The set F
fA 1
;:::;A
n g
DA
is simply the union over the individual sets:
F
fA
1
;:::;A
n
g
DA
:=
[
i
F
A
i DA
{4}
Then the fixpoint computation is entered with
O
003
:=O n fo 2 O j 9 i 2 f1; : : : ; ng : A i
2 del{o} _ F
fA
1
;:::;A
n g
DA
\ pre{o} 6= ;g {5}
The recomputation of O 003
in each iteration of the fixpoint algorithm from Figure 1 is done accordingly.
Apart from this, the algorithm remains unchanged.
Definition 20 {Ordering 024
H
} Let {O; I ; G } be a planning problem with fA
1 ; : : : ; A
n g 022
G and fB
1
; : : : ; B k
g 022 G . Let O
003
be the set of actions that is obtained by performing
the fixpoint computation shown in Figure 1, modifie d to hand le sets of
facts as defined in Equations {4} and {5}. The ordering fB
1 ; : : : ; B
k g 024
H
fA
1
; : : : ; A n
g holds if and only if 9 j 2 f1; : : : ; kg : :pA{B
j
; O
003
}
All given goal sets then undergo goal analysis, i.e., each pair of sets
is checked for an
ordering relation 024
E or 024
H
. Each derived relation defines an edge in a graph with the
subgoal sets as nodes. The transitive closure is determined as before, and
the degree of
each node is computed. If the graph contains no disconnected nodes, then
a total ordering
over subsets of goals results by ordering the nodes based on their degree.
This ordering defines the goal agenda. In the case of disconnected nodes,
we default to the heuristic of
addingthe corresponding goals to the last goal set in the agenda.
4.3 The Agenda-Driven Planning Algorithm Given a planning problem {O; I
; G }, let us assume that a goal agenda G 1
; G
2 ; : : : ; G
k with
k entries has been returned by the analysis. Each entry contains a subset
G
i
022 G . The
basic idea for the agenda-driven planning algorithm is now to first feed
the planner with
the original initial state I
1
:= I and the goals G
1
:=G
1
, then execute the solution plan P
in I , yieldingthe new initial state I
2
= Result{I
1 ; P }. Then, a new planning problem is
initializedas {O; I
2
; G 2
}. After solving this problem, we want the goals in G 2
to be true,
but we also want the goals in G
1
to remain true, so we set G
2 := G
1
[ G
2
. The continuous
merging of successive entries from the agenda yields a sequence of incrementally
growing
goal sets for the planner, namely
G
i
:=
i
[
j=1
G
j
Ina little more detail, the agenda-driven planning algorithm we implemented
for IPP works as follows. First, IPP is called on the problem {O; I ; G
1
} and returns the plan P
1
, which
achievesthe subgoal set G
1
. P
1
is a sequence of parallel sets of actions, which is returned by IPP similarly
to graphplan. Given this plan, the resulting state R{I ; P
1 } = I
2
is
366On Reasonable and Forced Goal Orderings
computed based on the operational semantics of the planning actions. 6
In the case of a set of STRIPS actions, one simply adds all ADD effects
to and deletes all DEL effects from astate description in order to obtain
the resulting state, following the Result function in
Definition 2. For STRIPS, the Result function coincides directly with the
R function. In the case of a set of parallel ADL actions, one needs to consider
all possible linearizations of the parallel action set and has to deal with
the conditional effects separately. For each
linearization, a different resulting state can be obtained, but each of
them will satisfy the
goals. To obtain the new initial state I
2
, one takes the intersection of the resulting states
for each possible linearization of the actions in a parallel set. This means
to compute n!
linearizations for a parallel action set of n actions in each time step.
Since n is usually
small {more than 5 or 6 ADL actions per time step are very rare}, the practical
costs for this computation are neglectible. This way, givena solution to
a subproblem{O; I
i
; G
i }, one calculates the new initial state I
i+1
and runs the planner on the subsequent planning problem {O; I
i+1
; G
i+1 } until
the planningproblem {O; I
k
; G
k
} is solved. The plan solving the original planning problem {O; I ; G }
is obtained by taking the
sequenceof subplans P
1
; P
2
; : : : ; P
k
. One could argue that planningfor increasing goal
sets can lead to highly non-optimal plans. But IPP still uses the \no-ops
first" strategy to
achieve goals, which was originally introduced in the graphplan system {Blum
& Furst,
1997}. Employing this strategy, the graphplan algorithm, in short, first
tries to achieve
goals by simply keeping them true, if possible. Since all goals G
1
; G 2
; : : : ; G i
are already
satisfied in the initial state I i+1
, starting from which the planner tries to achieve G i+1
, this
strategy ensures that these goals are only destroyed and re-established
if no solution can be found otherwise. The no-ops first strategy is merely
a graphplan feature, but any
reasonable planning strategy should preserve goals that are already true
in the initial state
whenever possible.
The soundness of the agenda-driven planning algorithm is obvious because
G
k
= G and
we have a sequence of sound subplans yielding a state transition from the
initial state I to
astate satisfying G .
The completeness of the approach is less obvious and holds only if the planner
cannot
make wrong decisions before finally reaching the goals. More precisely,
the approach is
complete on problems that do not contain dead locks as they were introduced
in Definition 14.
Theorem 8 Given a solvable planning problem {O; I ; G }, and a goal agenda
G
1
; G 2
; : : : G k
with G
i 022 G
i+1
and G
k
= G . Running any complete planner in the agenda-driven manner
describe d above wil l yield a solution if the problem is dead lock-fr e
e.
Proof: Let us assume the planner does not find a solution in step i of the
agenda-driven
algorithm, i.e., no solution is found for the subproblem {O; I
i
; G i
}. As the planner is assumed
to be complete on each subproblem, this implies unsolvability of {O; I
i ; G
i
}. If this problem
is not solvable, then neither is the problem {O; I
i
; G } solvable, since G i
022 G holds. Therefore, the goals cannot be reached from I
i
. Furthermore, I
i
is a reachable state|it was reached
by executing the partial solution plans P
1 ; : : : ; P
i0001
in the initial state. Consequently, I
i must be a deadlock state in the sense of Definition 14, which is a contradiction.6.
See {Koehler et al., 1997} for the exact definition of R, which we do not
want to repeat here.
367Koehler & Hoffmann This result states the feasibility of our approach:
As we have shown, most benchmark
problems that are currently investigated do contain inverse actions, are
therefore invertible
{Theorem 7}, and are with that also deadlock-free {Theorem 6}. Thus, with
Theorem 8, our approach preserves completeness in these domains.
However inthe general case, completenesscannot be guaranteed. The following
example
illustrates a situation where the assumption s
{A;:B}
6j= p {assuming that preconditions of achieving actions are not contained
in the state where A is reached, cf. the derivation of the
ordering 024 h
in Section 3} is wrong and yields a goal ordering under which no plan can
be
foundanymore although the problem is solvable.
Given the initial state fC; Dg and the goals fA; B g, the planner has the
following set
of ground STRIPS actions : op1: fC g 000! ADD fB g DEL fDg
op2: fDg 000! ADD fE g
op3: fE g 000! ADD fF g op4: fF g 000! ADD fAg
The analysis will return an ordering B 024 h
A because B is only added by op1, but its precondition C is not an effect
of any of the other actions. Thus it concludes that C is
not reachable from a state in which A holds. But in this example, C holds
in all reachable
states. The assumption s
{A;:B}
6j= C as made by the test pA{B ; O
003 } is wrong. Thus, B can be reached after A. On the other hand, A 024
r
B holds, we even have a forced ordering
A 024 f
B . But when testing for A 024
h
B , this ordering remains undetected, because our
method does not discover that the precondition F of op4 is not achievable
from the state
in which B holds: we obtain F B
DA
= fDg, which excludes op2 from O
003
, but op3 and op4
remain in the set of usable actions. Thus, op4 is considered a legal achiever
of A, and op3 isconsidered a legal achiever for its precondition F . We could
only detect the right ordering
if we regressed over the action chain op4, op3, op2 and found out that,
with D being in
the F set of B , all these actions must be excluded from O
003 .
Consequently, the goal agenda fB g; fAg is fed into the planner, which solves
the first
subproblem using op1, but then fails in achieving A from the state fB ;
C g since there is no inverse action to op1 and D cannot be re-established
in any other way.
5. Empirical Results
We implemented both methods to approximate 024
r as a so-called Goal Agenda Manager {GAM} for the IPP planningsystem {Koehler
et al., 1997}. GAM is activated after the
set of ground actions has been determined and either uses 024 e
or 024
h to approximate the
reasonable goal ordering. Then it calls the IPP planningalgorithm on each
entry from the
goal agenda and outputs the solution plan as the concatenation of the solution
plans that have been found for each entry in the agenda.
77. The source code of GAM, which is based on IPP 3.3, and the collection
of domains from which
we draw the subsequent examples can be downloaded from http://www.informatik.uni-freiburg.de/~
koehler/ipp/gam.html. All experiments have been performed on a SPARC 1/170.
368On Reasonable and Forced Goal Orderings
The empirical evaluation that we performed uses the IPP domain collection,
which con- tains 48 domains with more than 500 planningproblems. Out of these
domains, we were able to derive goal ordering information in 10 domains.
These domains indeed pose con- straints on the ordering in which a planner
has to a achieve a set of goals. In all other
domains, where no goal orderings could be derived, we found that either
only a single goal
has to be achieved, for example in the manhattan, movie, molgen, and montlake
domains
or the goals can be achieved in any order, as for example in the logistics,
gripper, and ferry
domains. We found no benchmark domain, in which a natural goal ordering
existed, but
our method failed to detect it. As a matter of fact, looking at a goal ordering
that seems to be natural, one usuallyfinds that the ordering is reasonable
in the sense of Definition 8, see
for example the blocks world, woo dshop, and tyreworld domains. Our method
finds almost
all of the reasonable orderings, which indicates that both approximation
techniques 024
e
and
024
h
are appropriate for detecting ordering information.
In the following, we will first compare the 024
e and 024
h
techniques in terms of runtime
and number of goal agenda entries generated. Then we take a closer look
at the agendas
thatare generated in selected domains and investigate how they influence
the performance
of the IPP planning system. The exact definitionof all domains can be downloaded
from
the IPP webpage, we just give the name of the domain and the name of the
particular
planning problem as well as the number of {ground} actions a domain contains,
because this parameter nicely characterizes the size of a domain and with
that usually the difficulty
to handle it.
In all examples, the times shown to compute the goal agenda contain the
effort to
parse and instantiate the operators, i.e., to compute the set of actions.
Times for parsing
and instantiation are not listed explicitly, because they are, on the test
examples used here,
usually very close to zero and do not influence the performance of the planner
in a significant way.
5.1 Comparison of 024
h
and 024
e
We begin our comparison with a summary of results that we obtained in different
represen-
tational variants of the blocks world. The bwlargea to bwlarged examples
originate from
the SATPLAN test suite {Kautz & Selman, 1996} to which we added the larger
examples
bwlargee to bwlargeg. The par cplan example comes from {El-Kholy & Richards,
1996}
and uses multiple grippers and limited space on the table. The stackn examples
use the
graphplan blocks world representation and simply require to stack n blocks
on each other,
which are all on the table in the initialstate.
The two methods return exactly the same ordering relations across all blocks
world
problems. But as Figure 5 confirms, the computation of 024
e
based on planning graphs is
much more time-consuming. It hits the computational border when a domain
contains more
than 10000 actions. The computation of 024
h is much faster and also scales to larger action
sets.
369Koehler & Hoffmannproblem#actions#agenda entriesCPU{024
e }CPU{024
h
}bwlargea16210.690.07bwlargeb24251.450.11bwlargec45074.850.22bwlarged7221114.180.35bwlargee7221112.950.35bwlargef1250644.930.58bwlargeg1800997.110.88parcplan1960425.841.47stack20800196.910.36stack40320039160.001.74stack60720059840.424.85stack801280079-11.38Figure
5: Comparison of 024 e
and 024
h
on blocks world problems. #actions shows the number of actions in the set
O, from which the planner tries to construct a plan. #agenda
entries says how many goal subsets have been detected and ordered by GAM.
Column 4 and 5 display the CPU time that is required by both methods to
computethe agenda when provided with the set O. A dash will always mean that
IPP ran out of memory on a 1 Gbyte machine.
Figure 6 and Figure 7 show the results for the other domains, in which our
method
is able to detect reasonable orderings. Figure 6 lists the domains, in which
both methods
return the same goal agendas. The tyreworld, hanoi, andfridgeworld domainsoriginate
from
UCPOP {Penberthy & Weld, 1992}, whilethe link-rep e at domain can be found
in {Veloso & Blythe, 1994}. The performance results coincide with those shown
in Figure 5. Figure 7
shows the same picture in terms of runtime performance, but in these domains
different
agendas are returned by 024
e
and 024 h
.
The woo dshop andscheduling domains contain actions with conditional effects,
while
the other domains only use STRIPS operators. The computation of 024 e
fails to derive goal orderings for all scheduling world problems {of which
we only display the largest problem
sched6} and for the woo d1 problem. The explanation for this behavior can
be found in the
different treatment of conditional effects by both methods. IPP does only
find a very limited form of mutex relations between conditional effects when
building the planning graph. A
goal, which is achieved with a conditional effect, will not very often be
exclusive to a large
number of other facts in the graph. Thus, the F sets are very small or sometimes
even empty
and consequently, only very few actions can be excluded when performing
the reachability analysis and thus, reasonable orderings may remain undetected.
Direct analysis investigates
the conditional effects in more detail and is therefore able to derive much
larger F sets.
The behavior of the 024
h
method in the STRIPS domains bul ldozer, glassworld, and
shopping world iscaused by the same phenomenon. In these domains, one can
derive much
largerF sets using planning graphs and in turn these sets exclude more actions.
Since direct
analysis finds smaller or empty F sets, it also finds less 024
h
relations. The woo dshop domain
370On Reasonable and Forced Goal Orderingsdomainproblem#actions#agenda entriesCPU{024
e }CPU{024
h
}tyreworldfixit12660.050.01fixit25960.200.03fixit310860.450.06fixit417360.840.10fixit525461.560.15fixit10899616.290.64hanoihanoi34830.050.02hanoi49040.100.04hanoi515050.190.08hanoi623160.350.12hanoi733670.630.19fridgeworldfridge77920.770.55link-repeatlink103120.190.01link303120.210.01Figure
6: Comparison of 024
e
and 024
h
on those benchmark domains, in which they return
identical agendas.domainproblem#actions#agenda entriesCPU{024
e }CPU{024 h
}bulldozerbull612/10.090.03glassworldglass1262/10.020.01glass21142/10.190.09glass31222/10.220.09shoppingworldshop812/10.070.02schedulingsched61041/401.00.12woodshopwood1151/30.030.01wood2156/50.030.01wood3436/50.140.06Figure
7: Domains in which 024
e and 024
h
return different goal agendas, which we give in the
form n
1 =n
2
. The number before the slash says how many entries are contained
in the agenda computed by 024
e
, the number following the slash says how many
entries are contained in the agenda computed by 024
h
. #agenda entries=1 means that the agenda contains only a single entry,
namely the original goal set, and no ordering was derived.
shows that the results can differ within the same domain, but depending
on the specific planning problem. The problem woo d2 varies from the problem
woo d1 in the sense that one goal is slightly different|an ob ject needs
to be put into a different shape|and that two
more goals are present. While there are no goal orderings derived between
pairs of the old 371Koehler & Hoffmann goalsfrom woo d1, lots of 024
e
relations are derived between mixed pairs of old and new goals
in woo d2, yielding a detailed goal agenda. The problem woo d3 contains
additional ob jects
and many more goals, which can also be successfully ordered.
In the subsequent experiments, we decided to solely use the heuristic ordering
024 h
because
the computation of 024
h
is less costly than the computation of 024
e
in all cases, yielding
comparable agendas in most cases. In the three domains that we investigate
more closely,
namely the blocks world, tyreworld andhanoi domains, the agendas derived
by both methods
are, in fact, exactly the same. 5.2 Influence of Goal Orderings on the Performance
of IPP and Interaction with RIFO
In this section, we analyze the influence of the goal agenda on the performance
of IPP
and combine it with another domain analysis method, called RIFO {Nebel,
Dimopoulos, &
Koehler, 1997}. RIFO is a family of heuristics that enables IPP to exclude
irrelevant actions and initial facts from a planningproblem. It can be very
effectively combined with GAM,
because if IPP plans for only a subset of goals from the original goal set,
it is very likely thatalso only a subset of the relevant actions is needed
to find a plan. More precisely, we
obtain one subproblemfor each entry in the agenda, and, for each such subproblem,
we use RIFO for preprocessing before planning with IPP. In this configuration,
GAM reduces
the search space for IPP by decreasing the number of subgoals the planner
has to achieve
at each moment, while RIFO reduces the search space dramatically by selecting
only those actions that are relevant for this goal subset.
5.2.1 The Blocks World Figure 8 illustratesthe par cplan problem{El-Kholy
& Richards, 1996} indetail. Seven
robot arms can be used to order 10 blocks into 3 stacks on 5 possible positions
on the table.Jana Koehler232123231131244221451321114131211224232221332311Figure
8: The par cplan problemwith limited space on the table, seven robot arms,
and
several stacks.
The goal agenda derived by IPP orders the blocks into horizontal layers:
1: on-table{21, t2} ^ on-table{11, t1}
2: on-table{31, t3} ^ on{22, 21} ^ on{12, 11}
3: on{32, 31} ^ on{13, 12} ^ on{23, 22} 4: on{14, 13} ^ on{24, 23} 372On
Reasonable and Forced Goal Orderings
The optimal plan of 20 actions solving the problem is found by IPP using
GAM in 14 s,
where it spends one second on computing the goal agenda, almost 13 seconds
to build the
planning graphs, but only 0.01 second to search for a plan. Only 70 actions
have to be tried to find the solution. Without the goal analysis, IPP needs
approx. 47 s and searches 52893
actions in more than 26 seconds.
RIFO {Nebel et al., 1997} failsin detecting a subset of relevant actions
when the original
goalset has to be considered, but it succeeds in selecting relevant actions
for the subproblems
stated in the agenda. It reduces runtime down to less than 8 s with 1 s
again spent on the
goal agenda, almost6 s spent on the removal of irrelevant actions and initial
facts, less
than 1 s spent on building the planning graphs. As previously, almost no
time is spent on planning.
Figure 9 shows IPP on the SATPLAN blocks world examples from {Kautz & Selman,
1996}, the bwlarge.e example taken from {Dimopoulos, Nebel, & Koehler, 1997},
and two very large examples bwlarge.f {containing 25 blocks and requiring
to build 6 stacks in the goal state} and bwlarge.g with 30 blocks/8 stacks.SATPLAN#
actionsplan lengthIPP+G+G+R+G+R+Lbwlarge.a16212 {12}0.700.740.580.34bwlarge.b24222
{18}26.710.860.550.52bwlarge.c45048-7.342.422.58bwlarge.d72254-11.623.743.81bwlarge.e72252-11.143.993.97bwlarge.f125090---16.01bwlarge.g180084--117.5628.71Figure
9: Performance on the extended SATPLAN blocks world test suite. The second
column shows the number of ground actions in this domain, the third column
shows the plan length, i.e., the number of actions contained in the plan,
generated by GAM and in parentheses the plan length generated by IPP without
GAM given that IPP without GAM is able to solve the corresponding problem.
+G means that IPP is using GAM, +G+R means IPP uses GAM and RIFO, +G+R+L
means that subgoals from the same set in the agenda are arbitrarily linearized.
All runtimes cover the whole planning process starting with parsing the
operator
and domain file, performing the GAM and RIFO analysis {if active}, andthen
searching the graph until a plan is found.
IPP 3.3 without GAM can only solve the bwlarge.a and bwlarge.b problems.
Using a
goal agenda, some plans become slightly longer, but performance is increasing
dramatically. Plan length is growing because blocks are accidentally put
in positions where they cut off
goals that are still ahead in the agenda and thus, additional actions need
to be added to
the plan to remove these blocks from wrong positions. A further speed-up
is possible when
RIFO is additionally used, because it reduces the size of planning graphs
dramatically. Finally , goals that belong to the same subset in the agenda
can be linearized based on the
373Koehler & Hoffmann heuristic assumption that if the analysis found no
reasonable goal orderings, then the goals are achievable in any order. With
this option, the problems are solved almost instantly.
The reader may wonder at this point why we use linearization of agenda entries
only
as an extra option and do not investigate it further. There are two reasons
for that. First,
linearization does have negative side effects in most domains that we investigated.
For
example, it yields much longer plans in the logistics domain and all its
variants. When
linearizing the single entry that the agenda for a logistics problem contains,
all packages get
transported to their goal position one by one. Of course, thistakes much
more planning steps than simultaneously transporting packages withcoinciding
destinations. Secondly, the effects of linearization are somewhat unpredictible,
even in domains where it usually tends to yield good results. This is because
GAM does not recognise al l inter-
actions between goals. Consider a blocks world problem with four blocks
A, B , C and D. SayB is positioned on C initially, the other blocks being
each on the table, and the goal is to have on{A; B } and on{C; D}. The agenda
for this problem will comprise a single entry
containing both goals. In fact, there is no reasonable goal ordering here.
Nevertheless,
stacking A onto B immedeatly is a bad idea, as the planner needs to move
C to achieve
on{C; D}. Being not aware of this, GAM might linearize the single agenda
entry to have
on{A; B } up front, which makes the problem harder than it actually is.
Thus, the runtime
advantages that linearization sometimes yields on the blocks world can be
more or less seen
as cases of \good luck". Figure 10 shows IPP on the stackn problems. IPP
without any domain analysis can handle up to 12 blocks in less than 5 minutes,
but for 13 blocks more than 15 minutes are
needed. Using GAM, 40 blocks can be stacked in less than 5 minutes. Using
GAM and
RIFO, the 5 minutes limit is extended to 80 blocks, whilestack100 is solved
in 11.5 min where 11.3 min are spent for both analysis methods and only 0.2
min are needed for building
the planninggraphs and extracting a plan.
Jana Koehler708010203040506090100blocksin stime IPPIPP+G600450300150IPP+G+RFigure
10: IPP 3.3 on a simple, but huge stacking problem. Figure 11 shows the sharing
of the overall problem-solving time between GAM, RIFO and the IPP search
algorithm on blocks world problems. Similar results are obtained in the
tyreworld. GAM takes between 3 and 16 045, RIFO takes between 75 and 96
045, and the
search effort is reduced down to approx. 1 045. The overall problem solving
time is clearly
determined by RIFO, while the search effort becomes a marginal factor in
the determination of performance. This indicates that a further speed-up
is possible when improving the
374On Reasonable and Forced Goal Orderings
performance of GAM and RIFO. It also indicates that even the hardest planning
problems
can become easy if they are structured and decomposed in the right way.problem#
actionsGAMRIFOsearch algorithmstack208000.31 = 16 0451.44 = 75 0450.13
= 7 045stack4032001.57 = 7 04518.77 = 90 0450.51 = 2 045stack6072004.40
= 4 04593.10 = 94 0451.15 = 1 045stack80128009.60 = 3 045283.60 = 96
0452.33 = 1 045parcplan19600.86 = 12 0455.52 = 76 0450.83 = 11 045Figure
11: Distribution of problem-solving time on blocks world examples between
GAM,
RIFO, and the search algorithm, which comprises the time to buildand search
the planning graph. The remaining fraction of total problem-solving time,
which
is not shown in the table, is spent on parsing and instantiating the operators.
5.2.2 The Tyreworld Thetyreworld problem, originally formulated by Stuart
Russell, asks a planner to find out
how to replace a flat tire. It is easily solved by IPP within a few milliseconds.
The problem becomes much harder if the number of flat tires is increasing,
cf. Figure 12.Tires# actionsIPP+G+R+G+R+LSearch Space1260.10 {12/19}0.15
{14/19}0.16 {17/19}1298/8825917.47 {18/30}0.41 {24/32}0.32 {30/34}1290182/2103108-2.87
{32/44}0.63 {41/46}-/3664173--1.12 {52/60}-/5655254--1.93 {63/73}-/8076353--3.42
{73/85}-/10927464--4.81 {84/98}-/14208593--8.07 {95/121}-/17919738--11.27
{106/124}-/ 220510899--16.89 {118/136}-/2662Figure 12: IPP in the Tyreworld.
The numbers in parentheses show the time steps, followed
by the number of actions in the generated plan. The last column compares
the
search spaces. The number before the slash shows the \number of actions
tried" parameter for the plain IPP planning algorithm, whilethe number following
the slash shows the \number of actions tried" for IPP using GAM, RIFO, and
the linearization of entries in the agenda. A dash means that the \number
of actions tried" is unknown because IPP failed in solving the corresponding
planning problem.
375Koehler & Hoffmann IPP is only able to solve the problem for 1 and 2
tires. Using GAM and RIFO, 3
tires can be handled. Solution length under GAM is slightly increasing,
which is caused
by superfluous jack-up and jack-down actions. In short, this is explained
as follows. Each
wheel needs to be mounted on its hub, which is expressed by an on{?r, ?h}
goal. To mount
a wheel, its hub must be jacked up. After mounting, the nuts are done up.
Then, the hub needs to be jacked down again, in order to tighten the nuts
achieving a tight{?n, ?h} goal.
Now, GAM puts all of the on goals into one entry preceeding the tight goals.
Thus, solving the entry containing the on goals, each hub is jacked up, the
wheel is put on, and the hub
is immediatly jacked down again in order to replace the next wheel. Afterwards,
solving
thetight goals, each hub must be jacked up|anddown|one more time for doing
up the nuts. Solving the problem in this manner, the planner inserts one
superfluous jack-up, and
one superfluousjack-down action for each wheel. More precisely, superfluous
actions are
inserted for all but one wheel, namely the wheel that is last mounted when
solving the on
goals. After mounting this wheel, all on goals are achieved, andthe planner
proceeds to
the next agenda entry with this wheel still being jacked up. Then, trying
to achieve the
tight goals, IPP recognizes that the shortest plan {in terms of the number
of parallel steps}
results when the nuts are first done up on the hub that is already jacked
up. Thus, this hub
is only jacked up one time, achieving the corresponding on goal, and jacked
down again one
time, before achieving its tight goal.
In the case of 3 tires, the following goal subsets are identified and ordered:
1: inflated{r3}, inflated{r2}, inflated{r1}
2: on{r3, hub3}, on{r1, hub1}, on{r2, hub2}
3: tight{n2, hub2}, tight{n3, hub3}, tight{n1, hub1}
4: in{w3, boot}, in{pump, boot}, in{w1, boot}, in{w2, boot}
5: in{jack, boot}
6: in{wrench, boot}
7: closed{boot} The hardest subproblem in the agenda is to achieve the on{r
i
; hub
i } goals in entry 2,
i.e., to mount inflated spare wheels on the various hubs. Trying to generate
a maximum par-
allelized plan is impossible for IPP for more than 3 tires. But since the
goals are completely independent of each other, any linearization of them
will perfectly work. The resulting
plans become slightly longer due to the way that the tight goals are achieved
when using
the -L option. We noticed earlier that for one wheel {the one that is last
mounted when solving the on goals} no superfluous jack-up and jack-down actions
need to be inserted into
the plan. Linearizing the agenda entries, superfluous jack-up and jack-down
actions must mostlikely be inserted for al l wheels, yielding plans that
are two steps longer. The reason
for that is that any tight goal might be the first in the linearization.
Most likely, this is
not the tight goal corresponding to the hub that is still jacked up, so
the planner needs to insert one superfluous jack-down action here. Later,
it must jack up this hub again, yielding
another superfluous action. Using +G+R+L in the case of 10 tires, only 2662
actions need
to be tried until a plan of 136 actions is found, which takes 0.08 s. GAM
requires 0.55 s, RIFO requires 14.42 s, 1.74 s are consumed to generate the
planning graphs, and 0.08 s are spent to compute the initial states for all
subproblems. The remaining 0.02 s are consumed for parsing and instantiating.
376On Reasonable and Forced Goal Orderings
5.2.3 The Tower of Hanoi
A surprisingresult is obtained in the tower of hanoi domain. In this domain,
a stack of discs
has to be moved from one peg to a third peg with an auxiliary second peg
between them,
but never a larger disc can be put onto a smaller disc. In the case of three
discs d1, d2, d3
of increasing size, the goals are stated as on{d3,pe g3}, on{d2,d3}, on{d1,d2}.
GAM returns
the following agenda, which correctly reflects the ordering that the largest
disc needs to be
put in its goal position first.
1: on{d3,peg3} 2: on{d2,d3}
3: on{d1,d2} The goal agenda leads to a partition into subproblems that
corresponds to the recursive formulation of the problem solving algorithm,
i.e., to solve the problem for n discs, the
planner first has to solve the problem for n 000 1 discs, etc. For the
first entry, a plan of 4
actions{time steps 0 to 3 below} is generated, which achieves the goal on{d3,pe
g3}.
8 Then
a plan of 2 actions {time steps 4 and 5} achieves the goals on{d3,pe g3}
andon{d2,d3} with
on{d3,pe g3} holding already in the initial state. Finally, a one-step plan
{time step 6} is generated that moves the third disc with the other two discs
being already in the goal
position.
time step 0: move{d1,d2,peg3}
time step 1: move{d2,d3,peg2}
time step 2: move{d1,peg3,d2}
time step 3: move{d3,peg1,peg3}
time step 4: move{d1,d2,peg1}
time step 5: move{d2,peg2,d3}
time step 6: move{d1,peg1,d2}
Surprisingly, IPP is not able to benefit from this information, but runtime
of IPP using GAM is exploding dramatically for increasing numbers of discs,
see Figure 13.discs#actionsIPPIPP +GUCPOPUCPOP on subproblems2210.020.020.12
{27}0.06 {17} + 0.02 {6}3480.080.078.00 {2291}0.18 {48} + 0.06 {13} + 0.01
{6}4900.330.25--51501.573.10--62319.7188.45--733669.442339.94--Figure 13:
Runtimes of IPP with and without the goal agenda on hanoi problems com-
pared to UCPOP without agenda and UCPOP on the agenda subproblemsusing
ZLIFO and the ibf control strategy.8. A move action takes as first argument
the disc to be moved, as second the disc from which it is moved,
and as third argument the disc or peg to which it is moved. 377Koehler &
Hoffmann We are not able to provide an explanation for this phenomenon, but
the divisioninto
subproblemscauses a much larger search space for the planner although the
same solution plans result. RIFO cannot improve on the situation because
it selects all actions as relevant.
The tower of hanoi domain is the only one we found where IPP's performance
is deteri-
orated by GAM. We do currently not see a way of how one can tell in advance
whether IPP
will gain an advantage from using GAM or not. The overhead caused by the
goal analysis itself is very small, but an \inadequate" split of the goals
into subgoal sets can lead to more
search, see also Section 6. However in this case, the phenomenon seems to
be specific to IPP. We simulated the
information that is provided by GAM in UCPOP and obtained a quite different
picture.
The fifth column in Figure 13 shows the runtime of UCPOP using ZLIFO {Pollack,
Joslin,
& Paolucci, 1997} and the ibf control strategy with the number of explored
partial plans in parentheses. UCPOP can only solve the problem for 2 and
3 discs. In the last column
of the figure, we show the runtime and number of explored partial plans,
which result
when UCPOP is run on the subproblems that result from the agenda. These
are exactly
the same subproblems which IPP has to solve, butthe performance of UCPOP
improves significantly. Instead of taking 8 s and exploring 2291 partial
plans, UCPOP only takes
0.18+0.06+0.01=0.25 s and explores only 48+13+6=67 plans. Unfortunately,
any problems
or subproblems with more than 3 discs remain beyond the performance of UCPOP.
The
performance improvement is independent of the search strategies used by
UCPOP. For
example, if ibf control is used without ZLIFO, the number of explored partial
plans is reduced from 78606 down to 2209 inthe case of the problem with 3
discs. Runtime improves
from 65 seconds down to 2 seconds. Similarly, when using bf control without
ZLIFO the
number of explored partial plans reduces from 1554 down to 873.
Knoblock {1994} also reports an improvement in performance for the Prodigy
planner
{Fink& Veloso, 1994} when it is using the abstraction hierarchy generated
for this domain
by the alpine module, which provides in essence the same information as
the goal agenda. 9
6. Summary and Comparison to Related Work
Many related approaches have been developed to provide a planner with the
ability to
decompose a planning problem by giving it any kindof goal ordering information.
Subse-
quently, we discuss the most important of them and review our own work in
the light of
theseapproaches. Our method introduces a preprocessing approach, whichderives
a total ordering for
subsets of goals by performing a static, heuristic analysis of the planning
problem at hand.
The approach works for domains described with STRIPS or ADL operators and
is based
on polynomial-time algorithms. The purpose of this method is to provide
a planner with
search control, i.e., we opt at deriving a goal achievement order and then
successively call
the planner on the totally ordered subsets of goals.
The method preserves the soundness of the planning system, but the completeness
only in the case that the planning domain does not contain deadlocks. We
argue that9. However, to find that goal ordering information, alpine requires
to represent the tower of hanoi domain
involving several operators, cf. {Knoblock, 1991}.
378On Reasonable and Forced Goal Orderings
benchmark domains quite often possess this property, whichis also supported
by other authors {Williams & Nayak, 1997}. The computation of 024
h and 024
e
requires only polynomial time, but both methods are incomplete in the sense
that they will not detect all reasonable goal orderings in the general case.
The complexity of deciding on the existence of forc e d and re asonable goal
orderings
has been proven to be PSPACE-hard in Section 2 and therefore, trading completeness
for
efficiency seems to be an acceptable solution. Our complexity results relate
to those found
by Bylander {1992} who proves the PSPACE-completeness of serial decomposability
{Korf,
1987}. Given a set of subgoals, serial dec omp osability meansthat previously
satisfied sub-
goals do not need to be violated later in the solution path, i.e., once
a subgoal has been achieved, it remains valid until the goal is reached.
The purpose of our method is to derive constraints that make those orderings
explicit under which no serial decomposability of a set
of goals can be found, i.e., we consider the complementary problem, which
is also reflected
in our complexity proofs.
In many cases, we found that the goal agenda manager can significantly improve
the
performance of the IPP planning system, butwe found at least one domain,
namely the
tower of hanoi, where a dramatic decrease inperformance can be observed
although IPP
still generates the optimal plan when processing the ordered goals from
the agenda. So
far, the complexity results of Bªackstrª om and Jonsson {1995} predictedthat
planningwith
abstraction hierarchies can be exponentially less efficient, but because
exponentially longer
plans can be generated. The idea to analyze the effects and preconditions
of operators and to derive ordering
constraints based on the interaction of operators can also be found in a
variety of approaches.
While we analyze harmful interactions of operators in our method by studying
the delete effects, the approaches described in {Dawsson & Siklossy, 1977;
Korf, 1985; Knoblock,
1994} concentrate on the positive interactions between operators. The successful
matching of effects to preconditions forms the basis to learn macro-op er
ators, see {Dawsson & Siklossy,
1977; Korf, 1985}.
The alpine system {Knoblock, 1994} learns abstraction hierarchies for the
Prodigy
planner {Fink & Veloso, 1994}. The approach is based on an ordering of the
preconditions
and the effects of each operator, i.e.,all effects of an operator must be
in the same abstraction
hierarchy and its preconditions must be placed at the same or a lower level
than its effects. Thisintroduces an ordering between the possiblesubgoals
in a domain, which is orthogonal
to the ordering we compute: In alpine, a subgoal A is ordered before a subgoal
B if
A enables B , i.e., A must be possibly achieved first in order to achieve
B . Our method orders A before B if A cannot be achieved without nec essarily
destroying B . The result of
alpine and GAM are a set of binary constraints. In the case of alpine, the
constraints are computed between all atoms in a domain, whileGAM restricts
the analysis to the goals only. Both approaches represent the binary constraints
in a graph structure. alpine
merges atomic goals together if they belong to a strongly connected component
in the graph.
GAM merges sets of goals together if they have identical degree. Then they
both compute
a topological sorting of the sets that is consistent with the constraints.
The resulting goal
orderings can be quite similar as the examples by Knoblock {1994} demonstrate,
but GAM approximates reasonable goal orderings in domains where alpine fails
in finding abstraction
hierarchies. Two further examples {Knoblock, 1991} are the tower of hanoi
domainusing
379Koehler & Hoffmann only one move operator and the blocks world. In both
domains, alpine cannot detect
the orderings because it investigates the operator schemata, not the set
of ground actions,
and therefore cannot distinguishthe orderings between different instantiations
of the same literal. Although alpine could be modified to handle ground actions,
this will significantly
increase the amount of computation it requires. GAM on the other hand, handles
large sets
of ground actions in an efficient way, in particular if dire ct analysis
is used. 10
An analysis, which is quite similar to alpine, but which is performed in
the framework
ofHTN planning, is described by Tsuneto et al. {1998}. The approach analyzes
the external conditions of methods, which cannot be achieved when decomposing
the method further.
This means, such conditions have to be established by the decomposition
of those methods,
which pre c e de the method using this external condition. Two strategies
to determine the
decomposition order of methods are defined and empirically compared. Here
lies the main difference to the other approaches described so far: Instead
of trying to automatically construct the decomposition orderings, they are
predefined and fixed for all domains and problems.
Harmful interactions among operators are studied by Smith and Peot {1993}
and Etzioni
{1993}. A thre at of an operator o to a precondition p occurs if there is
an instantiation of
o such that its effects are inconsistent with p {Smith & Peot, 1993}. The
knowledge about
threats is used to control a plan-space planner. In contrast to a state-space
plannersuch as
IPP, computing an explicit ordering of goals does not prevent the presence
of threats in a
partial plan because the order in which the goals are processed does not
determine the order
in which actions occur in the plan. The notion of forced and reasonable
goal orderings is
not comparable to that of a threat because a threat stillhas the potential
of being resolved by adding binding or ordering constraints to the plans.
In contrast to this, a forced or
reasonable goal ordering persists under all bindings and enforces a specific
ordering of the
subgoals.
Given a planning problem, sta tic {Etzioni, 1993} computes a backchaining
tree from the
goals in the form of an AND/OR graph, which it subsequently analyzes for
the occurrence of goal interactions that will nec essarily occur. This analysis
is much more complicated than ours, because sta tic has to deal with uninstantiated
operators and axioms, which
describe properties of legal states. The result of the analysis are goal
ordering rules, which
ordergoals if certain conditions are satisfied in a state. This is the main
difference to GAM, which generates explicitgoal orderings independently of
a specific state. It does not need to
extract conditionsthat a specific state has to satisfy because it considers
the generic state
s {A;:B}
in the analysis, which represents all states satisfying A, but not B . As
GAM, sta tic
is incomplete in the sense that it cannot detect all existing goal interactions.
The problem for GAM is that deciding reasonable orderings is PSPACE-hard,
as we have proven in this
paper. The problem for sta tic is that it has to compute the nec essary
effects of an operator in a given state. As Etzioni {1993} conjectures and
Nebel and Bªackstrª om {1994} prove, this10. Abstraction hierarchies are
more general than the goal orderings we compute. They cannot only serve
for the purpose of providing a planner with goal ordering information, but
also allow to generate plans at different levels of refinement, see also
{Bacchus & Yang, 1994}. Two other approaches generating abstraction hierarchies
based on numerical criticality values can be found in {Sacerdoti, 1974; Bundy,
Giunchiglia, Sebastiani, & Walsh, 1996}.
380On Reasonable and Forced Goal Orderings
problem is computationally intractable and therefore, any polynomial-timeanalysis
method
must be incomplete. Last, but not least there have been quite a number of
approaches in the late Eighties,
which focused directly on subgoal orderings. These fall into two categories:
The approaches
described in {Drummond & Currie, 1989; Hertzberg & Horz, 1989} focus on
the detection of
conflicts caused by goal interdependencies to guide a partial-order planner
during search. We
do not investigate these approaches in more detail here because they do
not extract explicit
goal orderings as a preprocess to planningas we do. The works described
in {Irani & Cheng,
1987; Cheng & Irani, 1989; Joslin & Roach, 1990} implementpreprocessing
approaches, which perform a structural analysis of the planning task to determine
an appropriate goal ordering before planning starts. Irani and Cheng {1987}
compute a relation 036 between
pairs of goals, which|roughly speaking|orders a goal A after a goal B if
B mustbe
achieved before A can be achieved. Their formalism is rather complicated
and the theoretical
properties of the relation are not investigated. In {Cheng & Irani, 1989},
theapproach is extended such that sets of goals can be ordered with respect
to each other. The exact
properties of the formalism remain unclear. In {Joslin & Roach, 1990}, a
graph-theoretical
approach is described that generates a graph with all atoms from a given
domain description
as nodes and draws an arc between a node A and a node B if an operator exists
that takes
A as precondition and has B asan effect. When assuming that all operators
have inverse
counterparts, identifying connected components in the graph is proposed
as a means to
order goals. The approach is unlikely to scale to the size of problem spaces
today's planners
consider and it is also completely outdated in terms of terminology. Finally
, one can wonder how the reasonable and forced goal orderings relate to others
defined in the literature. There is only one attempt of which we know where
an ordering
relation is explicitly defined and its properties are studied, see {Hªullem
et al., 1999}. In
this paper, the notion of nec essary goal orderings is introduced, which
must be true in
all minimal solution plans {Kambhampati, 1995}. 11
The approach extends operator graphs
{Smith & Peot, 1993} and orders a goal based on three criteria called goal
subsumption, goal clobb ering, and pre c ondition violation. Goal subsumption
A < B holds if every solution plan
achieving a goal B in a state s also achieves a goal A in a state s
0
preceding s, and no plan
achieving one of the goals in G n fAg deletes A. Goal clobbering holds if
any solution plan
for A deletes B and thus, A< B . Precondition violation holds if any solution
for B results
in a deadlock from which A cannot be reached anymore, i.e., again A < B
. A composite
criterion is defined that tests all three criteria simultaneously.
12
A goal A is nec essarily or der e d before B if it satisfies the composite
criterion. We remark that pre c ondition violation seems to be equivalent
to the forc e d orderingswe introduced, while goal clobbering appears to
be similar to our re asonable orderings. It is not
possible for us to verify this conjecture as the authors of {Hªullem et
al., 1999} do not give exact formal definitions. We have nothing similar
to goal subsumption and we argue that
this criterion will be rarely satisfied in natural problems: if a goal A
is achieved by every11. A plan is minimal if it contains no subplan that
is also a solution plan. We remark that minimality does not mean that only
shortest plans having the least number of actions are considered. In fact,
minimal
plans can be highly non-optimal as long as no action is truly superfluous.
12. Here, the authors are not very precise about what they mean with this.
We argue that this means that two goals are ordered if they satisfy at least
one of the criteria.
381Koehler & Hoffmann solution for a goal B anyway, then the goal A can
be removed from the goal set without
changing the planning task.
The authors report that they are able to detect nec essary orderings in
the artificial
domains D
i
S
i , cf. {Barrett & Weld, 1994}, butfail in typical benchmark domains such
as
the blocks world or the tyreworld. The reason for this seems to be that
their operator graphs do not represent all possible instantiations of operator
schemes. As the authors claim, this
makes operator graph analysis very efficient. However, the heuristic ordering
024
h
that we introduced in this paper also takes almost no computation time,
and succeeds in finding
thegoal orderings in these domains. 7. Outlook
Three promising avenues for future research are the following:
First, one can imagine that goal ordering information is also used during
the search process, i.e., by not only ordering the original goal set, but
also other goals that emerge
during search. The ma jor challenge seems to balance the effort on computing
the goal
ordering information with the savings that can result for the search process.
One can
easily imagine that ordering all goal sets that are ever generated can become
a quite costly
investment without yielding a ma jor benefit for the planner.
Secondly, the refinement of the goal agenda with additional subgoals is
another inter-
esting future line of work. A first investigation using so-called intermediate
goals {these are facts that the planner must make true before it can achieve
an original goal} has been
explored inside GAM and the results are reported in {Koehler & Hoffmann,
1998}. Earlier
work addressing the task of learning intermediate goals can be found in
{Ruby & Kibler, 1989}, butthis problem has not been in the focus of AI planning
research since then. A third line of work addresses the interaction of GAM
with a forward-searching plan- ning system. We have seen that GAM preserves
the correctness of a planner, and that
it preserves the completeness at least on deadlock-free planningdomains.
We have also
seen, however, thatsolution plans using GAM can get longer, i.e., GAM does
not pre-
serve the optimality of a planner. Recently, planning systems that do not
deliver plans of guaranteed optimality have demonstrated an impressive performance
in terms of runtime and plan length, e.g., HSP, which is first mentioned
in {Bonet, Loerincs, & Geffner, 1997},
GRT {Refanidis & Vlahavas, 1999}, andin particular ff {Hoffmann, 2000}.
These systems are heuristic-search planners searching forward in the state
space with non-admissible, but informative heuristics.
The ff planning system developed by one of the authors has been awarded
\Group A Distinguished Performance Planning System" and has also won the
Schindler Award for the best performing planning system in the Miconic 10
Elevator domain {ADL track} at
theAIPS 2000 planning competition. The integration of goal agenda techniques
into the
planner is one of the factors that enabled the excellent behavior of ff
in the competition:
they were crucial for scaling to blocks world problems of 50 blocks, helped
by about a factor 2
on schedule and Miconic 10, and never slowed down the algorithm.
Forward state-space search is a quite natural framework to be driven by
the goal agenda:
Simply let the planner solve a subproblem, and start the next search from
the state where
the last search ended. Even more appealing, heuristic forward-search planners
have a deeper 382On Reasonable and Forced Goal Orderings
kind of interaction with GAM than for example graphplan-style planners.
In addition
to the smaller problems they are facing when using the goal agenda, theirheuristics
are
influenced because they employ techniques for estimating the goal distance
from a state.
When using the goal agenda, different goal sets result at each stage of
the planning process
and therefore, the goal-distance estimate willbe different, too. Currently
a heuristic device
inside the ff search algorithm is being developed, which knows that it is
being driven by
a goal agenda, and which has access to the complete set of goals. This information
can be
used to further prune unpromising branches from the search space when it
discovers that
currently achieved goals will probably have to be destroyed and reachieved
later on.
References Allen, J. {Ed.}, AIPS-98 {1998}. Pro c e e dings of the 4th International
Conferenc e on Artifi-
cial Intel ligence Planning Systems. AAAI Press, Menlo Park.
Anderson, C., & Weld, D. {1998}. Conditional effects inGraphplan. In Allen
{Allen, 1998},
pp. 44{53.
Bacchus, F., & Yang, Q. {1994}. Downward refinement and the efficiency of
hierarchical problem solving. Artificial Intel ligence, 71, 43{100. Bªackstrª
om, C., & Jonsson, P. {1995}. Planning with abstraction hierarchies can be
expo-
nentiallyless efficient. In Mellish {Mellish, 1995}, pp. 1599{1604.
Barrett, A., & Weld, D. {1994}. Partial-order planning: Evaluating possible
efficiency gains. Artificial Intel ligence, 67, 71{112.
Blum, A., & Furst, M. {1997}. Fast planning through planning graph analysis.
Artificial Intel ligence, 90 {1{2}, 279{298.
Bonet, B., Loerincs, G., & Geffner, H. {1997}. A robust and fast action
selection mecha-
nism for planning. In Pro c e e dings of the 14th National Conferenc e of
the American
Asso ciation for Artificial Intel ligence, pp. 714{719.
Bundy, A., Giunchiglia, F., Sebastiani, R., & Walsh, T. {1996}. Computing
abstraction hierarchies by numerical simulation. In Weld, & Clancey {Weld
& Clancey, 1996}, pp.
523{529. Bylander, T. {1992}. Complexity results for serial decomposability.
In Pro c e e dings of the 10th National Conferenc e of the American Association
for Artificial Intel ligence, pp.
729{734 SanJose, CA. MIT Press.
Bylander, T. {1994}. The computational complexity of propositional STRIPS
planning.
Artificial Intel ligence, 69, 165{204. Chapman, D. {1987}. Planning for
conjunctive goals. Artificial Intel ligence, 32 {3}, 333{377.
Cheng, J., & Irani, K. {1989}. Ordering problem subgoals. In Sridharan {Sridharan,
1989},
pp. 931{936. 383Koehler & Hoffmann Dawsson, C., & Siklossy, L. {1977}. The
role of preprocessing in problem solving systems.
In Pro c e e dings of the 5th International Joint Conferenc e on Artificial
Intel ligence, pp.
465{471 Cambridge, MA.
Dimopoulos, Y., Nebel, B., & Koehler, J. {1997}. Encoding planning problems
in non-
monotonic logic programs. In Steel {Steel, 1997}, pp.169{181. Drummond,
M., & Currie, K. {1989}. Goal ordering in partially ordered plans. In Sridharan
{Sridharan, 1989}, pp. 960{965.
El-Kholy, A., & Richards, B. {1996}. Temporal and resource reasoning in
planning: the
parcPLAN approch. In Wahlster, W. {Ed.}, Pro c e e dings of the 12th Europ
e an Con-
ferenc e on Artificial Intel ligence, pp. 614{618. JohnWiley & Sons, Chichester,
New York.
Etzioni, O. {1993}. Acquiring search-control knowledge via static analysis.
Artificial Intel-
ligence, 62, 255{301. Fink, E., & Veloso, M. {1994}. Prodigy planning algorithm.
Technical report CMU-94-123,
Carnegie Mellon University.
F ox, M., & Long, D. {1998}. The automatic inference of state invariants
in TIM. Journal
of Artificial Intel ligence Rese ar ch, 9, 367{421. Fox, M., & Long, D.
{1999}. Efficient implementation of the plan graph in STAN. Journal
of Artificial Intel ligence Rese ar ch, 10, 87{115.
Hertzberg, J., & Horz, A. {1989}. Towards a theory of conflict detection
and resolution in
nonlinear plans. In Sridharan {Sridharan, 1989}, pp. 937{942.
Hoffmann, J. {2000}. A heuristic for domain independent planning and its
use in an enforced
hill-climbing algorithm. In 12th International Symposium on Methods for
Intel ligent Systems.
Hªullem, J., Munoz-Avila, H., & Weberskirch, F. {1999}. Extracting goal
orderings to improve partial-order and Graphplan-based planning. Technical
report, University of
Kaiserslautern.
Irani, K., & Cheng, J. {1987}. Subgoal ordering and goal augmentation for
heuristic prob-
lem solving. In McDermott, D.{Ed.}, Pro c e e dings of the 10th International
Joint Conferenc e on Artificial Intel ligence, pp. 1018{1024 Milan, Italy.
Morgan Kaufmann.
Jonsson, P., Haslum, P., & Bªackstrª om, C.{2000}. Towards efficient universal
planning: A
randomized approach. Artificial Intel ligence, 117 {1}, 1{29.
Joslin, D., & Roach, J.{1990}. A theoretical analysis of conjunctive-goal
problems. Artificial
Intel ligence, 41, 97{106.
Kambhampati,S. {1995}. Admissible pruning strategies based on plan minimality
for plan-
space planning. In Mellish {Mellish, 1995}, pp. 1627{1633. 384On Reasonable
and Forced Goal Orderings
Kautz, H., & Selman, B. {1996}. Pushing the envelope: Planning, propositional
logic, and stochastic search. In Weld, & Clancey {Weld & Clancey, 1996},
pp. 1194{1201.
Knoblock, C. {1991}. Automatical ly Generating Abstractions for Problem
Solving. Ph.D.
thesis, Carnegie Mellon University.
Knoblock, C. {1994}. Automatically generating abstractions for planning.
Artificial Intel-
ligence, 68 {2}, 243{302.
Koehler, J. {1999}. Handling of conditional effects and negative goals in
IPP. Techni-
cal report 128, Universityof Freiburg, Institute of Computer Science. available
at
http://www.informatik.uni-freiburg.de/~ koehler/ipp.html.
Koehler, J.,& Hoffmann, J. {1998}. Planning with goal agendas. Technical
report
110, University of Freiburg. available at http://www.informatik.uni-freiburg.de/~
koehler/ipp.html.
Koehler, J., Nebel, B., Hoffmann, J., & Dimopoulos, Y. {1997}. Extending
planning graphs to an ADL subset. In Steel {Steel, 1997}, pp.273{285.
Korf, R. {1985}. Macro-operators: A weak method for learning. Artificial
Intel ligence, 26,
35{77.
Korf, R. {1987}. Planning as search: A quantitative approach. Artificial
Intel ligence, 33,
65{88.
McDermott, D., & Hanks, S. {1987}. Nonmonotonic logic and temporal pro jection.
Artificial
Intel ligence, 33, 379{412. Mellish,C. {Ed.}, IJCAI-95 {1995}. Pro c e e
dings of the 14th International Joint Conferenc e
on Artificial Intel ligence. Morgan Kaufmann, San Francisco, CA.
Nebel, B., & Bªackstrª om, C. {1994}. On the computational complexity of
temporal pro jec-
tion, planning, and plan validation. Journal of Artificial Intel ligence,
66 {1}, 125{160.
Nebel, B., Dimopoulos, Y., & Koehler, J. {1997}. Ignoring irrelevant facts
and operators in plan generation. In Steel {Steel, 1997}, pp.338{350.
Nilsson, N. {1980}. Principles of Artificial Intel ligence. Tioga PublishingCompany,
Palo
Alto.
Pednault, E. {1989}. ADL: Exploring the middle ground between STRIPS and
the Situation
Calculus. In Brachman, R., Levesque, H., & Reiter, R. {Eds.}, Pro c e e
dings of the 1st
International Conferenc e on Principles of Know ledge Repr esentation and
Re asoning,
pp. 324{332 Toronto, Canada. Morgan Kaufmann.
Penberthy, J.,& Weld, D. {1992}. UCPOP: A sound, complete, partial order
planner for ADL. In Nebel, B., Swartout, W., & Rich, C. {Eds.}, Pro c e e
dings of the 3rd
International Conferenc e on Principles of Know ledge Repr esentation and
Re asoning,
pp. 103{113. Morgan Kaufmann, San Mateo.
385Koehler & Hoffmann Pollack, M., Joslin,D., &Paolucci, M. {1997}. Selection
strategies for partial-order planning.
Journal of Artificial Intel ligence Rese ar ch, 6, 223{262.
Refanidis, I., & Vlahavas, I. {1999}. GRT: A domain independent heuristic
for STRIPS
worlds based on greedy regression tables. In Pro c e e dings of the 5th
Europ e an Confer-
ence on Planning, pp. 346{358.
Regli, W., Gupta, S., & Nau, D.{1995}. AI planning versus manufactoring-operation
planning: A case study. In Mellish {Mellish, 1995}, pp.1670{1676.
Ruby, D., & Kibler, D. {1989}. Learning subgoal sequences for planning.
In Sridharan
{Sridharan, 1989}, pp. 609{615.
Russel, S., & Norvig, P. {1995}. Artificial Intel ligence - A modern Appro
ach. Prentice Hall.
Sacerdoti, E. {1974}. Planning in a hierarchy of abstraction spaces. Artificial
Intel ligence,
5, 115{135.
Sanborn,J., & Hendler, J. {1988}. Near-term event pro jection through dynamic
simulation
or how did the robot cross the road? In Pro c e e dings of the 2nd Conferenc
e on AI and
Simulation.
Smith, D., & Peot, M. {1993}. Postponing threats in partial-order planning.
In Pro c e e dings
of the 11th National Conferenc e of the American Association for Artificial
Intel ligence,
pp. 500{506. AAAIPress, MIT Press.
Sridharan, N. {Ed.}, IJCAI-89 {1989}. Pro c e e dings of the 11th International
Joint Confer- ence on Artificial Intel ligence, Detroit, MI. Morgan Kaufmann.
Steel, S. {Ed.}, ECP-97 {1997}. Pro c e e dings of the 4th Europ e an Conferenc
e on Planning,
Vol. 1348 of LNAI. Springer.
Tsuneto, R., Hendler, J., & Nau, D. S. {1998}. Analyzing external conditions
to improve
theefficiency of HTN planning. In Allen {Allen, 1998}, pp.913{920. Veloso,
M., & Blythe, J. {1994}. Linkability: Examining causal link commitments in
partial-
order planning. In Hammond, K. {Ed.}, Pro c e e dings of the 2nd International
Con-
ferenc e on Artificial Intel ligence Planning Systems, pp. 170{175. AAAI
Press, Menlo
Park.
Warshall, J. {1962}. A theorem on boolean matrices. Journal of the ACM,
9 {1}, 11{12.
Weld, D., & Clancey, B. {Eds.}., AAAI-96 {1996}. Pro c e e dings of the
14th National Con-
ferenc e of the American Association for Artificial Intel ligence. AAAI
Press.
Williams, B., & Nayak, R. {1997}. A reactive planner for a model-based executive.
In
Pro c e e dings of the 15th International Joint Conferenc e on Artificial
Intel ligence, pp.
1178{1185. MorganKaufmann, San Francisco, CA. 386