AltAltp: Online Parallelization of Plans with Heuristic State Search

R. Sanchez and S. Kambhampati

Volume 19, 2003

Links to Full Text:

Abstract:
Despite their near dominance, heuristic state search planners still lag behind disjunctive planners in the generation of parallel plans in classical planning. The reason is that directly searching for parallel solutions in state space planners would require the planners to branch on all possible subsets of parallel actions, thus increasing the branching factor exponentially. We present a variant of our heuristic state search planner AltAlt, called AltAltp which generates parallel plans by using greedy online parallelization of partial plans. The greedy approach is significantly informed by the use of novel distance heuristics that AltAltp derives from a graphplan-style planning graph for the problem. While this approach is not guaranteed to provide optimal parallel plans, empirical results show that AltAltp is capable of generating good quality parallel plans at a fraction of the cost incurred by the disjunctive planners.

Extracted Text


Journal of Artificial Intelligence Research 19 (2003) 631-657 Submitted 10/02;
published 12/03

AltAltp: Online Parallelization of Plans with

Heuristic State Search

Romeo Sanchez Nigenda rsanchez@asu.edu Subbarao Kambhampati rao@asu.edu Department
of Computer Science and Engineering, Arizona State University, Tempe AZ 85287-5406

Abstract Despite their near dominance, heuristic state search planners still
lag behind disjunctiveplanners in the generation of parallel plans in classical
planning. The reason is that directly

searching for parallel solutions in state space planners would require the
planners to branchon all possible subsets of parallel actions, thus increasing
the branching factor exponentially. We present a variant of our heuristic
state search planner AltAlt called AltAltp whichgenerates parallel plans
by using greedy online parallelization of partial plans. The greedy approach
is significantly informed by the use of novel distance heuristics that AltAltpderives
from a graphplan-style planning graph for the problem. While this approach
is not guaranteed to provide optimal parallel plans, empirical results show
that AltAltp iscapable of generating good quality parallel plans at a fraction
of the cost incurred by the disjunctive planners.

1. Introduction Heuristic state space search planning has proved to be one
of the most efficient planning frameworks for solving large deterministic
planning problems (Bonet, Loerincs, & Geffner, 1997; Bonet & Geffner, 1999;
Bacchus, 2001). Despite its near dominance, its one achilles heel remains
generation of "parallel plans" (Haslum & Geffner, 2000). Parallel plans allow
concurrent execution of multiple actions in each time step. Such concurrency
is likely to be more important as we progress to temporal domains. While
disjunctive planners such as Graphplan (Blum & Furst, 1997) SATPLAN (Kautz
& Selman, 1996) and GPCSP (Do & Kambhampati, 2000) seem to have no trouble
generating such parallel plans, planners that search in the space of states
are overwhelmed by this task. The main reason is that straightforward methods
for generation of parallel plans would involve progression or regression
over sets of actions. This increases the branching factor of the search space
exponentially. Given n actions, the branching factor of a simple progression
or regression search is bounded by n, while that of progression or regression
search for parallel plans will be bounded by 2n.

The inability of state search planners in producing parallel plans has been
noted in the literature previously. Past attempts to overcome this limitation
have not been very successful. Indeed, Haslum and Geffner (2000) consider
the problem of generating parallel plans using a regression search in the
space of states. They note that the resulting planner, HSP*p, scales significantly
worse than Graphplan. They present TP4 in (Haslum & Geffner, 2001), which
in addition to being aimed at actions with durations, also improves

cfl2003 AI Access Foundation. All rights reserved.

Sanchez & Kambhampati the branching scheme of HSP*p, by making it incremental
along the lines of Graphplan. Empirical studies reported by Haslum and Geffner
(2001), however indicate that even this new approach, unfortunately, scales
quite poorly compared to Graphplan variants. Informally, this achilles heel
of heuristic state search planners has been interpreted as a sort of last
stand of the disjunctive planners - only they are capable of generating parallel
plans efficiently.

Given that the only way of efficiently generating optimal parallel plans
involves using disjunctive planners, we might want to consider ways of generating
near-optimal parallel plans using state search planners. One obvious approach
is to post-process the sequential plans generated by the state search planners
to make them parallel. While this can easily be done - using approaches such
as those explored by Backstrom (1998), one drawback is that such approaches
are limited to transforming the sequential plan given as input. Parallelization
of sequential plans often results in plans that are not close to optimal
parallel plans.1

An alternative, that we explore in this paper, involves incremental online
parallelization. Specifically, our planner AltAltp, which is a variant of
the AltAlt planner (Sanchez, Nguyen, & Kambhampati, 2000; Nguyen, Kambhampati,
& Sanchez, 2002), starts its search in the space of regression over single
actions. Once the most promising single action to regress is selected, AltAltp
then attempts to parallelize ("fatten") the selected search branch with other
independent actions. This parallelization is done in a greedy incremental
fashion actions are considered for addition to the current search branch
based on the heuristic cost of the subgoals they promise to achieve. The
parallelization continues to the next step only if the state resulting from
the addition of the new action has a better heuristic cost. The sub-optimality
introduced by the greedy nature of the parallelization is offset to some
extent by a plan-compression procedure called Pushup that tries to rearrange
the evolving parallel plans by pushing up actions to higher levels in the
search branch (i.e. later stages of execution) in the plan.

Despite the seeming simplicity of our approach, it has proven to be quite
robust in practice. In fact, our experimental comparison with five competing
planners - STAN (Long & Fox, 1999), LPG (Gerevini & Serina, 2002), Blackbox
(Kautz & Selman, 1996), SAPA (Do & Kambhampati, 2001) and TP4 (Haslum & Geffner,
2001) - shows that AltAltp is a viable and scalable alternative for generating
parallel plans in several domains. For many problems, AltAltp is able to
generate parallel plans that are close to optimal in makespan. It also seems
to retain the efficiency advantages of heuristic state search over disjunctive
planners, producing plans in a fraction of the time taken by the disjunctive
planners in many cases. AltAltp has also been found to be superior to post-processing
approaches. Specifically, we compared AltAltp to an approach that involves
post-processing the sequential plans generated by AltAlt using the techniques
from Backstrom (1998). We found that AltAltp is able to generate shorter
parallel plans in many cases. Finally, we show that AltAltp incurs very little
additional overhead compared to AltAlt.

In the rest of this paper, we discuss the implementation and evaluation of
our approach to generate parallel plans with AltAltp. Section 2 starts by
providing a review of the AltAlt planning system, on which AltAltp is based.
Section 3 describes the generation of

1. We will empirically demonstrate this later; curious readers may refer
to the plots in Figure 15.

632

Online Parallelization of Plans with Heuristic State Search

Action Templatesfl

Problem Specfl(Init, Goal state)fl

Solution Planfl

GraphplanflPlan Extension Phasefl (based on STAN)fl

Serial PlanningflGraphfl Extraction offl

Heuristicsfl

HeuristicsflActions in theflLast Levelfl

Regression Plannerfl(based on HSP-R)fl AltAltfl

Figure 1: Architecture of AltAlt parallel plans in AltAltp. Section 4 presents
extensive empirical evaluation of AltAltp. The evaluation includes both comparison
and ablation studies. Finally, Section 5 discusses some related work in classical
as well as metric temporal planning. Section 6 summarizes our contributions.

2. AltAlt Background Architecture and Heuristics The AltAlt planning system
is based on a combination of Graphplan (Blum & Furst, 1997; Long & Fox, 1999;
Kautz & Selman, 1999) and heuristic state space search (Bonet et al., 1997;
Bonet & Geffner, 1999; McDermott, 1999) technology. AltAlt extracts powerful
heuristics from a planning graph data structure to guide a regression search
in the space of states. The high level architecture of AltAlt is shown in
Figure 1. The problem specification and the action template description are
first fed to a Graphplan-style planner (in our case, STAN from Long & Fox,
1999), which constructs a planning graph for that problem in polynomial time
(we assume the reader is familiar with the Graphplan algorithm of Blum &
Furst, 1997). This planning graph structure is then fed to a heuristic extractor
module that is capable of extracting a variety of effective heuristics (Nguyen
& Kambhampati, 2000; Nguyen et al., 2002). These heuristics, along with the
problem specification, and the set of ground actions in the final action
level of the planning graph structure are fed to a regression state-search
planner.

To explain the operation of AltAlt at a more detailed level, we need to provide
some further background on its various components. We shall start with the
regression search module. Regression search is a process of searching in
the space of potential plan suffixes. The suffixes are generated by starting
with the goal state and regressing it over the set of relevant action instances
from the domain. The resulting states are then (nondeterministically) regressed
again over relevant action instances, and this process is repeated until
we reach a state (set of subgoals) which is satisfied by the initial state.
A state S in our framework is a set of (conjunction of) literals that can
be seen as "subgoals" that need to be made true on the way to achieving the
top level goals. An action instance a is considered relevant to a state S
if the effects of a give at least one element of S and do not delete

633

Sanchez & Kambhampati any element of S. The result of regressing S over a
is then (Sef f (a)) [ prec(a) - which is essentially the set of goals that
still need to be achieved before the application of a, such that everything
in S would have been achieved once a is applied. For each relevant action
a, a separate search branch is generated, with the result of regressing S
over that action as the new fringe in that branch. Search terminates with
success at a node if every literal in the state corresponding to that node
is present in the initial state of the problem.

The crux of controlling the regression search involves providing a heuristic
function that can estimate the relative goodness of the states on the fringe
of the current search tree and guide the search in the most promising directions.
The heuristic function needs to evaluate the cost of achieving the set S
of subgoals (comprising a regressed state) from the initial state. In other
words, the heuristic computes the length of the plan needed to achieve the
subgoals from the initial state. We now discuss how such a heuristic can
be computed from the planning graph, which, provides optimistic reachability
estimates.

Normally, the planning graph data structure supports "parallel" plans - i.e.,
plans where at each step more than one action may be executed simultaneously.
Since we want the planning graph to provide heuristics to the regression
search module of AltAlt, which generates sequential solutions, we first make
a modification to the algorithm so that it generates a "serial planning graph."
A serial planning graph is a planning graph in which, in addition to the
normal mutex relations, every pair of non-noop actions at the same level
are marked mutex. These additional action mutexes propagate to give additional
propositional mutexes. Finally, a planning graph is said to level off when
there is no change in the action, proposition and mutex lists between two
consecutive levels.

We will assume for now that given a problem, the Graphplan module of AltAlt
is used to generate and expand a serial planning graph until it levels off.
As discussed by Sanchez et al. (2000), we can relax the requirement of growing
the planning graph to level-off, if we can tolerate a graded loss of informedness
of heuristics derived from the planning graph. We will start with the notion
of level of a set of propositions:

Definition 1 (Level) Given a set S of propositions, lev(S) is the index of
the first level in the leveled serial planning graph in which all propositions
in S appear and are non-mutex with one another. If S is a singleton, then
lev(S) is just the index of the first level where the singleton element occurs.
If no such level exists, then lev(S) = 1 if the planning graph has been grown
to level-off.

The intuition behind this definition is that the level of a literal p in
the serial planning graph provides a lower bound on the length of the plan
(which, for a serial planning graph, is equal to the number of actions in
the plan) to achieve p from the initial state. Using this insight, a simple
way of estimating the cost of a set of subgoals will be to sum their levels.

Heuristic 1 (Sum heuristic) hsum(S) := Pp2S lev({p})

The sum heuristic is very similar to the greedy regression heuristic used
in UNPOP (McDermott, 1999) and the heuristic used in the HSP planner (Bonet
et al., 1997). Its main limitation is that the heuristic makes the implicit
assumption that all the subgoals (elements of S) are independent. The hsum
heuristic is neither admissible nor particularly informed as it ignores the
interactions between the subgoals. To develop more effective heuristics,

634

Online Parallelization of Plans with Heuristic State Search we need to consider
both positive and negative interactions among subgoals in a limited fashion.

In (Nguyen et al., 2002), we discuss a variety of ways of using the planning
graph to incorporate negative and positive interactions into the heuristic
estimate, and discuss their relative tradeoffs. One of the best heuristics
according to that analysis was a heuristic called hAdjSum2M . We adopted
this heuristic as the default heuristic in AltAlt. The basic idea of hAdjSum2M
is to adjust the sum heuristic to take positive and negative interactions
into account. This heuristic approximates the cost of achieving the subgoals
in some set S as the sum of the cost of achieving S, while considering positive
interactions and ignoring negative interactions, plus a penalty for ignoring
the negative interactions. The first component RP (S) can be computed as
the length of a "relaxed plan" for supporting S, which is extracted by ignoring
all the mutex relations. To approximate the penalty induced by the negative
interactions alone, we proceed with the following argument. Consider any
pair of subgoals p, q 2 S. If there are no negative interactions between
p and q, then lev({p, q}), the level at which p and q are present together,
is exactly the maximum of lev(p) and lev(q). The degree of negative interaction
between p and q can thus be quantified by:

ffi(p, q) = lev({p, q}) - max (lev(p), lev(q)) We now want to use the ffi
- values to characterize the amount of negative interactions present among
the subgoals of a given set S. If all subgoals in S are pair-wise independent,
clearly, all ffi values will be zero, otherwise each pair of subgoals in
S will have a different value. The largest such ffi value among any pair
of subgoals in S is used as a measure of the negative interactions present
in S in the heuristic hAdjSum2M . In summary, we have

Heuristic 2 (Adjusted 2M) hAdjSum2M (S) := length(RP (S)) + maxp,q2Sffi(p,
q)

The analysis by Nguyen et al. (2002) shows that this is one of the more robust
heuristics in terms of both solution time and quality. This is thus the default
heuristic used in AltAlt (as well as AltAltp; see below).

3. Generation of Parallel Plans Using AltAltp The obvious way to make AltAlt
produce parallel plans would involve regressing over subsets of (non interfering)
actions. Unfortunately, this increases the branching factor exponentially
and is infeasible in practice. Instead, AltAltp uses a greedy depth-first
approach that makes use of its heuristics to regress single actions, and
incrementally parallelizes the partial plan at each step, rearranging the
partial plan later if necessary.

The high level architecture of AltAltp is shown in Figure 2. Notice that
the heuristic extraction phase of AltAltp is very similar to that of AltAlt,
but with one important modification. In contrast to AltAlt which uses a "serial"
planning graph as the basis for its heuristic (see Section 2), AltAltp uses
the standard "parallel" planning graph. This makes sense given that AltAltp
is interested in parallel plans while AltAlt was aimed at generating sequential
plans. The regression state-search engine for AltAltp is also different from
the search module in AltAlt. AltAltp augments the search engine of AltAlt
with 1) a fattening step and 2) a plan compression procedure (Pushup). The
details of these procedures are discussed below.

635

Sanchez & Kambhampati Action Templatesfl

Problem Specfl(Init, Goal state)fl

Solution Planfl

GraphplanflPlan Extension Phasefl (based on STAN)fl

ParallelflPlanningfl

Graphfl Extraction offlHeuristicsfl

HeuristicsflActions in theflLast Levelfl

NodeflExpansionfl (flFatteningfl)fl Node Orderingfland Selectionfl

PlanflCompressionfl Algorithmfl(flPushUpfl)fl

AltAltflpfl

Figure 2: Architecture of AltAltp

.  . .fl .  . .fl Afl={afl1fl,afl2fl,...,aflpfl,...aflmfl}fl

afl1fl afl2fl aflpfl aflmfl

SflmflSflpflSfl2flSfl1fl

Sfl

Figure 3: AltAltp Notation

636

Online Parallelization of Plans with Heuristic State Search parexpand(S)

A  get set of applicable actions for current state S forall ai 2 A

Si  Regress(S,ai) CHILDREN(S)  CHILDREN(S) + Si Sp  The state among Children(S)
with minimum

hadjsum2M value ap  the action that regresses to Sp from S

/**Fattening process O  { ap } forall g 2 S ranked in the decreasing order
of level(g)

Find an action ag 2 A supporting g such that ag 62 O and ai is pairwise independent
with each action in O. If there are multiple such actions, pick the one which
has minimum hadjsum(Regress(S, O + ag)) among all ag 2 A If hadjsum2M (S,
O + ai) < hadjsum2M (S, O)

O  O + ag Spar  Regress(S, O) CHILDREN(S)  CHILDREN(S) + Spar return CHILDREN
END;

Figure 4: Node Expansion Procedure

The general idea in AltAltp is to select a fringe action ap from among those
actions A used to regress a particular state S during any stage of the search
(see Figure 3). Then, the pivot branch given by the action ap is "fattened"
by adding more actions from A, generating a new state that is a consequence
of regression over multiple parallel actions. The candidate actions used
for fattening the pivot branch must (a) come from the sibling branches of
the pivot branch and (b) be pairwise independent with all the other actions
currently in the pivot branch. We use the standard definition of action independence:
two actions a1 and a2 are considered independent if the state S0 resulting
after regressing both actions simultaneously is the same as that obtained
by applying a1 and a2 sequentially with any of their possible linearizations.
A sufficient condition for this is that the preconditions and effects of
the actions do not interfere:

((|prec(a1)| [ |ef f (a1)|) " (|prec(a2)| [ |ef f (a2)|)) = ; where |L| refers
to the non-negated versions of the literals in the set L. We now discuss
the details of how the pivot branch is selected in the first place, and how
the branch is incrementally fattened.

Selecting the Pivot Branch: Figure 4 shows the procedure used to select and
parallelize the pivot branch. The procedure first identifies the set of regressable
actions A for the current node S, and regresses each of them, computing the
new children states. Next, the action leading to the child state with the
lowest heuristic cost among the new children is selected as the pivot action
ap, and the corresponding branch becomes the pivot branch.

637

Sanchez & Kambhampati

At(pack1,ASU)flAt(pack2,ASU)fl At(pack3,ASU)flAt(pack4, Home)fl

Sflpfl Sfl1fl Sflmfl

Sfl aflpfl: Unload(pack1,airp1,ASU)fl

afl1fl: Unload(pack1,airp2,ASU)fl

...fl

aflmfl: Unload(pack4,airp2,Home)fl

H=21fl H=22fl

H=23fl

Unload(pack2,airp1,ASU)flUnload(pack2,airp2,ASU)fl Unload(pack3,airp1,ASU)flUnload(pack3,airp2,ASU)fl
Unload(pack4,airp2,HOME)fl

H=21flPivotfl

Possible PairwiseflIndependent Actionsfl

Figure 5: After the regression of a state, we can identify the P ivot and
the related set of

pairwise independent actions.

The heuristic cost of the states is computed with the hadjsum2M heuristic
from AltAlt, based on a "parallel" planning graph. Specifically, in the context
of the discussion of the hadjsum2M heuristic at the end of Section 2, we
compute the ffi(p, q) values, which in turn depend on the level(p), level(q)
and level(p, q) in terms of the levels in the parallel planning graph rather
than the serial planning graph. It is easy to show that the level of a set
of conditions on the parallel planning graph will be less than or equal to
the level on the serial planning graph. The length of the relaxed plan is
still computed in terms of number of actions. We show later (see Figure 19(a))
that this change does improve the quality of the parallel plans produced
by AltAltp.

The search algorithm used in AltAltp is similar to that used in HSPr (Bonet
& Geffner, 1999) - it is a hybrid between greedy depth first and a weighted
A* search. It goes depthfirst as long as the heuristic cost of any of the
children states is lower than that of the current state. Otherwise, the algorithm
resorts to a weighted A* search to select the next node to expand. In this
latter case, the evaluation function used to rank the nodes is f (S) = g(S)
+ w * h(S) where g(S) is the length of the current partial plan in terms
of number of steps, h(S) is our estimated cost given by the heuristic function
(e.g. hAdjSum2M ), and w is the weight given to the heuristic function. w
is set to 5 based on our empirical experience.2 Breaking Ties: In case of
a tie in selecting the pivot branch, i.e., more than one branch leads to
a state with the lowest heuristic cost, we break the tie by choosing the
action that

2. For the role of w in Best-First search see (Korf, 1993).

638

Online Parallelization of Plans with Heuristic State Search supports subgoals
that are harder to achieve. Here, the hardness of a literal l is measured
in terms of the level in the planning graph at which l first appears. The
standard rationale for this decision (c.f. Kambhampati & Sanchez, 2000) is
that we want to fail faster by considering the most difficult subgoals first.
We have an additional justification in our case, we also know that a subgoal
with a higher level value requires more steps and actions for its achievement
because it appeared later into the planning graph. So, by supporting it first,
we may be able to achieve other easier subgoals along the way and thereby
reduce the number of parallel steps in our partial plan.

Fattening the Pivot Branch: Next the procedure needs to decide which subset
O ` A of the sibling actions of the pivot action ap will be used to fatten
the pivot branch. The obvious first idea would be to fatten the pivot branch
maximally by adding all pairwise independent actions found during that search
stage. The problem with this idea is that it may add redundant and heuristically
inferior actions to the branch, and satisfying their preconditions may lead
to an increase of the number of parallel steps.

So, in order to avoid fattening the pivot branch with such irrelevant actions,
before adding any action a to O, we require that the heuristic cost of the
state S0 that results from regressing S over O + a be strictly lower than
that of S. This is in addition to the requirement that a be pairwise independent
with the current set of actions in O. This simple check also ensures that
we do not add more than one action for supporting the same set of subgoals
in S.

The overall procedure for fattening the pivot branch thus involves picking
the next hardest subgoal g in S (with hardness measured in terms of the level
of the subgoal in the planning graph), and finding the action ag 2 A achieving
g, which is pair-wise independent of all actions in O and which, when added
to O and used to regress S, leads to a state S0 with the lowest heuristic
cost, which in consequence should be lower than the cost of S. Once found,
ag is then added to O, and the procedure is repeated. If there is more than
one action that can be ag, then we break ties by considering the degree of
overlap between the preconditions of action ag and the set of actions currently
in O. The degree of precondition overlap between a and O is defined as |prec(a)
" {[o2Oprec(o)}|. The action a with higher degree of overlap is preferred
as this will reduce the amount of additional work we will need to do to establish
its preconditions. Notice that because of the fattening process, a search
node may have multiple actions leading to it from its parent, and multiple
actions leading from it to each of its children.

Example: Figure 5 illustrates the use of this node expansion procedure for
a problem from the logistics domain (Bacchus, 2001). In this example we have
four packages pack1, pack2, pack3 and pack4. Our goal is to place the first
three of them at ASU and the remaining one at home. There are two planes
airp1 and airp2 to carry out the plans. The figure shows the first level
of the search after S has been regressed. It also shows the pivot action
ap given by unload(pack1,airp1,ASU), and a candidate set of pairwise independent
actions with respect to ap. Finally, we can see in Figure 6 the generation
of the parallel branch. Notice that each node can be seen as a partial regressed
plan. As described in the paragraphs above, only actions regressing to lower
heuristic estimates are considered in apar to fatten the pivot branch. Notice
that the action unload(pack4,airp2,Home) has been discarded because it leads
to a state with higher cost, even though it is not inconsistent

639

Sanchez & Kambhampati

At(pack1,ASU)flAt(pack2,ASU)fl At(pack3,ASU)flAt(pack4, Home)fl

Sflpfl Sfl1fl Sflmfl

Sfl aflpfl: Unload(pack1,airp1,ASU)fl

afl1fl: Unload(pack1,airp2,ASU)fl

...fl

aflmfl: Unload(pack4,airp2,Home)fl

H=21fl H=22fl

H=23fl

Unload(pack1,airp1,ASU)flUnload(pack2,airp1,ASU)fl Unload(pack2,airp2,ASU)flUnload(pack3,airp1,ASU)fl
Unload(pack3,airp2,ASU)flUnload(pack4,airp2,HOME)fl

H=21flPivotfl

Ofl

Sflparfl

aflparfl: Unload(pack1,airp1,ASU)fl       Unload(pack2,airp1,ASU)fl

       Unload(pack3,airp1,ASU)fl

H=19fl

Figure 6: Spar is the result of incrementally fattening the P ivot branch
with the pairwise

independent actions in O

with the rest of the actions chosen to fatten the pivot branch. Furthermore,
we can also see that we have preferred actions using the plane airp1, since
they overlap more with the pivot action ap.

Offsetting the Greediness of Fattening: The fattening procedure is greedy,
since it insists that the state resulting after fattening have a strictly
better heuristic value. While useful in avoiding the addition of irrelevant
actions to the plan, this procedure can also sometimes preclude actions that
are ultimately relevant but were discarded because the heuristic is not perfect.
These actions may then become part of the plan at later stages during search
(i.e., earlier parts of the execution of the eventual solution plan; since
we are searching in the space of plan suffixes). When this happens, the length
of the parallel plan is likely to be greater, since more steps that may be
needed to support the preconditions of such actions would be forced to come
at even later stages of search (earlier parts of the plan). Had the action
been allowed into the partial plan earlier in the search (i.e., closer to
the end of the eventual solution plan), its preconditions could probably
have been achieved in parallel to the other subgoals in the plan, thus improving
the number of steps.

In order to offset this negative effect of greediness, AltAltp re-arranges
the partial plan to promote such actions higher up the search branch (i.e.,
later parts of the execution of the eventual solution plan). Specifically,
before expanding a given node S, AltAltp checks to see if any of the actions
in As leading to S from its parent node (i.e., Figure 6 shows that Apar leads
to Spar) can be pushed up to higher levels in the search branch. This online

640

Online Parallelization of Plans with Heuristic State Search

pushUP(S)

As  get actions leading to Sforall

a 2 As x  0 Sx  get parent node of S/** Getting highest ancestor for each
action

Loop

Ax  get actions leading to Sx If (parallel(a, Ax))

x  x + 1 Sx  get parent node of Sx-1Else

aj  get action conflicting with a from Ax If (Secondary Optimizations)Remove

a and aj from branchInclude anew if necessaryElse

Ax-1  Ax-1 + a As  As - abreak

End Loop/**Adjusting the partial plan Sx  get highest ancestor x in history
createN ewBranchF rom(Sx) while x > 0

Snew  regress Sx with Ax-1 Sx  Snew x  x - 1 END;

Figure 7: Pushup Procedure

re-arrangement of the plan is done by the Pushup procedure, which is shown
in Figure 7. The Pushup procedure is called each time before a node gets
expanded, and it will try to compress the partial plan. For each of the actions
a 2 As we find the highest ancestor node Sx of S in the search branch to
which the action can be applied (i.e., it gives some literal in Sx without
deleting any other literals in Sx, and it is pairwise independent of all
the actions Ax currently leading out of Sx, in other words the condition
parallel(a, Ax) is satisfied). Once Sx is found, a is then removed from the
set of actions As leading to S and introduced into the set of actions leading
out of Sx (to its child in the current search branch). Next, the states in
the search branch below Sx are adjusted to reflect this change. The adjustment
involves recomputing the regressions of all the search nodes below Sx. At
first glance, this might seem like a transformation of questionable utility
since the preconditions of a (and their regressions) just become part of
the descendants of Sx, and this does not necessarily reduce the length of
the plan. We however expect a length reduction because actions supporting
the preconditions of a will get "pushed up" eventually during later expansions.

641

Sanchez & Kambhampati

At(pack1,ASU)flAt(pack2,ASU)fl At(pack3,ASU)flAt(pack4, Home)fl

Sflpfl Sfl1fl Sflmfl

Sfl Unload(pack1,airp1,ASU)fl

Unload(pack1,airp2,ASU)fl

...fl

Unload(pack4,airp2,Home)fl

H=21fl H=21fl H=22fl

H=23fl Sflparfl H=19fl

Unload(pack1,airp1,ASU)flUnload(pack2,airp1,ASU)fl Unload(pack3,airp1,ASU)fl

Sflpfl ...fl SflmflSflparfl

Unload(pack4,airp2,Home)fl

Unload(pack4,airp2,Home)flfly(airp1,LocX,ASU)fl

fly(airp1,LocX,ASU)fl

H=18flPivotfl H=18fl H=16fl (a) Finding the highest ancestor node to which
an action can be pushed up.

At(pack1,ASU)flAt(pack2,ASU)fl At(pack3,ASU)flAt(pack4, Home)fl

Sflpfl Sfl1fl Sflmfl

Sfl Unload(pack1,airp1,ASU)fl

Unload(pack1,airp2,ASU)fl

...fl

Unload(pack4,airp2,Home)fl

H=21fl H=21fl H=22fl

H=23fl Sflparfl H=19fl

Unload(pack1,airp1,ASU)flUnload(pack2,airp1,ASU)fl Unload(pack3,airp1,ASU)fl

Sflpfl ...fl SflmflSflparfl

Unload(pack4,airp2,Home)fl

fly(airp1,LocX,ASU)flUnload(pack4,airp2,Home)fl

fly(airp1,LocX,ASU)fl

H=18flPivotfl H=18fl H=16fl

Sflnewfl Unload(pack1,airp1,ASU)flUnload(pack2,airp1,ASU)fl Unload(pack3,airp1,ASU)flUnload(pack4,airp2,HOME)fl

Sflnewfl fly(airp1,LocX,ASU)fl

H=16fl (b) The Pushup procedure generates a new search branch.

Figure 8: Rearranging of the Partial Plan

642

Online Parallelization of Plans with Heuristic State Search Rather than doctor
the existing branch, in the current implementation, we just add a new branch
below Sx that reflects the changes made by the Pushup procedure.3 The new
branch then becomes the active search branch, and its leaf node is expanded
next.

Aggressive Variation of Pushup: The Pushup procedure, as described above,
is not expensive as it only affects the current search branch, and the only
operations involved are recomputing the regressions in the branch. Of course,
it is possible to be more aggressive in manipulating the search branch. For
example, after applying an action a to its ancestor Sx the set of literals
in the child state, say Snew changes, and thus additional actions may become
relevant for expanding Snew. In principle, we could re-expand Snew in light
of the new information. We decided not to go with the re-expansion option,
as it typically does not seem to be worth the cost. In Section 4.3, we do
compare our default version of Pushup procedure with a variant that re-expands
all nodes in the search branch, and the results of those studies support
our decision to avoid re-expansion. Finally, although we introduced the Pushup
procedure as an add-on to the fattening step, it can also be used independent
of the latter, in which case the net effect would be an incremental parallelization
of a sequential plan.

Example: In Figure 8(a), we have two actions leading to the node Spar (at
depth two), these two actions are Unload(pack4,airp2,Home) and fly(airp1,LocX,ASU).
So, before expanding Spar we check if any of the two actions leading to it
can be pushed up. While the second action is not pushable since it interacts
with the actions in its ancestor node, the first one is. We find the highest
ancestor in the partial plan that interacts with our pushable action. In
our example the root node is such an ancestor. So, we insert our pushable
action Unload(pack4,airp2,Home) directly below the root node. We then re-adjust
the state Spar to Snew at depth 1, as shown in Figure 8(b), adding a new
branch, and reflecting the changes in the states below. Notice that the action
Unload(pack4,airp2,Home) was initially discarded by the greediness of the
fattening procedure (see Figure 6), but we have offset this negative effect
with our plan compression algorithm. We can see also that we have not re-expanded
the state Snew at depth 1, we have only made the adjustments to the partial
plan using the actions already presented in the search trace.4

4. Evaluating the Performance of AltAltp We implemented AltAltp on top of
AltAlt. We have tested our implementation on a suite of problems that were
used in the 2000 and 2002 AIPS competition (Bacchus, 2001; Long & Fox, 2002),
as well as other benchmark problems (McDermott, 2000). Our experiments are
broadly divided into three sets, each aimed at comparing the performance
of AltAltp under different scenarios:

1. Comparing the performance of AltAltp to other planning systems capable
of producing

parallel plans.

3. Because of the way our data structures are set up, adding the new branch
turns out to be a more robust

option than manipulating the existing search branch. 4. Instead, the aggressive
Pushup modification would expand Snew at depth 1, generating similar states
to

those generated by the expansion of Spar at the same depth.

643

Sanchez & Kambhampati Gripper AIPS-98fl

0fl 10fl 20fl 30fl 40fl 50fl 60fl 70fl 80fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8flProblemsfl Ste psfl

AltAlt-pfl STANfl TP4fl Blackboxfl LPG 2ndfl

(a)

Elevator AIPS-00fl 0fl 5fl 10fl 15fl 20fl 25fl 30fl 35fl

1fl 3fl 5fl 7fl 9fl 11fl 13fl 15fl 17fl 19fl 21fl 23fl 25fl 27fl 29fl 31fl
33fl 35fl 37fl 39fl 41fl 43fl 45fl 47fl 49flProblemsfl Ste psfl

AltAlt-pfl STANfl Blackboxfl LPG 2ndfl

(b) Figure 9: Performance on the Gripper (AIPS-98) and the Elevator (AIPS-00)
Domains.

0fl 2fl 4fl 6fl 8fl 10fl 12fl

1fl 4fl 7fl 10fl 13fl 16fl 19fl 22fl 25fl 28fl 31fl 34fl 37fl 40fl 43fl 46fl
49fl 52fl 55fl 58fl 61fl 64fl 67fl 70flProblemsfl Ste psfl

AltAlt-pfl STANfl Blackboxfl

(a)

0fl 200fl 400fl 600fl 800fl 1000fl 1200fl 1400fl

1fl 4fl 7fl 10fl 13fl 16fl 19fl 22fl 25fl 28fl 31fl 34fl 37fl 40fl 43fl 46fl
49fl 52fl 55fl 58fl 61fl 64fl 67fl 70flProblemsfl Tim efl

(b) Figure 10: Performance on the Schedule domain (AIPS-00) 2. Comparing
our incremental parallelization technique to AltAlt + Post-Processing. 3.
Ablation studies to analyze the effect of the different parts of the AltAltp
approach

on its overall performance.

Our experiments were all done on a Sun Blade-100 workstation, running SunOS
5.8 with 1GB RAM. Unless noted otherwise, AltAltp was run with the hadjsum2M
heuristic

644

Online Parallelization of Plans with Heuristic State Search 0fl 10fl 20fl
30fl 40fl 50fl 60fl 70fl 80fl 90fl

1fl 3fl 5fl 7fl 9fl 11fl13fl15fl17fl19fl21fl23fl25fl27fl29fl31fl33fl35fl37fl39fl41fl43fl45fl47fl49fl51fl53fl55fl57fl59fl61flProblemsfl
Ste psfl

Altalt-pfl STANfl TP4fl Blackboxfl LPG 2ndfl

(a)

0fl 50fl 100fl 150fl 200fl 250fl 300fl 350fl 400fl 450fl

1fl 4fl 7fl 10fl 13fl 16fl 19fl 22fl 25fl 28fl 31fl 34fl 37fl 40fl 43fl 46fl
49fl 52fl 55fl 58fl 61flProblemsfl Tim efl

(b) Figure 11: Performance on the Logistics domain(AIPS-00)

0fl 5fl 10fl 15fl 20fl 25fl 30fl 35fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8fl 9fl 10fl 11fl 12fl 13fl 14fl 15flProblemsfl
Ste psfl

AltAlt-pfl STANfl TP4fl Blackboxfl LPG 2ndfl Sapafl

(a)

0fl 100fl 200fl 300fl 400fl 500fl 600fl 700fl 800fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8fl 9fl 10fl 11fl 12fl 13fl 14fl 15flProblemsfl
Tim efl

(b) Figure 12: Performance on the DriverLog domain(AIPS-02)

described in section 2 of this paper, and with a parallel planning graph
grown until the first level where the top-level goals are present without
being mutex. All times are in seconds.

4.1 Comparing AltAltp with Competing Approaches In the first set of experiments
we have compared the performance of our planner with the results obtained
by running STAN (Long & Fox, 1999), Blackbox (Kautz & Selman,

645

Sanchez & Kambhampati 1999), TP4 (Haslum & Geffner, 2001), LPG (Gerevini
& Serina, 2002) and SAPA (Do & Kambhampati, 2001). Unless noted otherwise,
every planner has been run with its default settings. Some of the planners
could not be run in some domains due to parsing problems or memory allocation
errors. In such cases, we just omit that planner from consideration for those
particular domains.

4.1.1 Planners Used in the Comparison Studies STAN is a disjunctive planner,
which is an optimized version of the Graphplan algorithm that reasons with
invariants and symmetries to reduce the size of the search space. Blackbox
is also based on the Graphplan algorithm but it works by converting planning
problems specified in STRIPS (Fikes & Nilsson, 1971) notation into boolean
satisfiability problems and solving it using a SAT solver (the version we
used defaults to SATZ).5 LPG (Gerevini & Serina, 2002) was judged the best
performing planner at the 3rd International Planning Competition (Long &
Fox, 2002), and it is a planner based on planning graphs and local search
inspired by the Walksat approach. LPG was run with its default heuristics
and settings. Since LPG employs an iterative improvement algorithm, the quality
of the plans produced by it can be improved by running it for multiple iterations
(thus increasing the running time). To make the comparisons meaningful, we
decided to run LPG for two iterations (n=2), since beyond that, the running
time of LPG was generally worse than that of AltAltp. Finally, we have also
chosen two metric temporal planners, which are able to represent parallel
plans because of their representation of time and durative actions. TP4 (Haslum
& Geffner, 2001) is a temporal planner based on HSP*p (Haslum & Geffner,
2000), which is an optimal parallel state space planner with an IDA* search
algorithm. The last planner in our list is SAPA (Do & Kambhampati, 2001).
SAPA is a powerful domain-independent heuristic forward chaining planner
for metric temporal domains that employs distance-based heuristics (Kambhampati
& Sanchez, 2000) to control its search.

4.1.2 Comparison Results in Different Domains We have run the planners in
the Gripper domain from the International Planning and Scheduling competition
from 1998 (McDermott, 2000), as well as three different domains (Logistics,
Scheduling, and Elevator-miconic-strips) from 2000 (Bacchus, 2001), and three
more from the 2002 competition (Long & Fox, 2002) - DriverLog, ZenoTravel,
and Satellite. In cases where there were multiple versions of a domain, we
used the "STRIPS Untyped" versions.6. We discuss the results of each of the
domains below.

Gripper: In Figure 9(a), we compare the performance of AltAltp on the Gripper
domain (McDermott, 2000) to the rest of the planners excluding SAPA. The
plot shows the results in terms of number of (parallel) steps. We can see
that for even this simplistic domain, AltAltp and LPG are the only planners
capable of scaling up and generating parallel

5. We have not chosen IPP (Koehler, 1999), which is also an optimized Graphplan
planning system because

results reported by Haslum and Geffner (2001) show that it is already less
efficient than STAN. 6. Since SAPA does not read the STRIPS file format,
we have run the SAPA planner on equivalent problems

with unit-duration actions from Long and Fox (2002).

646

Online Parallelization of Plans with Heuristic State Search 0fl 5fl 10fl
15fl 20fl 25fl 30fl 35fl 40fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8fl 9fl 10fl 11fl 12fl 13fl 14fl 15flProblemsfl
Ste psfl

AltAlt-pfl STANfl LPG 2ndfl Sapafl

(a)

0fl 500fl 1000fl 1500fl 2000fl 2500fl 3000fl 3500fl 4000fl 4500fl 5000fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8fl 9fl 10fl 11fl 12fl 13fl 14fl 15flProblemsfl
Tim efl

(b) Figure 13: Performance on the ZenoTravel domain (AIPS-02) plans. None
of the other approaches is able to solve more than four problems.7 AltAltp
is able to scale up without any difficulty to problems involving 30 balls.
Furthermore, AltAltp returns better plans than LPG.

Elevator: In Figure 9(b), we compare AltAltp to STAN, Blackbox and LPG in
the Elevator domain (Miconic Strips) (Bacchus, 2001).8 AltAltp approached
the quality of the solutions produced by the optimal approaches (e.g. Blackbox
and STAN). Notice that Blackbox can only solve around half of the problems
solved by AltAltp in this domain.

Scheduling: Results from the Scheduling domain are shown in Figure 10. Only
Blackbox and STAN are considered for comparison.9 AltAltp seems to reasonably
approximate the optimal parallel plans for many problems (around 50 of them),
but does produce significantly suboptimal plans for some. However, it is
again able to solve more problems than the other two approaches and in a
fraction of the time.

Logistics: The plots corresponding to the Logistics domain from Bacchus (2001)
are shown in Figure 11.10 For some of the most difficult problems AltAltp
outputs lower quality solutions than the optimal approaches. However, only
AltAltp and LPG are able to scale up to more complex problems, and we can
easily see that AltAltp provides better quality solutions than LPG. AltAltp
also seems to be more efficient than any of the other approaches.

7. Although STAN is supposed to be able to generate optimal step-length plans,
in a handful of cases it

seems to have produced nonoptimal solutions for the Gripper Domain. We have
no explanation for this behavior, but have informed the authors of the code.
8. we did not include the traces from TP4 because the pre-processor of the
planner was not able to read

the domain. 9. The TP4 pre-processor cannot read this domain, LPG runs out
of memory, and SAPA has parsing

problems. 10. Only SAPA is excluded due to parsing problems.

647

Sanchez & Kambhampati 0fl 10fl 20fl 30fl 40fl 50fl 60fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8fl 9fl 10fl 11fl 12fl 13fl 14fl 15fl 16fl 17fl
18fl 19flProblemsfl Ste psfl

AltAlt-pfl STANfl TP4fl Blackboxfl LPG 2ndfl Sapafl

(a)

0fl 500fl 1000fl 1500fl 2000fl 2500fl 3000fl 3500fl 4000fl 4500fl 5000fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8fl 9fl 10fl 11fl 12fl 13fl 14fl 15fl 16fl 17fl
18fl 19flProblemsfl Tim efl

(b) Figure 14: Performance on the Satellite domain(AIPS-02) The LPG solutions
for problems 49 to 61 are obtained doing only one iteration, since LPG was
not able to complete the second iteration in a reasonable amount of time.
This explains the low time taken for LPG, and the lower quality of its solutions.

DriverLog: We see in Figure 12(a) that AltAltp does reasonably well in terms
of quality with respect to the other approaches in the DriverLog domain.
Every planner is considered this time. AltAltp is one of the two planners
able to scale up. Figure 12(b) shows also that AltAltp is more efficient
than any of the other planners.

Zeno-Travel: Only AltAltp, SAPA, and LPG are able to solve most of the problems
in this domain.11 AltAltp solves them very efficiently (Figure 13(b)) providing
very good solution quality (Figure 13(a)) compared to the temporal metric
planners.

Satellite: The results from the Satellite domain are shown in Figure 14.
Although every planner is considered, only AltAltp, SAPA, and LPG can solve
most of the problems. SAPA solves all problems but produces lower quality
solutions for many of them. AltAltp produces better solution quality than
SAPA, and is also more efficient. However, AltAltp produces lower quality
solutions than LPG in four problems. LPG cannot solve one of the problems
and produces lower quality solutions in 5 of them.

Summary: In summary, we note that AltAltp is significantly superior in the
elevator and gripper domains. It also performs very well in the DriverLog,
ZenoTravel, and Satellite domains from the 2002 competition (Long & Fox,
2002). The performance of all planners is similar in the Schedule domain.
In the Logistics domain, the quality of AltAltp plans are second only to
those of Blackbox for the problems that this optimal planner can solve. However,
it scales up along with LPG to bigger size problems, returning very good
step11. Blackbox and TP4 are not able to parse this domain.

648

Online Parallelization of Plans with Heuristic State Search 0fl 50fl 100fl
150fl 200fl 250fl

1fl 4fl 7fl 10fl 13fl 16fl 19fl 22fl 25fl 28fl 31fl 34fl 37fl 40fl 43fl 46fl
49fl 52fl 55fl 58fl 61flProblemsfl Ste psfl

AltAlt-PostProcfl AltAlt-pfl AltAltfl

(a)

0fl 100fl 200fl 300fl 400fl 500fl 600fl 700fl 800fl 900fl

1fl 4fl 7fl 10fl 13fl 16fl 19fl 22fl 25fl 28fl 31fl 34fl 37fl 40fl 43fl 46fl
49fl 52fl 55fl 58fl 61flProblemsfl Tim efl

(b) Figure 15: AltAlt and Post-Processing vs. AltAltp (Logistics domain)

0fl 10fl 20fl 30fl 40fl 50fl 60fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8fl 9fl 10fl 11fl 12fl 13fl 14fl 15flProblemsfl
Ste psfl

AltAltfl AltAlt-PostProcfl AltAlt-pfl

(a)

0fl 50fl 100fl 150fl 200fl 250fl 300fl 350fl 400fl 450fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8fl 9fl 10fl 11fl 12fl 13fl 14fl 15flProblemsfl
Tim efl

(b) Figure 16: AltAlt and Post-Processing vs. AltAltp (Zenotravel domain)

length quality plans. TP4, the only other heuristic state search regression
planner capable of producing parallel plans is not able to scale up in most
of the domains. SAPA, a heuristic search progression planner, while competitive,
is still outperformed by AltAltp in planning time and solution quality.

649

Sanchez & Kambhampati SOLUTION: solution found (length = 9)fl   Time 1: load-truck(obj13,tru1,pos1)
Level: 1fl   Time 1: load-truck(obj12,tru1,pos1) Level: 1fl   Time 1: load-truck(obj11,tru1,pos1)
Level: 1fl   Time 2: drive-truck(tru1,pos1,apt1,cit1) Level: 1fl   Time 3:
unload-truck(obj12,tru1,apt1) Level: 3fl   Time 3: fly-airplane(apn1,apt2,apt1)
Level: 1fl   Time 3: unload-truck(obj11,tru1,apt1) Level: 3fl   Time 4: load-airplane(obj12,apn1,apt1)
Level: 4fl   Time 4: load-airplane(obj11,apn1,apt1) Level: 4fl   Time 5:
load-truck(obj21,tru2,pos2) Level: 1fl   Time 5: fly-airplane(apn1,apt1,apt2)
Level: 2fl   Time 6: drive-truck(tru2,pos2,apt2,cit2) Level: 1fl   Time 6:
unload-airplane(obj11,apn1,apt2) Level: 6fl   Time 7: load-truck(obj11,tru2,apt2)
Level: 7fl   Time 7: unload-truck(obj21,tru2,apt2) Level: 3fl   Time 8: drive-truck(tru2,apt2,pos2,cit2)
Level: 2fl   Time 9: unload-airplane(obj12,apn1,apt2) Level: 6fl   Time 9:
unload-truck(obj13,tru1,apt1) Level: 3fl   Time 9: unload-truck(obj11,tru2,pos2)
Level: 9fl Total Number of actions in Plan: 19fl

(a) AltAltp Solution

POST PROCESSED PLAN ...fl Time: 1 :  load-truck(obj13,tru1,pos1)fl Time:
1 :  load-truck(obj12,tru1,pos1)fl Time: 1 :  load-truck(obj11,tru1,pos1)fl
Time: 1 :  fly-airplane(apn1,apt2,apt1)fl Time: 1 :  load-truck(obj21,tru2,pos2)fl
Time: 2 :  drive-truck(tru1,pos1,apt1,cit1)fl Time: 2 :  drive-truck(tru2,pos2,apt2,cit2)fl
Time: 3 :  unload-truck(obj12,tru1,apt1)fl Time: 3 :  unload-truck(obj11,tru1,apt1)fl
Time: 3 :  unload-truck(obj21,tru2,apt2)fl Time: 3 :  unload-truck(obj13,tru1,apt1)fl
Time: 4 :  load-airplane(obj12,apn1,apt1)fl Time: 4 :  load-airplane(obj11,apn1,apt1)fl
Time: 5 :  fly-airplane(apn1,apt1,apt2)fl Time: 6 :  unload-airplane(obj11,apn1,apt2)fl
Time: 6 :  unload-airplane(obj12,apn1,apt2)fl Time: 7 :  load-truck(obj11,tru2,apt2)fl
Time: 8 :  drive-truck(tru2,apt2,pos2,cit2)fl Time: 9 :  unload-truck(obj11,tru2,pos2)fl
END OF POST PROCESSING: Actions= 19 Length: 9fl

(b) AltAltp plus Post-processing

Figure 17: Plots showing that AltAltp solutions cannot be improved anymore
by Postprocessing.

4.2 Comparison to Post-Processing Approaches As we mentioned earlier (see
Section 1), one way of producing parallel plans that has been studied previously
in the literature is to post-process sequential plans (Backstrom, 1998).
To compare online parallelization to post-processing, we have implemented
Backstrom (1998)'s "Minimal De-ordering Algorithm", and used it to post-process
the sequential plans produced by AltAlt (running with its default heuristic
hAdjSum2M using a serial planning graph). In this section we will compare
our online parallelization procedure to this offline method.

The first set of experiments is on the Logistics domain (Bacchus, 2001).
The results are shown in Figure 15. As expected, the original AltAlt has
the longest plans since it allows only one action per time step. The plot
shows that post-processing techniques do help in reducing the makespan of
the plans generated by AltAlt. However, we also notice that AltAltp outputs
plans with better makespan than either AltAlt or AltAlt followed by postprocessing.
This shows that online parallelization is a better approach than post-processing
sequential plans. Moreover, the plot in Figure 15(b) shows that the time
taken by AltAltp is largely comparable to that taken by the other two approaches.
In fact, there is not much additional cost overhead in our procedure.

Figure 16 repeats these experiments in the ZenoTravel domain (Long & Fox,
2002). Once again, we see that AltAltp produces better makespan than post-processing
the sequential plans of AltAlt. Notice that this time, AltAlt plus post-processing
is clearly less efficient

650

Online Parallelization of Plans with Heuristic State Search

Logistics AIPS-00fl

0fl 20fl 40fl 60fl 80fl 100fl 120fl 140fl

1fl 4fl 7fl 10fl 13fl 16fl 19fl 22fl 25fl 28fl 31fl 34fl 37fl 40fl 43fl 46fl
49fl 52fl 55fl 58fl 61flProblemsfl Ste psfl

AltAlt-pfl AltAlt-p NoPushfl AltAlt-p AGRfl

(a)

Logistics AIPS-00fl 0fl 100fl 200fl 300fl 400fl 500fl 600fl

1fl 3fl 5fl 7fl 9fl 11fl13fl15fl17fl19fl21fl23fl25fl27fl29fl31fl33fl35fl37fl39fl41fl43fl45fl47fl49fl51fl53fl55fl57fl59fl61fl

Problemsfl

Tim efl

(b) Satellite AIPS-02fl

0fl 10fl 20fl 30fl 40fl 50fl 60fl 70fl 80fl 90fl 100fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8fl 9fl 10fl 11fl 12fl 13fl 14fl 15fl 16fl 17fl
18fl 19fl 20flProblemsfl Ste psfl

AltAlt-pfl AltAlt-NoPushfl AltAlt-p AGRfl

(c)

Satellite AIPS-02fl 0fl 50fl 100fl 150fl 200fl 250fl

1fl 2fl 3fl 4fl 5fl 6fl 7fl 8fl 9fl 10fl 11fl 12fl 13fl 14fl 15fl 16fl 17fl
18fl 19fl 20flProblemsfl Tim efl

(d) Figure 18: Analyzing the effect of the Pushup procedure

than either of the other two approaches. In summary, the results of this
section demonstrate that AltAltp is superior to AltAlt plus post-processing.

One might wonder if the plans generated by AltAltp can also benefit from
the postprocessing phase. We have investigated this issue and found that
the specific post-processing routines that we used do not produce any further
improvements. The main reason for this behavior is that the Pushup procedure
already tries to exploit any opportunity for shortening the plan length by
promoting actions up in the partial plan. As an illustrative example, we
show, in Figure 17, the parallel plan output by AltAltp for a problem from
the logistics domain (logistics-4-1 from Bacchus, 2001), and the result of
post-processing

651

Sanchez & Kambhampati Logistics AIPS-00fl

0fl 10fl 20fl 30fl 40fl 50fl 60fl 70fl 80fl

1fl 4fl 7fl 10fl 13fl 16fl 19fl 22fl 25fl 28fl 31fl 34fl 37fl 40fl 43fl 46fl
49fl 52fl 55fl 58fl 61flProblemsfl Ste psfl

Serial PGfl Parallel PGfl

(a) Utility of using Parallel Planning Graphs

BlocksWorld AIPS-00fl 0fl 20fl 40fl 60fl 80fl 100fl 120fl 140fl

1fl 3fl 5fl 7fl 9fl 11fl 13fl 15fl 17fl 19fl 21fl 23fl 25fl 27fl 29fl 31fl
33fl 35flProblemsfl Tim efl

AltAlt-pfl AltAltfl

(b) Solving a Serial domain Figure 19: Plots showing the utility of using
parallel planning graphs in computing the

heuristics, and characterizing the overhead incurred by AltAltp in serial
domains.

this solution. Although the two solutions differ in terms of step contents,
we notice that they have the same step length. The difference in step contents
can be explained by the fact that the de-ordering algorithm relaxes the ordering
relations in the plan, allowing for some actions to come earlier, while Pushup
always moves actions towards the end of the plan. We have run more comprehensive
studies in three different domains (Logistics, Satellite and Zenotravel),
and found that in no case is the step length of a plan produced by AltAltp
improved by the post-processing routine (we omit the comparison plots since
they essentially show the curves corresponding to AltAltp and AltAltp with
post-processing coincident).12

4.3 Ablation Studies This section attempts to analyze the impact of the different
parts of AltAltp on its performance.

Utility of the Pushup Procedure: Figure 18 shows the effects of running AltAltp
with and without the Pushup procedure (but with the fattening procedure),
as well as running it with a more aggressive version of Pushup, which as
described in Section 3, re-expands all the nodes in the search branch, after
an action has been pushed up. We can see that running AltAltp with the Pushup
and fattening procedures is better than just the latter. Comparison of results
in Figure 15(a) and Figure 18(a) shows that even just the fattening procedure
performs better than the original AltAlt. In Figure 18(b) we can see that
although the Pushup procedure does not add much overhead, the aggressive
version of Pushup does get quite expensive. We also notice that only around
20 problems are solved within time limits

12. We also verified this with at least one problem in all the other domains.

652

Online Parallelization of Plans with Heuristic State Search with aggressive
Pushup. The plots in Figure 18(c) and Figure 18(d) show the results of the
same experiments in the Satellite domain. We see that the situation is quite
similar in this domain. We can conclude then that the Pushup procedure, used
to offset the greediness of the algorithm, achieves its purpose.

Utility of basing heuristics on Parallel Planning Graphs: We can see in Figure
19(a) that using parallel planning graph as the basis for deriving heuristic
estimates in AltAltp is a winning idea. The serial planning graph overestimates
the heuristic values in terms of steps, producing somewhat longer parallel
solutions. The fact that the version using serial planning graph runs out
of time in many problems also demonstrates that the running times are also
improved by the use of parallel planning graphs.

Comparison to AltAlt: One final concern would be how much of an extra computational
hit is taken by the AltAltp algorithm in serial domains (e.g. Blocks-World
from Bacchus, 2001). We expect it to be negligible and to confirm our intuitions,
we ran AltAltp on a set of problems from the sequential Blocks-World domain.
We see from the plot 19(b) that the time performance between AltAlt and AltAltp
are equivalent for almost all of the problems.

5. Related work The idea of partial exploration of parallelizable sets of
actions is not new (Kabanza, 1997; Godefroid & Kabanza, 1991; Do & Kambhampati,
2001). It has been studied in the area of concurrent and reactive planning,
where one of the main goals is to approximate optimal parallelism. However,
most of the research there has been focused on forward chaining planners
(Kabanza, 1997), where the state of the world is completely known. It has
been implied that backward-search methods are not suitable for this kind
of analysis (Godefroid & Kabanza, 1991) because the search nodes correspond
to partial states. We have shown that backward-search methods can also be
used to approximate parallel plans in the context of classical planning.

Optimization of plans according to different criteria (e.g. execution time,
quality, etc) has also been done as a post-processing step. The post-processing
computation of a given plan to maximize its parallelism has been discussed
by Backstrom (1998). Reordering and de-ordering techniques are used to maximize
the parallelism of the plan. In de-ordering techniques ordering relations
can only be removed, not added. In reordering, arbitrary modifications to
the plan are allowed. In the general case this problem is NP-Hard and it
is difficult to approximate (Backstrom, 1998). Furthermore, as discussed
in Section 1 and 4, post-processing techniques are just concerned with modifying
the order between the existing actions of a given plan. Our approach not
only considers modifying such orderings but also inserting new actions online
which can minimize the possible number of parallel steps of the overall problem.

We have already discussed Graphplan based planners (Long & Fox, 1999; Kautz
& Selman, 1999), which return optimal plans based on the number of time steps.
Graphplan uses IDA* to include the greatest number of parallel actions at
each time step of the search. However, this iterative procedure is very time
consuming and it does not provide any guarantee on the number of actions
in its final plans. There have been a few attempts to minimize the number
of actions in these planners (Huang, Selman, & Kautz, 1999) by

653

Sanchez & Kambhampati using some domain control knowledge based on the generation
of rules for each specific planning domain. The Graphplan algorithm tries
to maximize its parallelism by satisfying most of the subgoals at each time
step, if the search fails then it backtracks and reduces the set of parallel
actions being considered one level before. AltAltp does the opposite, it
tries to guess initial parallel nodes given the heuristics, and iteratively
adds more actions to these nodes as possible with the Pushup procedure later
during search.

More recently, there has been some work on generalizing forward state search
to handle action concurrency in metric temporal domains. Of particular relevance
to this work are the Temporal TLPlan (Bacchus & Ady, 2001) and SAPA (Do &
Kambhampati, 2001). Both these planners are designed specifically for handling
metric temporal domains, and use similar search strategies. The main difference
between them being that Temporal TLPlan depends on hand-coded search control
knowledge to guide its search, while SAPA (like AltAltp) uses heuristics
derived from (temporal) planning graphs. As such, both these planners can
be co-opted to produce parallel plans in classical domains. Both these planners
do a forward chaining search, and like AltAltp, both of them achieve concurrency
incrementally, without projecting sets of actions, in the following way.
Normal forward search planners start with the initial state S0, corresponding
to time t0, consider all actions that apply to S0, and choose one, say a1
apply it to S0, getting S1. They simultaneously progress the "system clock"
from t0 to t1. In order to allow for concurrency, the planners by Bacchus
and Ady (2001), and Do and Kambhampati (2001) essentially decouple the "action
application" and "clock progression." At every point in the search, there
is a nondeterministic choice - between progressing the clock, or applying
(additional) actions at the current time point. From the point of view of
these planners, AltAltp can be seen as providing heuristic guidance for this
non-deterministic choice (modulo the difference that AltAltp does regression
search). The results of empirical comparisons between AltAltp and SAPA, which
show that AltAltp outperforms SAPA, suggest that the heuristic strategies
employed in AltAltp including the incremental fattening, and the pushup procedure,
can be gainfully adapted to these planners to increase the concurrency in
the solution plans. Finally, HSP*, and TP4, its extension to temporal domains,
are both heuristic state search planners using regression that are capable
of producing parallel plans (Haslum & Geffner, 2000). TP4 can be seen as
the regression version of the approach used in SAPA and temporal TLPlan.
Our experiments however demonstrate that neither of these planners scales
well in comparison to AltAltp.

The Pushup procedure can be seen as a plan compression procedure. As such,
it is similar to other plan compression procedures such as "double-back optimization"
(Crawford, 1996). One difference is that double-back is used in the context
of a local search, while Pushup is used in the context of a systematic search.
Double-back could be also applied to any finished plan or schedule, but as
any other post-processing approach its outcome would depend highly on the
plan given as input.

6. Concluding Remarks Motivated by the acknowledged inability of heuristic
search planners to generate parallel plans, we have developed and presented
an approach to generate parallel plans in the context of AltAlt, a heuristic
state space planner. This is a challenging problem because of

654

Online Parallelization of Plans with Heuristic State Search the exponential
branching factor incurred by naive methods. Our approach tries to avoid the
branching factor blow up by greedy and online parallelization of the evolving
partial plans. A plan compression procedure called Pushup is used to offset
the ill effects of the greedy search. Our empirical results show that in
comparison to other planners capable of producing parallel plans, AltAltp
is able to provide reasonable quality parallel plans in a fraction of the
time of competing approaches. Our approach also seems to provide better quality
plans than can be achieved by post-processing sequential plans. Our results
show that AltAltp provides an attractive tradeoff between quality and efficiency
in the generation of parallel plans. In the future, we plan to adapt the
AltAltp approach to metric temporal domains, where the need for concurrency
is more pressing. One idea is to adapt some of the sources of strength in
AltAltp to SAPA, a metric temporal planner being developed in our group (Do
& Kambhampati, 2001).

Acknowledgments We thank Minh B. Do and XuanLong Nguyen for helpful discussions
and feedback. We also thank David Smith and the JAIR reviewers for many constructive
comments. This research is supported in part by the NASA grants NAG2-1461
and NCC-1225, and the NSF grant IRI-9801676.

References Bacchus, F. (2001). The AIPS'00 planning competition. AI Magazine,
22 (3), 47-56. Bacchus, F., & Ady, M. (2001). Planning with resources and
concurrency: a forward

chaining approach. In Proceedings of IJCAI-01, pp. 417-424.

Backstrom, C. (1998). Computational aspects of reordering plans. Journal
of Artificial

Intelligence Research, 9, 99-137.

Blum, A., & Furst, M. (1997). Fast planning through planning graph analysis.
Artificial

Intelligence, 90, 281-300.

Bonet, B., & Geffner, H. (1999). Planning as heuristic search: new results.
In Proceedings

of ECP-99.

Bonet, B., Loerincs, G., & Geffner, H. (1997). A robust and fast action selection
mechanism

for planning. In Proceedings of AAAI-97, pp. 714-719. AAAI Press.

Crawford, J. (1996). An approach to resource-constrained project scheduling.
In Proceedings

of the 1996 Artificial Intelligence and Manufacturing Research Planning Workshop.
AAAI Press.

Do, M. B., & Kambhampati, S. (2000). Solving planning graph by compiling
it into a CSP.

In Proceedings of AIPS-00, pp. 82-91.

Do, M. B., & Kambhampati, S. (2001). SAPA: a domain-independent heuristic
metric

temporal planner. In Proceedings of ECP-01.

655

Sanchez & Kambhampati Fikes, R., & Nilsson, N. (1971). Strips: a new approach
to the application of theorem

proving to problem solving. Artificial Intelligence, 2 (3-4), 189-208.

Gerevini, A., & Serina, I. (2002). LPG: A planner based on local search for
planning graphs.

In Proceedings of AIPS-02. AAAI Press.

Godefroid, P., & Kabanza, F. (1991). An efficient reactive planner for synthesizing
reactive

plans. In Proceedings of AAAI-91, Vol. 2, pp. 640-645. AAAI Press/MIT Press.

Haslum, P., & Geffner, H. (2000). Admissible heuristics for optimal planning.
In Proceedings

of AIPS-00, pp. 140-149.

Haslum, P., & Geffner, H. (2001). Heuristic planning with time and resources.
In Proceedings

of ECP-01. Springer.

Huang, Y., Selman, B., & Kautz, H. (1999). Control knowledge in planning:
benefits and

tradeoffs. In Proceedings of AAAI/IAAI-99, pp. 511-517.

Kabanza, F. (1997). Planning and verifying reactive plans (position paper).
In Proceedings

of AAAI-97 Workshop on Immobots: Theories of Action, Planning and Control.

Kambhampati, S., & Sanchez, R. (2000). Distance based goal ordering heuristics
for graphplan. In Proceedings of AIPS-00, pp. 315-322.

Kautz, H., & Selman, B. (1996). Pushing the envelope: planning, propositional
logic, and

stochastic search. In Proceedings of AAAI-96, pp. 1194-1201. AAAI Press.

Kautz, H., & Selman, B. (1999). Blackbox: unifying sat-based and graph-based
planning.

In Proceedings of IJCAI-99.

Koehler, J. (1999). RIFO within IPP. Tech. rep. 126, University of Freiburg.
Korf, R. (1993). Linear-space best-first search. Artificial Intelligence,
62, 41-78. Long, D., & Fox, M. (1999). Efficient implementation of the plan
graph in STAN. Journal

of Artificial Intelligence Research, 10, 87-115.

Long, D., & Fox, M. (2002). The 3rd international planning competition: results
and

analysis. To appear in JAIR.

McDermott, D. (1999). Using regression-match graphs to control search in
planning. Artificial Intelligence, 109 (1-2), 111-159.

McDermott, D. (2000). The 1998 AI planning systems competition. AI Magazine,
21 (2),

35-55.

Nguyen, X., & Kambhampati, S. (2000). Extracting effective and admissible
heuristics from

the planning graph. In Proceedings of AAAI/IAAI-00, pp. 798-805.

Nguyen, X., Kambhampati, S., & Sanchez, R. (2002). Planning graph as the
basis for deriving heuristics for plan synthesis by state space and CSP search.
Artificial Intelligence, 135 (1-2), 73-123.

656

Online Parallelization of Plans with Heuristic State Search Sanchez, R.,
Nguyen, X., & Kambhampati, S. (2000). AltAlt: combining the advantages of

graphplan and heuristics state search. In Proceedings of KBCS-00. Bombay,
India.

657