An Architectural Approach to Ensuring Consistency in Hierarchical Execution

R. E. Wray and J. E. Laird

Volume 19, 2003

Links to Full Text:

Abstract:
Hierarchical task decomposition is a method used in many agent systems to organize agent knowledge. This work shows how the combination of a hierarchy and persistent assertions of knowledge can lead to difficulty in maintaining logical consistency in asserted knowledge. We explore the problematic consequences of persistent assumptions in the reasoning process and introduce novel potential solutions. Having implemented one of the possible solutions, Dynamic Hierarchical Justification, its effectiveness is demonstrated with an empirical analysis.

Extracted Text

Journal of Artificial Intelligence Research 19 {2003} 355-398Submitted 10/02;
published 10/03
 An Architectural Approach to Ensuring Consistency in Hierarchical Execution
Robert E. Wray wrayre@acm.org
 Soar Technology,Inc., 3600 Green Court, Suite 600 Ann Arbor, MI 48105 USA
 John E. Laird laird@umich.edu The University of Michigan, 1101 Beal Avenue
 Ann Arbor, MI 48109 USA Abstract
 Hierarchical task decomposition is a method used in many agentsystems to
organize
 agent knowledge.This work shows how the combination ofa hierarchy and persistent
 assertions of knowledge can lead to difficulty in maintaining logical consistency
in asserted knowledge. We explore the problematicconsequences of persistent
assumptions in the reasoning process and introduce novelpotential solutions.
Having implemented one of
 the possible solutions, Dynamic Hierarchical Justification, its effectiveness
is demonstrated
 with an empirical analysis. 1. Introduction
 Theprocess of executing a task by dividing it into a series of hierarchically
organized sub- tasks is called hierar chic altask dec omp osition. Hierarchicaltask
decomposition has been
 used in alarge number of agent systems, including theAdaptive Intelligent
Systems ar- chitecture {Hayes-Roth, 1990}, ATLANTIS {Gat, 1991a}, Cypress
{Wilkins etal., 1995},
 the Entropy Reduction Engine {Bresina, Drummond, & Kedar, 1993}, the ProceduralRea-
 soning System {Georgeff & Lansky, 1987}, RAPS {Firby, 1987}, Soar{Laird,
Newell, &
 Rosenbloom,1987; Laird & Rosenbloom, 1990}, and Theo{Mitchell, 1990; Mitchell
et al., 1991}, and is a cornerstone in belief-desire-intention-based agent
implementations {Rao & Georgeff, 1991; Wooldridge,2000}. Hierarchical task
decomposition helpsboth an agent's
 knowledge developerand the agent itself manage environmentalcomplexity.
For example, an agent may consider high-level taskssuch as \find a power
source" or \fly to Miami"
 independent of low-levelsubtasks such as \go east 10 meters" or \turnto
heading 135."
 The low-level taskscan be chosen dynamically based on the currently active
high level
 tasksand the current situation; thusthe high-leveltask is progressively
decomposed into smallersubtasks. This division of laborsimplifies the design
of agents, thus reducingtheir
 cost. Additional advantagesof hierarchical task decomposition include knowledge
sharing
 {a low-level subtask can be invoked for many different high-level procedures},
modularity
 {thedecomposition helps insulate subtasks from interaction with other knowledge}
and the naturalness of this representation {Simon,1969}.
 Without careful design, it can bedifficult to ensure consistent reasoning
in agents em-
 ploying hierarchical task decompositions. By \consistency," we mean that
reasoning does
 not lead to a set ofassertions that contains a contradiction.Ensuring consistency
becomes
 c fl2003 AI Access Foundationand Morgan Kaufmann Publishers. All rightsreserved.Wray
& Laird much more difficult to solve { andthus more costly { as the complexity
of anagent's
 knowledge grows. Althoughthis problem can be solved through carefuldesign
of agent
 knowledge, such anapproach requires an understanding of all possibleinteractions
in the
 hierarchy. Thus, the correctness of this solution depends on the skill and
vigilance of the knowledge engineer. Our bias is to seek solutions in which
the operation of an agent's prim-
 itive memories and processesare structured to ensure inconsistencies do
not arise.Thus,
 we will prefer architecturalsolutions to knowledge-based ones. Architecturalsolutions
can
 guarantee consistency for all tasks and domains, reducing brittleness due
toomissions in
 task knowledge.Further, while developing an architecturalsolution may be
costly, it should be less costly than repeatedly developing knowledge-based
solutions for different domains. The following sections describe theinconsistency
problem and introduce a space of solutions to the problem, including two
novel solutions. Through both theoretical and empirical analysis, one of
the new solutions,Dynamic Hierar chic al Justification, is shown
 to provide anefficient architectural solution to the problem of ensuring
reasoning consistency
 inhierarchical execution.
 2. MaintainingReasoning Consistency in Hierarchical Agents This section
describes the inconsistencyproblem in greater detail. We review methods
 for ensuring consistency in non-hierarchical systems and discuss the limitations
of these approaches in hierarchical systems. 2.1 Consistency in Non-hierarchical
Systems Truth maintenance systems {TMSs}are often used to maintain consistency
innon-hierarch-
 ical systems {Doyle, 1979;McDermott, 1991; Forbus & deKleer, 1993}.An inference
engine
 uses domain knowledgeto create two different kinds of assertionsof knowledge
in an agent's
 knowledgebase: assumptions and entailments. The inference engine enablesassumptions
 that it has decided to treatas being true, without requiring that the assertionbe
justi-
 fied. Agentsoften treat environmental percepts as assumptions or \unquestioned
beliefs"
 {Shoham, 1993}.Entailments are justified assertions.A data structure, the
justification,
 captures the reasons for asserting theentailment. When the reasons no longer
hold{the
 entailment is no longerjustified}, the TMS retracts it from the set ofasserted
beliefs. Thus,
 a TMSautomatically manages the assertion and retraction ofentailments as
an agent's
 situation changes, ensuring all entailments are consistent with the external
environment
 and the enabled assumptions.
 Carefulconstruction of the domain knowledge is required to ensure that no
enabled
 assumptionsare contradictory. For example, ifsome assumption is inconsistent
with the current input, then the agent must have domain knowledge that recognizes
the situationand
 removes the assumption.Thus, when an agent utilizes a TMS, theproblem of
maintaining
 consistencyin reasoning is largely one of managing assumptionsthrough the
agent's domain
 knowledge. Assumptions often reflect hypotheticalreasoning about the world
{hence \assump- tions"}. However, assumptionscan beused to represent any
persistent feature.Although
 researchers have exploredstructuring the external environment to providepersistent
mem-
 ory {Agre & Horswill,1997}, internal, persistent memory is usuallynecessary
in agent
 356Ensuring Consistency in Hierarchical Execution
atendatendZapfDingbatsSymbol_Times-Roman02800280InferenceTimes-Roman02800280EngineTimes-Roman02800280TMSTimes-Roman02800280MemoryTimes-Roman02400240AssumptionsTimes-Roman02400240EntailmentsTimes-Roman02400240AssumptionsTimes-Roman02400240EntailmentsTimes-Roman02400240AssumptionsTimes-Roman02400240EntailmentsTimes-Roman02400240AssumptionsTimes-Roman02400240EntailmentsTimes-Roman02400240AssumptionsTimes-Roman02400240EntailmentsTimes-Roman02400240AssumptionsTimes-Roman02400240EntailmentsTimes-Roman02400240AssumptionsTimes-Roman02400240EntailmentsTimes-Roman02800280Hierarchy
MaintenanceFigure 1: A hierarchicalagent.
 domains. For example, persistence is required for hypothetical reasoning,nonmonotonic
 revisions of assertions {suchas when counting}, and remembering. 2.2 Truth
Maintenance in Hierarchical Agents
 The TMS agent framework introduced above can be extended to hierarchical
agent architec-
 tures. In such an agent, the inferenceengine and TMS are more or less identical
tothose of
 a non-hierarchical agent.When the agent initiates a new subtask viadynamic
hierarchical
 taskdecomposition, italso creates a new database that will containassumptions
and entail-
 ments specificto the subtask. Further decomposition canresult in a stack
of subtasks, each containing entailments and assumptions specific to the
subtask, as shown in Figure 1.We
 consider the creation and deletionof these distinct databases of assertions
thesine qua non
 of a hierarchical architecture. The architecture decomposes the task not
only by identifying
 relevant subtasks, but also by dynamicallyorganizing its memory according
to the current decomposition.
 A new system component, \hierarchy maintenance," is responsiblefor creating
and
 destroying the subtaskdatabases when subtasks begin and terminate.When a
subtask is
 achieved {ordetermined to be no longer worth pursuing},hierarchy maintenance
responds
 by immediately removing all the assertions associated with the subtask.
This function is of central importance in hierarchical architectures because
it allows the agent toautomatically
 retract all assertions associated with a terminated subtask, requiring no
agent knowledge
 to \clean up" or removeindividual assertions associated with the terminatedsubtask.
The
 hierarchy maintenance component can efficiently remove the assertionsbecause
they are
 {conceptually}located in a distinct unit, the subtask database. 357Wray
& LairdLevel 1Bl
Level 2Bl
Level 3BlatendatendZapfDingbatsSymbol_Times-Roman02800280eTimes-Roman0160016031Times-Roman02800280eTimes-Roman0180018032Times-Roman02800280aTimes-Roman0160016031Times-Roman02800280aTimes-Roman0160016032Times-Roman02800280Hierarchy
MaintenanceTimes-Roman02800280eTimes-Roman0160016021Times-Roman02800280eTimes-Roman0160016023Times-Roman02800280eTimes-Roman0160016022Times-Roman02800280aTimes-Roman0160016011Times-Roman02800280aTimes-Roman0160016012Times-Roman02800280aTimes-Roman0160016013Times-Roman02800280aTimes-Roman0160016021Times-Roman02800280aTimes-Roman0160016022Times-Roman02800280eTimes-Roman0160016011Times-Roman02800280eTimes-Roman0160016012Times-Roman02800280eTimes-Roman0160016013Times-Roman02800280eTimes-Roman0160016014Times-Roman02800280eTimes-Roman0160016015Times-Roman02800280eTimes-Roman0160016016Times-Roman02800280eTimes-Roman0160016017Times-Roman02800280eTimes-Roman0160016018Times-Roman02800280eTimes-Roman0160016019Times-Roman02800280aTimes-Roman0160016011Times-Roman02800280aTimes-Roman0160016012Times-Roman02800280aTimes-Roman0160016013Times-Roman02800280aTimes-Roman0160016014Times-Roman02800280aTimes-Roman0160016015Times-Roman02800280Level
1Times-Roman02800280Level 2Times-Roman02800280Level 3Times-Roman02800280{aTimes-Roman0160016012Times-Roman02800280,
aTimes-Roman0160016022Times-Roman02800280, eTimes-Roman0160016022Times-Roman02800280}Times-Roman02800280{aTimes-Roman0160016014Times-Roman02800280,
aTimes-Roman0160016015Times-Roman02800280, eTimes-Roman0160016014Times-Roman02800280,
eTimes-Roman0160016019Times-Roman02800280}Times-Roman02800280aTimes-Roman0160016034PSfrag
replacementsSubtask 1
Subtask
 2
Subtask
 3
Figure 2: An example ofhierarchy maintenance. Assumptions{\a"} and entailments
{\e"}
 are asserted within subtasks.
 Anagent's hierarchy maintenance function can be employed to help maintain
consis- tency, illustrated notionally in Figure 2.The agent architecture
identifies assertionsat each
 of the higher levels in thehierarchy that led to a new subtask.These assertions
together
 forma subtask \support set." In Figure 2,assertions a
 14
 ,a
 15
 , e
 14
 , and e
 19 form the support
 set forS ubtask
 2
 whilea
 12
 , a 22
 , e
 22 \support" S ubtask
3
 . These support sets, ineffect, form
 justifications for subtasks inthe hierarchy. When an assertion in asupport
set is removed
 {e.g.,a
 22
 }, the agent responds by removing the subtask {S ubtask 3
 }. While not all hierarchi-
 cal architectures use architecturalprocesses to create and destroy subtask
databases,this
 exampleillustrates how an architectural hierarchical maintenance function
can be realized
 via a process similar to thatof justification in truth maintenance. Withina
specific subtask, reason maintenance can go on as before. However,the hierar-
 chical structure adds acomplication to the maintenance of logical consistency.
Assumptions
 at some level in thehierarchy can be dependent on entailments and assumptions
in higher
 levelsof thehierarchy.
 1
This dependence relationship is suggested inFigure 1 by the curved
 linesextending from one subtask to the one belowit. Higher levels of the
hierarchy forma
 \context" for reasoning in the localsubtask.
 For execution agents embedded in dynamic domains, the hierarchical context
may
 change at almost any time.The changing context is not problematic forentailments;
the1.Assumptions in a lower levelsubtask are always at least indirectlydependent
on the higher level asser- tions. This observation will be exploited in Section
3.3.
 358Ensuring Consistency in Hierarchical Execution
 TMS can readily determinedependent context changes and retractaffected entailments.
 However,changes in higher levels of the hierarchy {such as those deriving
from inputs} may also invalidate the assumptionsof lower levels. Without
any additionalarchitectural
 mechanisms, domain knowledgeis required to ensure consistency among assumptions
and the hierarchical context as in non-hierarchical systems. The domain knowledge
for ensur- ing consistency in the assumptions iscomplicated by the necessity
of spanning multiple
 {possibly many} subtasks.We refer to such knowledge as\across-level" consistency
knowl-
 edge. As described in further detail below, identifying and creating across-level
consistency knowledge is a tedious, costly, andoften incomplete process.
Across-levelknowledge must
 explicitly consider the interactions between different subtasks {indifferent
levels of the hi-
 erarchy}, rather than focus solely onthe local subtask, compromising the
benefit ofthe
 hierarchical decomposition. Before continuing, we note that hierarchical
architectures should be contrasted with hierarchical task network {HTN} planners{Sacerdoti,
1975; Erol, Hendler, & Nau, 1994} andexecution-oriented systems that use
HTNrepresentations, such as DECAF {Graham & Decker, 2000} and RETSINA {Sycara,
Decker, Pannu, Williamson, & Zeng, 1996}.A
 planning problem for an HTN planner isrepresented by an initial task network
thatcan
 consist of primitive and non-primitivetasks. The planner uses operators
to find aplan
 to solve the tasks. Methodsallow the planner to match non-primitive taskswith
other
 task networks that describe how to accomplish the task; thus, methodsenable
hierarchical
 decomposition of theplanning problem into a family of connected tasknetworks.
 The main difference between HTN systems and hierarchical architectures isthat
the
 planner represents its plan ina single global state. That is, while methodsrepresent
decom-
 position steps, the hierarchical structure of an evolving plan is represented
in a blackboard-
 likedatabase that does not also reflect the structure of the decomposition.
The following sections discuss problems and especially solutions that depend
on the hierarchical organi-
 zation of asserted knowledgeduring execution, in addition to a hierarchical
task decom-
 position encoded as an agent'stask knowledge. Thus, the following willnot
be generally
 applicable to HTN-basedexecution systems. However, HTN systems need to address
the
 inconsistency problem; Section5.1.1 examines the consequences of global
state withrespect
 to inconsistency arising from persistence in a hierarchy.
 2.3 Failing to Respond to Relevant Changes in Hierarchical Context As mentioned
in the introduction, whenan agent fails to respond to a relevant change
 in its hierarchical context and leaves a now-inconsistent assumptionenabled,
the resulting
 behaviorcan become irrational; thatis, not consistentwith its knowledge.
This section explores how such irrational behaviorcan arise with several
illustrative examples. 2.3.1 The Blocks World
 We use a variant of the blocks world to illustrate the inconsistency problem
ina domain
 familiar to most readers.This domain is an execution domain rather than
aplanning
 domain, which we call the\Dynamic Blocks World" to reflect thisdifference
from the static
 blocksworld used in planning. We assume theagent has knowledge to build
an ordered 359Wray & LairdPacked Array Operators8/2/90(C) 1987-1990 Adobe
Systems Incorporated All Rights ReservedCMYK Color Operators1/23/89(C) 1987-1990
Adobe Systems Incorporated All Rights Reservedcshow Operator1/23/89(C) 1987-1990
Adobe Systems Incorporated All Rights ReservedCustom Color Operators5/9/88(C)
1987-1990 Adobe Systems Incorporated All Rights ReservedTypography Operators5/31/90(C)
1987-1990 Adobe Systems Incorporated All Rights ReservedAdobe Illustrator
(R) Version 3.0 Full Prolog7/22/89(C) 1987-1990 Adobe Systems Incorporated
All Rights Reserved-0000stringtype1r3r2rput-on-table{3}r  put-on-table{2}r
put-down{2}rGoal: {1 on 2} {2 on 3} {3 on table}r1r3r2rActual World
StaterAgent Memoryremptyrspacer1r3r2r1r3r2remptyrspacerput-on-table{3}r
put-on-table{2}r    put-down{2}rActual World StaterAgent MemoryrErxrtrerrrnrarlr
rervrernrtrkrnrorcrkrsr rorvrerrr r3r1r3r2r1r3r2remptyrspacerput-on-table{3}r
put-on-table{2}r    put-down{2}rActual World StaterAgent MemoryrtimertimerFigure
3: Failingto respond to relevant changes inhierarchical context in the Dynamic
Blocks World.
 tower{1-on-2-on-3} without resorting to planning and uses hierarchical task
decomposition
 to determine what actions to take as it buildsthe tower.
 In Figure 3, the agent isplacing block-2 on the table, in order toreach
block-3 and
 begin the goal tower. The put-down subtask finds anempty location on the
table. The agent places the empty assertion inthe memory associated with
the put-downsubtask. In
 the figure, the spaceimmediately to the left of the gripper was chosen.
Whether or not a
 space is empty may not be directly observablebut may need to be inferred
from a number
 of other facts in the domain andstored as an assumption in memory.Assume
the empty
 assertionis an assumption. Now, assume block-3is suddenly placed underneath
block-2.
 The result is an inconsistency between the assumption {the location is a
goodplace to put
 block-2}and the hierarchical context {the location isno longer a good place
to put the block on the table}.
 If the agent fails to recognize thatblock-3 has moved, it will attempt toput
block-2
 into the same location occupied by block-3.This behavior is irrational,
or not consistent
 with the agent's goals and knowledge{assuming the agent has knowledge that
indicates that blocks should not be placed in positions already occupied
by other blocks}. The incon-
 sistency arises because the agent has failed to recognize itspreviously-derived
assumption
 {empty} is no longer true in the current situation. Although this example
may appear contrived, this specific situation arose in an experi-
 mental system developed toexplore architecture and learning issues.Of course,
it is possible
 in such asimple domain to reformulate the task such thatthe problem does
not occur. This reformulation of the task via changes to or additions of
knowledge is exactly the solution we wish to avoid. Thatis, we desire that
the architecture guaranteeconsistency between
 the hierarchicalcontext and local assumptions such that thearchitecture
provides a priori
 constraints {guidance} in the knowledge development process and increased
robustness in execution {via consistency}. Theconclusion returns to this
example to describe how an
 360Ensuring Consistency in Hierarchical Execution
Robert Wrayturn-to-headingachieve-proximityattack{defensive}interceptpatrolFigure
4: Decompositionof behavior into subtasks.
 architectural solution to the inconsistency problemsolves this particular
problem { without requiring any reformulation of the agent'stask knowledge.
 2.3.2 TacAir-Soar T acAir-Soar agents pilot virtualmilitary aircraft in
a complex, real-time computer simula-
 tion of tactical combat {Tambe et al., 1995; Jones et al., 1999}.The TacAir-Soar
domain is
 onlyindirectly accessible {each agent uses simulated aircraft sensor models
and can perceive only what a pilot in a real aircraft would sense}, nondeterministic
{from the point ofview
 of the agent, the behavior ofother agents cannot be strictly predicted or
anticipated}, non-
 episodic {the decisions anagent makes early in the simulation can impact
later options and
 capabilities},dynamic{the world changes in real time whilethe agent is reasoning},
and
 continuous{individual inputs have continuous values}. Domains with these
characteristics are the most difficult ones in which tocreate and apply agents
{Russell & Norvig, 1995}. The domain knowledge of TacAir-Soar agents is organized
into over450 subtasks; during
 execution, theresultinghierarchical task decomposition sometimes reachesdepths
of greater
 than 10 subtasks.Each agent can have one of severaldifferent mission roles,
among them flying a patrol mission, and acting as apartner or \wing" to
some other agent's \lead." Consider a pair of planes on patrol, whichhave
been given specific instructions for engaging enemy aircraft. Whenenemy aircraft
enter the patrol area, the leadagent decides
 tointerceptthe aircraft.The lead then decomposes the intercept into a series
of situation-
 dependentsubtasks, which themselves may be furtherdecomposed. For example,
Figure 4 shows that the complex task of interceptingan enemy aircraft has
been decomposed intoa
 decision to turn the agent's aircraft toa specific heading. The agent turns
tothis heading
 in order to get close enoughto the enemy agent {via achieve-proximity} to
launch an
 attack. Assume three different kinds of attackcan be chosen for an intercept.
The first tactic
 {scare} is to engage and attempt to scare awayenemy planes without using
deadly force. This tactic is selected when the rules ofengagement specify
that deadly force should not be
 used, regardless of the number ofaircraft in the area. One of the remaining
two tactics will
 be chosen whendeadly force is allowed. Offensiveattack is appropriate when
friendlyRobert Wrayturn-to-headingachieve-proximitycount {enemy}count{friendly}attack{defensive}interceptinterceptinterceptpatrolpatrolpatrolFigure5:
Traceof behavior leading to intercept tactic in TacAir-Soar.
 361Wray & LairdepswriteC) 2001 artofcode LLC, Benicia, CA.  All rights reserved.
%%BeginResource: procset GS_epswrite_2_0_1001
/GS_epswrite_2_0_1001 80 dict
dup begin
/PageSize 2 array def/setpagesize{ PageSize aload pop 3 index eq
exch
4 index eq and{ pop pop pop}{ PageSize dup  1
5 -1 roll put 0 4 -1 roll
put dup where{ exch get exec}
{ pop/setpagedevice where
{ pop 1 dict dup
/PageSize PageSize put setpagedevice}
{ /setpage where{ pop PageSize aload
pop pageparams 3 {exch pop} repeat
setpage}if}ifelse}ifelse}ifelse} bind
def
/!{bind def}bind def/#{load def}!/N/counttomark #
/rG{3{3 -1 roll 255
div}repeat setrgbcolor}!/G{255 div setgray}!/K{0 G}!
/r6{dup 3 -1 roll rG}!/r5{dup
3 1 roll rG}!/r3{dup rG}!
/w/setlinewidth #/J/setlinecap #
/j/setlinejoin
#/M/setmiterlimit #/d/setdash #/i/setflat #
/m/moveto #/l/lineto #/c/rcurveto
#
/p{N 2 idiv{N -2 roll rlineto}repeat}!
/P{N 0 gt{N -2 roll moveto p}if}!
/h{p closepath}!/H{P closepath}!
/lx{0 rlineto}!/ly{0 exch rlineto}!/v{0
0 6 2 roll c}!/y{2 copy c}!
/re{4 -2 roll m exch dup lx exch ly neg lx h}!
/^{3 index neg 3 index neg}!
/f{P fill}!/f*{P eofill}!/s{H stroke}!/S{P stroke}!
/q/gsave #/Q/grestore #/rf{re fill}!
/Y{P clip newpath}!/Y*{P eoclip newpath}!/rY{re
Y}!
/|={pop exch 4 1 roll 3 array astore cvx exch 1 index def exec}!
/|{exch
string readstring |=}!
/+{dup type/nametype eq{2 index 7 add -3 bitshift
2 index mul}if}!
/@/currentfile #/${+ @ |}!
/B{{2 copy string{readstring
pop}aload pop 4 array astore cvx
3 1 roll}repeat pop pop true}!
/Ix{[1 0
0 1 11 -2 roll exch neg exch neg]exch}!
/,{true exch Ix imagemask}!/If{false
exch Ix imagemask}!/I{exch Ix image}!
/Ic{exch Ix false 3 colorimage}!
/F{/Columns
counttomark 3 add -2 roll/Rows exch/K -1/BlackIs1 true>>
/CCITTFaxDecode
filter}!/FX{<
,
251 255 r3
1087.5 712.5 m
0 62 50.5 112.5 112.5 112.5 c
62.25 0 112.5 -50.5
112.5 -112.5 c
0 -62.25 -50.25 -112.5 -112.5 -112.5 c
-62 0 -112.5 50.25
-112.5 112.5 c
f*
K
1087.5 712.5 m
0 62 50.5 112.5 112.5 112.5 c
62.25 0
112.5 -50.5 112.5 -112.5 c
0 -62.25 -50.25 -112.5 -112.5 -112.5 c
-62 0 -112.5
50.25 -112.5 112.5 c
h
S
1148.5 667.5 0 2.5 3.75 0 P
4.25 0 7.25 1.5 9 4
c
1 1.75 1.5 5.25 1.5 11 c
0 64 p
0 6.25 -0.75 10.25 -2 11.75 c
-2 2.25 -4.75
3.25 -8.5 3.25 c
-3.75 0 0 2.75 40.5 0 p
14.75 0 26 -1.75 33.75 -5 c
7.75
-3.5 14 -9 18.75 -17 c
4.75 -7.75 7 -17 7 -27.25 c
0 -13.75 -4.25 -25.25
-12.5 -34.5 c
-9.5 -10.5 -23.75 -15.5 -43 -15.5 c
h
1176.75 674.5 m
6.25
-1.25 11.5 -2 15.75 -2 c
11.25 0 20.75 4 28.25 12 c
7.5 8 11 18.75 11 32.25
c
0 13.75 -3.5 24.5 -11 32.5 c
-7.5 8 -17.25 12 -28.75 12 c
-4.5 0 -9.5 -0.75
-15.25 -2.25 c
f
251 255 r3
825 712.5 m
0 62 50.5 112.5 112.5 112.5 c
62.25
0 112.5 -50.5 112.5 -112.5 c
0 -62.25 -50.25 -112.5 -112.5 -112.5 c
-62 0
-112.5 50.25 -112.5 112.5 c
f*
K
825 712.5 m
0 62 50.5 112.5 112.5 112.5
c
62.25 0 112.5 -50.5 112.5 -112.5 c
0 -62.25 -50.25 -112.5 -112.5 -112.5
c
-62 0 -112.5 50.25 -112.5 112.5 c
h
S
978 769 2.25 -33.75 -2.25 0 P
-3.25
10 -7.5 17.25 -13 21.75 c
-5.75 4.5 -12.5 6.75 -20.25 6.75 c
-6.5 0 -12.5
-1.75 -18 -5 c
-5.25 -3.5 -9.5 -8.75 -12.5 -16 c
-3 -7.5 -4.5 -16.5 -4.5
-27.5 c
0 -9 1.5 -16.75 4.25 -23.25 c
3 -6.5 7.25 -11.75 13 -15.25 c
5.75
-3.5 12.5 -5.25 19.75 -5.25 c
6.5 0 12.25 1.5 17.25 4.25 c
4.75 2.75 10.25
8.25 16.25 16.5 c
2.25 -1.5 p
-5 -9 -10.75 -15.5 -17.5 -19.5 c
-6.75 -4 -14.5
-6.25 -23.75 -6.25 c
-16.25 0 -29 6.25 -38.25 18.5 c
-6.5 9 -10 19.75 -10
32 c
0 10 2.25 19 6.75 27.25 c
4.25 8.25 10.5 14.75 18.25 19.25 c
7.75 4.75
16.25 7 25.5 7 c
7.25 0 14.25 -1.75 21.25 -5.25 c
2 -1.25 3.5 -1.75 4.5 -1.75
c
1.25 0 2.5 0.5 3.5 1.5 c
1.25 1.25 2 3 2.5 5.5 c
f
251 255 r3
562.5 712.5
m
0 62 50.5 112.5 112.5 112.5 c
62.25 0 112.5 -50.5 112.5 -112.5 c
0 -62.25
-50.25 -112.5 -112.5 -112.5 c
-62 0 -112.5 50.25 -112.5 112.5 c
f*
K
562.5
712.5 m
0 62 50.5 112.5 112.5 112.5 c
62.25 0 112.5 -50.5 112.5 -112.5 c
0 -62.25 -50.25 -112.5 -112.5 -112.5 c
-62 0 -112.5 50.25 -112.5 112.5 c
h
S
694.5 718 m
6.75 -1.5 12 -3.75 15.25 -7 c
4.75 -4.5 7.25 -10 7.25 -16.5
c
0 -5 -1.5 -9.5 -4.75 -14 c
-3 -4.75 -7.5 -8 -12.75 -10 c
-5.5 -2 -13.75
-3 -25 -3 c
-47 0 0 2.5 3.75 0 p
4.25 0 7.25 1.5 9 4 c
1.25 1.75 1.75 5.5
1.75 11 c
0 64 p
0 6.25 -0.75 10.25 -2.25 11.75 c
-1.75 2.25 -4.75 3.25 -8.5
3.25 c
-3.75 0 0 2.75 43 0 p
8 0 14.5 -0.75 19.25 -1.75 c
7.25 -1.75 13 -5
16.75 -9.5 c
4 -4.25 5.75 -9.5 5.75 -15.25 c
0 -5 -1.5 -9.5 -4.5 -13.5 c
-3 -3.75 -7.5 -6.75 -13.25 -8.75 c
h
656 722 m
1.75 -0.25 3.75 -0.5 6.25
-0.75 c
2.25 -0.25 4.75 -0.25 7.5 -0.25 c
7.25 0 12.5 0.75 16.25 2.25 c
3.5
1.5 6.25 4 8.25 7 c
1.75 3.25 2.75 6.75 2.75 10.5 c
0 5.75 -2.25 10.75 -7
15 c
-4.75 4 -11.75 6 -20.75 6 c
-5 0 -9.25 -0.5 -13.25 -1.5 c
h
656 674.5
m
5.75 -1.25 11.25 -2 16.75 -2 c
8.75 0 15.5 2 20.25 6 c
4.75 4 7 9 7 14.75
c
0 4 -1 7.5 -3.25 11.25 c
-2 3.5 -5.5 6.25 -10.25 8.25 c
-4.75 2.25 -10.5
3.25 -17.5 3.25 c
-3 0 -5.75 0 -7.75 -0.25 c
-2.25 0 -4 -0.25 -5.25 -0.5
c
f
251 255 r3
1612.5 712.5 m
0 62 50.5 112.5 112.5 112.5 c
62.25 0 112.5
-50.5 112.5 -112.5 c
0 -62.25 -50.25 -112.5 -112.5 -112.5 c
-62 0 -112.5
50.25 -112.5 112.5 c
f*
K
1612.5 712.5 m
0 62 50.5 112.5 112.5 112.5 c
62.25
0 112.5 -50.5 112.5 -112.5 c
0 -62.25 -50.25 -112.5 -112.5 -112.5 c
-62 0
-112.5 50.25 -112.5 112.5 c
h
S
1692 761.25 0 -39.25 21.75 0 P
5.75 0 9.5
0.75 11.25 2.5 c
2.5 2.25 4 6.25 4.25 12 c
2.75 0 0 -34.5 ^ p
-0.5 4.75 -1.25
7.75 -2 9.25 c
-0.75 1.75 -2.25 3 -4.25 4 c
-2 1 -5 1.5 -9.25 1.5 c
-21.75
0 0 -32.75 p
0 -4.5 0 -7.25 0.5 -8 c
0.5 -1 1 -1.75 2 -2.25 c
1 -0.75 2.75
-1 5.5 -1 c
17 0 p
5.5 0 9.75 0.5 12.25 1.25 c
2.5 0.75 5 2.25 7.25 4.5 c
3 3 6.25 7.75 9.25 13.75 c
3 0 -8.5 -24.75 -76.5 0 0 2.5 3.5 0 p
2.25 0 4.5
0.75 6.5 1.75 c
1.75 0.75 2.75 2 3.25 3.5 c
0.5 1.5 0.75 4.75 0.75 9.75 c
0 64.5 p
0 6.25 -0.5 10 -1.75 11.5 c
-1.75 2 -4.75 3 -8.75 3 c
-3.5 0 0 2.75
76.5 0 1 -21.75 -2.75 0 p
-1 5.25 -2.25 8.75 -3.5 10.75 c
-1.25 2 -3 3.5
-5.5 4.5 c
-2 0.75 -5.5 1 -10.25 1 c
f
1749 667 38 51 /1Q
$C
2@Vk+mr$8![N=1?]3sC&DV_pJpNq=uLuL
s8W-!s8Vb^&+9aYLQD@pl-o0:r7_~>
,
1511 467 35 37 /6N
$C
,:V+n+Q5A'pI$]!Lj?WUoUF=qmA8b,s8AdTjKr"bg-oHfLj+dL(l>7~>
,
1545 469 29 35 /1Y
$C
0H'f`>?fAs8W-!s8R5pZJ&mm.R=
,
1570 468
37 36 /6R
$C
-PXe`#(reD42p8+1_=1CC#9CBli"TlqgH2riJ5;]L,aE?c%i1nli-5M.p=R7Fi:]~>
,
1603 469 1Y ,
1628 467 34 37 /2C
$C
.I#21+R.6qVi^W9+FQ1?oShZf8&lrV=t:S_*T6p=@qu$mc<^W%_$~>
,
1661 469 41 53 /6V
$C
0E`0n4SV6^QC@dPs8W-!s8W-!s8VrqpYi:,"rAL(=+8a1E5LZ8s88.+R8D#C)f)9$n-ts8W-!s8W,](uQ&F`lD=lGm[@]e*0l-U+#DO3r+#Y4#'1'KD#P~>
,
2025 468 7F ,
1637 381 48 50 /7J
$C
2@Z6k?d!H^`HZg`/ns8W-!s8W-!s82HI#<%/WYe4o~>
,
1683 379 7B ,
1720 379 2C ,
1753 380 6R ,
1787 381 2G ,
1826 379 30 37
/2U
$C
2:KX,Mh5C]milhiDbmFjQg&YB6A2%@.SuB66"+1u=@EK,>I0P8LHF;X;s8W-!s8W-!s8W-!eR:e]L&:~>
,
1893 379 38 55 /2Y
$C
,:UtP&9+-`*lG ( l+AA?qsX"P^]4?6s4?r@95CrQ$"C1prN=aE5LZ8s8VC>oKNS9K>YJ~>
,
1930 380 7F ,
1951 380 6R ,
1984 379 2U ,
2013 381 41 53 /0C
$C
0E`3s4SV1X>(tDs4>N]6h#>FGWY~>
,
251 255 r3
1125 75 225 225 re
f
[ 8 4 ] 0 d
1 w
0 J
K
1125 74.9998 225
225 re
S
1269 161.5 -7 -19 -58.75 0 0 2.5 P
17.25 15.75 29.5 28.75 36.5 38.5
c
7 10 10.5 19.25 10.5 27.5 c
0 6.25 -2 11.5 -5.75 15.5 c
-3.75 4 -8.5 6
-13.75 6 c
-5 0 -9.25 -1.5 -13.25 -4.25 c
-3.75 -2.75 -6.75 -7 -8.5 -12.5
c
-2.75 0 p
1.25 9 4.25 16 9.25 20.75 c
5.25 4.75 11.5 7.25 19 7.25 c
8 0
14.75 -2.5 20.25 -7.75 c
5.25 -5.25 8 -11.25 8 -18.25 c
0 -5 -1 -10.25 -3.5
-15.25 c
-3.5 -7.75 -9.5 -16.25 -17.5 -25 c
-12.25 -13.25 -19.75 -21.25 -22.75
-24 c
25.75 0 p
5.5 0 9 0.25 11.25 0.5 c
2 0.5 4 1.25 5.75 2.5 c
1.75 1 3.25
2.75 4.5 5 c
f
[ ] 0 d
5 w
1 J
412.5 600 433 -277.25 S
850.5 338 30.75 -38
-47.5 12 16.75 26 f*
675 600 199.5 -266 S
884.5 346.5 15.5 -46.5 -40.25 27.75
24.75 18.75 f*
937.5 600 0 -257.5 S
953 346.5 -15.5 -46.5 -15.5 46.5 31 0
f*
1200 600 -199.25 -266 S
1015.5 327.75 -40.5 -27.75 15.5 46.5 25 -18.75
f*
1462.5 600 -414.5 -276.5 S
1060 312.75 -47.5 -12.75 30.25 38.5 17.25 -25.75
f*
[ 5 10 ] 0 d
1725 600 -451 -277.75 S
1285.25 311 -47.75 -11 31.5 37.5
16.25 -26.5 f*
[ ] 0 d
17 w
1462.5 825 m
0.5 0 65.75 37.5 131.25 37.5 c
34.5
0 68.75 -10.5 93.5 -20.25 c
S
1725 825 -61.25 -2 P
15 12 23.5 30.5 23 50
c
38.25 -48 p f*
cleartomark end end pagesave restore showpage
%%PageTrailer
%%Trailer
%%Pages: 1

%%EndDocument
 @endspecial 1079 1628 a(Figure 6: Inconsistencydue
to persistence.
 planes outnumberor equal enemy planes. Defensiveattack is used when enemy
planes outnumber friendly planes. Choosing between offensive anddefensive
attack requirescounting the currentaircraft
 in the area. Figure5 shows the evolution of an executing exampledecomposition.
The
 agent must count relevant enemy and friendly planes.Determining a plane's
\side" and itsrelevance to the count oftenrequires remembering and is sufficiently
complexthat en-
 tailment of the count is notpossible. For instance, non-combatantaircraft
should not be
 counted, requiringsome reasoning about the type of eachaircraft. If the
agent determines that enemy planes outnumber friendlyones, the agent selects
defensive-attack,leading
 to further decomposition. What happens if an enemy plane flees,thus reducing
the actual count of relevant en-
 emy planes by one?The count maintained by the agent is now invalid. Standard
TMS
 mechanisms are insufficient because the count was asserted as an assumption.
Ifthe actual
 number of enemy andfriendly planes is now equal, then the agentshould switch
its tac-
 tic to offensiveattack. Continuing the defensive attackis not consistent
with the agent's knowledge. Additionally, other \friendly" agents participating
in the attack may basetheir
 behavior on the expectation thatthe agent is pursuing an offensive attack.Thus
the agent
 needs to recognize theinconsistency and remove the current count. Figure
6 presents a conceptual illustration ofthe problem. Assumptions are represented
as squares, entailments as circles.The horizontal line represents a hierarchicalrelationship
 between the assertions{i.e., assumptionsand entailments} inthehierarchical
context {above the line} and assertions in a localsubtask {below the line}.
Thearrowed lines represent
 dependenceinthe creation of an assertion. As in theprevious examples, some
reasoning
 in the subtask may require persistence, leading to the creationof an assumption
such as
 assumption1. However, the persistentassertion may still depend on other
assertions.This
 work focuses on the dependent assertions in the higher level context, suchas
A, B, C, D,
 and E
 1
in the figure.
 Suppose the world changes so that E
 1
 is retracted from memory and E 2
 is asserted.
 Assumption1 remains in memory. IfE
 2
 would not also leadto 1 {e.g., it could lead to some new assumption 2, as
shown}, then1 is no longer justified and may not beconsistent with
 the higher level context.Whether or not this potential inconsistencyamong
the assertions
 leads to inconsistent behavior depends on the use of assumption1 in later
reasoning.
 362Ensuring Consistency in Hierarchical Execution
 3. Solutions Our goal is to develop architecturalsolutions that allow an
agent to support persistent
 assumptions and simultaneously avoid inconsistencies across the hierarchical
context that
 can lead to irrational behavior. Before introducing two new architectural
solutions, however,
 we examine knowledge-based approaches and theirconsequences in order to
provide further rationale for the architectural approach. 3.1 Knowledge-based
Solutions Inconsistency can be avoided inhierarchical agents by creating
domain knowledge that
 recognizes potentialinconsistencies and responds by removingassumptions.
Many plan-
 ningand agent systems use explicit domain knowledge to represent knowledge
about the
 interactions amongassertions in the world. Forexample, the Entropy Reduction
Engine {ERE} {Bresina et al., 1993} is one agent system that relies on this
know ledge-b ase d assump-
 tion consistency {KBAC}. ERE requiresdomain constr aints, orknowledgethat
describes
 the physics of the taskdomain. Domain constraints identify impossible conditions.
For
 instance,adomain constraint would indicate that a robot cannot occupy two
different physical locations simultaneously.
 In ERE, domain constraints are specifically used \to maintain consistency
in thecurrent
 world model state duringexecution" {Bresina et al., 1993, pp.166}. However,
many other architectures use KBAC as well {perhaps in conjunction with other
methods}.KBAC
 knowledge can be viewedsimply as domain knowledge that must be addedto the
system
 toachieve consistent behavior.
 KBAC will always benecessary to maintain consistency among theassumptions
within
 a level of the hierarchy. However, in order to guarantee consistency for
assumptions dis-
 tributed throughout the hierarchy,all possible interactions leading to inconsistencymust
 be identified throughout the hierar chy. Thisknowledge engineering problem
can add signifi- cant cost to agent development.A knowledge designer must
not only specifythe conditions
 under which an assumptionis asserted but also al l the conditionsunder which
it must be
 removed.In the TacAir-Soar interception example,when the enemy plane flees,
the agent requires knowledge that disables all assumptionsthat depend upon
the number of enemy airplanes. Similarly, in the Dynamic Blocks World, the
agent must haveknowledge that
 recognizes any situation,in any subtask, that should cause the disabling
ofempty. In both
 cases, this KBAC knowledge \crosses" levels of the hierarchy. A complete
KBACsolution
 requires that an agent's knowledge capture al l potentialdependencies between
assumptions in a local subtask and any higher levels in the hierarchy.
 Althoughit will be possible to encode completeacross-level consistency knowledge
for simple domains, experience in TacAir-Soar and other complex agent systems
hasconvinced
 us that KBAC requiressignificant investments of time and energy. Further,
because it is often not possible to enumerate allconditions under which an
assumption must beremoved,
 agents are also brittle,failing in difficult-to-understand, difficult-to-duplicateways.
 The insufficiency of knowledge-based solutions led us to consider architectural
solutions
 to the problem.Architectural solutions eliminate the need fordomain knowledge
encoded
 only to addressinconsistency between the hierarchical context and assumptions
within a
 subtask.Thus, the cost of developing individual agents should be reduced.
In addition to 363Wray & Laird their generality, bydefinition,architectural
solutions are also complete, andthus able to
 guaranteeconsistency between the hierarchical context and assumptionswithin
a subtask,
 at all times, for anyagent task. Such completeness should improve the robustness
of agent
 systems, especially in situations not explicitly anticipated by their designers.
 3.2 AssumptionJustification
 One potential architectural solution to the inconsistency problem is tojustify
each assump-
 tion in the hierarchy with respect to assertions in higher levelsof the
hierarchy. Assumption Justification is an extension of the truth maintenance
approaches to consistency outlined previously. Each assumption in thehierarchy
is treated as if it were an entailment with
 respect to dependentassertions higher in the hierarchy. A new data structure,
the as- sumption justification, is created that captures the reasons in the
hierarchical context for a particular assumption. Locally, an assumption
is treated exactly like anassumption in a
 non-hierarchicalsystem. However, when the assumptionjustification is no
longer supported {indicating a change in the dependenthierarchical context},
thearchitecture retractsthe
 assumption.
 Refer again to Figure2. When the agent asserts a 34
 , the architecture builds anas-
 sumption justification for the assumption that includes a
 22
 and a
 21
 .If the agent retracts
 a 21
 , the assumption justificationfor a
 34
 is no longer supported and the architecture also
 retracts a
 34
 .The architecture ensures reasoning consistencyacross hierarchy levels because
an assumption persists no longer than the context assertions that led to
its creation. Assumption Justification solves the inconsistencyproblem because
all dependencies in the hierarchical context are captured inthe justification.
Within the subtask, domain knowl-
 edge is still required to ensureconsistency among the enabled assumptions
in thesubtask.
 However, no across-levelconsistency knowledge is needed. AssumptionJustification
still
 supports localnonmonotonic and hypothetical reasoning.Thus, Assumption Justification
 appears to meet functional evaluationcriteria. However, inorder to assess
its impact on
 performance, some implementation details must be considered.
3.2.1 Implementing Assumption Justification
 Creating assumption justificationsrequires computing context dependencies
for eachas-
 sumption, similar to the computation ofjustifications for entailments.
 2
 Figure 7 outlines
 a procedure for computing the assumptionjustification data structure. This
procedure is invokedwhen any assertion iscreated. This procedure creates
assumptionjustifications for
 every assertion in a local subtask; that is, for entailments as wellas assumptions.
This ap-
 proach allows the architecture to cache context dependencies for each local
assertion. The advantage of this caching is thatthe architecture can simply
concatenate theassumption justifications of the local assertions contributing
directly to the creation of the assumption2. The AssumptionJustification
procedure, as presented, requiresthat the inference engine record a justifi-
cation for every assertion during the courseof processing. In Soar, the architecture
inwhich Assumption
 Justification was implemented, these calculations are available from itsproduction
rule matcher for both assumptions and entailments. However,justification
calculations for assumptions may notbe supported
 in other architectures,requiring modifications to the underlying inferenceengine.
Laird and Rosenbloom
 {1995} and the Soar User's Manual{Laird, Congdon, & Coulter, 1999} describethe
specific mechanisms
 of justificationcreation in Soar.
 364Ensuring Consistency in Hierarchical ExecutionPROC createnewassertion{::
:}
 An assumption justification iscomputed when each new assertion A is created.
Thus, assumption justifications arecomputed for both
 assumptions and entailments.
 : : :
 A
 just
   createj ustif ication{:: :}
 Justifications can be created via well-known, textbook algorithms {e.g.,
Forbus & deKleer, 1993; Russell &Norvig, 1995}
 A
 aj   makeassumptionj ustif icationf orassertion{A}
 : : :
 END PROC makeassumptionj ustif icationf orassertion{assertion A}
 AJ  NIL
 FOR Each assertionj in A
 just , the justification of A
 fl1 IF {Level{j } closer to the root thanLevel{A})
 AJ append{j; AJ }{add j to the assumption justification} fl2 ELSE {j and
A at the same level} AJ  concatenate{j aj
 ; AJ } {addassumption justification of j to assumption justification of
A} return AJ, a list of assertions comprising theassumption justification
of A
 END PROC Level{assertionA}
 Return the subtask level associated with assertion AFigure 7: A procedure
forbuilding assumption justifications.
 {infl2}. We chose this cachingoption for computing assumption justifications
over an \on-
 demand" implementation that would, when an assumption was created, recursively
follow
 local dependencies until all context dependencies were determined. Theadvantage
of the
 caching implementation is that the context dependencies for any assertion
must be com-
 puted onlyonce, even when a local assertion contributesto the creation of
multiple local assumptions.
 The procedure that createsan assumption justification loops over theassertions
in the
 justificationof a new assertion A. The assertions in the justification can
be either context or local assertions. Context assertions, infl1, are added
to the assumptionjustification directly.
 However,local assertions should not be added to theassumption justification
because the assumption justification should include only context dependencies.
For example, the ar- chitecturecan retract a local assertion forreasons other
than a change in the hierarchical context {e.g., a non-monotonic reasoningstep
or a change in enabled assumptions in the subtask} and in these cases, the
agentshouldnot necessarily retract dependent assumptions. Because assumption
justifications for all local assertions in the justification have already
been computed {i.e., they are cached,as described above}, the assumptionjustification
of a
 365Wray & Laird(a)Bl
(b)Bl
Robert WrayatendatendZapfDingbatsSymbol_Times-Roman04800480ATimes-Roman04800480BTimes-Roman04800480CTimes-Roman04800480DTimes-Roman04800480ETimes-Roman048004801{b}atendatendZapfDingbatsSymbol_Times-Roman04800480ATimes-Roman04800480BTimes-Roman04800480CTimes-Roman04800480DTimes-Roman04800480ETimes-Roman048004801Times-Roman048004802{a}PSfrag
replacements{a}{b}
Figure 8: Technicalproblems with Assumption Justification.In {a}, an assumption
replaces
 another assumption nonmonotonically.In {b}, multiple assumption justifications
for the same assumption must be supported. local assertion j cansimply be
added to the assumption justificationof A, as in fl2.For an
 on-demand implementation, theprocedure here would recur through localassertions,
until
 all context dependenciesfor local assertions contributing toA had been identified.
 The worst-case computational complexity of this algorithmis polynomial in
the number ofassertions in the subtask. Theaddition of a higher-level assertion
can be done in constant-
 time {a single pointerreference}. However, fl2 must uniquely add context
assertions fromthe
 assumption justification of the localassertion. There are at most {n000
1} local assumption
 justifications whenever the nthassertion is created. Thus, the concatenationneeds
to be
 performed no more than {n 000 1} times for any call tothe assumption justification
procedure. This limit provides an upper bound ofO{n} on the complexity of
theassumption justification
 procedure:the worst-case cost of building an individualassumption justification
is linear in the number of assertions, n, in the level. However,the architecture
executes the assumption justification procedure for every assertionin the
level. Thus, the worst case costfor building
 all the justifications in aparticular level is O{1 + 2: : : + n} or O{n
 2
 }. Non-monotonic changes complicate the implementation. The architecture
must disable a replaced assumption, rather than delete it,because the initial
assumption may need to be restored. For example, in Figure8 {a}, assumethat
the assertion ofE leads to both
 the assertion of2 in the local subtask and the retraction of1 {i.e., 2 is
a revision of1}. If
 the agent retractsE, Assumption Justification will retract2, as desired,
but it must also re-enable 1. Thus, assumption1 must remain available in
memory, although disabled.
 Figure 8 {b}illustrates a second problem. An assumption canhave multiple
assump-
 tionjustifications. These justifications can changeas reasoning progresses.
Assumption 1 initially depends on assertionsA, B, and C in higher levels.Now
assume that later in
 theprocessing, the agent removes A, whichnormally would result in the retraction
of1.
 However, inthe meantime,the context has changed such that1 is now also justified
byfC,
 D, Eg. Now when the agent removesA, the architecture should not immediatelyretract
1
 but must determine if1 is justified from other sources. An implementation
of Assumption Justification in Soar was completed by members of the Soar
research group at the University of Michigan. Experiments using Air-Soar,a
flight
 simulator domain {Pearsonet al., 1993}, showed that the overhead of maintaining
all prior
 assumptionsin a level produced a significant negative impact on agent performance.
In this domain, Assumption Justification incurredsignificant computational
cost, requiring at 366Ensuring Consistency in Hierarchical Execution
 least 100045 more time thanthe original Air-Soar agent. Further,the number
of assumption
 justificationsmaintained within a level continued to grow during execution,
for the reasons
 explained above. Some subtasks requiredminutes to execute as the aircraft
performed a maneuver, leading to large {and problematic}increases in the
amount of memory required. Thus, Assumption Justification failed to meet
efficiency requirements on both theoretical and empirical grounds. Althoughthe
limitations of Assumption Justification might be im-
 proved by developing solutionsto its technical problems, we abandoned furtherexploration
 of this approach after suchstrongly discouraging results.
 3.3Dynamic Hierarchical Justification Figure 2 introduced the notion of
asupport set for subtasks. Both the Procedural Reasoning
 System {PRS} {Georgeff & Lansky, 1987} and Soar {Laird et al., 1987} usearchitectural
 mechanisms to retractcomplete levels of a subtask hierarchy whenthe support
set no longer
 holds.In this section, we consider a solution thatleverages the hierarchy
maintenance function to ensure consistency betweenassumptions and the higher
level context. A significant disadvantageof the support set in existing systems
is thatit is fixed. In
 Soar,the support set is computed for the initiation of the subtask but is
not updated to reflect reasoning that occurs within thesubtask. For example,
in Figure 2, supposethat
 assumption a
 34 depends on assumptions a
 22
 and a
 21 {represented by the dashed, arrowed lines}. The support set does notinclude
a
 21
 ; this assertionmay not have even been present when S ubtask
 3
 was created. When a local assumptiondepends on an assertion not in the support
set, then a change in thatassertion will not directly lead to the retraction
of the
 assumption {or the subtask}.Thus, approaches using such Fixed Hierar chic
al Justification
 {FHJ} still require knowledge-based solutions for consistency.FHJ is discussed
further in
 Section 5.1.2. We propose a novel solution,Dynamic Hierar chic al Justification
{DHJ}, that is similar to Fixed Hierarchical Justification, butdynamically
updates the support set as reasoning progresses. Assumption justifications
forindividual assumptions are unnecessary.However,
 one consequence of thissimplification is that a subtask {and all assertions
within it} will
 be retracted when thedependent context changes. Referto Figure 2. When a
DHJ agent asserts a
 34
in Figure 2, the architecture updates the support set for S ubtask
 3 to include a
 21 .
 Assumption a
 22 is already a member of the supportset and need not be added again. When
anymember of the support set forS ubtask
 3
 changes,the architecture retracts the entire subtask. Thus Dynamic HierarchicalJustification
enforces reasoning consistency across the hierarchy because a subtaskpersists
only as long as all dependentcontext assertions.
 3.3.1 ImplementingDynamic Hierarchical Justification Figure 9 outlines the
procedure forcomputing the support set in DHJ. As in Assumption Justification,
the architecture candirectly add context assertions to the supportset fl1.
 When the architecturecomputes the dependencies for a local assertionfl3,
the assertion
 ismarked as having been inspectedfl4. Inspected assertions can simplybe
ignored in the
 future fl2, because the architecture has alreadyadded the assertion's dependencies
to the support set. The architecture will alsoignore dependent, local assumptionsfl2
because
 thedependenciesof those assumptions will have already beenadded to the support
set.
 367Wray & LairdPROCcreatenewassertion{: : :} Whenever a new assumption is
asserted, thesupport set is updated
 to include anyadditional context dependencies.
: : :
 A
 just   createj ustification{: : :}
 IF A is an assumption
 S is thesubtask in which A is asserted
 S
 supportset   append{S
supportset
 ;adddependenciestosupportset{A})
 : : :
 END
 PROC adddependenciestosupportset{assertionA}
 FOR Each assertionj in A
 just , the justification of A
 fl1 IF fLevel{j } closer to the root thanLevel{A}g
 append{j; S } {appendcontext dependency to support set} fl2 ELSEIF fLevel{j
} sameas Level{A} AND j is NOT an assumptionAND
 j has not previously beeninspected g
 fl3S   append{S; adddependenciestosupportset{j })
 {compute support setdependencies for j and add to S} fl4 j
 inspected   true
 {j's context dependencieshave now been added to the support set}
 return S, the list of new dependencies in thesupport set
 END
 PROCLevel{assertion A} Return the subtask level associated withassertion
AFigure 9: A procedure for DynamicHierarchical Justification.
 BecauseDHJ needs to inspect any local assertion only once, context dependencies
are
 computed on-demand, rather than cached as inAssumption Justification. Condition
fl2 will
 be true whenever there isa local entailment whose context dependencies have
not yet been computed. These dependencies are determinedby calling adddependenciestosupportset
 recursively.Recursive instantiations of adddependenciestosupportseteach
receive a local
 assertionin the justification of an uninspected entailment, j , and return
a list comprising the context dependencies of j. The return value is then
appended to the support set S in
 the prior instantiation of adddependenciestosupportset. The recursive call
to adddependenciestosupportset atfl3 is the only non-constant time operation
in the procedure.It must be made only once for anyassertion j
 i
 and thus the worst case complexity to compute thedependencies is linear
in the number ofassertions in
 the level, as in AssumptionJustification. However, DHJ requires onlya single
inspection of
 any individualassertion, rather than repeated inspections for each new assumption
as in As-
 368Ensuring Consistency in Hierarchical Execution
 sumption Justification.Thus the architecture needs to calladddependenciestosupport-
 set at most n times for any subtask consisting of n assertions, and the
worst case cost of
 updating the supportset in a level remains O{n}. This reduction in complexity
poten- tially makes Dynamic HierarchicalJustification a more efficient solution
thanAssumption
 Justification, especially as the number of local assertions increases. Additionally,
the two technicalproblems outlined for Assumption Justification do not impact
DHJ. DHJ never needs to restorea previous assumption. When a dependency changes,
thearchitecture retracts the entire level. Thus, DHJcan immediately delete
replaced assumptions from memory. DHJ collects all dependencies forassumptions,
so
 there is no need to switch from one justification to another.In Figure 8
{b}, dependencies
 A, B, C, D, andE are all added to the support set.These simplifications
can make the support set overly specific butreduce the memory and computation
overhead required by
 Dynamic Hierarchical Justification. DHJ retractions will sometimes be followed
by the re gener ationof the subtask and
 the re-assertion ofreasoning that was retracted. Forexample, if the enemy
plane fled as described in the TacAir-Soarscenario, DHJ would retract the
entire levelassociated with
 the counting subtask.The count would then need to be re-started from the
beginning.
 Section 3.4.3 examines potential problems introduced by interruptionand
regeneration.
 The cost incurred throughthe regeneration of previously-derived assertions
isthe primary
 drawback of Dynamic Hierarchical Justification.
 3.4 Implicationsof Dynamic Hierarchical Justification Dynamic Hierarchical
Justification solvesthe specific problem of maintaining reasoning consistencyin
a hierarchy,guaranteeing consistency and utilizing an efficient algorithm.
 The heuristic DHJ employsassumes that assumptions are so closely associatedwith
their
 subtasks that retracting subtasksis nearly equivalent to retracting individualassumptions.
 This section explores theimplications of this heuristic, focusing on taskdecompositions,
the
 impact on the agent's ability to use persistent assumptions, and thefeasibility
of interrupting
 anagent {with a subtask retraction} in the midstof reasoning.
 3.4.1 The Influence ofthe Task Decomposition
 An agent'sreasoning can be viewed as knowledge search{Newell, 1990}. From
this per- spective, the inconsistency problem isfailure to backtrack in knowledge
search.The world
 changes, leading to changes in the agent hierarchy.The agent must retract
some of the knowledge it has previously asserted, so itshould backtrack to
a knowledge state consistent
 with the world state.
 3
 Each solution can be describedin terms of the way its achieves {or avoids}
backtracking in the knowledge search. For instance, KBAC leadsto knowledge-based
 backtracking,in which the KBAC knowledge tells the agent how to correct
its assumptions
 given the current situation.3. Obviously,this world state will usually be
differentthan the agent's initial state and it is oftenimpossible
 to return to a prior state in an execution system. We use \backtrack" in
this section to refer to the
 retraction of asserted execution knowledge suchthat any remaining asserted
knowledge is consistent with
 the currently perceived worldstate.
 369Wray & LairdABl
BBl
CBl
DBlEBl
FBl
GBl
HBl
1Bl2Bl
(a)Bl
(b)Bl
Robert Wray{a}{b}ABCDEFGHABCDE1212PSfrag replacementsAB
C
D
E
FG
H
1
2
{a}{b}
Figure 10: Examplesof {a} disjoint dependencies and {b} intersecting assumption
depen-
 dencies. Assumption Justification is a form of dependency-directed backtracking
{Stallman & Sussman, 1977}. In dependency-directed backtracking, regardless
of the chronological order in which an architecture makesassertions, the
architecture can identify andretract those
 assertions that contributed to failure in a search and retain all otherassertions.
In Assump-
 tion Justification,the architecture retracts only those assumptions thatare
directly affected
 by a change inthe context. Assumptions created later in theprocessing, not
dependent
 on the change, areunaffected. Consider the examples inFigure 10. In {a},
assumptions 1 and 2 each depend upondisjoint sets of assertions. With AssumptionJustification,
re-
 movalof any assertion in 1's assumptionjustification will result in the
retraction of1; 2 is
 unchanged, even ifthe architecture asserted 2 after1.
 Dynamic Hierarchical Justificationis similar to backjumping {Gaschnig,1979}.
Back-
 jumping heuristicallydetermines the state to which a current search should
backtrack or
 \backjump."The heuristics used by backjumping are basedon syntactic features
of the
 problem.For instance, in constraint satisfactionproblems, the backjumping
algorithm
 identifies which variable assignmentsare related to other variable assignments
via the con-
 straints specified in theproblem definition. When a violation is discovered,
the algorithm
 backtracksto the most recent, related variable{Dechter, 1990}. Interveningvariable
as-
 signmentsare discarded. In DHJ, when an assertion inthe hierarchy changes,
the system \backjumps" in its knowledge search tothe highest subtask in
the hierarchy not dependent
 on the change. InFigure 10 {a} all dependent assertions arecollected in
the support set for
 the subtask. If any of the higher levelassertions change, the entire subtask
is removed.
 When using DHJ, as in backjumping,some previous knowledge search may need
to be repeated after backtracking.Assume the removal of the subtask inFigure
10 {a} was
 due to a change in A. If a similar subtask is reinitiated,assumption 2 may
need to be regenerated. This regeneration isunnec essary because 2did not
need to be retracted
 to avoid inconsistency. UnderDynamic Hierarchical Justification, the agentretracts
al l
 reasoning in the dependent subtask {and all lower levels inthe hierarchy};
assertions not
 dependent on the change in the contextcan also be removed. Thus,like backjumping,
DHJ
 usesa syntacticfeature of reasoning {decomposition into subtasks}to choose
a backtracking
 point and this backtracking is not always as conservative as possible. Although
the subtask decomposition is a syntactic feature of the knowledge search,
it is a strongly principled one, reflecting asemantic analysis of the task
by a knowledgede-
 signer. Hierarchical task decomposition is based on the premise that tasks
can be broken
 down into discrete unitsthat have little interaction with other units;they
are nearly dec om- 370Ensuring Consistency in Hierarchical Execution
 posable {Simon,1969}. Thus, the goal of a hierarchicaldecomposition is to
separate mostly
 independentsubtasks from one another.A consequence of this separation is
that dependen-
 cies in higher levels will belimited as much as possible {interaction between
subtasks should
 be minimized} andthe dependencies among the assertions in anyparticular
subtask will be
 shared{otherwise, the subtask could be subdivided into two or more independent
subtasks}. Of course, it is often possible to decompose a given task in many
different ways. In most
 cases the domain imposes minimal constraint and the knowledge engineer hassignificant
 latitude in crafting the taskdecomposition.
 In the situation illustrated inFigure 10, {b} would be a more completedecomposition
 of a task by the knowledge engineer than {a}, assuming the twoalternatives
represent a
 decompositionof the same task. In {b}, the numberof dependent assertions
does not nec- essarily grow as a function of the number of assumptions in
the local level, while in{a} it
 does. Further,in {a}, two independent assumptions are being pursued. These
assumptions
 could potentially be inferred in separatesubtasks in an alternate decomposition.In
{b},
 on the other hand, theassumptions in the subtask are closely tied togetherin
terms of
 their dependencies and thus better asserted within the same subtask.Because
the depen-
 dencies of assumptions1 and 2 have considerable overlapin {b}, Assumption
Justification
 pays a high overhead cost to trackindividual assumptions because {most}
everything in the local subtask would be removed simultaneously if assertions
B,C, or D changed. Be- cause DHJ incurs no such overhead, DHJis a better
choice when the intersection between
 assumption dependencies is high.Task knowledge structured more like thesituation
in
 {b}, rather than {a} wouldlead to few unnec essary retractions.Because {b}
appears to
 betterreflect well-decomposed tasks, Dynamic Hierarchical Justification
will constrain the knowledge development process and improve the resulting
decompositions. Consequently,
 nearly-decomposed tasks should allowDHJ to avoid most unnecessary regenerations
while avoiding the processing overhead ofAssumption Justification.
 3.4.2Limiting Persistenceunder DHJ
DHJ limits persistence in subtasks, resulting inassumptions that are not
as persistent as assumptions in typical truth maintenancesystems. This section
explores the consequences of these limitations to determine if DHJ architectures
 4
 can still providethe persistence
 necessary for agentexecution {Section 2.1}.
 DHJ will retract a subtask when a potential inconsistency couldimpact hypothetical
and
 recursive reasoning like counting. Consider the aircraftclassification and
counting example. Perhaps an aircraft's altitude contributesto a hypothetical
classification of the aircraft {e.g., particular altitude and speed combinations
might suggest a reconnaissance aircraft}. The agent would create assumptions
locallythat depend on this aircraft's altitude.If the
 altitude {or an altitude boundary} changes, then the assumption should beretracted.
This
 retraction is required toavoid inconsistency. If the contact'saltitude no
longer suggests that
 itis a reconnaissance aircraft, then the assumption that depended on that
assertion should
 be removed. DHJ captures these dependencies and performs the retraction.The
agent now4. For clarity,and because Assumption Justification has already
been eliminated as a candidate solution, the following discussion focuses
exclusively on DHJ. However, Assumption Justificationlimits persistence
 similarly. 371Wray & Laird has the opportunity to reconsider theclassification
of the aircraft, or pursue other tasks if
 the classification is no longer important.
 DHJ will also retract a subtask if anassumption was created in the local
subtask for the purpose of remembering some input{or elaboration of input}.
Forexample, if an agent
 needed to remember a particular aircraft's altitude at a particular point
in time, then that
 assumption cannotbe stored in a local subtask. DHJlimits persistence in
such a way that remembering within a local subtask is generally impossible.
 Inorder to remember previous situations, assumptions can beasserted in the
root task.
 Anyassumption asserted in this level will never be retracted because there
are no higher level dependencies {assuming percepts are associated with the
top level and not a higher\input
 level," as in Theo, Mitchellet al., 1991}. The primary drawback ofthis requirement
for
 rememberingis that remembered items are no longer localto the subtask that
created them,
 requiring additional domain knowledge to manageremembered assumptions. However,
remembering already requires domain knowledge; it is not possible to remember
anassertion
 regardless of its dependencies and also be able to retract it architecturally.
 These examples show that DynamicHierarchical Justification still allows
all formsof
 persistence, but trades capturing dependencies for nonmonotonic assumptions
in local sub- tasks with remembering assumptions in theroot task, where no
dependencies are captured. Because DHJ forces remembered items intothe root
task, it also suggests that a fundamen- tal aspect of this root task shouldbe
managing these remembered assumptions.We view
 this requirement as a positive consequence of DHJ, because it forces knowledge
engineers
 to better recognize thereasons for creating an assumption {e.g., remembering
vs. a hypo-
 thetical}and circumscribes remembering so that we can now develop or adopt
functional
 or temporal theories to manage assumptions createdfor remembering {e.g.,
Allen, 1991; Altmann & Gray, 2002}. 3.4.3 Recovery from Interruptionwith
DHJ
 Dynamic Hierarchical Justificationmakes an agent more reactive to its environment,
en-
 suring that relevant changes in the environment lead tothe retraction of
any dependent subtasks. DHJ imposes an automatic interruption of the agent
for a subtask retraction, without evaluating the state of thesystem first.
Although automatic interruptionincreases
 the reactivity of the system, it can lead to difficulties if there is no
wayto override it.
 In this section weexamine two cases where uncontrolled interruption can
cause problems.
 The problemsarise because DHJ biases the system to bereactive; thatis, to
respond au- tomatically to changes in the environment without deliberation.
However,inboth cases,
 additional agent knowledge can overcome that bias and make thesystem more
deliberate
 andavoid uncontrolled interruption.
 The first problemarises when there is a sequence of actions that must be
completed
 without interruption inorder for a subgoal to be achieved.If the processing
is interrupted, then it is possible, because of thedynamics of the world,
that the task cannot beresumed.
 For example, imagine an aircraft nearing the point where it can launch amissile
at a target.
 When the task is interrupted and then resumed, the aircraft's positionmay
have changed
 enough, relativeto the target, that additional steering commands arenecessary
before the
 372Ensuring Consistency in Hierarchical Execution
 missile can be launched. In this case, it may be preferablenot to interrupt
the original
 launchsequence once it has begun.
 Considertwo possible approaches to achieving thiscapability with Dynamic
Hierarchical Justification architectures. Thefirst is to move the processing
to the root task. Because
 the root task isnot interrupted, the processing will not be interrupted.
However, this
 approach greatly restricts how a task can be hierarchically decomposed and
thus should be
 considered only as a last resort.The second approach is to add new reasoningfor
the task
 that \freezes" the externalsituation with respect to additional reasoning
inthe subtask.
 The new processing initiatesthe execution of the subtask and creates persistent
structures
 in the root task.These persistent structures represent a deliberate commitment
to not
 beinginterrupted. The remaining processing in thesubtask accesses only these
structures
 in the execution of the task. Thus,because they are persistent, even if
thereare changes
 to the surrounding situationthat would have interrupted the subtask, itsprocessing
is
 now insensitive to those changes and interruption is prevented.This approach
also requires
 additionalreasoning to recognize completion of the uninterruptible behavior
and remove the persistent structures built by theinitial subtask. This reasoning
reflects a deliberate act,
 signaling that the commitment nolonger holds. In the abstract, together
theseadditions
 provide a mechanism for overcoming automatic interruption. Thedisadvantage
of this
 approachis that, as part of the system design, thosesubgoals that cannot
be interrupted must be identified beforehand.For those subtasks, additional
agent knowledge must be
 implementedto create and remove encapsulations of anydynamic data.
 A more critical problem forDHJ is the \Wesson Oil" problem:when someone
is cooking
 dinner and ahigher-priority activity suddenly occurs {a hurt child}, the
cook should turn off the stove {a \cleanup" procedure} before leaving for
the hospital {Gat, 1991b}.This problem
 occurs when there is achange in the hierarchical context at a level far
from the terminal
 levelof the hierarchy. In this situation,similar tasks may not be resumed
or initiated following the interruption. Theagent must therefore recognize
whether \cleanup" of the
 external and/or internal states isnecessary, and, if so, performthat cleanup.
Even with
 DHJ,the agent canstill behave appropriately if it has the right knowledge.
In particular,
 the agent must be able to recognize partially completed tasks {like cooking
dinner} and be able to select cleanup actions specific to the task state
{like turning off astove burner}.
 BecauseDHJ requires allremembered assumptions to be asserted in theroot
level of the
 hierarchy, this recognition task has internal state available; it need not
try to reconstruct that state from the external environmentalone. However,
itdoes require someanalysis
 of the task domain{s} by aknowledge engineer so that any interruptibleactivity
requiring
 cleanup include triggeringassertions for cleanup in the root task. This
work was prompted by a desirefor architectural solutions to inconsistency,
yet
 maintaining consistency efficiently can lead to interruptions, which, under
DHJ, requires
 knowledge-based solutions toproblems arising from automatic interruption.
5
 However, most of the requirements imposed by DHJ arepositive consequences.
Subtask retractions and observed recovery in the developmentprocess help
define what must be remembered in the
 roottask for cleanup, which is significantly different than the laborious
process of debugging5. Dynamic HierarchicalJustification could also be used
as a triggerfor meta-level deliberation rather than immediate subtask retraction.
Itwould then possibly provide an architecturalsolution to the question
 of when to deliberate about potential inconsistency forintention reconsideration
{seeSection 5.2}.
 373Wray & Laird agent programs that are failing due toinconsistency. In
theory, DynamicHierarchical
 Justification imposesrequirements for handling interruptions that do pose
serious questions
 about its overall utility. In practice, we have notfound addressing these
questions to be a problem in a variety of recentagent implementations using
the Soar-DHJ architecture {e.g.,
 Laird, 2001; Wray et al., 2002}.
 4.Empirical Evaluation of Dynamic HierarchicalJustification
 Architectural mechanismslike DHJ must be efficient.We have demonstrated
that the algorithm itself is efficient, but thequestion of its impact on
the overall behavior generation
 capability of an agentremains an open question due to interruption andregeneration.
Given
 the complexity of both agent-based systems and the domains in whichthey
are applied,
 analytical evaluations must be extremely narrow in scope, and even then
require specialized techniques {Wooldridge, 2000}.This section instead pursues
an empirical evaluation of
 Dynamic HierarchicalJustification, focusingon efficiency and responsiveness
in two domains
 atextremes in the continua of agent domain characteristics. Because the
architectural solution to inconsistency was motivated by the cost {and incompleteness}
of knowledge-
 based solutions, knowledge development costs will also be estimated. 4.1
Methodological Issues Dynamic Hierarchical Justification is ageneral solution,
applicable in a wide range of agent
 tasks. In order to evaluate such a solution, a number ofmethodological issues
must be
 addressed. The following describes three important issues and the choices
made for this evaluation.
 4.1.1 Relative vs. Absolute Evalua tion What constitutes \good" or \poor"cost
and performance evaluations?In general, an
 absolute evaluation of performance and cost is difficult because the task
itself, in addition
 tothe agent's knowledge and architecture,determines overall cost and performance
results. We circumvent this problem bymaking relative comparisons between
agentsusing the
 original, Fixed HierarchicalJustification Soar architecture {\FHJ agents"}and
new agents
 {\DHJ agents"}.The FHJ agents provide cost and performancebenchmarks, obviating
the
 need forabsolute evaluations.
 4.1.2Addressing Multiple Degrees of Freedom in Agent Design
 Even when architecture andtask are fixed, many different functional agents
can be devel-
 oped.How can one know if comparative resultsare valid and general if the
experimenter hascontrol over both benchmarksand new agents?
 DHJ agents will becompared to agents previously implemented byothers. Such
systems
 willprovide good performance targets, becausethey were optimized for performance,
and will minimize bias, because they were developed independently.
 FHJsystems were used as fixed benchmarks, andwere not modified. DHJ agents
use the identical task decompositions employed by the FHJ agents and the
same initial knowledge
 base. We observed opportunities to improve performance in the DHJagents
by modifying
 374Ensuring Consistency in Hierarchical Execution
 either the task decomposition or re-designing significant portions ofthe
agent knowledge
 base. However,agent knowledge was modified only whennecessary for correct
behavior, in order to ensure that DHJ agents remained tightly constrained
by their FHJ counterparts, thus limiting bias in the evaluation. 4.1.3 The
Choice of Representa tive Tasks
 This evaluation will be limited to execution agentsin the Dynamic Blocks
World and in a reduced-knowledge version of TacAir-Soar {\micro-TacAir-Soar"}.The
choice of only a
 few tasks ordomains is a considerable drawback of benchmarks {Hanks, Pollack,
& Co- hen, 1993}. Although these choices were motivated primarily by the
availability ofdomains
 withpre-existing FHJ agents, thetwo domains dorepresent opposite extremes
for many domain characteristics. Micro-TacAir-Soar, like TacAir-Soar, isinaccessible,
nondetermin-
 istic, dynamic, andcontinuous, whilethe Dynamic Blocks World simulator used
in the
 experimentsis accessible, deterministic, static and discrete.The primary
motivation for
 using the Dynamic Blocks World,which is less representative of typical agent
tasks than
 Micro-TacAir-Soar,isto assess the cost of employing DHJ in adomain where
a priori it
 appearsit would not be useful {although Section 6suggests DHJ can prove
useful even in relatively static domains}.Thus, the Dynamic Blocks Worldwill
provide a baseline for
 the actualcost of deploying the algorithm, even thoughlittle benefit is
expected from its deploymentin this domain.
 4.2 Evaluation Hypotheses Although specific expectations willdiffer in different
domains, differences in thedimensions
 of knowledge cost and performance can be anticipated when comparing DHJagents
to
 baseline agents.The following discusses the expectations andthe metric{s}
used for each
 dimension. 4.2.1 Knowledge Engineering Cost Knowledge engineering effort
in DHJ agents should decrease in comparison to previously developed agents.
Knowledgein Soar is represented with production rules.Each production
 representsa single, independent knowledge unit.We assume the addition of
more produc- tions represents an increase in cost andmeasure knowledge cost
by counting the number of
 productions in each typeof agent. The number of productions,of course, provides
only a
 coarse metric of cost because the complexity of individual productions varies
significantly. However, the productions that will be removed in DHJ agents
are often the mostdifficult
 ones to create. Therefore,the difference in number of productions is probably
a conservative
 metricfor knowledge cost in DHJ.
 4.2.2Performance: Efficiency and Responsiveness In general, overall performance
shouldchange little in DHJ agents, as compared to FHJ counterparts. Although
Dynamic Hierarchical Justification does add a new architectural mechanism,
the algorithm itself is efficient and should not contribute to significantdiffer-
 ences in performance.Further, less domain knowledge will need tobe asserted
because all
 375Wray & Laird across-level consistency knowledge is nowincorporated in
the architecture. Thus,if apply-
 ing across-level KBAC knowledge represented a significant expense inthe
overall cost of
 executinga task, DHJ agents might perform betterthan FHJ agents.
 There are two specific exceptions to this expectation.First, in domains
where consis-
 tency knowledge is {mostly} unnecessary for taskperformance, FHJ agents
may perform better than DHJ agents. For example, the Dynamic Blocks World
requires little consis-
 tency knowledge but the DHJ architecture will still update the support set,
even though
 few inconsistency-causing context changes shouldarise.
 Second, if regeneration is problematic,overall performancewill suffer.In
DHJ, whenever
 the dependent context changes, a subtask will be retracted.If the change
does not lead to a different choice of subtask, thesubtask will be necessarily
regenerated.Thus, under
 DHJ, some subtaskregeneration will occur, and, if regeneration issignificant,
performance
 degradation willresult.
 CPU execution time provides asimple, single dimension of gross performance.The
CPU
 time reported for individualexperiments reflects the time the agent spends
reasoning and
 initiating actions ratherthan the time it takes to execute those actionsin
the environment.
 Decisions:In Soar, subtasks correspond to the selectionof oper ators and
subgoals for implementing operators. The selection ofan operator is called
a decision. When Soar
 selects an operator,it tries to apply the operator. Soarreaches an impasse
whenit cannot
 apply a newly selected operator. These non-primitive operators lead tothe
generation of
 a subgoal in the subsequent decision. For example, Soar selects theput-down
operator
 inone decision andcreates a subgoal to implement put-downin the subsequent
decision.
 Together, these two steps constitute the notionof a subtask in Soar.
 The number ofdecisions can thus be used as an indication ofthe number of
subtasks
 undertakenfor a task. In FHJ, a subtask was generally never interrupted
until it terminated {either successfully or unsuccessfully}.In DHJ, subtasks
will be interrupted whenever a
 dependent change occurs.Thus, decisions should increase in DHJ agentsbecause
subtasks
 will be interrupted andre-started. Further, if decisions increasesubstantially
{suggesting
 significantregeneration}, overall performance will degrade. Production Firings:
A productionrule \fires" when its conditions match and itsresult is
 applied to the current situation.Production firings should decrease in DHJ
fortwo reasons.
 First, any across-levelconsistency knowledge that was previously used inFHJ
agents will no
 longer be necessary{or represented}; therefore, this knowledge willnot be
accessed. Second,
 anyreasoning that occurred after inconsistency arose inFHJ agents will be
interrupted and eliminated. However, productionfirings will increase if significant
regenerationis necessary.
 4.3 EmpiricalEvaluation in the Blocks World
 Agents in this Dynamic Blocks World domain have execution knowledge to transform
any
 initial configuration ofthree blocks into an ordered tower usinga simulated
gripper arm.
 Thetable inthis simulation has a width of nine blocks.The agent's task goal
is always to build the 1-on-2-on-3 tower.Each agent built a tower from eachof
the resulting 981 unique,
 non-goal, initialconfigurations of blocks. Table1 summarizes the results
of these tasks.As
 expected, total knowledge decreased.Overall performance improved.Decisions
increased,
 as expected, but thenumber of rule firings increased as well,which was not
anticipated.
 376Ensuring Consistency in Hierarchical ExecutionFHJDHJ
 026x s.d. 026x s.d.Rules188  -  175  - 
 DecisionAvg. 87.1 20.9 141.1 38.7 Avg. Rule Firings 720.3 153.5 855.6199.6
 Avg. CPU Time {ms}413.1 121.6 391.6 114.0Table 1: Summary of knowledge and
performance data from the Blocks World. The agents
 performed the tower-building task for each of 981 configurations.Task order
was
 randomly determined. 4.3.1 Knowledge Differences Total knowledge decreased
about 7045 in the DHJ agent. This small reduction isconsistent
 with expectation.The aggregate comparison is misleading because knowledge
was both
 added {16 productions} and deleted {29}.
 RemovingConsistency Knowledge: In Soar, the subtask operator and subgoal
are
 terminatedseparately. Soar monitors impasse-causingassertions to determine
if a subgoal
 {such as the subtask goal} should be removed via FHJ. However, the removal
of a subtask
 operatorrequires knowledge. The original,FHJ architecture treats the initiation
of an operator as a persistent assumption andrequires knowledge to recognize
whena selected operator should be interrupted orterminated. This knowledge
can be categorizedas consis-
 tency knowledge because itdetermines the time at which a subtask should
be terminated,
 even when the initiatingconditions for the subtask no longer hold. In DHJ,
only the effects of operators arepersistent; all other assertions are entailments
 of the situation. Thus,the initiation of a subtask is now treated asan entailment
and
 a subtask remainsselected only as long as the initiation conditionsfor the
subtask hold.
 This change removes the need for knowledge to terminate thesubtask: when
the subtask
 initiationconditions are no longer true, the subtask isautomatically retracted.
Thus,
 terminationknowledge was removed for all subtask operators.
 Filling Gaps in Domain Knowledge: The persistence of subtasks in theoriginal
archi-
 tecture allows FHJ agents to ignore large parts of the state space intheir
domain knowledge.
 Forexample, the knowledge that initiatesstack and put-on-table subtasks
assumes that the gripper is currently not holding ablock. As these tasks
are executed, thegripper, of
 course, does grasp individual blocks. The conditions for initiatingstackor
put-on-table
 when holding ablock were ignored in the original domain knowledge.
 The DHJ agent now requires knowledge to determine which subtasks it should
choose
 when holding blocks, becausesubtasks can be interrupted while the agentstill
holds a block.
 16 productionswere necessary, primarily for thestack and put-on-tableoperators.It
is
 important to note that thisknowledge is necessary domain knowledge.FHJ agents
could
 not solve anyproblem in which they began a task holding ablock because they
lacked
 domain knowledge for these states. Theseadditions are thus a positive consequence
of DHJ. The architecture's enforcement ofconsistency revealed gaps in domain
knowledge. 377Wray & Laird 4.3.2 Performance Differences Somewhat surprisingly,
overall performanceof the DHJ agents {measured in CPUtime} im-
 proves slightly in comparisonto the FHJ agents, even though both decisionsand
production
 firings increase.Each of the Soar-specific performancemetrics are considered
individually
 below,and then the overall performance improvement is considered.
 Decisions:FHJ agents, on average, made considerablyfewer decisions than
DHJ agents.
 The difference was consistent across everytask. These additional decisions
result from the removal and subsequent regeneration ofsubtasks. For example,
when the agent picks up a
 block in pursuit of astack task, the selection of the stacktask must be
regenerated. The knowledge in the DHJ agents could bemodified to avoid testing
specificconfigurations of
 blocks and thus avoid many of these regenerations.
 Production Firings: The number of production firings also increased in the
Blocks World.
 The increase in productionfirings can be attributed to the knowledgeadded
to the system
 and the regeneration of subtasks that made the additions necessary. The
relative increase in
 number of production firings {19045} was much smaller than the increase
in decisions{62045}.
 The smaller difference can beattributed to the productions that were removed
{and thus
 didnot fire}. CPU Time: Generally,when production firings increase in Soar,
anincrease in CPU
 time is expected.However, CPUtime in DHJ decreased slightly in comparison
to FHJ
 eventhough production firings increased.To explain this result, some additional
aspects of
 Soar's processing must beconsidered.
 The match cost of a production is not constant but grows linearly with the
number of
 tokens, partial instantiations of the production {Tambe,1991}. Each token
indicates what conditions in the production have matched and the variable
bindings for those conditions. Thus, each token represents a node in a search
over the agent's memoryfor matching
 instantiation{s}of the production. The more specific aproduction's conditions
are, the more constrained the search through memory, and thus it costs less
to generate theinstantiation.
 The new productions addedto the DHJ Blocks World agent weremore specific
to
 the agent's memory{i.e., its external and internal state} than theproductions
removed.
 Further,simply having fewer total productionsalso canreduce the amount of
total search in memory.
 6
 An informal inspection of the match timeand tokens for several FHJ and
 DHJ runs showed that the number oftokens decreased in DHJ by 10-15045.This
reduction
 in token activity isthe primary source of improvement in DynamicBlocks World
DHJ
 agent CPU time.This improvement, ofcourse, is not ageneral result and provides
no
 guaranteethat in some other task or domain the cost ofmatching will not
increase rather
 than decrease.6.The RETE algorithm {Forgy , 1979} sharescondition elements
across different productions.Thus, the
 removal of productionsonly decreases the total search if removed productions
contain condition elements not appearing in the remaining productions.We
did not perform an exhaustive analysis of the condition
 elements to determine if the removed productions reduce the number of unique
condition elements in the RETE network.
 378Ensuring Consistency in Hierarchical Execution
 4.4 EmpiricalEvaluation in 026TacAir-Soar Converting TacAir-Soar to the
DHJarchitecture would be very expensive,requiring many
 months of effort.DHJ agents were instead developed for aresearch and instruction
version
 of TacAir-Soar, \Micro-TacAir-Soar"{026TAS}. 026TAS agents use the TacAir-Soarsimulation
 environment {ModSAF} andinterface but have knowledge to fly onlya few missions,
result-
 ing in an order ofmagnitude decrease in the number of productions in the
agents. However, 026TAS uses the same tactics anddoctrine for its missions
as TacAir-Soar. In 026TAS, a team of twoagents {\lead" and \wing"} fly
the patrolmission described
 previously.They engage any hostile aircraft that are headed toward them
and are within
 aspecific range. The lead agent's primaryrole is to fly the patrol route
and intercept enemy planes. The wing's responsibility is to fly in formation
with the lead.Because the
 total knowledge is significantly reduced, converting 026TAS DHJ agents
shouldbe relatively inexpensive. However, the resultsshould be representative
of TacAir-Soarbecause 026TAS
 retainsthe complexity and dynamics of TacAir-Soar. The patrol mission has
no clearly-defined task termination condition like the Dynamic Blocks World.
Toaddress this problem, each agent in the simulation executes for ten
 minutesof simulator time. During this time, eachagent has the opportunity
to take off, fly in formation with its partner on patrol, intercept one enemy
agent, and return topatrol
 after the intercept.In an actual TacAir-Soar scenario, theseactivities would
normally be
 separatedby much larger time scales. However,anagent spendsmuch of its time
on a patrol mission simply monitoring the situation {waiting}, rather than
taking new actions. Ten minutes of simulated time proved to be brief enough
that overall behavior was not
 dominated by wait-states, while also providing time for a natural flowof
events.
 When running for a fixed period of time, an increase in the numberof decisions
can be
 attributed toregeneration or simply an improvement indecision cycle time.
We avoid this potential confusion by running the simulator with a constant
cycle time. Inthis mode, each
 simulatorupdate represents 67 milliseconds of simulatedtime. Because each
agent now runs for a fixed period of time withfixed updates, each FHJ and
DHJ agent willexecute
 the same number of decisions.Any problems due to regeneration will beapparent
in the
 numberof rule firings and degradation in responsiveness. Additionally, the
general results do not change significantly if thescenarios are executed
with the real-time modenormally
 used for TacAir-Soaragents. The fixed cycle simply eliminatessome variability
.
 Althoughthe patrol scenario was designed to minimize variation from run
to run, the
 026TAS simulator is inherently stochastic and the specific actions taken
byan agent and the
 time course of thoseactions varies when the same task is repeated. To control
for this vari-
 ation, each scenario was run forthe lead and wing agents approximately 50
times.Logging
 and data collection significantly impacted CPU time and other performancestatistics.
In
 order to control forthis effect, we actually ran each scenario 99times,
randomly choosing
 oneagent {lead or wing} to perform loggingfunctions {and discarding its
performance mea- sures}. The other agent performed nologging functions. Data
from logging agents was used
 to create Figure 9.The performance measures of the \no logging"agents were
recorded at
 the conclusion ofeach scenario and are summarized in Table 2.
 379Wray & LairdLead Agent WingAgentFHJ DHJ FHJ DHJRules 591 539 591539
 Number of runs {n} 43 53 56 46026x s.d. 026x s.d. 026x s.d.026x s.d.
 Decisions8974 0.0 8974 0.0 89580.0 8958 0.0
 Outputs 109.16.71 142.8 7.03 1704 42.7869 12.8
 Rule Firings 2438122 2064 81.1 16540 3986321 104
 CPU Time {msec}1683 301 1030 242 12576861 2175 389Table 2: Summary of 026TAS
run data.
 4.4.1 ImprovingTask Decompositions
 026TacAir-Soar DHJ agents required extensiveknowledge revision. Such revision
was not unexpected. For instance, unlike theDynamic Blocks World, 026TAS
agents remember many percepts, such as the last known location of an enemy
aircraft. As previouslydescribed,
 assertions for remembering must now be located in the root levelof the hierarchy,
thus
 requiring some knowledge revision. However,other problems were discovered.In
some
 cases, FHJ agentstook advantage of inconsistency inasserted knowledge. In
other words, the FHJ agent not only allowed inconsistencyin the assertions
but actually dependedon those inconsistencies to apply new knowledge.There
were two ma jor categories ofthis knowledge.
 \Within-level"consistency knowledge recognized specificinconsistencies
{e.g., retraction of
 the proposal for the subtask} as a triggerfor actions such as \clean up"
of the subtask state. \Complex subtasks" allowed thenon-interruptible execution
of a complex procedure regardless of the continuing acceptabilityof the subtask.
In both cases, agent knowledge
 was modified to remove any dependence on inconsistency. AppendixA provides
further
 explanation of theoriginal knowledge and subsequent changes.Section 4.4.3
summarizes
 the changesquantitatively.
 4.4.2 Results
 T able 2 lists average datafor the FHJ and DHJ lead and wing agents forthe
patrol/intercept
 scenario after the modifications to the DHJ agent's knowledge base were
completed. The
 results in this domain are consistent with expectations: totalknowledge
decreases, rule
 firingsdecrease and performance improves, substantially so for the DHJ wing
agent. The following sections explore each of theseresults in greater detail.
 4.4.3Knowledge Differences
 Table3 quantifies the changes to the Soar production rules described above.
7
 Modifica-
 tions includedeletions, additions and changes. Arule was considered \changed"
only if its conditions changed slightly, but it made the same type ofcomputation
for the same
 subtask. For example, most of the \changed" within-level consistency knowledge
now refers7. The DHJ agent data was generated with a knowledge base that
includedsome changes to accommodate
 learning{Wray , 1998} and these changes areincluded in the table for completeness.The
presence of
 these rules in the knowledge base has negligible impact on the performance
data reported here.
 380Ensuring Consistency in Hierarchical ExecutionAcross-level
 ConsistencyRememberingWithin-level Consistency
Complex
 SubtasksLearningMiscellaneousTOTALSFHJ Agent: 591 Deletions: 44 36 9 104
8 {111}
 Additions: 032 5 21 0 1 59DHJ Agent: 539
 Additional Changes: 0 33 80 24 0 65Table 3: Quantitative summaryof changes
to production rules in the FHJ agent knowledge
 base for DHJ agents. to an entailed structure rather than onecreated as
an assumption, but that structure is located in the same subtask. Thissomewhat
restrictive definition of a changeinflates the
 addition and deletion accounting. In many cases a production was\deleted"
and then im-
 mediately\added" to a different subtask. For example, the productions that
manipulate motor commands were all moved from local subtasks to the highest
subtask. Almostall
 the additions and deletions in the\Remembering" category can be attributed
tothis move,
 which required no synthesis of new production knowledge.
 Total knowledge required for the DHJ agentsdecreased. This approximately
9045 reduc- tion was achieved by making some type of modification to about
40045 of theFHJ agent
 rules, and may seem a modest gain, given the conversion cost.However, this
cost is an artifactof the chosen methodology. Had the DHJ agents been constructed
inthis domain
 without previously existing FHJagents, at least a 9045 decrease in the
totalknowledge would
 be expected.This result thus suggests a reduction in thecost of the agent
design. The
 high conversion cost does suggest that converting a much larger system,
like TacAir-Soar,
 would probably be verycostly. On the other hand, the modifications were
made evident by identifiable regenerations in the architecture.Thus, the
235 total changes made to theFHJ
 knowledge base were much easierto make than constructing a similar numberof
rules.
 4.4.4 PerformanceDifferences
 As the performance resultsin Table 2 show, DHJ agents improved in performance
relative to
 their FHJ peers. However, the improvements ofthe lead and wing agents was
substantially different. Differences in the tasks oflead and wing pilots
led to the differences inrelative
 improvements.
 Lead and Wing Agents: Thelead and wing agent share the same knowledgebase
but
 perform different tasks inthe 026TAS scenario.
 8
 These differences lead todifferences in their8. The agents share the same
knowledgebase because they can dynamically swap rolesduring execution.
 For instance, if thelead exhausts its long-range missiles, it will orderthe
wing to take over the lead role, and then take the role of wing itself. 381Wray
& Laird7/17/2003 11:06:49 AMsecTime (sec)0100200300400500600Cumulative Outputs0500100015002000DHJ
LeadDHJ WingFHJ LeadFHJ Wingpatrol turnsinterceptresume patrollaunch missileFigure
11: Cumulativeoutputs over the course of one ten minutescenario for DHJ {black}
 andFHJ {gray} agents. Cumulativeoutputs for lead agents are represented
with solid lines, wing agents with dashed lines. absoluteperformance. Recallthat
the lead's primary responsibility is tofly the patrol route
 andintercept enemy aircraft. On the other hand, the wing's primarymission
role is to follow
 thelead. These different tasks require different responses in the agents.
 Anagent's overall reasoning activity is oftencorrelated with its output
activity; that is, the commands it sends to the external environment to take
action in it. Figure11
 summarizes the output activity of two pairs of lead and wing agents {FHJ
&DHJ} over the
 course of a ten-minute scenario. The output activity of bothleads is mostly
concentrated
 ata few places over the course of the scenario{take-off, intercept, launch-missile,
 and when resuming patrol following the intercept}. The wings' most concentratedoutput
 activity occurs when the leads turn to a new leg of the patrol and the wings
must follow
 the lead through a 180 degreeturn. In the remainder of this section, wefocus
only on DHJ
 agents to contrastlead and wing agent behavior. Thediscussion of the performance
metrics will examine differences between FHJand DHJ leads and wings.
 The lead actuallyspends most of the scenario waiting, with shortbursts of
reasoning
 and output activity occurringat tactically important junctures in thescenario.
On patrol,
 the lead fliesstraight and makes a decision to turn when itreaches the end
of a patrol
 leg.The lead monitors the environment and searches for enemy planes. This
search is 382Ensuring Consistency in Hierarchical Execution
 {mostly}passive; the agent's radar notifies the agent if any new entities
have been detected.
 After detecting and classifying an enemy plane as a potential threat, the
lead commits to an
 intercept.The lead immediately makes a number ofcourse, speed, and altitude
adjustments, based on the tactical situation.These actions are evident in
the figure bythe pulse labeled
 \intercept."The lead spends most of the time in the intercept closing the
distance between the aircraft to get within weapon range,again having to
maneuver very little and thus
 requiring few actions in the environment {thus the relatively flat slope
following intercept}.
 When the agent reaches missile range of the enemy plane, the leadexecutes
a number of
 actionsvery quickly. The lead steers theplane into a launch window for the
missile,pushes
 the fire button, waits for themissile to clear, and then determines a course
tomaintain
 radar contact as the missileflies to its target {at launch-missile}. Once
the intercept has
 been completed, the lead resumes its patrol task.Again, it issues a large
number of output commands in a short period of time.These examples show that
the lead's reasoning focuses
 primarily on reacting to discrete changes in the tactical situation {patrol
leg ended,enemy
 in range, etc.} andthe behavior generally requires little continuous adjustment.
 The execution of thewing's follow-leader task, on the other hand,requires
reaction to
 continuous change in the lead's position in order to maintainformation.
Position corrections
 require observing the lead's position, recognizing an undesired separation
in the formation, andthen responding by adjusting speed,course, altitude,
etc. Because the wing is following
 the lead throughout the scenario, it isexecuting this position maintenance
knowledgealmost
 constantly. Whenthe lead is flying straight and level,ason a patrol leg,
the wing's task does not require the generation of many outputs.In Figure
11, these periods of little activity
 are evident in the periodicflat segments in the wing's cumulativeoutputs.
When the lead
 beginsa maneuver {e.g., a turn}, the wing mustmaintain the formation throughout
the maneuver. During a turn the wing generatesmany motor commands as it follows
the lead. Because the turn takes a few seconds tocomplete, the outputs increase
gradually over the course of the turn, as can be seen in the figure. Thus,
the wing periodicallyencounters
 a dynamic situation that requires significant reasoning and motor responses.Further,
the
 response to this change is not discrete, asin the lead, butoccurs continuously
over the
 course of the lead's maneuver.
 These differences in the tasks for the twoagents account for the relatively
large absolute differences in the performance metrics between the lead and
wing agents.Because the wings
 are adjusting their positions relative to the leads, they issue manymore
output commands
 than the leads, which requires many more inferences to determine whatthose
commands
 should be.
 Decisions: The differences betweendecisions in the lead and wing is due
to anartifact of
 the data collection.The lead agents ran for an extra second afterthe wings
halted in order
 to initiate datacollection.
 Production Firings: In both the lead and wing agents, production firings
decrease. How-
 ever, the wing's production firings decreaseby 62045, while in the lead,
the decrease isonly
 15045. One reason for the largeimprovement in the DHJ wing is due to theelimination
of
 some redundant outputcommands in the FHJ agents. The FHJ wingsometimes issues
the
 same motor command morethan once. The reason for this duplication isthat
a specific mo-
 tor command iscomputed locally, andis thus not available to other subtasks.
In some cases, two subtasks may issue the same motorcommand. When the command
is stored locally, a
 383Wray & Laird command may be issued again because theagent cannot recognize
that the command has already been issued by another subtask.Because motor
commands are remembered in the top subtask by DHJ agents, theycan be inspected
by all subtasks. TheDHJ wing thus
 never issues a redundantmotor command. The large relative decrease inoutputs
in the
 wing agent from FHJ toDHJ {Figure 11} can be attributed to this improvement.
Produc-
 tion firingsdecrease with the decrease in output activity because most reasoning
activity
 inthe wing concerns reacting to the lead's maneuvers. In contrast to the
wing, the lead's average number of outputs actually increases.Re-
 generation is the source of theseadditional outputs. In a few situations,
a DHJagent's
 subtask for adjusting heading, speed or altitude can get updated repeatedly
in ahighly
 dynamic situation {e.g., a hardturn}. The FHJ agent uses subtask knowledgeto
decide if
 the current output commandneeds to be updated. However,inDHJ, the subtask
may be
 retracted due to dependence on a changing value {e.g., current heading}.
Whenthe sub-
 taskis regenerated following aretraction, the lead may generate a slightlydifferent
motor
 command. For example, the lead might decide to turn toheading 90.1
 ffi
 instead of 90.2 ffi
 .
 This decisioncauses the generation of a new output command thatwould not
have been
 re-issued in FHJ agents and accounts for the small increase inoutputs. It
also suggests
 thatwithout the self-imposed constraint of the methodology, the knowledge
base could be further modified to avoid thisregeneration and further decrease
production firings. Although the large magnitude of the improvement in the
wing is primarily due to re- membering motor commands, both agentsalso needed
less consistency knowledge and thus accessed less knowledge while performing
thesame task. The agents performthe sametasks
 using less knowledge. CPUTime: CPU time decreases in both the DHJ lead and
wing agents.The improvement
 in the lead{39045} is about half the improvement inthe wing {81045}. These
differences are due primarily to the decrease in productionfirings. There
are fewer production firings and thus
 fewer instantiations togenerate, leading to improvements in CPU time.Match
time also
 improved,contributing to the overall performance improvement.
 9
 The larger improvements
 in CPU time as compared toproduction firings improvements {39045 vs.15045
in the lead,
 81045 vs.62045 in the wing} might be attributable to decreases in both
the number of rule firings and match time. Again,these results offer no guarantee
that matchtime will always
 decrease with DHJ. Itis important to note, however, in two very different
domains DHJ
 reduces total knowledge and further constrains the remaining knowledge.
The architecture then leveraged these small differences forimproved overall
performance. 4.4.5 Differences in Responsiveness Because CPU time decreases
in the DHJ agents, responsiveness should generally improve. However, because
some agent knowledge has been split into several different subtasks, some
 actions may not beinitiated as quickly as would be initiated by the FHJ
agent. In this
 section, we explore differences in responsiveness in one of these situations.9.
As in the Dynamic BlocksWorld, these trends are based on a fewobservations
of the data, rather than a significant analysis. In particular, in026TAS,
data for the number oftokens generated was not collected. The results reported
here are consistent with the expectation that the token activity fallsin
DHJ agents,
 as compared to FHJ agents.
 384Ensuring Consistency in Hierarchical ExecutionAvg.In-Range Time Avg.
Launch TimeReaction Time
 {sec} {sec}{sec} nFHJ161.816 162.084 .268 95
 DHJ162.048 162.993 .945 99Table 4: A comparison of average reaction times
for launching a missile in026TAS.
 When an enemy planecomes in range, the agent executes a series ofactions,
leading to
 the firing of a missile.Re action time is the differencebetween the time
at which an enemy agent comes in range and the time when theagent actually
pushes the fire button to launch amissile. This reaction time is onemeasure
of the agent's responsiveness.As Table 4 shows,
 theFHJ agent is able to launch the missile injust over a quarter of a second.However,
 the DHJ agent is about three-and-a-half times slower than the FHJ agent
in launching the
 missile, taking almost a full second, on average.
 Splitsubtasks, regeneration, and subtask selection all contribute to the
increase in
 reactiontime. Splitting a subtask with nsteps, which may have all been executedin
a
 single decision previously, may now take n decisions in theDHJ agent. Only
a few actions are necessary for launching a missile soone would expect an
increase of, at most, afew
 hundredmilliseconds for this change.However, by dividing subtasks intoseparate
steps,
 the sequential series ofactions can be interrupted. In particular, anumber
of regenerations
 occurunder the launch-missilesubtask as the agentprepares to fire the missile
in a highly dynamic situation. The agent sometimes chooses to undertake a
similar action becausethe
 situation has changed enough that aslightly different action might be necessary,
asdescribed
 above.The result is that the DHJ agents are takingmore accurate aim than
the FHJ agents, as they are responding more quickly tothe dynamics of the
environment. This\aiming,"
 however takes more time,although the increase in time is not tacticallysignificant
{i.e.,
 enemy planes werenot escaping that were previously hit by FHJagents}.
 Some additional re-engineering of the knowledge would improve the reaction
time{e.g.,
 as described in Section 3.4.3}.However, decreasesin responsiveness willbe
difficult to avoid,
 ingeneral. Dynamic Hierarchical Justificationrequires that subtasks with
different depen- dencies be initiated and terminated separately, or risk
unnecessary regeneration. However,
 by splitting complex tasks intoseparate subtasks, individual actions are
delayed both be-
 cause the subtasks are nowseparate procedures, and because the selection
fora particular
 subtask in the series can bepostponed when additional subtask choices are
available.
 4.5 Summary of EmpiricalEvaluations
 Figure 12 summarizes results from the Dynamic Blocks Worldand 026TAS. In
both domains, DHJ agents require fewer total productions, suggesting a decrease
in knowledge cost.Per-
 formance is roughly the same inthe Dynamic Blocks World and for thelead
agents in 026TAS. The DHJ wing agents show a muchgreater improvement in
overall performance, which is due
 both to DHJ and to changes in knowledge. These results suggest thatDynamic
Hierarchical
 Justificationcan be expected to reduce engineering effortand not degrade
performance in
 a variety of domains, simple and complex.However, response time in some
situationsmay
 decrease.
 385Wray & Laird7/17/2003 3:30:29 PMsecCPU Time (sec)012341011121314Productions0100200300400500600FHJ
LeadFHJ WingDHJ LeadDHJ WingDHJ DBWFHJ DBWFigure 12: Mean CPU Timevs. knowledge
in productions for FHJ{black} and DHJ {gray}
 agentsin the Dynamic Blocks Worldand 026TAS. The graph includes theactual
 distribution of CPU time for eachagent as well as the mean for each agent.
Means for the Dynamic Blocks World agents are illustrated with squares,026TAS
 lead agents withtriangles, and 026TAS wing agentswith diamonds.
 5. Discussion Solutions other than KBAC and the newsolutions introduced
here can be developedfor
 the inconsistency problem {Wray , 1998}. We briefly introduce a few additional
solutions,
 andalso consider the relationship of Dynamic Hierarchical Justification
to intention recon- sideration in belief-desire-intention agentsand belief
revision.
 5.1 Othersolutions to inconsistency across the hierarchy In this section,
we review other existingarchitectural solutions to the problem of inconsis-
tency arising from persistence in a hierarchy of assertions.
 5.1.1 LimitingPersistence
 One obvious approach toeliminating inconsistency arising from persistence
isto disallow
 persistent assumptionsaltogether. This approach was adopted in Theo {Mitchell
et al.,
 1991}. Allreasoning in Theo is entailed from sensors; onlyperceptual inputs
are unjustified.
 Theo cannot reason non-monotonically about anyparticular world state; onlythe
world can
 change non-monotonically. Thus, Theo cannot generally rememberprevious inputs.
 386Ensuring Consistency in Hierarchical Execution
 Another possible limitation would be to restrict all assumptions to asingle
memory
 {global state}, or, equivalently , allow assumptions only in the root level
of the hierarchy
 ina hierarchical architecture. Thissolution ensures that the hierarchical
context isalways
 consistent {all assertions within an associated subtask are entailments}
andalso allows
 persistence. BecauseHTN execution systems such as RETSINA and DECAF,mentioned
 previously, have only aglobal state, they obviously do not suffer frominconsistency
over
 a hierarchy. However, the interactions between persistent assertions and
new information derivedfrom sensors will be a problem insystems with global
state.
 RETSINAhas recently adopted rationale-b ase dmonitoring {Veloso, Pollack,
& Cox, 1998} to identify environmental changes that could impact a currently
executing task net- work {Paolucci et al., 1999}.Rationale-based monitoring
uses the structure of plan knowl-
 edge {in this case, planoperators, including task networks} to alleviateinconsistency.
Moni-
 tors for relevant world features are created dynamically asplanning progresses
by identifying
 pre-conditions in operators and instantiatingthem via a straightforward
taxonomy of mon- itor types {e.g., a monitor for quantified conditions}.
The collection of monitorsform a plan
 rationale,\reasons that support a planner's decisions" {Veloso et al.,
1998}. Plan rationales are thus similar to the justifications used in truth
maintenance. Monitors are activated
 when a pre-condition element in theworld changes. They then inform the plannerof
the
 change, and the planner can thendeliberate about whether the change shouldimpact
the
 planunder construction and, if so, consider appropriate repairs.
 Rationale-basedmonitoring is similar to Dynamic HierarchicalJustification,
especially
 because they both leverage the structures of their {different} underlying
task representa-
 tionsto provide consistency. However,there are two important differences.First,
because
 DHJ identifies the specific subtask impacted by a change, it does not require
deliberation
 to determine theimpact of the change; immediate return to aconsistent knowledge
state
 is possible.When a monitor is activated inrationale-based monitoring, the
planner must first determine where and how the plan is affected, which can
require deliberation.Second,
 because monitors trigger deliberation, rather than automatically retracting
reasoning,an
 agent using rationale-based monitoring candetermine if the plan should be
repaired and how. DHJ {as implemented} does notoffer this flexibility; retraction
is automatic.Auto-
 matic retraction assumes the cost ofretrieving {or regenerating} pre-existing
plan knowledge
 is less costly than deliberation to determine if/how the plan can be revised.Because
plan
 modification can be as expensive as plan generation {Nebel & Koehler, 1995},
this as-
 sumptionis reasonable. However, invoking adeliberate revision process could
circumvent potential problems arising from recovery from interruption {Section
3.4.3}. 5.1.2 Fixed Hierarchical Justification
 As mentioned previously, both the pre-DHJ version of Soar and theProcedural
Reasoning
 System {PRS}{Georgeff & Lansky, 1987} use Fixed Hierarchical Justification
to retract
 completelevels of the hierarchy when the supportset no longer holds. In
PRS, the support set consists of a set of context elementsthat must hold
during the execution of thesubtask.
 These elements are defined bya knowledge engineer. Fixed HierarchicalJustification
offers a
 completesolution to the inconsistency problem if contextreferences within
the reasoning in
 a subtask are limited to the support set.This approach guarantees consistency.
However,
 387Wray & Laird it requires that the knowledge designeridentify all potentially
relevantfeatures used in
 reasoning within the subtask.Additionally, theresulting system may beoverly
sensitive
 to the features inthe support set if those features only rarelyimpact reasoning,
leading to
 unnecessaryregeneration.
 Fixed Hierarchical Justification requires less explicit consistency knowledge
than knowledge-
 based solutions. However,KBAC knowledge is still required if access tothe
whole task hier-
 archy is possible.Thus, an agent's ability to makesubtask-specific reactions
to unexpected changes in the environment is limited by the knowledge designer's
ability to anticipate and
 explicitly encode the consequences ofthose changes.
 5.2 IntentionReconsideration
 In the belief-desire-intention {BDI} model of agency,an intention represents
a commitment to achieving a goal {Rao & Georgeff, 1991;Wooldridge, 2000}.
An intention isthus similar
 to the instantiation of asubtask in a hierarchical architecture. Dynamic
Hierarchical Justification can beviewed as a partial implementation ofinten-
 tion re c onsider ation{Schut & Wooldridge, 2000, 2001}.Intention reconsideration
is the process of determining when an agentshould abandon its intentions
{due to goal achieve-
 ment, recognition of failure, orrecognition that the intention itself is
nolonger desired}.
 Dynamic HierarchicalJustification is only a partial implementation ofintention
reconsid-
 eration because it is only able to capture syntactic features of theproblem
solving {i.e.,
 the identificationof dependencies via the support set} todetermine when
to reconsider an
 intention.Situations that require deliberation to determinethat an intention
should be
 abandoned are not captured in DHJ. 10
 Schut & Wooldridge{2001} describe an initial at-
 tempt to allow the run-time determination ofreconsideration policies. An
optimal policy would maximize the likelihood that deliberate intention reconsideration
actually leads to abandoning an intention {i.e., the agent reconsiders when
reconsideration is necessary}.In
 contrast, Dynamic HierarchicalJustification offers a low-cost, always available,
domaingen-
 eral process for abandoning intentions, but cannot automatically identify
reconsiderations
 requiring semanticanalysis of the problem state.
 In BDI models, agents can choose to execute theircurrent action plans with
or without reconsideringtheir current intentionsfirst. Kinny and Georgeff
{1991} showed that, in more
 static domains, \bold"agents that never reconsider their intentionsperformmore
effectively
 than\cautious" agents that always reconsider before executing a plan step.
The opposite is true in highly dynamic domains:\cautious" agents out perform
\bold" ones.Both Soar
 and PRS can be describedas being cautious via Fixed HierarchicalJustification.
That is,
 ateach plan step, the architectures determine ifelements in the support
set remain asserted before executing the step. FHJapproaches are, in effect,
more \bold" thanthey appear,
 because they do notreconsider intentions when assertions have changed in
the dependent
 context,but not in the support set. DynamicHierarchical Justification provides
more \cautious" agents, becauseit ensures thatthe agent's reconsideration
function takes into account all context dependenciesfor subtask reasoning.
From the perspective of intention10. DHJdoes not preclude deliberate reconsideration.However,
Soar {as the testbed for theexploration of
 DHJ} does not provide anarchitectural solution for deliberate reconsideration.Thus,
these situations
 will beaddressed through knowledge and the deliberativeprocesses of the
architecture.
 388Ensuring Consistency in Hierarchical Execution
 reconsideration,the problemsintroduced by more dynamic domains prompted
usto explore
 more \cautious" solutions. The results of the empirical analysis weresomewhat
consistent with those of Kinny & Georgeff. The \cautious" DHJ agents performedbetter
than less \cautious" FHJ agentsin
 the highly dynamic 026TacAir-Soar domain. In the Dynamic BlocksWorld, the
performance
 differenceswere more equivocal. In comparison to FHJthe number of new intentions
increased with Dynamic Hierarchical Justification{measured as Soar decisions}.
While there was a slight overall performance improvement with DHJ, it was
dueto improvements
 in the match timeof productions, a Soar-specific measure that likely will
not generalize
 to other systems.These results suggest that DHJ is possibly overly cautious
in static
 domains.However, because Dynamic Hierarchical Justificationdid not present
a significant
 performance cost and unexpectedly played aconstructive role in agent execution
even in the static domain, DHJ seems warranted inboth static and dynamic
domains.
 5.3 Belief Revision
 BeliefRevision refers to the process of changing beliefs to accommodate
newly acquired information. The inconsistency problem is anexample of the
need for revision in asserted beliefs: some change in the hierarchical context
{deriving ultimately from perceived changes
 in the world} leads to asituation in which a currently asserted assumptionwould
not {nec-
 essarily} beregenerated if it were re-derived.Theories of belief revision
identify functions that can be used to update a beliefset so that it remains
consistent. The best known theory of beliefrevision is the \AGM" theory
{Alchourron, Gªardenfors,
 & Makinson, 1985; Gªardenfors,1988, 1992}. AGM is a coher enc etheory, meaning
that
 changesto beliefs are determined based on mutual coherence with one another.
This ap-
 proachcontrasts with the foundations approach,in which justifications {reasons}
determine when/how to revise a belief set.Obviously, Dynamic Hierarchical
Justification is an ex-
 tension to the foundations approach to belief revision. However,as the foundations
and
 coherenceapproaches can be reconciled {Doyle, 1994},in this section we explore
the reper- cussions of Dynamic Hierarchical Justificationin the context of
the AGM theory of belief revision.
 In AGM theory, when a new sentence is presented to adatabase of sentences
representing
 the current knowledge state, an agent isfaced with the task of revising
its knowledge base via one of three processes:expansion {adding sentences
to the knowledgebase}, contraction
 {removingsentences from the knowledge base} and revision{a combination of
expansions
 and contractions}. AGM theory emphasizes making minimalchanges to a knowledge
base
 and epistemicentrenchment, a notion of the usefulness of asentence within
the database.
 AGMtheory prefers that sentences with high epistemicentrenchment {relative
to other sentences} are retained during revision. Comparing Dynamic Hierarchical
Justification to Assumption Justification suggests that it is sometimes cheaper
to remove asubtask {and all asserted beliefs associated withthat
 subtask} than it is to compute theminimal revision with Assumption Justification.In
 the context of belief revision,this result is not surprising, since it has
beenshown that
 computing a minimal revision toa knowledge base can be computationally harderthan
 deduction {Eiter & Gottlob,1992}. This theoretical resulthas led toapplications
that
 389Wray & Laird compute belief updates via incrementalderivations of a belief
state, rather thanvia belief
 revision {Kurien & Nayak,2000}.
 The power of the heuristicapproach used by DHJ over the analyticsolution
follows from
 the characteristicsoutlined in Section 3.4.1: the hierarchicalstructure
and organization of
 theagent assertions and the efficiency of theunderlying reasoning system
to regenerate any unnecessarily removed assertions.Assumptions {persistent
beliefs} are associated with par-
 ticular subtasks in hierarchical architectures. A change in perception{an
epistemic input}
 leads to a revision.Rather than determining the minimal revision, DHJuses
a heuristic
 that, inthis context,saysthat persistent beliefs in a subtaskhave similar
epistemic en-
 trenchmentto the subtask/intention itself. Insome cases, this heuristic
will be incorrect, leading to regeneration, but, when correct, itprovides
a much simpler mechanism for re- vision. Gªardenfors {1988} anticipates such
conclusions, suggesting that systems possessing
 additional internal structure {ascompared to the the relatively unstructuredbeliefsets
of
 AGM theory} may provideadditional constraints for orderings of epistemic
entrenchment.
 6. Conclusion The empirical results from both theDynamic Blocks World and
026TAS domains were con-
 sistentwith expectations: knowledge engineering costdecreased and overall
performance in DHJ was roughly the same {or slightlyimproved} in comparison
to independently-developed
 FHJ benchmarks. Development cost decreases because the designer is freedfrom
the task
 of creating across-levelconsistency knowledge. One drawback of DHJis that
responsiveness
 can degrade whenregeneration occurs.
 DHJ has been incorporated into the currently released version ofSoar {Soar
8} for over
 3years and the experience of users furtherconfirms that development cost
decreases.It
 is partly true that developers need a deeper understanding of the architecture
torealize
 thisbenefit. However,DHJ removes the need for the encoding ofacross-level
consistency
 knowledge,which has proven difficult to understand andencode in many systems.
DHJ also makes understanding the role of assumptions inSoar systems more
straightforward, by im- posing design and development constraints.For instance,
the knowledge designer must now
 think about why, when, and where persistence should be usedin the agent.
Once the knowl- edge designer determines the functional role ofsome persistent
assumption, DHJ guides the development of the knowledge necessaryfor that
assumption. For a nonmonotonic orhy-
 pothetical assumption, no knowledge must be created that \looks outside"
the subtask in
 order to ensure consistency {i.e., noacross-level knowledge is necessary}.Assumptions
for
 rememberingmust be asserted in the root level ofthe hierarchy, and knowledge
must be
 created to manage the rememberedassumption. Functions of the root task now
include
 monitoring, updating, and removing remembered assumptions {we are developing
domain-
 general methods for managing theseremembered assumptions to further reduce
cost}.Thus,
 while DHJ does increase thecomplexity of the architecture, it makes designdecisions
more
 explicit and manageable thanprevious KBAC approaches.
 Regeneration,seemingly one of the drawbacks of DHJ, alsocontributes to decreased
 knowledgedevelopment costs. Regeneration serves as a debugging tool, allowing
immediate localization of problem areas in the domainknowledge {and its specific
decomposition}. This debugging aid contrasts with previousknowledge development
in which inconsistency 390Ensuring Consistency in Hierarchical Execution
 often became evident only in irrational behavior, making it often difficult
to determine the
 actual source of aproblem. Thus, in addition to reducing totalknowledge
necessary for
 some task, DynamicHierarchical Justification might also reduce thecost per
knowledge
 unit when creating agent knowledge by localizing problems viaregeneration.
However, if
 it is the case that some domain cannot bedecomposed into nearly decomposable
subunits, regeneration could be debilitating. Another positive consequence
of DHJ isthat an agent may behave more robustlyin novel
 situations not anticipated by the knowledge engineer. Forexample, as a simple
experiment, the FHJ and DHJ Dynamic Blocks World agents were placed in the
situationdescribed in
 Figure 3. TheFHJ agent fails when the block moves because it lacks knowledge
to recognize moving blocks; the knowledge designerassumed a static domain.
With the same knowledge,
 however, theDHJ agent responds to this situation gracefully.In the specific
situation
 inFigure 3, the DHJ agent immediately retracts theput-on-table{3} subtask,
because block-3 is on the table, and thus the selection of that subtask is
no longer consistent with
 the current situation.The agent then chooses stack{2,3}and decomposes this
subtask
 into actionsto put block-2 on block-3. If a new block {e.g., block-4} is
placed in the
 emptyspace below block-2, the architecture respondsby retracting the subtask
goalfor
 put-down{2} {i.e., thesubtask that contains the emptyassumption}. It then
begins to search for empty spaces in order to continueits attempt to put
block-2 on the table.Because the
 architecture, ratherthanagent knowledge, ensures consistency across thehierarchy,
DHJ
 agentsshould be less brittle in situations not explicitly anticipated in
agent design.
 DHJalso provides a solution to the problem oflearning rules with non-contemporaneous
constraints {Wray , Laird,&Jones, 1996}. Non-contemporaneous constraintsarise
when
 temporally distinct assertions{e.g., redlight, green light} are collected
in a single learned
 rule via knowledgecompilation. A rule with non-contemporaneousconstraints
will not lead
 toinappropriate behavior but rather will neverapply. This problem makes
it difficult to use straightforward explanation-basedlearning approaches
to operationalize agentexecution
 knowledge. Non-contemporaneousconstraints arise when the architecture creates
persistent
 assumptions that can becomeinconsistent with the hierarchical context {Wray
et al., 1996}.
 Because DHJ neverallows such inconsistency, it solves thenon-contemporaneous
problem.
 For instance, agents in both the Dynamic Blocks World and 026TASwere able
to learn
 unproblematicallyin the new architecture, with no/little knowledgere-design.
Wray {1998}
 provides additional details and an empiricalassessment of the learning.
 DynamicHierarchical Justification operates at a higherlevel of granularity
than As-
 sumption Justification or knowledge-based solutionmethods, trading fine-grained
consis- tency for lower computational cost.This higher level of abstraction
does introduce addi-
 tional cost in execution.In particular, necessary regeneration led to someredundancy
in
 knowledge search in both the Dynamic Blocks World and026TAS agents. Althoughover-
 all efficiency improved, some of the improvement was due to improvements
in the average
 matchcost of productions, which cannot be guaranteed in all domains or in
other archi- tectures. Further, Dynamic HierarchicalJustification requires
that complex subtasks be split into distinct subtasks. Thisrequirement improves
the knowledge decomposition and
 reduces regeneration in performance but can reduce responsiveness.However,
with the
 straightforward compilation of reasoning in subtasks that DHJenables, the
reduction in
 responsivenesscan be overcome with learning {Wray , 1998}.
 391Wray & Laird Although the implementation and evaluation of DHJ was limited
to Soar, weattempted
 to reduce the specificity ofthe results to Soar in two ways.First, we identified
the prob- lems that across-level consistency knowledgeintroduces in knowledge-based
approaches:it
 is expensive to develop, degradesthe modularity and simplicity of the hierarchical
repre-
 sentation, and is only asrobust as the knowledge designer's imagination.When
agents are
 developedin sufficiently complex domains, the expense ofcreating this knowledge
will grow prohibitive. This cost may leadadditional researchers to consider
architecturalassurances
 of consistency.Second, Dynamic Hierarchical Justification gainsits power
via the structure
 of hierarchically decomposed tasks.Although specific implementations may
differ for other
 agent architectures, theheuristic simplifications employed by DHJ should
transfer to any
 architecture utilizing ahierarchical organization of memory for task decomposition.
Dy-
 namic HierarchicalJustification is an efficient, architecturalsolution that
ensures reasoning
 consistency across the hierarchy in agents employing hierarchical task decompositions.
This
 solution allows agents to act more reliablyin complex, dynamic environments
while more fullyrealizing low cost agent development via hierarchical task
decomposition. Acknowledgments
 Thiswork would not have been possiblewithout those who contributed directly
to the developmentand evaluationof Dynamic Hierarchical Justification.Scott
Huffman, John
 Laird and Mark Portelli implemented Assumption Justification in Soar.Ron
Chong imple-
 mented a precursor to DHJ. Randy Jones, John Laird, and Frank Koss developed
026TacAir-
 Soar. Sayan Bhattacharyya,Randy Jones, Doug Pearson, Peter Wiemer-Hastings,and
 other members of the Soar group at the University of Michigan contributed
to the develop-
 ment of the Dynamic Blocks World simulator. The anonymousreviewers provided
valuable,
 constructive comments on earlier versions ofthe manuscript. This work was
supportedin
 part by a University of MichiganRackham Graduate School Pre-doctoral fellowship,
con-
 tract N00014-92-K-2015from the Advanced Systems Technology Office of DARPAand
 NRL, and contract N6600I-95-C-6013 fromthe Advanced Systems TechnologyOffice
of
 DARPA and the Naval Command and Ocean Surveillance Center, RDT&E division.
Por-
 tions of this work were presented at the 15
 th National Conference on Artificial Intelligence in Madison, Wisconsin.
 AppendixA: Improving Task Decompositions This appendix describes in detail
the changes that were made to the 026TAS agent knowledge
 for DHJ. Remembering: Figure 4 showed an agent computing a new heading as
a subtask ofthe
 achieve-proximity subtask.This calculation usually depends upon thecurrent
heading.
 When the agent generatesthe command to turn, the heading changes soonthereafter.
In
 this situation, the DHJagent must \remember" that it has alreadymade a
decision to turn
 to a new heading by placing the assumption that reflects the newheading
in the top level.
 If it placesthe assumption in the local level, then the new current heading
will trigger the
 removal of turn-to-headingand then regeneration of the subtask {if theagent
determines
 that it still needs toturn to some new heading}.
 392Ensuring Consistency in Hierarchical Execution
 In the FHJ agents, alloutput commands {such as turn to some specific heading}
were
 asserted as assumptions in the local subtask. The DHJ agent's knowledgewas
changed to
 issue output commandsdirectly to the output interface {which, inSoar, is
always part of
 thehighest subtask in the hierarchy}.No unnecessary regeneration now occurs
because the agent remembers all motorcommands and generates a new one only
when adifferent
 output is necessary. This change, of course, requires consistencyknowledge
because the
 motor commands areunjustified and thus must be explicitly removed, as is
true for any
 rememberedknowledge with DHJ.
 Within-levelConsistency Knowledge: Dynamic HierarchicalJustification, like
all so-
 lutions to theacross-level consistency problem, still requiresconsistency
knowledge within
 an individualsubtask. Some of this knowledge in the FHJagents is used to
remove inter- mediate results in the execution of a subtask.This \clean
up" knowledge allows the agent to remove local assertions that contributed
to some terminating subtask and thus avoid the
 {mis}use of these assertions inlater reasoning.
 As an example,consider the achieve-proximitysubtask.This subtask is used
in
 a numberof different situations when an agent needs toget closer to another
agent. If the wing strays too far from the lead,it may invoke achieve-proximityto
get back into
 formationwith the lead. The lead uses achieve-proximityto get close enough
to an enemy aircraft to launch a missile.The subtask requires many local
computations as the agent
 reasons about what heading itshould take to get closer to another aircraft.The
specific
 computation dependsonwhat information is available about theother aircraft.
When the
 wingis pursuing the lead, it may know the lead'sheading and thus calculate
a collision course to maximize the rate of convergence.Sometimes the other
agent's heading is not available. In this case, theagentsimply moves toward
the current locationof the other agent.
 These localcomputations are stored in the local subtask.When achieve-proximityis
 terminatedin the FHJ agent, the agent removes thelocal structure. Removing
the structure is important both because it interrupts entailment of the local
structure{e.g., calculation of
 thecurrent collisioncourse} and guarantees that if the agent decides to
achieve-proximity
 with a differentaircraft, supporting data structures are properlyinitialized.
This knowledge
 thusmaintains consistency in the local subtask byremoving the local structure
when the achieve-proximity subtask is no longerselected.
 The FHJ agent could recognizewhenit was going to remove a subtask.The termination
 conditions in FHJ agents acted as a signal to the within-level consistencyknowledge.
The
 knowledge that removes the local structure for achieve-proximitycan be summarized
as:
 \if theachieve-proximity operator is selected, butits initiation conditions
no longer hold, then remove the local achieve-proximitydata structure." Thus,
the FHJ agentuses a
 recognition of an inconsistency inthe assertions to trigger the activationof
this within-level
 consistencyknowledge.
 When the subtask's initiatingconditions are no longer supported in the DHJ
agents, the
 selected subtask is removedimmediately. Thus, the DHJ agent never has the
opportunity
 to apply the FHJagent's within-level consistency knowledge.The failure to
utilize this
 knowledgeled to a number of problems, including moreregenerations than expected.
 To solve this problem, the local subtask datastructure was created as an
entailment of the initiation conditions of the subtask itself.When the subtask
initiation conditions no longer held, both the subtask selection andthe local
structure are immediately removed by
 393Wray & Laird the architecture, requiring no additional knowledge. Thus,
this change obviated theneed
 for some within-level consistency knowledge. However, thelocal data structure
may need
 tobe regenerated if a subtaskis temporarily displaced. For instance, theFHJ
within-level
 consistency knowledge coulddetermine under what conditions the local structureshould
be
 removed. TheDHJ solution has lost that flexibility.
 Subtasks with Complex Actions:FHJ agents can execute a number ofactions
in rapid
 succession, regardless of any inconsistency in the local assertions.A single
subtask operator
 canbe initiated in a situation representing theconditions under which to
apply the first action in a sequence, and terminated when thelast step in
the sequence has applied.If
 some intermediate step invalidates the initiation conditions, the subtask
still executes the
 actions.
 Considerthe process of launching a missile.An actual missile launch requires
only the push of a button, assuming that previoussteps such as selecting
the target and an appropriate missile have been accomplishedbeforehand. After
pushing the fire button, the pilotmust fly straight and levelfor a few seconds
while the missile rocketsignite and launch
 the missile intoflight. Once the missile has cleared theaircraft, the agent
\supports" the missile by keeping radar contact with thetarget. In FHJ agents,
the push-fire-button subtask includes both the act of pushingthe fire button
and counting while the missileclears
 the aircraft. These tasks havedifferent and mutually exclusive dependencies.The
initiation
 condition forpush-fire-button requires that no missile isalready launched.
However, the subsequent counting requires monitoring thenewly launched missile.
 DHJ agents usingthe FHJ knowledge base always remove thepush-fire-button
sub-
 task as soon asthe missile is perceived to be in the air,interrupting the
complete procedure. Regeneration of the push-fire-buttonsubtask occurs because
the agent never waits for the
 missile to clear and thusnever realizes that the missile just launchedneeds
to be supported.
 TheDHJ agent unsuccessfully fires all available missiles at the enemy plane.
Pushing the fire button and waiting forthe missile to clear are independent
tasks which happen to arise in serial order in the domain. We enforced this
independenceby
 creating a new subtask,wait-for-missile-to-clear, which depends only on
having a
 newly launched missile inthe air. The DHJ agent now pushes thefire button,
selects
 wait-for-missile-to-clearto count a few seconds before taking any other
action, and
 then supports themissile if it clears successfully. This solution reduces
regeneration and improves behavior quality but it does havea
 non-trivial cost. Whenevera subtask is split, the effects of subtask actions
no longer occur
 in rapid succession withina decision. Instead, the effect of the firstsubtask
occurs in one
 decision,the effect of the second subtask in the seconddecision, etc. Thus,
this solution can compromise responsiveness. References
 Agre, P.E., & Horswill, I. {1997}. Lifeworldanalysis. Journal of Artificial
Intelligence
 R ese ar ch,6, 111{145.
 Alchourron, C. E., Gªardenfors, P., & Makinson, D. {1985}. On the logic
of theory change:
 Partial meet contractionand revision functions. Journal of Symbolic Lo gic,
50 {2}, 510{530.
 394Ensuring Consistency in Hierarchical Execution
 Allen, J. F. {1991}.Time and time again. InternationalJournal of Intel ligent
Systems, 6 {4}, 341{355.
 Altmann,E. M., & Gray, W. D. {2002}.Forgetting to remember: The functionalrelationship
 of decay and interference.Psychologic al Science,13, 27{33.
 Bresina, J., Drummond, M., & Kedar, S. {1993}. Reactive, integratedsystems
pose new
 problems for machinelearning. In Minton, S. {Ed.},Machine Le arning Methodsfor
 Planning, pp. 159{195. MorganKaufmann, San Francisco, CA.
 Dechter,R. {1990}. Enhancementschemes for constraint processing: Backjumping,learning
 and cutset decomposition.Artificial Intel ligence,41, 273{312.
 Doyle, J. {1979}.A truth maintenance system. ArtificialIntel ligence, 12,
231{272. Doyle, J. {1994}. Reason maintenanceand belief revision. In Gªardenfors,P.
{Ed.}, Belief
 Revision, pp. 29{51. Cambridge UniversityPress, Cambridge, UK.
 Eiter, T., &Gottlob, G. {1992}. On the complexity ofpropositional knowledge
base revision, updates, and counterfactuals.Artificial Intel ligence,57,
227{270.
 Erol,K., Hendler, J., &Nau, D. S. {1994}. HTN planning: Complexityand expressivity.
In
 Pro c e e dings of the 12
 th
 National Conferenc eon Artificial Intel ligence,pp. 1123{1128.
 Firby, R. J. {1987}.An investigation into reactive planningin complex domains.
In Pro- c e e dings of the 6
 th
 National Conferenc eon Artificial Intel ligence,pp. 202{206.
 Forbus, K. D., & deKleer, J. {1993}. Building ProblemSolvers. MIT Press,
Cambridge, MA.
 Forgy , C. L. {1979}.On the Efficient Implementation of Pro duction Systems.
Ph.D. thesis, Computer Science Department, Carnegie-Mellon University.
 Gªardenfors,P. {1988}. Know ledgein Flux: Modeling the Dynamics of EpistemicStates.
 MIT Press, Cambridge, MA. Gªardenfors, P. {1992}.Belief revision. In Pettorossi,
A. {Ed.},Meta-Pro gr amming in Lo gic.
 Springer-V erlag, Berlin, Germany.
 Gaschnig, J. {1979}. Performancemeasurement and analysis of certain searchalgorithms.
 Tech. rep. CMU-CS-79-124,Computer Science Department, Carnegie-Mellon Univer-
sity, Pittsburgh, Pennsylvania.
 Gat, E. {1991a}. Integratingplanning and reacting in a heterogeneous asynchronous
archi-
 tecture for mobile robots.SIGART BULLETIN,2, 71{74. Gat, E. {1991b}. Reliable,Goal-dir
e cte d Controlof Autonomous Mobile Rob ots.Ph.D.
 thesis, Virginia PolytechnicInstitute and State University,Blacksburg, VA.
 Georgeff,M., & Lansky, A. L. {1987}.Reactive reasoning and planning. InPro
c e e dings of
 the 6 th
 National Conferenc eon Artificial Intel ligence,pp. 677{682.
 Graham, J., & Decker, K.{2000}. Towards a distributed, environment-centered
agent frame-
 work. In Wooldridge, M., & Lesperance, Y. {Eds.}, Le ctur eNotes in Artificial
Intel li-
 gence: Agent Theories,Archite ctur es, and LanguagesVI {AT AL-99}. Springer-Verlag,
 Berlin.
 395Wray & Laird Hanks, S., Pollack, M., & Cohen, P. R. {1993}. Benchmarks,
test beds, controlled experi-
 mentation and the designof agent architectures. AI Magazine, 14, 17{42.
 Hayes-Roth, B.{1990}. An architecture for adaptive intelligent systems.
In Workshop on Innovative Appro aches to Planning, Scheduling and Control,
pp. 422{432. Jones, R. M., Laird, J. E., Neilsen, P. E., Coulter, K. J.,
Kenny,P., & Koss, F. V. {1999}. Automated intelligent pilotsfor combatflight
simulation. AI Magazine, 20 {1}, 27{41.
 Kinny, D., & Georgeff, M. {1991}. Commitmentand effectiveness of situated
agents.In
 Pro c e e dings of the 12 th
 International Joint Conferenc e on Artificial Intel ligence, pp.
 82{88.
 Kurien,J., & Nayak, P. P.{2000}. Back to the future forconsistency-based
tra jectory
 tracking.In Pro c e e dings of the 17 th
 National Conferenc eon Artificial Intel ligence, pp. 370{377.
 Laird, J. E. {2001}.It knows what you are going to do: Addinganticipation
to a Quakebot.
 In Pro c e e dings of the 5 th
 International Conferenc e on Autonomous Agents,pp. 385{
 392.
 Laird, J. E.,Congdon, C. B., & Coulter, K. J. {1999}.Soar user's manual
version 8.2. Manual, Department of Electrical Engineeringand Computer Science,
University of Michigan,http://ai.eecs.umiuch.edu/soar/docs.html.
 Laird, J. E., Newell, A., & Rosenbloom, P. S. {1987}. Soar:An architecture
for general
 intelligence.Artificial Intel ligence,33, 1{64.
 Laird, J. E., & Rosenbloom, P. S. {1990}. Integratingexecution, planning,
and learning
 inSoar for external environments. InPro c e e dings of the 8 th
 National Conferenc eon
 Artificial Intel ligence, pp. 1022{1029.
 Laird, J. E., &Rosenbloom, P. S. {1995}.The evolution of the Soar cognitive
architecture.
 In Steier, D., & Mitchell, T.{Eds.}, Mind Matters: Contributions to Cognitive
and
 Computer Science in Honor of Al len Newel l. Lawrence Erlbaum Associates,
Hillsdale, NJ.
 McDermott, D. {1991}.A general framework for reason maintenance.Artificial
Intel ligence, 50, 289{329.
 Mitchell,T. M., Allen, J., Chalasani, P.,Cheng, J., Etzioni, O., Ringuette,
M., & Schlimmer, J. C. {1991}. Theo: A frameworkfor self-improving systems.
In VanLehn, K. {Ed.},
 Archite ctures for Intel ligence, chap. 12,pp. 323{355. Lawrence Erlbaum
Associates, Hillsdale, NJ.
 Mitchell, T. M. {1990}.Becoming increasingly reactive. InPro c e e dings
of the 8 th
 National
 Conferenc e on Artificial Intel ligence, pp. 1051{1058.
 Nebel,B., & Koehler, J. {1995}. Planreuse versus plan generation: A theoretical
and empirical analysis. ArtificialIntel ligence, 76, 427{454. Newell, A.
{1990}. Unified Theories of Cognition.Harvard University Press, Cambridge,
MA.
 396Ensuring Consistency in Hierarchical Execution
 Paolucci, M., Shehory, O., Sycara, K. P., Kalp, D., & Pannu, A. {1999}.
A planning compo- nent for RETSINA agents. InWooldridge, M., & Lesperance,
Y. {Eds.},Le ctur e Notes
 in ArtificialIntel ligence: Agent Theories, Archite ctur es, and Languages
VI {AT AL-
 99}, pp. 147{161, Berlin. Springer-Verlag.
 Pearson, D. J., Huffman, S. B.,Willis, M. B., Laird, J. E., & Jones, R.
M.{1993}. A
 symbolic solution to intelligent real-time control. Rob otics and Autonomous
Systems, 11, 279{291.
 Rao, A. S., &Georgeff, M. P. {1991}. Modelingrational agents within a BDI-architecture.
In Pro c e e dings of the 2 nd
 International Conferenc e on Principles of Know ledge Rep-
 r esentation and Re asoning, pp.471{484.
 Russell, S., & Norvig, P. {1995}. ArtificialIntel ligence: A Modern Appro
ach. Prentice Hall,
 Upper Saddle River, NJ.
 Sacerdoti, E. D.{1975}. The nonlinear nature of plans.In Pro c e e dings
of the 4 th
 Interna-
 tional JointConferenc e on Artificial Intelligence, pp. 206{214.
 Schut,M., & Wooldridge, M. {2000}.Intention reconsideration in complex environments.
In
 Pro c e e dingsof the 4
 th
 International Conferenc e on Autonomous Agents, pp. 209{216.
 Schut, M., & Wooldridge, M. {2001}. Principlesof intention reconsideration.
InPro c e e dings
 of the 5 th
 International Conferenc e on Autonomous Agents,pp. 340{347.
 Shoham, Y. {1993}.Agent-oriented programming. ArtificialIntel ligence, 60
{1},51{92.
 Simon, H. A. {1969}.The Sciences of the Artificial. MIT Press, Cambridge,
MA.
 Stallman, R. M., & Sussman, G. J. {1977}.Forward reasoning and dependency-directed
backtracking in a system for computer aidedcircuit analysis. Artificial Intelligence,
 9 {2},135{196.
 Sycara, K., Decker, K., Pannu, A., Williamson, M., & Zeng, D. {1996}.Distributed
intelli-
 gent agents.IEEE Expert, 11 {6},36{46.
 Tambe, M. {1991}.Eliminating Combinatorics from Pro duction Match. Ph.D.
thesis, Carnegie-Mellon University.{Also published as TechnicalReport CMU-CS-91-150,
 Computer ScienceDepartment, Carnegie Mellon University.}.
 T ambe, M., Johnson, W. L.,Jones, R. M., Koss, F., Laird, J. E., Rosenbloom,
P. S., &
 Schwamb, K.{1995}. Intelligent agents for interactive simulation environments.
AI Magazine, 16 {1}, 15{39. Veloso, M. M., Pollack, M. E., &Cox, M. T. {1998}.
Rationale-basedmonitoring for plan-
 ning in dynamic environments. In Pro c e e dingsof the 4
 th
 International Conferenc e on
 Artificial Intelligence Planning Systems, pp. 171{180. Wilkins, D. E., Myers,
K. L., Lowrance,J. D., & Wesley , L. P.{1995}. Planning and reacting
 in uncertain and dynamic environments.Journal of Experimental and Theor
etic al
 Artificial Intelligence, 7 {1}, 197{227. Wooldridge, M. {2000}.Re asoning
about RationalAgents. MIT Press, Cambridge, MA. Wray , R. E. {1998}.Ensuring
Re asoning Consistency in Hierar chic al Archite ctur es. Ph.D.
 thesis, University of Michigan. Also published as University of Michigan
Technical
 Report CSE-TR-379-98. 397Wray & Laird Wray , R. E., & Laird, J. {1998}.Maintaining
consistency in hierarchicalreasoning. In
 Pro c e e dingsof the 15
 th
 National Conferenc e on Artificial Intel ligence, pp. 928{935.
 Wray ,R. E., Laird, J., & Jones, R. M. {1996}.Compilation of non-contemporaneous
con- straints. In Pro c e e dingsof the 13
 th
 National Conferenc e on Artificial Intel ligence, pp.
 771{778.
 Wray , R. E., Laird, J. E., Nuxoll, A., &Jones, R. M. {2002}. Intelligentopponentsfor
virtual
 realitytrainers. In Pro c e e dingsof the Interservice/Industry Tr aining,Simulation
and
 Education Conferenc e {I/ITSEC} 2002.
398