G. A. Kaminka, D. V. Pynadath and M. Tambe
Volume 17, 2002
Links to Full Text:Journal of Arti034cial Intelligence Research 17 {2002} 83025135Submitted
10/01; published 08/02
Monitoring Teams by Overhearing: A Multi-Agent Plan-Recognition Approach
Gal A. Kaminka galk@cs.biu.ac.il Computer Science Department Bar Ilan University
Ramat Gan 52900, Israel
David V. Pynadath pynadath@isi.edu
Milind Tambetambe@usc.edu
Computer Science Department and Information Sciences Institute University
of Southern California
4676 Admiralty Way
LosAngeles, CA 90292, USA
Abstract Recent years are seeing an increasing need for on-line monitoring
of teams of coop- erating agents, e.g., for visualization, or performance
tracking. However,in monitoring
deployed teams, we oftencannot rely on the agents to always communicate
their state
to the monitoring system.This paper presents a non-intrusiveapproach to
monitoring by
overhearing, where the monitored team's state is inferred{via plan-recognition}
from team-
members'routine communications, exchanged aspart of their coordinated task
execution, and observed {overheard} by themonitoring system. Key challenges
in thisapproach in-
clude the demanding run-timerequirements of monitoring, the scarceness of
observations
{increasing monitoring uncertainty}, and the need to scale-up monitoring
to addresspoten-
tially large teams. To address these, we present a set ofcomplementary novel
techniques, exploiting knowledge of the socialstructures and procedures in
the monitored team:{i}
an e036cient probabilisticplan-recognition algorithm, well-suited for processingcommuni-
cations as observations;{ii} an approach to exploiting knowledge ofthe team's
social be-
havior to predict future observations during execution {reducingmonitoring
uncertainty};
and{iii} monitoring algorithms that trade expressivityfor scalability, representingonly
certain useful monitoring hypotheses, but allowing for any number of agents
and their di033erent activities to be represented in a single coherent entity.
We present an empirical evaluation of these techniques, in combination and
apart, in monitoring a deployed team of agents, running on machines physically
distributed across the country, and engaged in
complex, dynamic taskexecution. We also compare the performanceof these
techniques
to human expertand novice monitors, and show that the techniques presented
are capable
of monitoringat human-expert levels, despite the di036culty of the task.
1. Introduction Recent years have seen tremendousgrowth of applications
involving distributed multi-agent
teams, formed of agents thatcollaborate on a speci034c joint task {e.g.,Jennings,
1995; Pe-
choucek,Marik, & Stepankova, 2000,2001;Kumar & Cohen, 2000; Kumar,Cohen,
& Levesque, 2000; Horling, Benyo, & Lesser, 2001; Lenser, Bruce, & Veloso,
2001; Barber&
Martin, 2001}. This growth hasled to increasing need for monitoring techniquesthat
allow a
c
fl2002 AI Access Foundation and MorganKaufmann Publishers. All rights reserved.Kaminka,
Pynadath,& Tambe
synthetic agent or human operator to monitor and identify the stateof the
distributed team.
Previouswork has discussed the critical role of monitoringin visualization
{e.g., Ndumu,
Nwana,Lee, & Collis, 1999}, in identifying failures inexecution {e.g., Horling
et al., 2001}, in providing advice to improve performance {e.g., Aiello,
Busetta, Dona, & Sera034ni, 2001}, and in facilitating collaboration between
the monitoring agent and the members of theteam
{e.g., Grosz & Kraus, 1996}. This paper focuses on monitoring cooperative
agent teams by overhearing theirinter-
nal communications. Thisallows a human operator or a syntheticagent to monitor
the
coordinatedexecution of a task, by listening to the messages team-members
exchange with
eachother. It contrasts with previous techniquesthat are impractical in
settings where direct observations of the team members are unavailable {e.g.,
when team-members are
physically distributed awayfrom the observer}, or in large-scale applicationscomposed
of
alre ady-deploye dagents that are dynamically integrated tojointly execute
a task.
Forexample, one common technique, rep ort-b ase d monitoring, requires each
monitored team-member to communicate its state tothe monitoring agent at
regular intervals, or at
least whenever the team-member changes its state. Suchreporting provides
the monitoring
agent with accurate information on the state ofthe team. Unfortunately,
report-based mon- itoring su033ers from several di036culties in monitoring
large deployed teams of interest inthe
real-world {see Section 2 for adetailed discussion}: First, it requires
intrusive modi034cations
to the behavior ofagents, such that they report their state asneeded by
the di033erent moni-
toring applications. However,sinceagents are already deployed, such repeated
modi034cations
to the behaviorof the agents are di036cult to implement andcomplex to manage.
In par-
ticular, legacy and proprietary systems are notoriously expensiveto modify
{for instance,
consider thenotorious modi034cations to address the Year 2000 bug, also
known as Y2K}. Second, the bandwidth requirements of report-based monitoring
{which relies on communi- cation channels} can be unrealistic{Jennings, 1993,
1995; Grosz & Kraus, 1996; Pechoucek
et al., 2000, 2001; Vercouter, Beaune, & Sayettat, 2000}.In addition, network
delays and unreliable or lossy communication channels area key concern with
report-based monitoring approaches.
We therefore advocate an alternative monitoring approach, based onmulti-agent
keyhole
plan-recognition{Tambe, 1996; Huber & Hadley, 1997; Devaney & Ram, 1998;
Intille& Bobick, 1999; Kaminka & Tambe, 2000}. In this approach,themonitoring
system infers
theunobservable state of the agents based on their observable actions, using
knowledge of the plans that give rise to the actions.This approach is non-intrusive,
requiring nochanges
to agents' behaviors;and it allows for changes in the requestedmonitoring
information.
It assumes access toknowledge of plans that may explain observable action026however
this
knowledge is readily availableto the monitoring system as we assume it isdeployed
in a
collaborativeenvironment. Indeed, in some cases, the monitoring system may
be deployed
by the human operator of the team. An additional bene034t of a plan-recognition
approach is that it can rely on inference to compensate for occasional communication
losses,and can therefore be robust to communicationfailures.
In general, the only observable actions of agents in a distributed teamare
their routine
communications,which the agents exchange as part of taskexecution {Ndumu
et al., 1999}. Fortunately , the growing popularityof agent integration tools
{Tambe,Pynadath, Chauvat,
Das, & Kaminka, 2000; Martin, Cheyer, & Moran, 1999} andagent communications
{Finin,
84Monitoring Teams by Overhearing Labrou, & May034eld, 1997; Reed, 1998}increases
standardization of aspects of agent commu-
nications, and provides increasing opportunities for observing and interpreting
inter-agent
communications. We assume thatmonitored agents are truthful in their messages,since
they are communicating to theirteammates; and that they are not attempting
to deceive
the monitoring agent or prevent itfrom overhearing {as it is deployed bythe
human oper-
ator of the team}.Given a {possibly stochastic} model of the plans that
the agents may
be executing, a monitoring system usingplan-recognition can infer the current
state of the agents from such observed routinemessages.
However, the application ofplan-recognition techniques for overhearing posessigni034cant
challenges. First,a key characteristic ofthe overhearing taskis the scarcity
of observations. Explanations for overheard messages {i.e.,the observed actions}
can sometimes be fairlyeasy
to disambiguate, but uncertaintyarises because there are relatively few
of themto observe:
team members cannot and donot in practice continuously communicate amongthemselves
about their state {Jennings,1995; Grosz & Kraus, 1996}. Thusteam-members
change their
state while keeping quiet. Another key characteristic of overhearing is
that the observable actions are inherently multi-agent actions: When agents
communicate, it is only a single
agent that sendsthe messages. The others implicitly acttheir role in the
communications by listening. Yet despite the scarcity of observable communications,
andthe multi-agent nature
of the observed actions,a monitoring system must infer the state of allagents
in the team,
atall times.Previous investigations of multi-agentplan-recognition {Tambe,
1996; Devaney
& Ram, 1998; Intille & Bobick,1999; Kaminka & Tambe,2000} have typically
made the assumption that all changes to the state ofagents have an observable
e033ect:Uncertainty
resulted from ambiguityin the explanations for the observed actions.Furthermore,
these
investigationshave addressed settings where observable actions were individual
{each action is carried out by a single agent}. In addition to these challenges
that areunique to overhearing, a monitoring system must address additional
challenges stemming fromthe use of monitoring in service of visualiza- tion.
The representation and algorithms must support soft real-time response; reasoning
must be done quickly to be usefulfor visualization. Furthermore, real-worldapplications
demand techniques that canscale up as the number of agents increases, for
monitoring large
teams. However,many current representations for plan-recognitionare computationally
in-
tense {e.g.,Kj346rul033, 1992}, or only address single-agentrecognition
tasks {e.g., Pynadath
&Wellman, 2000}. Multi-agentplan-recognition investigations have typicallynot
explicitly
addressed scalability concerns{Devaney & Ram, 1998; Intille & Bobick,1999}.
This paper presentsOverseer, animplemented monitoring systemcapable of moni-
toring large distributedapplications composed of previously-deployed agents.
Overseer
builds on previous work in multi-agent plan-recognition {Tambe, 1996; Intille
& Bobick, 1999; Kaminka & Tambe,2000} by utilizing knowledge of the relationships
between agents
to understand howtheir decisions interact. However,as previous techniques
proved insu036- cient, Overseer includes a number of novel multi-agent plan-recognition
techniques that
address the scarcity of observations, as well as the severe response-timeand
scale-up re-
quirements imposed by realistic applications. Key contributions include:{i}
a linear time
probabilistic plan-recognition representation and associated algorithms,
which exploit the nature of observed communications fore036ciency; {ii}
a method for addressing unavailable
observ ations by exploiting knowledge of the social pro c e dur esof teams
to e033ectively predict 85Kaminka, Pynadath,& Tambe
{and hence e033ectivelymonitor} future observations during normal andfailed
execution, thus
allowinginference from lack of such observations; and {iii} YOYO*, an algorithm
thatuses
knowledge of the team organizationalstructure {team-hier ar chy }to model
the agent team
{withall the di033erent parallel activities taken by individual agents}
using a single structure, instead of modeling each agent individually. YOYO*
sacri034ces some expressivity {theability
to accurately monitor the team incertain coordination failure states} for
signi034cant gains in
e036ciency and scalability.
W e present a rigorous evaluation of Overseer's di033erentmonitoring techniques
in one
ofits application domains and show that the techniques presented result
in signi034cant boosts
to Overseer's monitoring accuracyand e036ciency, beyond techniques explored
in previous
work. Weevaluate Overseer's capability toaddress lossy observations, a key
concernwith
report-based monitoring. Furthermore, weevaluate Overseer's performance
in comparison
with humanexpert and novice monitors, and show thatOverseer's performance
is compa- rable to that of human experts, despitethe di036culty of the task,
and Overseer's reliance
on computationally-simple techniques. One of the key lessons that wedraw
in Overseer
is that a combination of computationally-cheap multi-agentplan-recognition
techniques, ex-
ploitingknowledge of the expected structures and interactions among team-members,
can
be competitive with approaches which focus on accurate modeling of individual
agents{and
may be computationally expensive}.
This paper is organized as follows. Section 2 presents the motivation for
the design
of Overseer, using examples from an actual distributedapplication in which
Overseer
was applied. Section 3 presents a novel single-agent plan-recognition representation
and associated algorithms, particularlysuited to monitoring an agent based
on its observed communications. Section 4 explores severalmethods Overseer
uses to address uncertainty
in using this representation formonitoring a team of agents. Section5 presents
YOYO*,
whichallows e036cient reasoning using the methodspreviously discussed.
Section 6 presents an evaluation of the di033erent techniques incorporated
in YOYO*. Section7 contrasts the
techniques presentedwith previous related investigations, and 034nally,
Section 8 concludes
andpresents ourplans for future work. In addition, severalappendices present
all pseudo-
codefor algorithms discussed in the text, and portions of the data used
in our experiments, for those readers who may wish to replicatethe experiments.
2. Motivation and Illustrative Examples Several considerations, based on
our experience with actual distributed applications, have directed us towards
the plan-recognitionapproach we advocate in this paper.We present
these considerations inthe context of an illustrative complex distributedapplication,
which
wealso use for evaluating Overseer in Section 6.In this application, a distributed
team of 11 to 20 agents executes a simulation ofan evacuation of civilians
from a threatenedlocation.
The integrated system allowsa human commander to interactively provide locations
of the
stranded civilians, safeareas for evacuation and other key points. Simulated
helicopters then
035y a coordinated mission to evacuate the civilians, relying on various
information agents to
dynamicallyobtain information about enemy threats, {re}planroutes to avoid
threats and
obstacles,etc. The distributed team is composed of diverse agents from four
di033erent re- search groups: A Quicksetmulti-modal command input agent
{Cohen,Johnston, McGee,
86Monitoring Teams by Overhearing Oviatt, Pittman, Smith, Chen, & Clow,1997},
a Retsina route planner {Payne,Sycara,
Lewis, Lenox, & Hahn, 2000}, theAriadne information agent {Knoblock, Minton,
Am-
bite, Ashish, Modi, Muslea, Philpot, & Tejada, 1998} and eight synthetichelicopter
pilots
{Tambe,Johnson, Jones, Koss, Laird, Rosenbloom, & Schwamb, 1995}.
The agents were notdesigned to work together on this task026they were already
built
and deployed prior tothe creation of the team. The team is integrated using
Teamcore
{Tambeet al., 2000}, which accomplishes integration by 020wrapping021
each agent with a proxy that maintains collaboration withother agents {via
their own proxies}.The proxies
and agents form a team,jointly executing a distributed application described
by a team-
oriente dpro gr am. Such a program consistsof:
* A team hierarchy, where a team decomposes into subteams, andsub-subteams.
* A plan hierarchy, which contains team plans that decompose into subteam
plans
* Assignment of teams from the team hierarchy to plans in the plan hierarchy.
As an example, Figure 1-a showsa part of the team/subteam hierarchy used
inthe
evacuation-domain {described below}. Here, for instance, TRANSPORTis a subteam
of
FLIGHT-TEAM, itself asubteam of TASK-FORCE. Figure 1-b showsan abbreviated
plan-
hierarchyfor the same domain. High-level team plans,such as Evacuate, typically
de- compose into other team plans, such asProcess-Orders, and, ultimately,intoleaf-level
plans that are executedby individuals. Temporal transitions areused to constrain
the or-
der of executionof plans. There are teams assigned to executethe plans,
e.g., the TASK
FORCEteam jointly executes Evacua te, while only the TRANSPORT subteam executes
the Transport-Opera tions{Transport-Ops} step. The team-oriented program
for
this application consists of about 40 team-plans. Some plans may get executed
repeatedly
though, so each agent mayexecute up to hundreds of plan steps as part ofthe
execution of
a single team-orientedprogram.
To execute the team-orientedprogram, each proxy uses a domain-independentteamwork
model, called STEAM {Tambe, 1997}. The teamwork modelautomatically generates
the
communicationmessages required to ensure appropriate coordinationamong the
proxies.
For instance, STEAM requires that if an agent privatelyobtains a belief
bel thatterminates
ateam plan, then that agentshould send a message to the rest of the team
toterminate that
team plan, along with the private belief bel thatled to that termination.
To avoidjamming
the communication channels with a 035ood of messages about every single
plan, STEAM
chooses to communicate selectively. Thus, whereas communicating aboutthe
initiation and
termination of each and every plan would have led to 2000 ormore messages
generated in
one run, only about 100 messages get exchanged in any one runwhen using
STEAM {Tambe
et al., 2000}.
Figure 2 displayssome of the messages exchanged among team members in the
evac-
uation application,through the use of STEAM. The 034rst message issent
from a proxy
called 020teamquickset021 to members of a team TEAM-EVAC {another name
for TASK FORCE}. The content of this messageindicates that the team should
terminate a plan called determine-number-of-helos.The second message is sent
from a proxycalled
87Kaminka, Pynadath,& Tambe
Gal A. KaminkaTASK FORCEFLIGHTTEAMTRANSPORTESCORTROUTEPLANNERESCORTFOLLOWTRANSPORTDIVISION
1........ESCORTLEAD {a}{b}EVACUATE.....[TASK FORCE]EXECUTEMISSION[TASK FORCE]PROCESS[TASK
FORCE]MANEUVERSZONELANDING....FLY-FLIGHTPLAN[FLIGHT TEAM]FLY-CONTROLROUTE....ORDERS[FLIGHT
TEAM][FLIGHT TEAM]TRANSPORTOPERATIONSESCORT[ESCORT][TRANSPORT]OPERATIONSORDERSGET[GET
ORDERS]GET ORDERSROLEFigure1: Portions of the team-hierarchy {a}and plan-hierarchy
{b} used in our domain. Dotted lines show temporal transitions. 020team_auto2021
to members of asubteam TEAM-ESCORT-FOLLOW {asubteam ofES-
CORTS}. The content of thismessage indicates that the subteam should establishcommit-
ment to a plan namedprepare-to-execute-mission. The online appendix presents
sample
logs of the overheard messages from complete runs, as well asthe plan and
team hierarchies
for the evacuation application.
As discussed inSection 1, the capability for automatically monitoring the
progress of
the team is critical.This need for team monitoring is furtherampli034ed
in distributed
settings,since a human operator in one place cannotdirectly observe the
agents executing in a remote location. Forinstance, in trial runs of the
evacuationsimulation application
describedabove, monitoring sometimes required a seriesof frantic phone calls
among human operators in di033erent states, trying to verify the successful
execution of the system as itwas
operating. And even when thisagent team was co-located on multiplecomputers
in one
room, the diversity of agents made it extremely di036cult for an observer
to automatically
monitor the state ofthe team just from observing the di033erent agent output
screens.
Overseerwas built to provide such monitoring bytracking the routine communica-
tions among the agents {Figure 2}.Using plan-recognition, itallows humans
andagents
to query about the present andfuture likely plans of the entire team, itssubteams
and
individuals026to monitor progress,compute likelihoods of failure, etc.However,
given that
theagent team communicates selectively about theplans being executed, Overseer's
plan- recognition faces signi034cant uncertainty. Furthermore, Overseer
must beable to answer
queries on-line, andmust therefore work e036ciently.As discussed later,
addressing these challenges has required several novelteam-based plan-recognition
techniques to be devel-
oped.
Several considerationshave led us away from report-basedmonitoring for this
and other
Teamcoreapplications. First, report-based monitoringrequires that agents'
code be mod- i034ed to communicate the reports needed for monitoring; as
monitoring requirements change 88Monitoring Teams by Overhearing Log Message
Received; Fri Sep 17 18:27:541999:
Logging Agent: teamquickset Message==> tell
:content teamquicksetterminate-jpg constant determine-number-of-helos number-of-helos-determined
*yes* 4 4 98kqml_string
:receiver TEAM-EVAC 9 kqml_word :reply-withnil 3 kqml_word
:team TEAM-EVAC9 kqml_word
:sender teamquickset 12kqml_word
:kqml-msg-id21547+tsevet.isi.edu+722 kqml_word
Log Message Received; Fri Sep17 18:30:35 1999:
Logging Agent: TEAM_auto2 Message==> tell
:content TEAM_auto2establish-commitment prepare-to-execute-mission
58 kqml_string
:receiverTEAM-ESCORT-FOLLOW 18 kqml_word
:reply-withnil 3 kqml_word
:team TEAM-ESCORT-FOLLOW18 kqml_word
:sender TEAM_auto2 10 kqml_word :kqml-msg-id20752+dui.isi.edu+16 20kqml_word
Figure 2: ExampleKQML messages used as observationsby Overseer.
89Kaminka, Pynadath,& Tambe
from one application to thenext, so does the information needed about each
agent. Un-
fortunately,theagents and their proxies are already deployed in several
government lab- oratories and universities. Modifyingthe agents at each deployed
location isproblematic
and intrusive026modi034cationsinterfere with carefully designed timing
speci034cations of given
tasks, requiring further modi034cations by other agent developers.The distributed
nature of
Teamcoreimplies that there is no centralized serverwhich controls the behavior
of the agents, but instead changes are required inthe di033erent proxy types.
Indeed,in general,
modifying legacy andproprietary applications {including the integration
architecture} is of
course known to bea di036cult process, and so a solution thatrequires constant
modi034cations
to the agents and architecture will not scaleup.
A second important consideration was the computational and bandwidth requirements
of report-based monitoring. As has beenrepeatedly noted in the literature,
one cannot expect
agents to be able to communicate continuously and fully monitor all other
agents {e.g.,
Jennings, 1993, 1995; Grosz & Kraus, 1996; Pechoucek et al., 2001; Vercouter
et al., 2000}.
Ina teamof 11 {used as an example in this paper},regularly scheduled state
reports from the agents at the required temporal resolutionwould require
approximately 50,000 messages to be sent during a 15-minute run, with the
number nearly doubling when we reach 20agents.
If we instead have the 11 agents only report on state changes, announcing
plan initiation
and termination, approximately2,000 messages have to be sent.However, this
is still an
order-of-magnitude more than the normal 100 messagesor so that are exchanged
by the 11agents as part of routine execution.Even if the network could support
thebandwidth
necessary for report-based monitoring,there is also a signi034cant computational
burdenon
themonitoring system to process all the incoming reports.
On the other hand, aplan-recognition approach seemed like a natural034t
for the task.
First, it doesn'trequire any changes in the behavior of themonitored agents,
and is thus
very suitable for monitoring agents that arealready deployed. Second, it
doesn't addany
computational burdens to the monitoredagents or the network, sinceit uses
onlywhat
observations are already available. Third, the main knowledge sourceplan-recognition
sys-
tems typically relyon026a plan library026is in fact easily available in
accessible form to the monitoring system from the team-oriented programwhich
is used to integrate the agents, since the operator deploying the monitoringsystem
is assumed to be the one to describe the integration team-oriented program
in the 034rst place. Thus plan-recognition's sometimes criticized assumption
of a correct plan-libraryis in fact satis034ed fully in this monitoring
application.
Note that this assumptionholds even if agents are not all using thesame
integra-
tion architecture:The only knowledge we rely on is a {possibly stochastic}
model of how components of execution 034t together, andthe communications
that are used to integrate them. Therefore, while this paper focuseson team-oriented
programs {described above}, the
techniques introduced appeargeneralizable to other types of representationlanguages
for
distributed systems, such asT306MS {Decker, 1995}, team-oriented programming{Tidhar,
1993a} and others. Furthermore, the plan-library need not containimplemetation
details026
only the names ofthe key steps. Thus even agentsutilizing radically-di033erent
representations than a plan-hierarchy can be monitored, as long as they have
execution states corresponding
to the team-oriented program {which they have to have in any case inorder
to coordinate
with other team-members}.
90Monitoring Teams by Overhearing Monitoring by overhearing posesuniquechallenges
as previously discussed. However,it
also o033ers unique opportunities forplan recognition. We had earlier stated
ourassumption
that agents are truthful intheir communications, and do not seek todeceive
their teammates
or the monitoringsystem, nor prevent overhearing in any way {e.g., encryption}.
This
assumptionis justi034ed as the monitoring system is deployed by the operator
of the monitored agents, or by an agent team-member.Failures of the team
to coordinate{e.g., due to clock
asynchrony orunintentional erroneous messages} will thereforecause corresponding
failures
inmonitoring. However, wedo not makeadditional assumptions about the messages
beyond those that are made by the monitoredagents themselves.
This assumption allowsa plan-recognition system to treat observations with
certainty:
When a messageis overheard terminating plan X, the monitoring system can
infer with certainty that indeed the planX is no longer executed. However,thisdoes
not eliminate
planrecognition ambiguity. First,multiple instantiations of plan Xmay exist,
andthe
messagedoes not specify which one was terminated.Second, upon termination
of the plan, the monitored team-member must often choose between multiple
alternative plansteps to
follow X ,and yet this choice is not evident in theobservations. Indeed,
thedi036culty of monitoring by overhearing isdemonstrated by human monitoring
performance:Novice
human monitors have managedto only achieve approximately 60045 accuracy
on average.
3. Monitoringa Team of Agents as Separate Individuals In this section, we
present a representation and associated baseline algorithms to support
overhearing based on the plan-hierarchy and team-hierarchy. We begin by
making an as-
sumptionof agent independence, where observations and beliefs about one
agent's stateof
execution have no bearing on our beliefs about another agent's state.This
assumption can
becontrasted withanother: If we assume instead that team-members are successful
in their
coordination,then knowing that one agent has begunexecuting a joint plan
would natu- rally increase the likelihood that itsteammates have begun as
well, as agentswould not be
considered independent.In fact, successful teamwork re quir es interdependency
among the agents {Grosz, 1996}.
However,an initial assumption of agent independence provides a baseline
of comparison,
asit more closely follows current approaches tomulti-agent plan recognition,
which often assume that observations about eachindividual agent are continuously
available. Later
sections {Sections 4 and5} will highlight the unique challenges tackled
in monitoring by
overhearing,and will take agent interdependencies intoaccount.
We thus begin bymaintaining a separate plan recognizer for eachagent. Each
recog-
nizer observesonly those messages that its respective agentsends. On the
basis of these
observations, the recognizer maintains aprobabilistic estimate of the state
of execution of the various plans the agent may be currently executing. Knowledgeof
the plans assigned to
agentsand their team memberships is available in our application from the
plan-hierarchy
and team-hierarchy of the team-oriented program used in constructing the
monitored appli- cation.
Section 3.1 presents thelanguage we use for the probabilistic representation
of a team-
oriented program.We exploit various independence properties within team-oriented
pro-
gramsto achieve a compact representation ofthe possible plan states of the
agents. Sec- 91Kaminka, Pynadath,& Tambe
tion 3.2 presents analgorithm for updating the recognizer's beliefs about
the agents' plan
states upon theobservation of a message. Thisalgorithm performs the update
with an e036ciency gained by exploiting the particularsemantics of communicated
messages, namely that each such message is an observation that indicates
the initiation/termination ofapar-
ticular plan with certainty. Section 3.3 presents an algorithm for updating
the recognizer's
beliefs about theagents' plan states when no messagehas been observed. In
the absence of any such evidence, this algorithm e036ciently updates the
recognizer's beliefs by usinga
temporal model of the agents' planexecution that makes a strong Markovianassumption.
Finally, Section 3.4 presents the overall recognition procedure, as well
as an illustration and
complexity analysisof that procedure.
3.1 Plan-StateRepresentation
We address uncertainty in monitoring through a probabilistic modelthat supports
quanti-
tative evaluation of the recognized plan hypotheses.Since we are monitoring
these agents through the duration of their execution, weuse a time series
of plan-state variables.At
each point in time, the agent's plan state is the state of the team-orientedprogram
that it
is currently executing, i.e., a path from root to leaf in the team-orientedprogram
tree. We
representthe plans in the program by a set of boolean random variables,
fX t
g, where each variable X
t
is true if and only if the agent is actively executing plan X at timet.
We then
representour beliefs about the agent's actual state at time t as a probability
distribution over all variables fX t
g. The distribution takes into account dependencies among thedi033erent
plans in the team-orientedprogram {e.g., parent-child relationships}, as
well as the tempo-
ral dependencies between the plan state at times t andt + 1. To simplify
the dependency
structure, it is useful to introduce additional boolean random variables,done{X;
t},that are
true if and only if planX was executed at time t000 1 and its execution
has terminated at time t.
There are a number of possible representations for capturingthe distribution
and per-
forming inference over these variables. However,the generality of the plan
hierarchy, the
dynamic nature of the domain, and the requirements of the task eliminate
mostexisting ap-
proaches from consideration.For instance, we could potentiallygenerate a
DBN026Dynamic
Belief Network{Kj346rul033, 1992}026to represent the probabilisticdistribution
over the plan
variables. To do so, we include nodesrepresenting all of the plan variables,X
t
, as well as representing done{X;t}. The links among these nodesrepresent
the structure of the plan hierarchy {e.g., parent-childrelationships, temporal
constraints}, and we can034ll in the con-
ditional probabilitytables accordingly. We also representthe temporal progress
of the team
by including nodes for the variablesat the next time slice, X
t+1
. We add links fromthe X
t
nodesto the X
t+1
nodes and represent the dynamics in theconditional probability tables
on those links.For each transition from a nodeX
t
to a nodeY
t+1
{X6= Y }, we would also add binary nodes indicating the observation of a
message along that transition.Thus,
for a plan hierarchy withM plan nodes, the corresponding DBNrepresentation
will have
O{4M + M
2 } = O{M
2
} binary random variables.
The standard DBN inferencealgorithms maintain a belief state,b
t
, representing the posterior probability distribution overthe variables
in time slice, t, conditioned on all of
the observations made so far {from time 0025t}. These inference algorithms
can update the
92Monitoring Teams by Overhearing belief state to incorporate newevidence
about any variables,X
t
, and they can alsocompute
the next time-tick's belief state, b
t+1
. We can extract the desired probability overplan-
state variables by examiningthe posterior probabilities stored in b t
. Given the dependency structure of our plan model, the space andtime complexity
of performing inference using this DBN {either incorporating a single observation,
or computing b
t+1
} is O{2
M
2
} for a single agent.
This DBN method isnot su036ciently e036cient to support on-linemonitoring
in real-world
domains,sinceon each and every time step, therecognizer must perform an
inferential step of exponential computational complexity. There exist single-agent
plan-recognition techniques that avoid the exponential complexity of DBNs
by using a representation and
inference algorithms aimed at theparticular properties of the plan-recognition
task{e.g.,
Pynadath & Wellman,2000}. Such specialized representations avoid the full
generality
ofDBNs, while still capturing a broad class of interesting planning agent
models. Given a specialized representation,the single-agent plan-recognition
algorithms canexploit the
particular structure of the planmodels to achieve e036cient online inference.
Drawing our inspiration from the success ofthis work in single-agent domains,
we adopt a similar methodology in our multi-agentdomain. In other words,
we have developed a
novel plan-recognition representation more suited to capturing team-orientedprograms.
The
structural assumptions we make in this representation support e036cientinference
with our
specializedalgorithms, as well as more naturally supportingan extension
to represent inter-
agent dependencies {as discussed in Section 4}. We represent the team-oriented
planas a directed graph, whose vertices are plans, and whose edges signify
temporal and hierarchical decomposition transitions between plans: Children
edges denote hierarchical decomposition of a plan into sub-plans. Siblingedges
denote temporal orderings between plans. Following the structure of theplan
hierarchy, the
variables fX
t
g form a directed connected graph, such thateach node X
t
has at most one
hierar chic al-dec omp osition incoming transitionfrom a parent node {representing
itsparent
plan}, and any number oftempor al incoming transitionsfrom plans that precede
it in order of execution. The graph may contain multiple nodes for a single
plan, if the plan isthe po-
tential child of multipleparent plans. The node may have any number of temporal
outgoing transitions to immediate successor sibling nodes {representing plans
that may follow it in order of execution}, and any numberof hierarchical-decomposition
outgoing transitions to the node's 034rst children{i.e., those that will
be executed 034rst bya decomposition of the
plan X t
. The graph forms a treealong hierarchical decomposition transitions, sothat
no
plan can have itself as adescendent. On the other hand, there may be cycles
along temporal
transitions{to siblings}. In other words, a plan may have an outgoing temporal
transition to itself {meaning that it can be selected for execution again
upon termination},or to a
node that has a temporalpath leading back to the plan {meaning that itis
the 034rst node
in a temporalsequence of plans that may be executed repeatedly}. It may
also have two alternative temporal paths leading indirectlyfrom one node
to another.
To perform inference with this representation, we borrow the standard DBN
inference algorithms' notion of a belief state,b
t
. As in the DBNcase, the belief state represents the posterior probability
distribution overthe variables in time slice, t, conditioned on all of the
observations made so far. In addition, for eachplan, we distinguish between
a state ofactual
execution and a blocke d state, indicating that execution hasterminated,
but execution of
93Kaminka, Pynadath,& Tambe
a successor has not yetbegun {perhaps because the agent is in the process
of sending a
message}.Thus, b
t
{X; block} is our belief thatX has terminated, but the agent has not begun
execution of a successor;b
t
{X;:block} is then our belief attime t that the monitored
agentis currently executing X , which hasnot yet terminated. More precisely,
we de034ne
b
t {X; block} 021Pr{X
t
; done{X; t + 1}jE }and b
t
{X;:block} 021 Pr{X
t
; :done{X; t + 1}jE },
where E again denotes all ofthe evidence we have received so far.If the
recognizer observes
a message froman agent at time t, it updates itsprevious belief state, b
t , by incorporating
theevidence into its new belief state,b
t+1
, according to themethod described in Section
3.2.If it does not observe a message from anagent at time t, it propagates
belief into its
new belief state, b t+1
, using the methoddescribed in Section 3.3 to simulate planexecution
over time.
3.2Belief Update with Observed Message While observing team communications,
therecognizer can expect to occasionally receive evidence in the form of
messages {sent by an individual agent member} that identifyeither
plan initiation or termination.In incorporating this evidence, we exploit
theassumption
that the agents are truthful intheir messages. In other words, if weobserve
an initiation
message for a plan,X , at time t, then X t
is true with certainty. Likewise, if we observe a termination message for
a plan,X , at time t, then done{X; t + 1} is true withcertainty.
More precisely, the algorithms presented in this section arespecialized
to exploit the prop-
erty of observed communications, whereforany observation 012, eitherPr{X
t
j012; E } = 1 or Pr{done{X; t}j012; E } = 1, for any possible previously
observedevidence, E .
Though messages areassumed truthful, there still remains ambiguity. First,
while a
messageuniquely speci034es the relevantplan, it does not uniquely specify
therelevant node.
In other words, the recognizer is still unsureabout which particular X
t
node the message
refers to, since the graph may contain multiple X
t
nodes consistent with the message. Fur-
thermore,when a message announces termination of a plan {even with no ambiguity
about
the corresponding node}, there still remains ambiguity about the next plan
selected by the agent.
The observationsavailable in the overhearing tasks ofimmediate interest
to us fall into this level of ambiguity.In our evacuation scenario example,
thereare two nodes correspond-
ing to the plan land-troops, becausethere is one instance of land-troopsfor
picking up
the people to betransported and another for dropping them o033.If the recognizer
observes
a messageindicating that an agent has initiated execution ofland-troops,
then there is ambiguity about which of the twoinstances is currently relevant.Furthermore,
there may
existambiguity about which plan the agent willselect after terminating land-troops.
Algorithm 1 presents the pseudo-code for thecomplete procedure for incorporating
evi- dence from observations.
Incorporating Evidence of an Observed Initiation Message {lines 30258} Suppose
that, at time t, we haveobserved a message, msg, that correspondsto initiation.
If only one
plan,X , is consistent with msg, then we know, with certainty, that the
agent is executing X,
regardless of whatever evidence wehave previously observed. Therefore,we
can simply set
our belief thatX
t
is true to be 1.0.If multiple plans are consistent withmsg, we distribute
theunit probability over each consistent plan, weighted by our prior belief
in seeing the given
94Monitoring Teams by OverhearingAlgorithm 1Incorporate-Evidence{msg m,
beliefsb, plans M }1: Initialize distributionsb
0
; b
t+1
0:0for all plans in M
2:for all plans X 2 Mconsistent with m do
3: if m is an initiation messagethen
4: for all plansW that precede X do 5: b
0
{X; :block} b
0
{X;:block} + b
t {W;block}026 wx
031
wx 6: else {m is a terminationmessage}
7: for allplans Y 2 M that succeedX do
8: b
0
{Y ; :block} b
0
{Y ; :block} + b t
{X;block}026
xy
031 xy
9: Normalize distributionb
0
10: forall plans X 2 M withb
0
> 0do
11: b
t+1
{X; :block} b
0
{X; :block}
12:Propaga te-Down{X;b
0
{X; :block}; b; M }
13: tmp X
14:while parent{tmp}6= null do
15:b
t+1
{parent{tmp}; :block} b
t+1 {parent{tmp}; :block} + b t+1
{tmp;:block}
16: tmp parent{tmp}message. This prior belief depends on all predecessor
plans of Xthat may have terminated
prior to seeing this message.
To support the computation of the beliefs over transitions from predecessor
plans to successors, as well as the beliefs ofseeing a message for a given
transition,Overseer stores
two parameters:031 and 026. The former is the probability of entering
a successor plan,X ,
given that predecessor plan,W , has just completed: 031 wx
021 Pr{X
t+1
jW t
; done{W; t+ 1}).
The latter is theprobability of seeing a message, given that theagent took
the speci034ed
transition:026
wx
021Pr{msg
t
jW
t
; done{W; t + 1}; X
t+1
}. Wecan use previous runs to acquire
suitable values for these parameters,031 and 026, by producing afrequency
count over transitions and messages seen during those runs {seeSection 4.2
for more discussion of the use of026 in
Overseer}. Therefore, given the observationof an initiation message, msg,
at timet, we wish to
distributethe unit probability over all plans,X , {in the unblocke dstate}
that are consistent
with msg. We can derive our newbelief in plan X at timet + 1 as follows:
Pr{X
t+1
jmsg; E } =
Pr{msg; X
t+1
jE }Pr{msgjE }
The denominator issimply a normalization factor, and it is the samefor all
candidate plans,
X .Therefore, we ignore it in this derivation, and focus on only the numerator,
which we
can expand over all possible predecessor plans, W , and possibletermination
states of W :
/
X
W
Pr{msg; X
t+1 ; W
t
; done{W; t + 1}jE } +
X
W
Pr{msg; X
t+1 ; W
t
; :done{W; t + 1}jE}
95Kaminka, Pynadath,& Tambe
The second term is 0,since we cannot proceed from Wto X if W hasnot terminated.
In the second term,we can expand the joint probability intoits component
conditional
probabilities: /
X
W
[Pr{msgjW
t ; done{W; t + 1}; X
t+1
;E }
001 Pr{X t+1
jW
t
; done{W; t+ 1}; E }
001Pr{W
t
;done{W; t + 1}jE}]
We assume that the probability of sending a message and the distribution
over plan tran-
sitions obey a Markovproperty, sothat they are independent of the plan history
before
timet, given the current plan at timet. Thus, the 034rst two conditional
probabilities are
independentof our previous history of observations.The third is exactly
our previous belief that W is blocked: /
X
W
[Pr{msgjW
t ; done{W; t + 1}; X
t+1
} Pr{X
t+1
jW t
; done{W; t+ 1})
001 b
t
{W; block}] The 034rst two conditional probabilitiesare exactly our parameters,
026 and031:
/
X W
026
wx 031
wx
b t
{W; block} {1}
Lines 40255 of Algorithm1 perform exactly the derived summation ofEquation
1 {the
normalization step iscarried out on line 9 {see below}.A similar procedure
is followed when a message is observed indicating the termination of X {lines
60258}. Insuch a case, we know
that the agent was executing X in the previous timestep but that it has
moved on to some successor. Thus, for each ofX 's potential successor plansY
, we set our belief inY to be
proportional to atransition probability, similar to that forthe initiation
message:
Pr{Y
t+1
jmsg;E } =
Pr{msg;Y
t+1
jE }Pr{msgjE}
The denominator is again anormalization factor that we ignore. We can expand
the nu-
meratorover possible states of X 'sexecution:
/ Pr{msg; Y t+1
; X
t
; done{X; t+ 1}jE }
+ Pr{msg;Y
t+1
;:X
t
; done{X; t + 1}jE } + Pr{msg;Y
t+1
; X
t
; :done{X; t + 1}jE }
+ Pr{msg;Y t+1
; :X t
; :done{X; t + 1}jE }
96Monitoring Teams by Overhearing Only the 034rst term is nonzero, since
the others correspond to states of execution that are inconsistent with the
observed message: / Pr{msg; Y
t+1
; X
t
; done{X; t + 1}jE }
We can rewrite this joint probability as a product of conditionalprobabilities:
/ Pr{msgjX
t
; done{X; t + 1}; Y t+1
; E } 001 Pr{Y
t+1
jX
t
; done{X; t + 1}; E }
001 Pr{X
t
; done{X; t + 1}jE }
We again use our Markovian assumptionsto simplify the conditional probabilities,
and we rewrite the third probability using our belief state:
/ Pr{msgjX
t
; done{X; t + 1}; Y t+1
} Pr{Y t+1
jX
t
; done{X; t+ 1})
001 b
t
{X; block} Finally, we rewrite the 034rst two conditional probabilities
using our parameters,026 and 031:
/026
xy
031
xy
b
t
{X; block} {2} Lines 70258 of Algorithm 1 perform exactlythe derived summation
of Equation 2. Normalization of the sum {line 9}.Line 9 normalizes the sum
to recapture a well-
formed probability distribution.Note that the normalization step must take
into account
the fact that evidence maybe incorporated for plan steps where one is anancestor
of
another026inwhich case theevidence for the ancestor plan is probabilisticallyredundant.
The more speci034c evidence{for the descendent plan} will be more usefulfor
visualization,
asit is more accurate. Propagation of Evidence {lines 1002516}Finally,
the recalculated beliefs are set{line
11} and then the changes arerecursively propagated down the decompositionhierarchy
to
the plan's children{line 12}, via the call to Algorithm 2.In addition, the
recalculated beliefs are propagated up to the plan's ancestors inthe decomposition
hierarchy {lines 1302516},since
evidence of a child plan beingactive is evidence of its parent being active
as well. We assume
here that we have no knowledge aboutthe relative likelihood of the childplans,
so we treat
each as equallylikely. If we had additional knowledge about these likelihoods,
we could easily exploit it in our Propaga te-Down algorithm.Algorithm 2Propaga
te-Down{plan Y , probability032, beliefs b, plansM }1:C fc j c 2M; c 034rst
child of Yg
2: 032
0 032= j C j 3: for all plans c2 C do
4: b t+1
{Y ;:block} b
t+1
{Y ; :block} + 032
0 5: Propaga te-Down{c; 032
0
; b; M }97Kaminka, Pynadath,& Tambe
3.3 Belief Update withNo Observation
In overhearingtasks, there is a great deal of uncertainty about when agents
complete the
executionof their plan steps, sinceagents do notnecessarily send messages
upon every termination or initiation of a plan.Therefore, ifno messages are
observed at timet, then the
system's beliefs for timet + 1 must be calculated based onthe possibility
that the agents may have initiated or terminated plans withoutsending any
messages. To support thenecessary
belief update, we need a model of plan execution that provides us with aprobability
of plan
terminationover time {i.e., Pr{done{X; t})}. Inprinciple, this probability
distribution can be arbitrarily complex, and its structure may vary enormously
from domain to domain, and even from plan to plan within the samedomain.
In some domains, obtaining an accurate model of this distribution requires
complex knowledge acquisition from domain experts or else a complex learning
process on the partof the agent. In addition, an accurate model
may be too complex to support e036cient online inference.
Overseerinstead uses a temporal model that supportsboth e036cient inference
and
simpleparameter estimation procedures. Overseermodels the duration of a
{leaf }plan,
X , as an exponentialrandom variable. In other words, theprobability of
the plan completing
execution within 034 time units increasesas 1 000 e
000034 025 X
. The single parameter,025
X
, corresponds to 1/{mean duration of X },which we can easily acquire from
domain experts or previous
runs. As for inference,the exponential random variablehas a Markovian property,in
that
the probability of the plan'scompletion between times t andt + 1 is
Pr{done{X; t + 1}jX t
} 021 1000e
000025
x ;
independent ofhow long the agent has been executingX before time t. Thisstrong
assump-
tion may not fully holdin some real-world domains, but it is often a good
approximation.
Also, the error associated with this approximation may beacceptable, given
the enormous
gain ininferential e036ciency {as we show in theremainder of this section}.
Thesee036ciency gains manifest themselves whenOverseer rolls the model
forward in time to compute its belief state forthe next time slice. Given
the exponential random
variable as a model of planduration, the probability of completion of a
leafplan is a constant,
1 000e
000025
x , for each plan X .For plans with children, the probabilityof completion
is exactly
the probabilityof completion of its last child {according to the temporal
ordering of the
children}. Having computed the probability of plantermination, Overseer
then evaluateswhich
plan the agent may execute next.It examines the possible successors and,
foreach, com-
putes the probability oftaking the corresponding transition, conditioned
onthe fact that no
message was observed{1 000 026
xy }, and on the prior probability oftaking this message {031
xy }.
Again, as mentioned in Section3.2, Overseer makes a Markovianassumption
that the
plan history before timet does not a033ect the likelihood ofthe various
transitions. Given this assumption, it can combine the two parameters, 031
and 026, to get thedesired conditional
98Monitoring Teams by Overhearing probability of the transition, given that
we observed no message:
Pr{Y
t+1
jX t
; done{X; t+ 1}; :msg
t }
=
Pr{:msg t
jX
t ; done{X; t + 1}; Y
t+1
} Pr{Y
t+1
jX t
; done{X; t+ 1})Pr{:msg
t
jX
t
; done{X; t+ 1})
=
{1 000026
xy
}031 xyX
Z
Pr{:msg
t
jX
t
; done{X; t + 1}; Z
t+1
} Pr{Z
t+1
jX t
; done{X; t+ 1})
=
{1 000026
xy
}031 xyX
Z
{1 000 026
xz
}031
xz =
{1 000 026
xy
}031
xy021
X {3}
The normalizing denominator,021
X
, is the sum of thenumerator over all possible succes- sors, Y , which we
canpre-compute o033-line. We can use the value of 021
X
to determine the
likelihood that theagent will send a message upon terminating planX at time
t. In the special case when 021
X
= 0, Equation 3 is not well-de034ned, as al l possibletransitions from
X re quir ea message. In this case, the agentcannot have begun execution
of any successor, even though it has completed execution of X . 021
X
is therefore the probability mass signifying our belief that the agent is
no longerexecuting X at time t + 1, and is not waiting for a
message {i.e., it is in a blocked state}.In other words, it is our increased
beliefthat the
agent is executing one ofX 's immediate successors at timet + 1, given that
we haveseen
nomessage.
Algorithm3 presents the pseudo-code for the process of propagating the probabilities
forwardin time when a message is not observed.First, it initializes all
the valuesto 0 {lines
10255}. The process continues by going over all plansX 2 M , in post-or
der 026we explore
children plans {i.e., plans reachable byhierarchical decomposition transitions}
beforetheir
parents, and sibling plans in orderof execution. For each plan, the algorithm
executes four
stages: {1} It determinesthe plan's outgoing probabilities {lines 702510};{2}
it determines 021
x ,
the outgoing probability massthat is propagated along the outgoing temporaltransitions
without being blocked bywaiting for a message {lines 1102512}; {3}itpropagates
021
x
along the non-blocked temporal outgoingtransitions {lines 1302520}; and
034nally {4} itcomputes our
belief that the agent will execute the plan at the next time-tickb
t+1
{X;:block} or will be blocking {lines 2102522}. Theremainder of this section
explains these four stagesin detail.
Calculating the outgoing probability out
x
{lines 702510}.In Algorithm 3, the variable out
x
representsthe total temporal outgoing probability from plan, X , given our
belief that the agent was executing Xat time t. If a planX is a leaf, then
we derive its temporal
outgoing probability,out
x
, from the temporalmodel discussed previously, given our belief
that the agent is currentlyexecuting X {lines 70258}. IfX is a parent,
lines 902510 are, infact,
redundant: They serve only toremind the reader that for a parent,Y , out
y
follows from
Y 'schildren when they execute line 20.This depends critically on the post-order
traversal
of the plan-hierarchy:the outgoing probability of a parentY is derived from
the outgoing probabilities of its last hierarchical-decomposition children,
and thus all children'soutgoing
probabilities must be calculated before their parents'.
99Kaminka, Pynadath, & TambeAlgorithm 3Propaga te-For w ard{beliefs b, plans
M}1: forall plans X 2 M do 2: b
t+1
{X; :block} 0:0
3: b t+1
{X; block} 0:0
4:out
x
0:0
5: 021
x 0:0
6:for all plans X 2 Min post-order do {children in temporal order before
parents}
7:if X is a leaf then 8: out
x
b
t
{X;:block}{1 000 e 000025
x
} {calculate probability of Xterminating at time t}
9: else {X is a parent} 10: out
x
is known { because post-order guaranteesall children set it in line 20}
11: for all temporaloutgoing transitions T
x!y
from X do 12: 021
x
021
x
+{1 000 026
xy
}031
xy
13:if 021
x
>0 then {some transition can be taken} 14: for all temporal outgoingtransitions
T
x!y from X do
15:032 out
x
{1000 026
xy
}031
xy
16:if T
x!y
leads to a successor plan Ythen
17: b
t+1
{Y ; :block} b
t+1 {Y ; :block} +032
18: Propaga te-Down{Y ; 032; b; M } 19: else {T
x!y
is a terminating transition} 20: out
parent{x}
out parent{x}
+ {1 000 026
xy }031
xy
{parent's outgoing probability is its chil- dren's}
21: b
t+1
{X; block} b
t+1 {X; block} + out x
000 021
x
22: b
t+1 {X; :block} b
t+1
{X; :block} 000out
xDeterminingthe non-blocked outgoing probability021
x
{lines 1102512}.The prob-
ability,021
x
is the sum overall possible values of the numerator inEquation 3 {i.e.,
over all
temporaloutgoing transitions originating in X}, as illustrated in the derivation.As
we see
in line 21, 021 x
is critical for calculating the belief that the agent has terminated execution
of X; but has not yet begunexecution of a successor {i.e., the beliefb
t+1
{X;block} that the
agentis blocking}.
Propagating 021 x
along temporal outgoing transitions {lines 1302520}. This is the keycomponent
in the propagation.For every temporal outgoing transitionT
x!y
, Over- seercalculates 032, a temporary variable that holds the probability
mass corresponding to
Overseer's belief inthe joint event of {i} the agent having completed execution
of X ,{ii}
the agent taking the transitionT
X !Y
,and {iii} the agent doing so without sendingout an
observable message.The calculation of 032 is derived asfollows:
032 = Probabilitythat X is done ^ nomessage was observed ^ agentchose T
x!y = Pr{done{X;t}jX
t
} Pr{:msgjX
t
; done{X; t}) Pr{Y
t+1 jX
t
; done{X; t}; :msg t
}
= out x
003 021
x
003
{1 000026
xy
}031 xy021 x
= out
x 003 {1 000 026
xy
}031
xy {4}
If the transitionT
x!y
leads to asuccessor plan Y {lines 1602518}, then032 is added to
Y 'sfuture state {at time t + 1} as temporal incoming probability. Since
decomposition
isassumed to be immediate, thisincoming probability is propagated {added}
to Y 's034rst
100Monitoring Teams by Overhearing children {Algorithm 2}. Ifthere are multiple
034rst children, then theydenote alternative plan
decompositionsfor a single agent, and we compute theprobability over them
by dividing the probability incoming to the parent amongthem. If any children
have 034rst child plans
of their own, we distributethis new incoming probability in turn, using
thesame method.
Only in the next time-stepdoes the algorithm propagate from 034rst childrento
the next
child, in order of execution.The reason for this is that we assume thatall
plans take at
least a single timestep to complete.
Ifthe transitionT
x!y
is the special-case termination transition {line 1902520}, thenX has
no successors. Inthis case, the outgoing temporal probability032 is added
to X 's parent's out- going probability out
parent{x}
so that it may be used when propagating parent{x}'s temporal
outgoingprobability along its own temporal outgoingtransitions. Note again
that the post- order traversal of the plan-hierarchyguarantees that all children
are explored before their
parents, thus out parent{x}
is fully computed by the time the algorithmreaches parent{x}. Computing
X 's new blockedand non-blocked probabilities {lines 2102522}.Now
that the outgoing probabilitymass has been propagated to X 'schildren and
siblings, the
only stepsremaining involve re-calculation ofOverseer's belief in X's blocked
and non-
blockedstates. The total temporal outgoing probability {whether blocked
or not} isout
x
; it
must be subtracted from future belief thatthe agent is executing X . Theprobability
mass
that left b t
{X; :block} but is blocking on a message that was not observed by Overseer
is out
x
000021
x
. It is added toX 's future blocked state. 3.4 Discussion
The overhearing approach outlined in this section maintains aseparate plan-recognition
mechanismfor each agent, ignoring any inter-agentdependencies. Using an
array of indi- vidual models {Figure 3} that are updated with the passage
of time, or as messages are observed, the state of a team is taken to be
the combination of the most likelystate of each
individual agent.Algorithm 4 embodies this approach:It is called every time
tick, collects all messages that are observed, and updatesthe state of the
agents.
rootEVACUATE.....[TASK FORCE]EXECUTEMISSION[TASK FORCE]PROCESS[TASK FORCE]MANEUVERSZONELANDING....FLY-FLIGHTPLAN[FLIGHT
TEAM]FLY-CONTROLROUTE....ORDERS[FLIGHT TEAM][FLIGHT TEAM]TRANSPORTOPERATIONSESCORT[ESCORT][TRANSPORT]OPERATIONS[GET
ORDERS]ORDERSGETEVACUATE.....[TASK FORCE]EXECUTEMISSION[TASK FORCE]PROCESS[TASK
FORCE]MANEUVERSZONELANDING....FLY-FLIGHTPLAN[FLIGHT TEAM]FLY-CONTROLROUTE....ORDERS[FLIGHT
TEAM][FLIGHT TEAM]TRANSPORTOPERATIONSESCORT[ESCORT][TRANSPORT]OPERATIONS[GET
ORDERS]ORDERSGETFigure 3:Array of single-agent recognizers026one for each
agent.
As an illustration of the operation of this algorithm, consider the example
domainof
the evacuation scenario.Overseer begins with a belief that the agent is
executing its top-
levelplan {and its 034rst child, Process-Orders} at time 0 {i.e., b
0 {E vacuate; :block} = 1:0,
b 0
{P rocessOrders;:block} = 1:0}. If Overseer observes a messageabout the
initiation of
Fly-Flight-Planby one of the helicopters, then it appliesIncorporate-Evidence
{Algo-
101Kaminka, Pynadath, & TambeAlgorithm 4Array-Overseer{beliefs b, plan-hierarchyarray
M [], agentsA}1:for all Agents a2 Ado
2: if A messagem
a
from a was observed then
3: Incorporate-Evidence{m
a
; b; M[a]}
4: else{No message was sent by a}
5: Propogate-For w ard{b; M [a]}rithm 1}. Fromthe plan-hierarchy {Figure
1b} it is knownthat Process-Orders cannot be a possible current or future
plan of the agent, and that the helicopter in question is executing Fly-Flight-Plan,
i.e., b t
{P rocessOrders;:block} = 0, b t
{F lyF lightP lan; :block} = 1:0.
Thisprobability mass is propagatedto Fly-Flight-Plan's 034rst children,
ofwhich there is
one, and thus the belief in this child is set to 1.0 as well. After some
time passes and no message is observed, there is uncertainty as to whether
Fly-Flight-Plan and Landing-Zone-Maneuversare active, as both are possible
future states,and the duration of Fly-Flight-Planis uncertain. Overseer would
stillassign
aprobability of 1.0 to the top-level plan Evacuate. However,some probability
mass from
Fly-Flight-Planwould be propagated every time-tick toLanding-Zone-Maneuvers
by
Propaga te-For w ard {Algorithm 3}.For each such propagation, the incomingtemporal
probability mass being addedto the belief in the execution ofLanding-Zone-Maneuvers
wouldbe propagated to its 034rst children immediately. Assuming that the
helicopter agent is free to select either Transport-Operationsor Escort-Operations,
the incoming proba- bility would be split evenly and addedto the prior belief
in each of the two034rst children.
In the same temporalpropagation step, any outgoing belief from these034rst
children would
be propagated viatheir own outgoing temporal transitions. The inference
procedure described byAlgorithms 10254 exploits the particular structure
of our representation in ways that moregeneral existing algorithms cannot.
Thepseudo-code
demonstrates that for a singlemonitored agent, both types of belief updates
have a time
complexitylinear in the number of plansand transitions in M , i.e.,O{M }.
Thusfor N
agents, the space and timecomplexity of Algorithm 4 is O{M N }.
We gain thise036ciency {compared to an approach such asDBN} from two sources.
First, we make a Markovian assumption thatthe probability of observing a
message depends on only the relevant plan being active, independently of
execution history. With this assump-
tion,we can incorporate evidence, based on only ourbeliefs at time t. Second,we
make
another Markovian assumption in the temporal model, allowing our propagation
algorithm
to reason forward to timet + 1 based on only our beliefs attime t, without
regard for
previous history.
4. Monitoringa Team by Overhearing
The previous section has outlined an e036cient plan-recognition mechanism
that is particularly suitable for monitoring a single agent basedon its communications.
Monitoring a team was achieved by monitoring each member of the team independently
of the others.Unfortunately,
although the time complexity of this approach is acceptable, its monitoring{recognition}
results are poor.The evaluation in Section 6.1 providesmore details, but,
inshort, the
average accuracy using this approach over all experiments was less than
4045.
102Monitoring Teams by Overhearing The main cause for this low accuracyis
the scarcity of observations,one of the identifying
characteristicsof monitoring by overhearing. Aspreviously discussed, agents
often switch their state unobservably {i.e., withoutsending a message}. Therefore,
the monitoringsystem
critically needs to estimate correctlythe times at which agents switch state.Since
some
agents rarely communicate{i.e., there are very few observationsabout them},
variance in
their temporal behavior {with respect tothe system's predictions} tends
to cause large errors in monitoring.
Toaddress this issue, we bring back for discussion the agent independence
assumption
which we have made in the previoussection. After all, team-members do not
communicate
independently of each other:Communication in a team is an action that isintended
to
change the state of alistener {Cohen & Levesque, 1990}. Agentsthat only
rarely send a
messagemay still change their state uponre c eiving a message. Inother words,
although
observedmessages are used in the previous section to update the belief in
the state of
the sender, they could also be used to update the state of any listeners.
To do this, the
monitoring system mustknow about the relationships between theteam-members.
Knowledge of the socialstructures enables additional sophisticated forms
ofmonitoring.
For instance, in order tomaintain their social structures, team-memberscommunicate
with
each other predictably, during particular points in the executionof a task.
Such predictions
of future observable behavior026communications026can be used to further
reduce theuncer-
tainty. However,it is often the case that while it can bedi036cult to correctly
predict that
aspeci034c agent will communicate at a speci034c point in task execution,
it is easy to predict
that some team-member will.Knowledge of the procedures employed bya team
to maintain
its social structurescan be very useful allows a monitoring systemto make
such predictions.
To reason about the e033ects of communicationson receivers, and about future
observ- able behavior of team-members,a monitoring system must utilize knowledge
ofthe so-
cial structures and social procedures used by team-members to maintainthese
structures.
Such exploitation of social knowledge for monitoring is called Socially-Attentive
Monitoring
{Kaminka& Tambe, 2000}. This sectiondiscusses these concepts in detail.
4.1 Exploiting Social Structures While computationally cheap, the approachdescribed
earlier proved insu036cient in theevac-
uation domain. Inmonitoring by overhearing tasks, the monitoringsystem must
address
scarce observations, as agents rarely communicate all atthe same time. Indeed,
in the evac-
uation application, only a singlemessage was observed {on average} for every
20 combined
individual state changes. Under such challenging conditions, a systemfor
monitoring by overhearing must come to rely extensively on its ability toestimate
when agents change their internal state with-
out sending a message.The representation presented earlier used asimple,
but e036cient,
temporalmodel to do this, based on the estimated average duration of plans.
However,we
have found high variance in the actual duration of plan execution,compared
to the duration
predictedby the average-duration model:
* Plan execution times vary depending on the external environment. For instance,
when all the agents in the team are running on a local network, their response
times toqueries
103Kaminka, Pynadath, & Tambe
may be shorterthan when communicating acrosscontinents.Indeed, latency times
in
the Internet vary greatly, and are di036cult to predict.
* Plan execution times vary depending on when a plan-step is execute d internal
ly. For instance, thetraveling plans, usedrepeatedly within the given evacuationteam-
oriented program, take anywherefrom 15 seconds to almost two minutes toexecute,
depending on the particular route being followed.
* Planexecution times vary depending onthe outcome of a plan-step.For instance,
when
the route-plannerisfunctioning correctly, it responds withina few seconds.
However,
whenitcrashes it does not return an answer at all,and the other agents wait
for a relatively long time before relying on atime-out to decide that it
had failed. This problem can be addressed in principleby a more expressive
model of executionduration,
for instance taking into account the internal execution context. However,
in practice, such
a model would likely be much more expensivecomputationally, asit would need
to rely on knowledge of previous and future steps,breaking the Markovian
assumption {e.g., to determine duration based on whena plan-step is execute
d,an improved temporal model would have to reason about the likelihood that
a given instance of the plan-step is the
second instance, as opposed to a third}. As applications grow in scale in
the real world, an
increasingly more complextemporal model would have to be continuously re034ned
to cover
the increasingly complex temporal behavior ofagents. Fortunately , atemporal
model is
only one way in which amonitoring system can estimate the times in whichagents
change
their internal stateunobservedly.
An alternative methodfor estimating unobserved state changes is toutilize
known de-
pendenciesbetween agents to exploit evidence aboutthe state of one agent
to infer the state of another. In particular, it is often true in team settings
that one agent would send ames-
sage intending to a033ect thestate of all its receivers in a particular
way. Thus in principle,
under the assumption that the receivers do change their state predictably,
an observation
of such a message can be usedas evidence in the inference of the sender's
state,as well as
all receivers', i.e., thestate of all team-members. Wecan trade the agent
independence as- sumption made earlier with an assumption ofsuccessful coordination.
This is a reasonable assumption in team settings, given that agents are actively
attempting to maintain their teamwork with such communications {Tambe, 1997;
Kumar et al., 2000;Dunin-Keplicz &
Verbrugge, 2001}. The e033ects of a message on a receiverare dependent
on the relationship betweenthe
sender and the receiver {where wetake such a relationship to be described
by a mathematical
relation between thepossible states of the sender and the receiver}.In principle,
such rela-
tionshipsunderly social structures026structures of interactions between
agents that make the
decisions of one team-member dependent, to some predictable degree, onthose
of its team-
mates. Usingknowledge of these dependencies, a monitoring agent may use
observations of
a communication action by an agent to inferthe possible state of another.
One simpleexample of such a structure is common in manyteams {e.g., Jennings,
1993;
Kinny, Ljungberg, Rao, Sonenberg, Tidhar, &Werner, 1992}, and indeed is
present also in our application: rolesthat govern which team-members undertake
what tasks in service
104Monitoring Teams by Overhearing of the team goal. Suchroles ideally bias
the decision mechanism of theteam-members to-
wardsmaking decisionsthat are appropriate for their roles. Thusknowledge
of the roles of
team-memberscan be useful to counter the uncertaintyfaced by a monitoring
agent. For
instance, supposethe monitoring agent knows that in the evacuation application,
apar-
ticular team-member is to choose Transport-Ops, rather than Escort-Ops,
as a child of
Landing-Zone-Maneuvers{because the team-member belongs to theTRANSPORT team,
rather than the ESCORT team}. This knowledge can reduce theuncertainty the
monitor-
ing agenthas026under the assumption that the team-memberdid not incorrectly
choose an
inappropriate plan for its role.Overseer in fact uses knowledge of rolesin
such a manner
to alleviate uncertainty. This monitoring use of role information has been
used in previous
work {Tambe, 1996; Intille & Bobick, 1999},discussed in Section 7.
However, a much more important social structure exists in teams. Agents
in teams work together, as team-member areideally in agre ement abouttheir
joint goals and plans {Cohen & Levesque, 1991; Levesque, Cohen, & Nunes,1990;
Jennings, 1995; Grosz & Kraus, 1996, 1999; Tambe, 1997; Rich& Sidner, 1997;
Lesh, Rich, & Sidner, 1999; Kumar& Cohen,
2000; Kumar et al.,2000}. This phenomenon026sometimes calledteam coher
enc e {Kaminka
& Tambe, 2000}026holds atdi033erent levels in the team. Agentsin an atomic
subteam work
togetheron the plans selected for the subteam, subteams work together with
sibling subteams
on higher level joint plans, etc.Individual agents may still choose theirown
execution, but
theydo so in serviceof agreed-upon joint plans. Providedthe monitoring agent
knows what
plans are to be jointly executed by which subteams, and what transitions
are to be taken
together by which subteams, it canuse coherence as a heuristic, preferring
hypotheses in
which team-members are in agreement about their joint plans, over hypotheses
in which
they are in disagreement.
For example, suppose that the entire team is known to be executingFly-Flight-Plan
{Figure1-b}.Now, a message from one member of theTRANSPORT subteam is observed,
indicating that it has begun execution of theTransport-Ops plan step. Since
this plan step is to be jointly executed by allmembers of the TRANSPORT subteam
{andonly them},
we can use coherence to prefer the hypothesis that the other subteam members
have also
initiated execution ofTransport-Ops. Furthermore, since thisplan-step is
in service of the
Landing-Zone-Maneuvers plan, which is to bejointly executed by the TRANSPORTand
ESCORT subteams, we can preferthe coherent hypothesis that team-members
of ESCORT
are executingLanding-Zone-Maneuvers. Now, based on theirknown role, we can
now come
backdown the plan-hierarchy and infer thatmembers of the ESCORT subteam
areexecuting
Escort-Ops, etc. This knowledge of the expectedrelationships, and in particular
knowledge of which plans
are joint to team-members{i.e., are sub ject to coherence}, is part ofthe
speci034cation of a
distributedapplication026and can thus be provided to anoverhearing system
by the designer or operator. In fact, it is oftenreadily available, since
it is used bythe agents themselves
in their coordination. For instance, we haveearlier discussed the assumption
that team- oriented programs are availableto the monitoring agent, andthat
these hold knowledge
about what plans in the hierarchy are to be executed by which {sub}teamsis
encoded in the
plan-hierarchy. The team hierarchy contains the knowledge about what subteam/agent
is part of another subteam.
105Kaminka, Pynadath, & Tambe
Coherencecan be a very powerful heuristic.It assumes non-failing cases,
where team- members successfully maintain their joint execution of particular
plans. Underthis assump-
tion, evidence about adecision made by one team-member in035uences{through
coherence},
our belief of whatits team-mates have decided. Andlacking such evidence,
coherence prefers hypotheses in which at least the team-members have made
joint decisions.For instance, sup-
posea transition from a team plan is to be taken only by the TRANSPORT team.Under
non-failure circumstances, there are only two coherent hypotheses considering
thistransi-
tion: Either all members ofTRANSPORT took the transition, or none did.Evidence
for
one member, supportingone of these hypotheses, can be used toinfer the state
of the other
members. The signi034cance of this property ofcoherence is that if the
monitoring system can reduce the uncertainty for even one agent, then this
reduction will be ampli034ed throughthe use
of the coherence heuristic to applyto the other agents as well. Theuse of
the coherence
heuristic can thuslead to a signi034cant boost in monitoringaccuracy, since
the number of hypotheses underlying any further{probabilistic} disambiguation
is cut downdramatically.
Section 6.1 provides anin-depth evaluation of the use of coherenceand knowledge
of roles
to select planrecognition hypotheses in Overseer. The use of coherence signi034cantlyincreases
the time complexity of the computation. At the very least, it requires settinginter-agent
links in the array of planrecognizers used
by Overseer{Section 3.4}, such that these links representa probabilistic
association be-
tween plans that are to be executed jointly {in contrast with the temporal
and hierarchic
decomposition transitions used thusfar}. For instance, if a speci034c planX
is to be executed
jointly by agents A and B , then such a link would be constructed between
the variable X
A
t {representing agent A's execution of a plan X } and the variableX
B
t
{representing agent B 's
execution of the sameplan}. In general, there would be N 003{N 0001}2
suchinter-agent links be-
tweenN agents, for each one of the joint plans {of which there are at mostM
}. Thus given
N agents, and the array of recognizers M [], where each individual agent's
plan-hierarchy
is of sizeM , therun-time complexity of anexact-inference algorithm would
be at least O{M N
2 } and quite likely much worse{since in general there is an exponential
number of
coherent and non-coherent hypotheses to select from}. In the nextsection
{Section 5.1},
we describe ahighly scalable {in the number of agents}representation for
reasoning about
coherent hypotheses.
4.2Exploiting Procedures that Maintain SocialStructures
A monitoring system can exploitknowledge of the procedures agents use to
maintain their
socialstructures to alleviatesome of the uncertainty resulting from thescarceness
of obser-
vations.For instance, if the monitoring system couldaccurately predict future
observable behavior of monitored agents, thenwhile it has not observed the
predicted behavior, the
monitoring system may infer that the agents have not reached the state associated
with the
predicted behavior.Thus such predictions can be used toeliminate monitoring
hypotheses,
by setting an individual agent's 026 X Y
probabilities to re035ect a prediction that a message will
betransmitted by the agent as its execution ofX terminates and it initiates
Y. For instance,
in our ownapplication, the Ariadne information agent isqueried for possible
threats before each route is followed in the evacuation. It may therefore
be possible to predict that before
106Monitoring Teams by Overhearing each route is taken by thehelicopters,
a message will be sent by theAriadne agent to its
teammates; thuswhile no such message is observed, theAriadne agent can be
inferred to have not yet executed this step.Furthermore, under the assumption
of coherence{discussed
above}, the monitoringsystem may further infer that all team-membershave
not yet exe-
cuted this step,i.e., a new route was not taken by theteam. Such inference
is obviously dependent on the system's observational capabilities, but we
have found itto be useful even
under lossy observations by the monitoring system {see Section6.2}.
However, in general, such speci034c individual predictions can be di036cult
tomake. Team-
membersare often engaged in joint tasks, which requiremany agents to tackle
a problem together. In these settings, predictingindividual communications
may be impossible.For
instance, consider a distributedsearch problem in which a target solution
is to be found
somewhere in the search-space;di033erent areas of the search space are
divided amongst the
agents, with the understandingthat the 034rst to 034nd the target will
communicate with the
others. It would be di036cult to accurately predict which one of theagents
will communicate
{034nd thetarget}, since if we could predict that, wecould focus all agents'
e033orts on that area alone. Yet it is easy to predict that at least one
agent will 034nd the target andcommunicate.
Similarly, in the evacuation application, it may be di036cult topredict
which helicopter will
reachthe civilians 034rst026but it is easy to predictthat one of them
will, andwill then communicate their location.
Indeed, teams utilize social pro c e dur es or conventions{Jennings, 1993}
by which team- members maintain their relationships withone another. Removal
of the agent independence
assumption allows the monitoringsystem to exploit knowledge of such procedures,by
mak-
ing predictions as to the behavior of team-members in coordinating with
oneanother. For
instance, knowledge ofthe failure-recovery procedures used by ateam to recover
from coordi-
nation failures allows the monitoring system topredict the future behavior
of team-members in case of failed execution. Similarly, knowledge of the
communication proceduresused by
the team {as part of itsteam-members' coordination} allows predictingfuture
observable
messages026futureinteractions between team-members026withoutnecessarily
specifying a par-
ticularindividual agent that will carry them out. For example, suppose Overseeroverhears
a message indicating that the 035ight team has
initiated joint execution ofFly-Flight-Plan {Figure 1-b}. Aftersome time
has passed, it is
nowpossible that the team is either still executingFly-Flight-Plan, or it
has terminated it already and begun joint execution ofLanding-Zone-Maneuvers.
However,if Over-
seer knows that atleast one team-member will explicitly communicateafter
terminating
Fly-Flight-Planand before initiating Landing-Zone-Maneuvers, then while
such commu-
nicationsare not observed, the monitoring system caneliminate the possibility
that the team is executing the latter, eliminating anyuncertainty in this
case {only Fly-Flight-Planis
possible}.
Weleave discussion of how technically a social procedure of the form 020at
least one team- member will communicate when itssubteam will take this transition
fromX to Y 021 can be converted into 026
X Y
values to the nextsection, where we present a technique forrepresenting
team-wide 026probabilities in a way that allows e036cient reasoning. In
the remainder of this section, we address instead how knowledgeof such social
procedures may be acquired. Social procedures of communications may be simple
per-case rules, or may involve
complexalgorithms. Forinstance, Jennings{1993} suggests using heuristicapplication-
107Kaminka, Pynadath, & Tambe
dependentrules to determine communication decisions.STEAM {Tambe, 1997}instead
uses a decision-theoretic procedure thatconsiders the cost of communication
and the cost of miscoordination in the decision to communicate. Other procedures
have been proposed
as well {e.g., Cohen & Levesque,1991; Jennings, 1995; Rich & Sidner, 1997}.However,
regardless of their complexity, a key point is that a monitoringsystem does
not necessarily
have to have full knowledge of these procedures in order to exploit them
for predictions: it only needs to approximate their outcome, since it can
use a combination of techniques to combat plan-recognition ambiguity, rather
than relying just on one technique. The decisions of social procedures can
be acquired by learning from previous runs of the system. Although a detailed
explorationof appropriate learning mechanisms is out- side the scope of this
paper, we provide a strict demonstration of the feasibility oflearning
social procedures by simplerote-learning, which proved e033ective ingenerating
a useful com-
municationsmodel that signi034cantly reduced the uncertainty in monitoring
the evacuation
application. This simple mechanism records during execution which plans
are explicitly communicated about, and whether they were initiated or terminated.
The learned rules are e033ective immediately, and are storedfor future monitoring
of the same task. Figures 4a025d present the results fromusing of this rote-learning
mechanism in four di033erentruns on the same tasks.The X-axis denotes observed
communicationmessage-
exchanges as the task progresses.Overall, between 22 and 45 exchangestake
place in a
run, each exchangeincluding between one and a dozen broadcastmessages in
which agents
announcetermination or initiation of a plan. TheY-axis shows the number
of hypotheses considered by Overseer after seeingeach message, without using
any probabilistic temporal
knowledge. Thus greater uncertainty about which hypothesis is correct would
be re035ected
by higher values on the Y-axis. At the beginningof task execution, all possible
plans are considered possible, since we ignore temporal knowledge in this
graph. Asprogress is made
on the task, less and less steps remain possible before the end is reached,
and so we expect
tosee a gradual {non-monotonic} decline as we move along the X-axis. A technique
that successfully eliminates hypotheses fromconsiderations results in Y-axis
valueslower than
those of this baselineexecution curve.
In Figure 4, the line marked No Le arning showsthis baseline {i.e., no predictions,
and withthe learning component turned o033}. The baseline shows that a relativelyhigh
level of
ambiguity exists, sincethe system cannot make any predictions aboutfuture
states of the
agents,other than that they are possible.When the learning technique is
applied on-line {i.e., any message seen is immediately usedfor future predictions},
some learned experience is immediately useful, and ambiguity isreduced somewhat
{the line marked On-Line Le arning }.
However,some exchanges are either encountered late duringtask execution,
or are seen only
once. Those cannot be e033ectively used toreduce the ambiguity of the monitoring
system on the 034rst run. However,the third line {After Le arning} presents
the number of hypotheses
consideredwhen a fully-learned modelis used. Here, the model was learned
onrun G, then
applied without any modi034cations in the other runs of the system.As can
be seen, it shows
a signi034cantly reduction in the numberof hypotheses considered by Overseer.
Further
ev aluation of the useof communications predictions is presented inSections
6.1 and 6.2;
however, a fullexploration of the use of learning for this task is beyond
the scope of this
paper.
108Monitoring Teams by OverhearingatendTimes-Roman0510152025051015202530354045Number
of Recognized PlansObserved Communication ExchangesNo learningOn-line learningUsing
previously learned predictions{a}Learning in experiment C
atendTimes-Roman0510152025051015202530354045Number of Recognized PlansObserved
Communication ExchangesNo learningOn-line learningUsing previously learned
predictions{b} Experiment E
atendTimes-Roman0510152025051015202530354045Number of Recognized PlansObserved
Communication ExchangesNo learningOn-line learningUsing previously learned
predictions{c} Experiment G
atendTimes-Roman0510152025051015202530354045Number of Recognized PlansObserved
Communication ExchangesNo learningOn-line learningUsing previously learned
predictions{d} Experiment I
Figure 4: Learningof communication decisions in di033erent experiments.
109Kaminka, Pynadath, & Tambe
4.3 Discussion A key characteristic of monitoring byoverhearing tasksis
the scarcity of observations avail-
able to themonitoring system. Fortunately , the observations available to
the monitoring system can often be viewed as observations of multi-agent
actions :The sender of the message
notonly changes its own state, but often also intends to change the state
of the recipients {Cohen & Levesque, 1990}. Thuseven a single observation
can be usedas evidence for
inferring the state of both sender and receivers. This stands in contrast
to previous work,
which addressedmonitoring of multiple single-agentactions.
In monitoring a team, themonitoring system can use knowledge of socialstructures
and
procedures to exploitinformation about the activities of one team-member,
in hypothesiz-
ing about theactivities of another team-member. Thesetechniques are not
speci034c to the representation presented earlier.For instance, an increased
belief in oneagent's execution
of a plan Xbased on evidence for a teammate's execution ofX can be also
used by con- structing appropriate probabilistic links between nodes representing
these beliefs in alarge
DBN representing the two agents.If we start with the DBN representation
asdiscussed in
Section 3.1, we can replicatethe single-agent network {containingM plans}
for each of the N separate agents. Thenumber of nodes is then O{M
2
N }, since we represent the plans and transitions for each individual agent.We
can also introduce the appropriate inter-agent links
to capture the inter-agent dependencies represented by our model ofteamwork.
However,
uponintroducing such links, the computationalcomplexity of performing DBN
inference explodes to O{2
M
2
N
}.
Obviously, such 020socialreasoning021 can be computationally expensive,
even with the ef-
034cientrepresentation described earlier. Thenext section provides details
of an e036cient mechanism for reasoning about a team using information about
role and coherence, and uti- lizing communications predictions.Using this
mechanism, the techniques described in this
section have resulted in anaccuracy of up to 97045 {84045 average acrossall
experiments}026
compared to average 4045 without the use of social knowledge.Sections 6.1
and 6.2 present
a detailed discussion of these results. 5. Plan-Recognition for Overhearing
The previous section has outlined socially-attentive monitoring techniques,
alleviatingthe
uncertaintyin monitoring a team ofagents by exploiting knowledge of the
socialstructures
and social procedures of themonitored team. It discussed using coher enc
e and role mainte- nance to exploit knowledge of theideal agreement of agents
that speci034c plansbe executed
together, and that other speci034c plans are assigned to agents ful034llingtheir
roles. Further-
more,it discussed disambiguation based on predictions offuture observable
behavior, based on knowledge of the social proceduresemployed by team-members.
Thesedisambiguation
heuristics eliminate many{incorrect} hypotheses from being considered.However,
reasoning
using these techniques can be computationally expensive. This section presents
an e036cientalgorithm, building on the representation previously presented,
which facilitates scalablemonitoring by overhearing of large teams.The key
idea
here is to representonly those hypotheses which the heuristics would have
considered valid,
eliminating from consideration plans and transitionsthat would be considered
illegal with the heuristics. Relying on the team-hierarchy forbookeeping,
all coherent hypotheses are
110Monitoring Teams by Overhearing representedusing a single recognizerinstead
of an array of recognizers, o033eringconsiderable
scalability in team monitoring.However, since the algorithm can no longerrepresent
certain
hypotheses,this scalability comes at the expense ofexpressivity. We discuss
the scalable representationand the trade-o033 it o033ers below.
5.1 E036cient Reasoning withTeam Coherence
Coherenceis a very strong constraint, sincefor a teamof agents there are
only a linear number{O{M} where M is the size of theplan-hierarchy} of coherent
hypotheses, butan
exponential number of incoherent hypotheses {O{M N
} where Nis the number of agents; the proof is in Appendix A}. We can exploit
this property by designingmonitoring algorithms
that reason only about the linear number of coherent hypotheses, and therefore
o033er better scalability as the number of agents increases. Such algorithms
may not beable to reason
aboutincoherent hypotheses, and are therefore less expressive.However, Section
6 demon-
strates that the level of accuracy even withsuch limited expressiveness
is su036cient forour
purposes. Furthermore,algorithms that reason only about coherent hypotheses
may still
be able to detectincoherent hypotheses, representing a failurestate in which
two or more
team-members are in disagreement with eachother.
We begin by presentingthe YOYO* algorithm, an e036cient technique for reasoning
about
coherent hypotheses{Algorithm 5}. YOYO* replaces the array-based algorithm
described
earlier {Algorithm 4}.Similarly to it, YOYO* is called every timetick. If
no message is
observed,the state of the entire team is propagated forward in time. Otherwise,
all observed messagesare collected together and used asevidence for the {di033erent}
plans implied bythese
messages.
YOYO*'skey novelty is that it relies on asingle plan-hierarchy that is used
torepresent
all team-members together{regardless of their number}, instead of anarray
of such struc-
tures.In other words, each variableX
t
represents Overseer's belief that al lagents in the
teams associatedwith X {as described in the team-oriented program} are executing
the plan
X at time t. ThusYOYO* makes extensive use of the informationassociating
plans and
transitions inM with teams and subteams inH , the team-hierarchy.The team
hierarchy
playsa critical bookeeping role in this respect,since it maintains the knowledge
critical for correctly applying coherence in the singlerecognizer.
This key distinction between YOYO* and the array-based approach causesa
subtle,
but critical, di033erence in theway probabilities are propagated along
transitions.In a plan-
hierarchy Mof an individual agent, part of an arrayof such models, each
outgoing transition represented a hierarchical decomposition or temporal
step that the agent is allowed to take.
Alternative outgoing transitionstherefore represent alternative paths of
executionavailable
to the agent.On the other hand, ina plan-hierarchyM used by YOYO*, alternative
outgoing transitions tagged by di033erentsubteams {that are not ancestors
of one another} represent not a decision point for theagent, but alternative
paths of execution asdecided
by the agents' roles and team-memberships.
This creates a criticaldi033erence in how the values of031
X Y
and026
X Y
areto be interpreted.
Where previously {in Section 3} the value of 031 xy
referred to the probabilitythat a speci034c
agent will take atransition X ! Y {giventhat it has terminated execution
of X}, in
YOYO* it refers to theprobability that an entire team will take thetransition
together.
111Kaminka, Pynadath, & TambeAlgorithm 5YOYO*{plan-hierarchyM , team-hierarchy
H, beliefs b}1: if no new messages areobserved then
2: Team-Propaga te-For w ard{b, M}
3: else
4:Initialize distributions b
0
; b
t+1 0 for all plans U2 M . ; Initialize I; E to be empty sets. 5: for
all Messagesm
i
do
6:I I [ fXj X 2 M; m
i
is a an initiation message; X consistent with m i
g
7: E E [fY j Y2 M; m
i
is a termination message; Yconsistent with m
i g
8: for allplans X 2 I do
9: T team
msg
{X } {T is the agent sending the messageinitiating X }
10: forall plans W 2 M thatprecede X , where the transitionW ! X is allowed
forT do
11: b
0
{X; :block} b
0
{X; :block} + b t
{W;block}026
wx
031 wx
12: for allplans X 2 E do
13: T team
m
sg{X } {T is the agent sending the messageterminating X }
14:for all plans Y 2 M;Y =2 I that succeedX , where the transition X! Y
is allowed forT
do
15: b 0
{Y ; :block} b
0 {X; :block} +b
t
{X;block}026
xy
031 xy
16: Normalizedistribution b
0
taking teamsinto account
17: forall plans X where b 0
{X; :block} > 0 do
18: b
t+1
{X; :block} b
0
{X;:block}
19:Team-Propaga te-Down{X;b
0
{X; :block}; b; M }
20: T team{X }
21: P X
22: while parent{P } 6= nulldo
23: b
t+1
{parent{P}; :block} b
t+1
{P;:block}
24:if team{parent{P}) = parent
team {T } then
25: Scale{parent{P }; T ; P; b} 26: T = parent team
{T } 27: P parent{P }112Monitoring Teams by Overhearing YOYO* is unable
to represent hypotheses in which some team-members take onetransition,
and others do not026unless thesetwo di033erent groups of members formdi033erent
subteams
that are representedin the team-hierarchy, andthe di033erent transitions
are tagged as being
allowed for the di033erent subteams. The value of 026
X Y
is also interpreted di033erently, ina very critical way. Where in the
previous sections it was taken to represent the probability that aspeci034c
individual will
communicate when a transition X ! Y istaken, in YOYO* its value representsinstead
the probability that one or more team-memb ers wil l communic ate when the
transition is takenby the team. Thusit no longer refers to individual agents,
but toa {sub-}team. In
this way, YOYO* solves the issue of how torepresent predictions of the type
020at leastone
team-member will communicate whenthis step is reached021, discussed previously.
F or example, supposeYOYO* sets the belief that the team isexecuting the
Landing-Zone-Maneuversplan-step to some probability p. Landing-Zone-Maneuvers,
in YOYO*, has two {034rst} children:Escort-Ops and Transport-Ops, to beexecuted
by mem-
bers of th ESCORT and TRANSPORT subteams, respectively. Unlike in the individual
agent case, the probability p should not be divided among these two children,
butshould
be duplicated to them:A belief that the entire team is executingLanding-Zone-Maneuvers
implies an equally-likely belief that the TRANSPORT subteam isexecuting
Transport-Ops,
andthat the ESCORT subteam is executingEscort-Ops. We explain YOYO*'s operation
in detail below:
No message is observed {lines 10252}.Since no observations are available,
the state of the
entireteam is jointly propagated forward in time by calling Team-Propaga
te-for w ard {Algorithm 7, Appendix A}.This is a slightly modi034ed version
ofthe propaga te-for w ard {Algorithm 3} that takes di033erentsubteams into
account in propagating beliefs:Given some
total outgoing probability{either to a sibling or child transition}, ifthe
outgoing transitions
are to be taken by di033erent teams where one team is notan ancestor of
another {such as
theTRANSPORT and ESCORT sub-teams},the same total probability would be used
for each transition, instead of splitting theoutgoing probability between
the transitions.Ap-
propriately, Team-Propaga te-for w ard relies on a modi034ed version of
the Propaga te- Down algorithm {Algorithm 2}, calledTeam-Propaga te-Down
{Algorithm 6,Appendix
A}. This latter algorithm isalso used in the incorporation of evidence {lines302527}.
The
run-time complexity of the propagation process is O{M}.
One or more messages areobserved {lines 30257}. If one or moremessages
are ob-
served{since YOYO* isa single algorithm monitoring multiple potentialmessage
senders,
more than one message maybe observed at once}, YOYO* begins toincorporate
these ob-
servationsinto the maintained beliefs about the team.This process is somewhat
similar to the Incorporate-Evidence algorithm,described earlier {Algorithm
1}, but takes into ac-
count multiple observations {since all N agentsmay have sent a message}.
Multiplemessages
{from di033erent agents} may all refer to the same plan, but YOYO* mustnot
incorporate
evidence for them multiple times.
The simple loop {lines50257} builds the set I {ofinitialized plans} and
E {ofterminated
plans} by going over allincoming messages that have arrived at timet. The
run-time
complexityof this process {in the worst case} isO{N }. Here, YOYO* does
better than the
113Kaminka, Pynadath, & Tambe
array approach,since multiple messages always cause multipleupdates in the
array, but in YOYO*, multiple messages may all refer toa single plan, thus
triggering a single update. Incorporating evidence about initiatedand terminated
plans {lines 802515}.For
each one of these plansX in I {line 8}, YOYO* now sets the new belief b
0 , weighted by any
priorbelief in X 's initiation {lines1002511}, similarly to how this is
done in theincorporate-
evidence algorithm{Algorithm 1}, but taking into account the team implied
by the sender
of the processedmessage {line 9}. This is done by a lookup into M using
the senderT
{team
msg {m
i
}}: Only transitions in M thatT is allowed to take are followed. By de034nition,
anytransition that is allowed to be taken by a super-team of T is allowed
for T . A similar
process is then done with any termination messages {lines1202515}, but
of course looking at possible successors of any plans consistent with the
messages. However,since we do not
wantto cause updates in both line 11 and line 15 in cases where a termination
message and an initiation message refer to the sametransition, the loop over
the plansY {line 14} skips
anyplans which have already been addressed inthe previous step. Overall,
therun-time complexityof this process isO{M }.
Normalizingthe temporary distribution b
0
{line 16}. The temporarydistribution
b
0
resulting from the processing of initiation andtermination messages is normalized,
in a similar fashion to the analogous step inalgorithm 1. However, the process
must take into
account not only theplan-hierarchy in question, but also theteam-hierarchy.
Unlike a typical normalization procedure, evidence for twodi033erent plans,
selected by two di033erent teams,
may not necessarily competewith each other, and therefore may notnecessarily
require
normalization. For instance, if two messages are observed,one implying that
team A has
initiated execution of plan P ,and another implying that team B hasinitiatedexecution
of plan Q, then ifP and Q are both children of ajoint parent J {executed
jointly by the
two subteams A;B }, then the same normalized likelihood {1.0} should be
assigned to Pand
Q {and J 026butthis will be assigned to it by the propagation steps described
below}. The run-time complexity of this process isO{M }.
Propagatingthe evidence up and down M {lines1702527}. First, the beliefs
are set for each plan implied by the observations, and its children {lines
1802519}.Then, the team T
that is to executethis plan is determined by a lookup intoM using team{X
}{line 20}. Now
YOYO*begins to propagate the evidence up to the plan's parents {lines 2102526}.
Any belief in the child plan is propagated and addedto the belief in its
parent {line 23}.However,
if the parent plan {parent{P }} is to beexecuted by a super-team of the
current teamT ,
then any change to itsprobability must be propagated to its other children,
that are to be
executed byother {subteams}. Thus the upwardpropagation is alternated with
downward propagation along hierarchical decompositiontransitions
1
. Thisdownward stepis executed
wheneverthe team that is responsible for joint execution of the parent plan
is no longer the current subteam being considered {T }, but its parent team
in theteam-hierarchy H ,
givenby parent
team
{T } {lines 2402526}.When this condition is satis034ed, any changein the
beliefs about the parent plan must be propagated down to any childrenit
has that are to be
executed byother subteams. This is done via theScale algorithm {Algorithm
8, Appendix A}.1. Thisalternating upward-downward propagation isthe origin
for YOYO*'s name.
114Monitoring Teams by Overhearing The downward propagation {line 25}implements
a subtle but critical step:It re-aligns
any beliefs YOYO* maintains about subteams other than those implied bythe
message so
that these beliefs are madecoherent with existing evidence. TheScale procedure,
which
re-distributesthe new state probability of a parent amongits children, such
that each child gets scaled based on its relative weight in the parent. The
end result is that thestate
probabilities of the children aremade to sum up to the state probability
of theparent. The
process is recursive,but never re-visits a subtree, since it is onlycarried
out for hierarchical-
decompositiontransitions that were not previously updated. Once this downward
propagation is done,YOYO* updates the current team to be its parent in the
team-hierarchy, in line 26. Note that the call toparent
team
re035ects a lookup
in the team-hierarchy H, rather than the plan-hierarchyM . Finally, regardless
of whether downward propagation took place, thetemporary variable P isupdated
to climb up the
hierarchicaldecomposition in M {line 27}. Each iteration through the loop
begun online 17 is O{M + H} since in the worst case both the plan-hierarchy
and team-hierarchy are traversed. However,this loop many repeat
{inthe worst case} for each of the plans in theplan-hierarchy, and thus
overall,the run-time
complexity of this processis O{M {M +H }) = O{M 2
+ M H }.
An example run of YOYO*.The following example illustrates YOYO*'sinference
upon an observationof a message. Suppose a single memberof the TRANSPORT
subteam
communicatesthat it is initiating the Transport-Opsplan. Upon observing
this message, YOYO* looks up the sender, to determinewhat transitions can
be taken by it {line8}. It then
proceeds to determine thenew beliefs in team T 's execution ofthe Transport-Ops
plan {lines
902510, then 16}, and incorporates these new beliefs to re035ect a much
increased beliefthat the
TRANSPORT subteam is executingTransport-Ops and its children {lines 1802519}.Since
this plan's parent,Landing-Zone-Maneuvers, is not null, YOYO* enters the
loop in lines
2202527.First, it increases the belief in the executionof the parent {line
23}. Then,it checks
the condition on line 24:Indeed, the team that is to executeLanding-Zone-Maneuvers
is
TEAM-FLY-OUT, the parent of the TRANSPORT subteam {i.e., Landing-Zone-Maneuvers
is to be executed jointly by theTRANSPORT and ESCORT subteams}.YOYO* there-
fore calls theScale procedure {line 25} to re-adjustLanding-Zone-Maneuvers'
other
childrensubtrees. Landing-Zone-Maneuvers has twohierarchical-decomposition
children: Transport-Ops {which YOYO* has alreadyupdated} which is to be executed
by the TRANSPORT subteam, and Escort-Ops, which is to be executed by the
ESCORT subteam.
Scale climbsdown from Landing-Zone-Maneuversto Escort-Ops, increasing YOYO*'s
beliefs that the ESCORT team isexecuting the Escort-Ops plan. Thisprocess
re-aligns
any prior beliefs YOYO* had about the likelihood thatEscort-Ops was being
executed
with current evidence, in e033ect updating beliefs about the plans executed
by the ESCORT
subteam, based on a single observation made of a member of the TRANSPORT
team. The
process now repeatsthis loop until the entire set of beliefsis updated and
aligned with
respectto the observed message.
5.2Scalability in the Number of Agents YOYO* o033ers signi034cant computational
advantages when compared to the individual rep- resentation {array} approach.YOYO*
requires only a single,fully-expanded plan-hierarchy
115Kaminka, Pynadath, & Tambe
to represent theentire team. This hierarchy isa union of all the individual
agent plan- hierarchies, containing all transitions andplans, tagged by the
subteams that are allowed to execute them. In addition YOYO* uses a single
copy of the team hierarchy. Suppose
M is the size ofthe plan-hierarchy, H isthe size of the team-hierarchy,andN
the number
of agents in the team. When agents areadded to the monitored team, the team
hierarchy grows by one new node that represents the new agent, and is connected
to theappropriate
sub-team in the team hierarchy. YOYO*'s space complexity is thereforeO{M
+ H }.Since
H grows with N, we could write it O{M + N } {compare to the array approach:
O{M N },
Algorithm 4}.
Toanalyze YOYO*'s run-time complexity, we have to consider the behaviorof
Algorithm
5 separately in cases where no communications are observed, andin cases
whereat least
one message is observed.If no messages are observed, then an updatetakes
the form of
a single call toTeam-Propaga te-For w ard{Algorithm 7}, an O{M} process.
This is
clearly a best-case scenario for YOYO*.If one agent communicates, then YOYO*
would
haveto go through Mand H in its upward-downward propagation process only
once, thus O{M + H }= O{M + N }.
The worst case scenario for YOYO*occurs if all agents send messages, and
eachone of
these N messagesrefers to a di033erent plan {messages aboutthe same plans
would be merged
in lines 50257}. In this case, there wouldbe up to M di033erent plans
for which evidence exists,
and each one of themwould require a separate update through lines1702527.
Thus YOYO*'s
run-timecomplexity in this case is
O{N + M + M+ M {M + H}) = O{N +M
2
+ M H} = O{N + M 2
+ M N } Clearly, this worst-case cannot becontinuously sustained by a monitored
team,since agents
cannot continuously communicate about their state. Wethus believe that the
average case in real-world domains with manyagents would be much closer to
theO{M + N }case
presented earlier {see Section 6.4 for empiric evaluation}. In any case,
YOYO*'s complexity
compares favorablywith a procedure reasoning about coherent hypotheses using
an array
of recognizers, an O{M N
2
} process {at least}, even if onlyone agent communicates {Section
4.1}.
5.3 Discussion
YOYO* explicitly represents a team as asingle coherent entity. Itsspace and
run-time re-
quirementsare preferable to the array based approach whenthe number of agents
grow, and it considerably simpli034es reasoning aboutcoherence and communications
predictions.On
the other hand, YOYO* sacri034ces the capability to represent failing team
activities{inco-
herent hypotheses}, where oneteam-member is executing one team-plan while
itsteammate
is executing another.This does not at all mean thatindividual actions taken
by agents aresomehow locked together in synchronous execution, or that individual
agents mustall
execute the same individual action atthe same time. For instance, twoteam-members
A; B
thatare each executing a completely di033erent pathof execution at the
same time {i.e., plan steps A
1
;:::; A
k
and B 1
; :::; B
l
} can be easily representedby a plan hierarchy that includes an overall
joint plan J, having two 034rst hierarchical decomposition children, A
1 and B
1
, to 116Monitoring Teams by Overhearing be selected by A andB , respectively.
A 1
would have an outgoingtemporal transition to A
2 ,
etc. and similarly B 1
would have an outgoing temporal transition to B
2
, etc. Since J is to be executed by the two team-membersjointly, any initial
evidence for anyone of the agent
executing any of itsindividual plans would be used by YOYO* asevidence for
the other
team-memberhaving begun its own parallel execution ofits own individual
execution path.
Further evidence about one agent executingits own individual actions would
only increase the likelihood that the other agentis continuing its own execution,
at its own pace. However,
it would beimpossible for YOYO* to correctly represent amonitoring hypothesis
in which
Ais executing some child of J, A
i
, whileB isexecuting some plan that is notJ , nor a child
ofJ . Given the results of the evaluation we conducted {Section 6}, whichdemonstrated
the
importance of coherence inaccurate visualization, the tradeo033 of expressivityvs.
scalability
is justi034ed:Overseer's accuracy was much improved due to the use of coherence.
Although YOYO* sacri034ces the capability toreason about certain failure
{incoherence} hypotheses,it is still capable of supporting failure-detection,
animportant secondary goal of
visualization. In earlier work, we have shown the merits of coherence in
service of detecting
disagreements in a team, inparticular demonstrating that coherent monitoring
leadsto
sound centralized disagreementdetection, and may lead to sound and completedisagreement
detection under speci034ccircumstances {Kaminka & Tambe,2000}. As YOYO*
is in fact a very e036cient way to reason about coherent hypotheses, it
provides a goodbasis for providing
sound disagreementdetection results.
A concern about thegenerality of the technique may be raisedbased on YOYO*'s
re-
liance on theteam-hierarchy. However,we believe it is reasonable to expect
that large,
complex, real-world multi-agentsystemsof the type targeted by this paperwould
have
an organizational hierarchy of some sort associated with them {see, forinstance,
Tidhar,
1993b}. Humanorganizations certainly demonstrate the emergence of such hierarchies,
es-
peciallyas theorganizations grow larger {e.g., big corporations,government
organizations}
or tacklemission-critical tasks {e.g., military organizations}.In addition,
team-hierarchies
forcomputational agents are critical for planning, formaintaining network
and system secu- rity, etc. Thus we believe our use of a team-hierarchy is
not a weakness in our approach, as
organizationalstructures will become as wide-spread in computational multi-agent
systems
as they already arein human multi-agent systems. Indeed,it may be possible
to gradually learna team-hierarchy for a given coordinated team for the purpose
of monitoring;however,
discussion of this possibility is outside the scope of this paper. Indeed,
using a team-hierarchy, we can apply our assumption of coherence toother
rep-
resentations and algorithms as well. For instance, if we start with theDBN
representation
of the team fromSection 4.3, we can unify the multiple random variables
used to repre-
sentthe separate agents into a single random variable for an overall team/subteam.As
in
YOYO*, the size of therepresentation grows with the size of the planhierarchy,
andnot
the number of agents. Thus, the number ofnodes will be the same as for the
single-agent
case, O{M
2
}, as discussed in Section 3.1.However, again, the complexity ofinference
in
answering plan-recognitionqueries will still be exponential in the number
of nodes, O{2
M
2
}. 117Kaminka, Pynadath, & Tambe
6. Evaluation
This section presents a detailed evaluation of the di033erent contributionscontained
within
Overseer. We begin by exploring the relative contribution of each technique
to the success of Overseeras a whole {Section 6.1}.We then focus on evaluatingOverseer's
use of
communicationspredictions with respect to lossless and lossyobservations
{Section 6.2}.
We then present a comparison ofOverseer'sperformance with that of humanexperts
and non-experts {Section 6.3}.Finally, we empirically evaluateYOYO*'s scalability
in our
applicationdomain {Section 6.4}.
6.1 AccuracyEvaluation
The 034rst part of theevaluation tests the contribution ofthedi033erent
techniques in Over- seer to the successful recognition of thecorrect state
of the team-members.Figure 5 com-
pares the averageaccuracy for a sample of our actual runs, markedA through
J {X-axis}.
In each such1002520-minute run, the team executed its taskcompletely. At
di033erent points during the execution, the actualstate of the system was
compared to thestate pre dicte d
byOverseer, where the prediction was takento be the current most-likely
hypothesis. Each run had 2202545 such comparisons{data-points}. The percentage
of correctmonitoring
hypothesesfor each run across those comparisons is given in the 0-1 {0-100045}
range, on the
Y-axis.
linuxCompile Date: Aug 4 1999 13:39:24XpDummyUseratendatendAverage AccuracyFigure
5: Percent accuracy insample runs.
The accuracy when using theindividual models with no coherence {as in Section3}
is
presented in the leftmost bar {marked Temp or al } in each group{Figure
5}, and is clearly
verylow. This approach is a straightforwardattempt at monitoring multiple
agents by mon- itoring each individual, without consideringthe interactions
between them, as describedin
Section 3. The next bar presentsthe monitoring accuracy when only coherence
is used to rule out hypotheses {Section 5.1},with ties broken randomly. Thenext
bar to the right
{Coherent, Temp or al }presents the results of combining bothcoherence and
the probabilistic
temporalmodel {Sections 3 and 5.1}. Then,the bar marked {Coherent,Comm }
shows the
e033ects of combining the use of coherence with the use ofpredictions based
on knowledge of
118Monitoring Teams by Overhearing the communication procedures used bythe
team {Section 4.2}. Here, the communications
predictions were used to restrictthe set of coherent hypotheses considered,
withties bro-
ken randomly. Theremaining bar {Coherenc e, Temp or al, Comm } presents
themonitoring
accuracy in each run using thecombination of all techniques.
The resultspresented in Figure 5 demonstrate the e033ectiveness of the
socially-attentive
monitoring techniques we presented.First, the results show that the coherenceheuristic
brings the accuracy up by1502530045 without using any probabilistic reasoning.This
boost in
performanceis a particularly interesting result, because ofthe relation
between the coherence technique and previous techniques exploredin the literature
{Tambe, 1996; Intille& Bobick,
1999}. Previous work hassuccessfully used the relationships between agents
to increase the
accuracy of monitoring.The boost in Overseer's accuracybased on the use
of role and
teamworkrelationships con034rms the results from previous investigations.
However, the results also demonstrate that the techniqueis not su036cient
in this domain. Overseer adds a number of novel techniques not addressed
in previous work.The
034rst such technique combinescoherence with a temporal model of plan-duration{Coherent,
T emp or al}, and it results in signi034cant increasesto the accuracy,
because the probabilistic temporal information now allowsOverseer to better
handle the lack ofobservations. A
possiblealternative, which we explore in this evaluation, is to rely instead
on the commu- nications predictions to rule out hypothesesabout future states
that may or may not have
been reached {Coherent, Comm }. It is therefore interesting to compare the
performance of these two techniques by comparing the{Coherent, Temp or al}
and {Coherent, Comm} bars.
In almost all runs the average accuracy when using coherence and communications
pre-
dictions is signi034cantly higher thanwhen using coherence and the temporal
model.This is
despite the fact that the moree033ective coherence technique uses arbitrary{random}
selection
among the available hypotheses: The reason for thisis that in many cases
the communica- tion predictions are powerful enough torule out all hypotheses
but one or two,signi034cantly
decreasing the uncertainty of the agents' plan-horizons. Thuseven a random
selection stands
a better chance than a more informed {bya temporal model} selection among
many more {1002520} hypotheses.
However,runsJ and B show a reversal of this trendcompared to the other runs.
Figures 6a025b show the accumulative numberof errors as task execution progresses
during run I {Figure 6-a} and during run J {Figure6-b}. An error is de034ned
as a failure tochoose the
correct hypothesis as themost likely one {i.e., the most likely hypothesis
does not re035ect
the true stateof the agent/team}. Each message exchangecorresponds to one
to a dozen
messages communicated by the agents,establishing or terminating a plan.
Inthe two 034gures,
a lower slope means better performance {less errors}.The line marked Coherentshows
the
accumulativenumber of errors if only coherence is usedto select the correct
hypothesis026 most such choices turn out to beerroneous since a random choice
is made among the competing hypotheses. Theline marked Coherent, Temp or
alshows the results using both coherence and the temporal model to choose
the most likely hypothesis.Similarly, the line
markedCoherent, Comm shows the results usingboth coherence and the communications
predictions. Finally, the remaining linedisplays the results of using the
combined technique,
using coherence, the temporal model, and the communications predictions.
In Figure 6-a, we see that the two techniques {Coherent, Temp or aland Coherent,
Comm } have almost equal slopes and result inalmost equal number of errors
at the end ofrun I,
119Kaminka, Pynadath, & Tambe
atendatendThis job requires more memory than is available in this printer.Try
one or more of the following, and then print again:In the PostScript dialog
box, click Optimize For Portability.In the Device Options dialog box, make
sure the Available Printer Memory is accurate.Reduce the number of fonts
in the document.Print the document in parts.%%[ PrinterError: Low Printer
VM ]%%G00F0ArialMSTT31c37d00F0_59030024024030025025030026026030027027030F2ArialMSTT31c39300F2_58024030034024026024032025024025030025034026026026032027024027030F4Times,BoldMSTT31c34d00F4_752EVHUYHG0030HVVDJH003{[FKDQJHVF6Times
New Roman,BoldMSTT31c39e00F6_72$FFXPXODWLYH003006003{UURUVF8Times New
RomanMSTT31c36500F8_75&RKHUHQW&RKHUHQW
* &RPP&RKHUHQW
* 7HPSRUDO&RKHUHQW
* 7HPSRUDO
* &RPP{a}Run I
atendatendThis job requires more memory than is available in this printer.Try
one or more of the following, and then print again:In the PostScript dialog
box, click Optimize For Portability.In the Device Options dialog box, make
sure the Available Printer Memory is accurate.Reduce the number of fonts
in the document.Print the document in parts.%%[ PrinterError: Low Printer
VM ]%%G00F0ArialMSTT31c38700F0_58030024024030025025030026026030027027030F2TimesMSTT31c39200F2_58024030034024026024032025024025030025034026026F4Times,BoldMSTT31c35900F4_752EVHUYHG0030HVVDJH003{[FKDQJHVF6Times
New Roman,BoldMSTT31c39d00F6_72$FFXPXODWLYH003006003{UURUVF8TimesMSTT31c37100F8_75&RKHUHQW&RKHUHQW
* &RPP&RKHUHQW
* 7HPSRUDO&RKHUHQW
* 7HPSRUDO
* &RPP{b} Run J Figure 6: Accumulativenumber of errors in runs I and J. though
from Figure 5 we know that due tothe alleviated uncertainty, the use ofcommuni-
cations predictions leads to overall higher probability of success {i.e.,
theCoherent, Comm
techniqueresults in fewer alternative hypotheses, andthus has a better chance
of being cor- rect}. However, in Figure 6-b wesee that in run J the situation
has changeddramatically.
First, we see that thetwo lines are no longer similar.The line marked Coherent,
Commhas
greater slope than in run I,indicating that the communications predictions
are not able to
reduce the uncertainty, resultingin lower average accuracy. Second, we see
that the tem- poral model results in many lesserrors, as evidenced by the
much slower-rising slope of the
line markedCoherent, Temp or al. Thusin this case, the actual duration of
plans matched
thetemporal model more accuratelythan in other runs.
In trying to understand this di033erence between runs J, B and theother
runs of the
system, we discovered that runs J and B involved relatively more failures
on the part of
team-members,including agents crashing or not responding atall. The communications
predictions,however, werelearned based on successfulruns026and thus did
not correctly
predict the communication messagesthat wouldresult as the team detected
and recovered from the failures. Thus the uncertainty was not alleviated,
and the arbitrary selectionwas
made among relatively many hypotheses. This explains the relatively loweraccuracy
of the
{Coherent,Comm} technique in run J and B. This clearlyshows a limitation
of the simple
learning approach we took, and we intend to address it in future work.However,
there
are other factorsthat in035uence the accuracy of the communicationmodels,
since this lower
accuracy didnot occur in other runs where failures haveoccurred.
The results of theCoherent, Temp or al techniquevary as well. We have been
able to
determine that failures cause arelative increase in the relative accuracy
of the Coherent,
T emp or altechnique. However, variancein the results is due to additional
factors.In run C,
for instance, this techniqueresults in relatively higher accuracy, but no
failure has occurred. Certainly, the mission speci034cationsthemselves di033er
between runs, machineloads cause the
mission execution to runslower or faster, etc. The great variance in the
temporal behavior of the system was the principal reason forour using the
communication prediction.This
variance is obvious in thegraphs.
120Monitoring Teams by Overhearing In summary, despite the variance in the
results of the Coherent, Temp or al technique{due
to variance in the temporal behavior of the system and the simplicity of
the temporal model},
and the possiblesensitivity of the Coherent, Commtechnique to learned predictions,
it is clear that the two techniques work well in combination, building on
the coherenceheuristic,
and compensating for eachother's weaknesses. In all runs, the combined technique
Coherent,
T emp or al, Comm was superior to either technique alone. Its performance
varied between
72045 accuracy {RunE} to 97045 {Run I}. The average accuracyacross all
runs of this all-
combinationtechnique was 84045, resulting in verysigni034cant increases
in accuracy compared to the initial solution with which we began our investigation
{less than 4045}, andto human
novice performance {seeSection 6.3}. Thus the communicationspredictions
need not be
perfect,and the temporal knowledge need not beprecise, in order to be useful.
6.2 Evaluating the Use of Communications Predictions
One key question aboutthe use of the communications predictions is theirsensitivity
to loss
of observations. The e036cacy of the technique {seeFigure 5} stems from
its capability to make inferences based on an expectedfuture observation.
Thepredictions used in the previous
sectionassumed no observation loss, i.e., if aprediction stated that a particular
message was to be observed, than the probability assigned to this prediction
was 1.0.But in settings
involvinglossy observation streams, such inference will prove incorrect,
as Overseer will 020wait021 for the observationand will therefore not correctly
monitor the actualstate of team-
members.
To evaluate the predictions' sensitivity to observation loss, we chose three
of the exper-
imental runs, E, I, and J,which represent the extreme performance resultsof
Overseer:
Run E had the lowest accuracy {72045}, Run I had the highest{97045}, and
run J showed an interesting reverse in relative performance of the Coherent,
Temp or aland Coherent, Comm
{see Figure 5}. For each of theseruns, we simulated observationloss at a
rate of 10045,
repeatingeach trial three times with di033erent randomseeds. In other words,
we ran a total of 9 trials, inwhich a random 10045 ofthe messages to be
observed byOverseer were
not observable to Overseer {though they still reached the evacuation team-members026
team-performance was identical to theoriginal settings}. We then set thepredictions
to
appropriately use 9004502510045 settings: each expected message waspredicted
to appear with
0.9 probability{as opposed to 1.0 probability originally}. The results of
these experiments arepresented in Figure 7. For each ofthe three di033erent
runs, two barsare presented. The left {shaded} bar showsthe original results
as presented in the previous section {i.e., with no observation loss, and
no treatment of possible loss in the
predictions}. The right barshows the average accuracy achieved byOverseer
on the three
trials{for each run} in which 10045 of the observations were not observableto
Overseer.
The error-bars on the right bar mark the minimum and maximumaccuracy values
achieved
in the three trials for each run.Run I's error-bars are unseen since all
threetrials resulted
in the same accuracy.
There are a number of promisingconclusions that can be drawn from these
results. First, in both runs E and I,Overseer's average accuracy dropped
by less than 8045, i.e.,
the performance ofOverseerdropped by less than the level of loss introduced.
Indeed, in run E, in which the original performance was the poorest, there
was almostno change
121Kaminka, Pynadath, & Tambe
atendatendLaserWriter II NT47.0 ---- -mark- -dictionary- -null- -filestream-
-savelevel- -fontid- /{}-string- {}[]-array- {}[]-packedarray- VMErrorERROR:
OFFENDING COMMAND: STACK:%%[ Error: ; OffendingCommand: ]%%This job requires
more memory than is available in this printer.
Try one or more of the following, and then print again:
For the output format, choose Optimize For Portability.
In the Device Settings page, make sure the Available PostScript Memory is
accurate.Reduce the number of fonts in the document.
Print the document in parts.
%%[ PrinterError: Low Printer VM ]%%00.20.40.60.81Run ERun IRun JAverage
AccuracyFigure7: Comparison of average accuracy resultswith 0045 and 10045
observation losses. in performance. Performancein run J did drop by slightly
more than10045, and that can
be at least partiallyexplained by run J's previously discussed failuresto
exploit the com-
munications predictions.Thus one promising conclusion to be drawnfrom these
results is
that Overseer's performance can degrade gracefully, at a rate comparable
to the rate of degradation to Overseer's input. A second conclusion is that
Overseer's performance under observation-losssettings is
fairly invariant.Again, both run E and I, which can beconsidered normative,
show very
little {if any} variance from one trial to the next, despite the change
in the selectionof
observations to be made unobserved from one trial to the next. Evenrun J,
which is not
a representative ofthe normative runs, shows little variance with respect
to its average accuracy under observation loss.This result suggests that
while there may bea drop in
performance with observation loss {as expected}, Overseerperforms consistently
under
varying lossy settings.
6.3Overseer and Human Monitoring by Overhearing
Another important facet to theevaluation of Overseerexamines its performance
in com-
parisonto that of novice and expert monitors of the evacuation application.
This evaluation sheds some light on the di036culty ofthe monitoring task,
and demonstrates thatOver-
seer's performance is comparable {sometimes higher, sometimes lower} to
humanexpert
performance, and signi034cantlybetter than that of novices.
To conduct this evaluation,we examined the same three runs representatives
of Over-
seer'sbounds on performance discussed above {runs E, I, andJ}. The 034rst
author of this
paperserved as an expert monitor, having as much experience in overhearing
in the evacua-
122Monitoring Teams by Overhearing tion application as possible {and speci034cally
in the actual test runs E, I and J} 2
. We estab- lished a group of novice monitors, made upfrom 034ve sub jects
who were generallyfamiliar
with hierarchical controlstructures but unfamiliar with either monitoring
byoverhearing or
with the evacuation application or its component agents.Each sub ject was
presented with printed books {one for each run}containing the overheard messages
{in human-readable
form}, the same messages overheard by Overseer under optimal {lossless}conditions.
As
reference material, eachsub ject was given a copy of theplan-hierarchy,
team-hierarchy, and
the same average durationinformation available to Overseer{the parameter
025 for di033erent leaf plans}. For each overheard message, a second line
of print indicatedthe time passed
since overhearing themessage, and the sub ject was asked towrite down their
best estimate
for the agents' current state {i.e., afterthe message was overheard and
the speci034ed time
passed}. If they felt di033erentagents or di033erent sub-teams had di033erentstates,
they were to
specifywhat each agent or subteam is doing.We emphasize that the sub jectswere
presented
with exactly the sameruns on which Overseer was evaluated.
The actual test process beganwith a short explanation of the task, with
a fullexpla-
nation of the plan-hierarchy{including answering any questions the subjects
had about the
semanticsof di033erent transitions, etc.}, and with ashort test run which
allowed each sub- ject to use the plan-hierarchy and teamhierarchy {but without
providing any feedback as
to the sub ject's accuracy}.Then, once all questions had been answered,the
sub jects were
presentedwith the test books and were givenunlimited time to complete them,
in any order {though once started on a book, they were required to 034nish
it}. Sub jectswere not allowed to
look forwardat the next message before completing their answer, nor to go
back and change their previous answers. Theywere encouraged to seek assistance
in locatingplans {in the
plan-hierarchy}consistent with messages, but no information wasprovided
as to which plan
may becorrect if two or more plans were consistent. After all sub jects
034nished with allruns,
the test books were takentogether and a score was computed. Eachcorrect
prediction was
awardeda 1, incorrect prediction a 0026same standard usedin scoring Overseer's
accu-
racy. A non-speci034c answer {i.e.,020it was somewhere in the sub-tree
ofFly-Flight-Plan021}
wasconsidered to be an incorrect prediction, as subjects were repeatedly
instructed that a speci034c answer was required. The results of the test
are presented inFigure 8. As in previous graphs, theY-axis denotes percentage
of accurate monitoring hypotheses across all the data-points in a run {2202545,
depending on the run}.The X-axis has three categories, for the threedi033erent
runs.
For each run, the left bar {marked 020Novices021} presents theaverage
accuracy achieved by the novice monitors, the middle bar {marked 020Experts021}
presents the accuracy achieved by
the human expert monitor, andthe 034nal bar {020YOYO*021} re-prints theresults
presented in
Figure 5 above. The results show that the averageaccuracy of the novices
is clearly inferior to that of the expert monitor and toOverseer. Overseer's
performance is above that
of the human expert inruns I and J. However, thehuman expertdoes much better
than
Overseerin run E.
We draw severalconclusions from these results. First,the monitoring task
Overseer
faced in the evacuation application isnottrivial: The novices failed to achievemore
than2. We have had to settle for one expertsince training an expert in this
task is verytime consuming and
requires much familiarity with the internals of the evacuationapplication
as well as the TEAMCORE architecture.
123Kaminka, Pynadath, & Tambe
70045 on average {in their best run}, and generally performedsigni034cantly
worse {by 15045 and more} than a human expert.Second, Overseer's performance
indi033erent runs was
comparable to that ofthe human expert {sometimes better, sometimes worse}.
However,
Overseer's performance tended to follow the sametrend as the novices. In
other words, Overseer's accuracy tended to go upand down on di033erent runs
in a similarmanner
to that of the average novicehuman monitor, while the expert's accuracy
remained fairly
constantacross all runs.atendatendLaserWriter II NT47.0 ---- -mark- -dictionary-
-null- -filestream- -savelevel- -fontid- /{}-string- {}[]-array- {}[]-packedarray-
VMErrorERROR: OFFENDING COMMAND: STACK:%%[ Error: ; OffendingCommand: ]%%This
job requires more memory than is available in this printer.
Try one or more of the following, and then print again:
For the output format, choose Optimize For Portability.
In the Device Settings page, make sure the Available PostScript Memory is
accurate.Reduce the number of fonts in the document.
Print the document in parts.
%%[ PrinterError: Low Printer VM ]%%00.20.40.60.81IJENovicesExpertsYOYO*Figure
8:Accuracy of human novice and expertmonitors compared to Overseer. 6.4 Evaluating
YOYO*'s Trading of Expressivity for Scalability We examine a key trade-o033
between the expressivity and e036ciency involved in the plan-
recognition techniques wehave presented. From the accuracydiscussion above,
it is clear
thatcoherence is a useful heuristic.YOYO* takes an extreme approach, strictlyruling
out
reasoning about incoherences.It is impossible for YOYO*, for instance, torepresent
an
incoherence in which twoteam-members are in disagreement about theplan executed
by
the common team.It may thus be impossible for YOYO*to explicitly represent
hypothe- ses associated with communication lossesanddelays, which cause such
incoherences.An
approach in which each individualis represented separately allows for suchrepresentation,
and in this respect ismore expressive. However, with a fewfailure-checks
in place, YOYO* is able to detect many incoher enc es, as previously discussed.
On the other hand, YOYO* o033erssigni034cant computational scalability
with respect to
the number of agents monitored.Analysis of YOYO*'s complexity {in contrast
to the array
124Monitoring Teams by Overhearing approach} was already presented inSection
5.2, and we follow it here withempirical evalu-
ation. Figure9 reports on the space requirement of YOYO*and the array-based
approach
in threedi033erent domains: the evacuationdomain, where YOYO* has been
evaluated and
deployed, and twoadditional domains in which we have built multi-agent teams026ModSAF
{Tambe et al., 1995; Calder, Smith,Courtemanche, Mar, & Ceranowicz, 1993}and
RoboCup
{Tambe, Adibi, Al-Onaizan,Erdem, Kaminka, Marsella, & Muslea, 1999;Marsella,
Adibi,
Al-Onaizan, Kaminka, Muslea, Tallis, & Tambe,2001}. YOYO* is currently being
evaluated
in these domains, and while it has not yet been fully deployed there, we
believe the partial
existing implementations are su036cient to provide robust projections of
the space savings
achieved in these domains. Webelieve that such pro jectedsavings of implementation
in
thesetwo domains could provide a rough guide asto the savings that designers
could expect from deploying YOYO* in additional domains. For each domain,
Figure 9 compares thespace requirements of the array-based approach {left
bar} with those of YOYO* {rightbar}. In addition, the dark-shaded region
on top of each bar shows the space requiredfor representing each additional
agent in thetwo
approaches, under the assumptionthat no additional plans are added to theplan-hierarchy
as more agents areadded. As discussed above, this assumptionis favorable
to the array-
based representation. The 034gure shows thesigni034cant space savings
achieved by YOYO*.
First, in representing the teams intheir current size, YOYO*'s space requirementsare
signi034cantly smaller. Furthermore, YOYO*'s savings really shine whenwe
examine the
scalability of the two approaches. While the array-based approach requires
at least the
amount of space shown in the 034gure as darkly-shaded area, YOYO*'srequirements
grow
by one node witheach additional agent. Its space requirements for representing
additional
agents areso small, that they don't show in the 034gure.atendatendLaserWriter
II NT47.0 ---- -mark- -dictionary- -null- -filestream- -savelevel- -fontid-
/{}-string- {}[]-array- {}[]-packedarray- VMErrorERROR: OFFENDING COMMAND:
STACK:%%[ Error: ; OffendingCommand: ]%%This job requires more memory than
is available in this printer.
Try one or more of the following, and then print again:
For the output format, choose Optimize For Portability.
In the Device Settings page, make sure the Available PostScript Memory is
accurate.Reduce the number of fonts in the document.
Print the document in parts.
%%[ PrinterError: Low Printer VM ]%%0300600900ArrayYOYO*ArrayYOYO*ArrayYOYO*Evacuation
{11 Agents}RoboCup {11 Agents}ModSAF {3 Agents}Number of NodesEachAdditionalAgentCurrentApplicationFigure
9:Empirical savings in applying YOYO* in the evacuation and other domains.
Earlier, inSection 5.2, we have analyzed YOYO*'s worst case run-time complexity,
but argued that this worst case behavior isvery extreme, and cannot be sustained
in practice since it involves continuous communications among all agents,
the infeasibility ofwhich
125Kaminka, Pynadath, & Tambe
providedthe motivation for exploring a plan-recognitionapproach. As further
evidence
for the average case, consider the evacuation application, where agents
communicate on average once every 20 state changes. In this application,
agents communicate in parallel in
4 or 5 exchanges {out ofdozens}, but in all cases but one, such parallelcommunications
all
referred to the sameplan, thus still requiring only a single updatein YOYO*
{see discussion
in Section 5.2}.Only once during task execution would 3 agents {out of 11}
be expected to communicate in parallel about di033erentplans, a scenario
still di033erent than YOYO*'sworst
case scenario.
Theaverage length of task execution in this domainis approximately 900 time-ticks.
The array approach would update the stateof each agent, at each time tick,
whethera
message would appear or not.Thus its average complexity per time-tick is
the same as
its worst-case, which is at least O{M N 2
}. ForYOYO*, the average complexity would be signi034cantly di033erent:899
out of 900 time-ticks it would resultin an O{M + H }process, and
only one time {out of900} it would be result in a process threetimes as
expensive {updating
the state of 3 di033erent agents}.The worst case scenario did not occur
atall in any of the
di033erent runs. 7. Related Work
Aiello et al. {2001} present several bene034ts to overhearing agent conversations.They
suggest
that the overhearermay infer the intent of the agents engagedin conversations,
and o033er
speci034csuggestions for improving the agents' performance. For instance,
overhearing a conversation between two agents about a keyword search on the
web,the overhearer may
suggest alternativekeywords to conduct the same search.This work is closely
related to our research on Overseer, and indeedpoints out several potential
additional bene034ts of
overhearing technology. However, in contrast to our work, Aiello et al.
do not address the problem of intent- or plan-recognition.They do not present
algorithms for inferringplans,
nor for disambiguating recognized plans. Overseer di033ers from most previouswork
on plan-recognition in being focused on monitoringmultiple agents, not a
single agent. While previous work in multi-agentplan
recognition has either focused onexploiting explicit teamwork reasoning
{e.g.,Tambe,
1996}, or explicitlyreasoning about uncertainty when recognizing multi-agent
plans {e.g.,
Devaney& Ram, 1998; Intille & Bobick, 1999}, akey novelty in Overseer is
that it e033ectively blends these two threadstogether. We provide a detailed
discussionbelow.
Like Overseer, RE S C
team {Tambe, 1996} reasons explicitlyabout team intentions
forinferring team plans fromobservations, similarly to Overseer's use of
the coherence
heuristic.RE S C
team
usescoherence to restrict the space requirements of the plan-library
used, similarly to YOYO*.However, Overseer usesa more advanced teamwork
model
{e.g., it canpredict failure states and recovery actions}, uses knowledge
about procedures
usedby a team {i.e., communication decisions}, andalso explicitly reasons
about uncertainty and time, allowing it to answer queriesrelated to the likelihood
of current andfuture team
plans {issues not addressed inRE S C
team
}.Indeed, RE S C
team does not explicitly represent ordering constraints between plans, anddoes
not address scarce observations:It assumes
that observationsare available that account for possiblechanges in the state
of each of the observed agents.
126Monitoring Teams by Overhearing Work such as {Devaney & Ram, 1998; Intille
& Bobick,1999}focuses on explicitly
addressinguncertainty in plan recognition in multi-agentcontexts, but does
not exploit
explicit notions of teamwork. Devaney and Ram {1998} use pattern matching
torecognize
team-tacticsin military operations.Their approach relies on team-plan libraries,
veri034ed by
domain experts, that combinethe team- and plan-hierarchies; theorganizational
knowledge
is not explicitly represented intheir technique. Similarly, IntilleandBobick
{1999} rely
entirelyon coordination constraints among agents torecognize team-tactics
in football, and in this sense use a socially-attentivetechnique that prefers
hypotheses in whichagents are
maintaining their roles.Intille and Bobick's work uses a singlestructure
for each di033erent
recognized tactic. Both investigations useposition trace data of the monitored
human teams. Our work di033ers from {Devaney & Ram, 1998; Intille & Bobick,
1999}in several ways.
First, these previous investigations have been applied insettings where
observations are con- tinuously available about eachmonitored agent. In contrast,
Overseeris targeted towards
overhearing, where limited observationsare available, both in time, and
in the number of
agents actually observed. Overseer introduces a number of novel techniques
{such as the communications predictions} which are usefulin such settings.
A second importantdi033er-
ence is the underlying representation usedin reasoning. We introducea novel
representation
particularly suitedfor monitoring by overhearing, while Intilleand Bobick
rely on standard
beliefnetworks, constructed in a particular way to support reasoning about
spatial/temporal coordination. Finally, the explicit usewe make of teamwork
and organizationalstructure
{the team-hierarchy} enables YOYO* in principle to reason about coordinationand
team-
work failures, where the previousmonitoring techniques would fail to recognize
theteam's
actions {Intille & Bobick, 1999}. Huber {1996} reports on the use ofprobabilistic
plan recognition in service of observation-
based coordination in the Net-trekdomain, and shows that agents using planrecognition
for coordination outperform agents using communications for coordination.Huber
takes
coordinationto be cooperative actions on the part ofthe self- interested
agents, e.g., joining an agent in attacking a common enemy. Huber's work
does not exploit anyknowledge of
relationships betweenthe agents to limit the computation or increasethe
accuracy. Huber's
systemdoes allow for some uncertainty caused by missing observations, but
in contrast to our work, does not introduce specialized mechanisms {such
as ours} to explicitly address
these.
Plan Recognition Bayesian Networks {PRBNs} {Charniak & Goldman,1993} provide
a
very general model for plan events, evidence, and inference.However, a PRBN
is a static Bayesian network, soit must includenodes for all plans and observationsthroughout
the
execution of the plans.Therefore, instead of representing only the events
of a single time
step {as in theDBNs described in Section 3.1}, it must includenodes over
all time steps.
Therefore, for N agents, executing a plan hierarchy of size M , over a034nite
time horizon
of Tsteps, the number of nodes in the network will be O{T N M 2
}. Inferencewill have a
space/time complexity exponential in the number of nodes,O{2
T N M
2
}, which is prohibitive over the lengths of execution found in ourexample
domains {e.g., T = 900}.
The representation used by YOYO* is related to existing approaches to the
modeling
of stochastic processes, inparticular those used for probabilistic plan
recognition.The
representation we present perhaps most closely resembles Hidden Markov Models
{HMMs}
{Rabiner, 1989}, used forplan-recognition in {Han & Veloso,1999}. One could,
in theory, 127Kaminka, Pynadath, & Tambe
representthe plan state of a team of agents within theunconstrained state
space of an
HMM.However, theHMM state space would have to represent all possible combinations
of the individual plan states of the agents,sothe size of the HMM state space
would be exponential in the number of agents and plans. Thus, the standard
algorithmsfor HMM
inference would not be able to exploit the structure of the plan and team
hierarchies, northe
particular forms of evidence {as described in Section 3.2}, in the way thatwe
do in YOYO*.
Generalizedversions ofthe HMM model {Ghahramani & Jordan, 1997; Jordan,Ghahramani,
& Saul, 1997} could more compactlyrepresent the same state space as in YOYO*,
but exact
inference is intractable for thesemodels. These models have more e036cient
algorithms for
approximate inference, butthese would have di036culty with thedeterminism
present in our
planning models. Pynadath and Wellman report on theProbabilistic State-Dependent
Grammar {PSDG} model {2000} that avoids the fullcomplexity of DBN inference
by making simplifyingas-
sumptions appropriate for plan recognition.However, while PSDG can incorporatebroader
classes of inference than YOYO*, itis intended for single-agent plan recognition,
anddoes
not support concurrency in a generalenough fashion for multi-agent plan
recognition. Goldman, Geib and Miller {1999} develop aconceptual model for
Bayesian plan recogni- tion which does include, as one of its key novelties,
the ability to infer the plansof a single
agent from lack of observation of its action. However,Goldman et al. deal
with a di033erent issue altogether than the one our communications predictions
address. Their framework looks at a sequence of observations, in which an
observationmay be missing, but observa- tions of actions following it appear.Their
framework then allows inference thatplans that
should have given rise tothe missing observation can be ruled out as recognition
hypothe-
ses. Incontrast, our approach uses the communicationspredictions to make
inference of
plan-steps that did not yet occur.Overseer probabilistically expects thepredictions
to
come true, and does notinfer additional information from a missing {predicted}
observation
that is followed byanother. In addition, our approach is fullyimplemented
and deployed in
multi-agent settings, rather than single agent. A complementary line of
work {in thecontext of the TEAMCORE architecture} has focused on intended
plan-recognitionformonitoring, where team-members may adapttheir
communications such that monitoringis made easier {Tambe etal., 2000}. This
work
{i} reduced, but did not eliminate uncertainty, and {ii} did not present
any methods to
address uncertainty, as we do here, However, it presents an interesting
future direction for Overseer's development. 8. Summary and FutureWork
This paper introducedmonitoring by overhearing, a techniquethat will be
increasingly
importantwith the growing need to monitor agent systems,particularly distributed
or de-
ployed.We presented Overseer, a systemfor monitoring teams by overhearing
the routine communications team-members exchange aspart of the execution
of their joint tasks.Moni-
toring by overhearing, while being a plan-recognition task, presents characteristic
challenges
not previously addressed.These include the scarcity of observations compared
to the rate
of changein agent's state, and the fact that agents arenot individually
observable, as the observations are essentially of multi-agent actions. In
addition to these,familiar challenges
128Monitoring Teams by Overhearing such as demanding response times andmaintaining
performance in face of a scale-up inthe
number of monitored agents, arealso present.
To address these challenges, Overseer employs a numberof novel techniques,
which
exploitknowledge of the relationships between the agents to alleviate uncertainty
and in- crease e036ciency of monitoring: {i}An e036cient probabilistic
algorithm forplan-recognition,
particularly suited formonitoring communications; {ii} YOYO*, anapproach
for e036cient
maintenanceof recognition of coherent hypotheses; and{iii} use of social
structures and procedures, e.g., team coherence and communications to maintain
coherence, toalleviate uncertainty. To demonstrate thegenerality of these
techniques, we havediscussed the po-
tential use of thesetechniques with representations other than aplan-hierarchy,
inparticular DBNs {Kj346rul033, 1992}.
We provided an in-depth empirical evaluation of these techniques in one
of thedomains
in which Overseeris applied. The evaluation carefullyexamines the contribution
ofeach technique to the overall recognitionsuccess, and demonstrates that
these techniques work
best together, as they complementrelative weaknesses of each other.The paper
also pre-
sented an evaluation of the scalability of YOYO*, and its performance under
conditions of
observation loss. Finally, we presented acomparison of Overseer's performance
withthat
of human expert and novicemonitors, and demonstrated that Overseerperformance
is
comparable to that of human experts, despite the di036culty of themonitoring
task.
Several opportunities forfuture research directions arise from the experimental
results.
First, the use of rote-learning to predict when messages will be observed
{provided as feasi-
bility demonstration}, proved e033ective for normative runs.However, the
simple mechanism was damaging when rare patterns of communications arose,
as some of the experiments have
shown. In-depth exploration of therole of learning is therefore one of the
directionswe hope
to pursue in the future.In addition, learning mechanisms that can derive
plan-hierarchy
and team-hierarchystructures from records of conversations are alsoof much
interest.
Acknowledgements
This paper is based in part on anAgents-2001 paper by the same authors {Kaminka,
Py-
nadath, & Tambe,2001}. Parts of this research werecarried out while the
034rst author was a Post Doctorate Fellowat the Computer Science Department,
Carnegie MellonUniversity.
W e thank Manuela Veloso for her enthusiastic support ofthis pro ject at
Carnegie Mellon University, and we thank Yves Lesp351rance, Victor Lesser,
George Bekey, Je033 Rickel, and
DanO'Leary for useful comments. OshraKaminka deserves special thanks for
herhelp in
analyzing and processing the data.This research was supported by DARPA awards
F30602-
98-2-0108,F30602-98-2-0135, and F30602-00-2-0549, managedbythe Air Force
Research
Labs/Romesite.
Appendix A. Additionalalgorithms and proofs
Thisappendix contains the pseudo-code for allalgorithms described in the
paper, for which pseudo-code was not provided in the body of the text itself.
These include the modi034cations
to the propagation proceduresnecessary for propagation within YOYO*.In addition,
we
129Kaminka, Pynadath, & Tambe
providea proof that the number of coherent hypotheses for N agents is linear
inthe size of
the plan-library M.
A.1 The Number of Incoherent and Coherent Hypotheses
LetM
i
be the monitoringplan-library for agent i; 1024 i 024 N . Whenmonitoring
agent i, a
monitoringsystem reasons about monitoring hypotheses inM
i
. In other words,we can view
M
i as the 034nite set of all possible plans agent i may be executing.Given
a query as to the
agent'scurrent state by the monitoring system, theplan-recognition algorithm
picks some
k
i
speci034cmembers of M
i
as hypotheses as to the current state ofthe agent026call these sets
of hypotheses m
i
wherejm
i
j =k
i
.
To construct an overall team hypothesis,the monitoring system must combine
the indi- vidual hypotheses to form a hypothesis for the team's state. For
each agenti, the monitoring
system chooses one individual hypothesis h
i 2 m
i
.The combination ofthese forms the team state hypothesis. If there is nouncertainty
about the state of any of theagent, i.e., k
i
=1
for all i, then oneteam hypothesis exists. However,ifuncertainty existsabout
the state agents, then clearly, the process ofselecting individual hypotheses
becomes combinatorial
in nature, as all possible combinations of all individual hypotheses are
possible in principle.
Let us consider howmany coherent hypotheses exist.If we restrict ourselves
to coherent hypotheses, then the selection of individualhypotheses for each
agent are constrained such
that the selections are in agreement026the same individual hypothesis is
selected for each
agent. Given a selection ofan individual state hypothesis h 1
2 m
1 for the 034rst agent, we must choose h
2 2 m
2
for the secondagent, h
3
2m
3
for the third agent,etc., such that
h
1 = h
2
=h
3
= ::: =h
N
. Sincethere are not more than k
1
024 jM
1 j individual state
hypothesesfor the 034rst agent, it follows that the number of coherent
team-state hypotheses is bounded by jM
1
j, i.e., the size of the planlibrary for the agents. In fact, the number
of
coherent hypotheses is bounded by mink
i 026since only members of m min k
i
can be matched
with members of the other individual hypothesis sets,m. In contrast, by
de034nition, allother
team-state hypotheses are incoherent. There will be k
1 003 k
2
003 k
3
003::: 003 k
N
000 {min k
i
} of these
hypotheses. A.2 YOYO* Propagation Algorithms {Section5.1}
The algorithms presented in thissection support those presented in the main
textof the
paper, and are provided here forcompleteness. Some of them may contain astep
which
iteratesover all teams that can take an outgoing transition {e.g., line
1 of algorithm 6, or
line 13 of algorithm 7}.This step requires some further clari034cation:When
iterating over
all outgoing teamsthat meet the condition, the algorithm consults theteam-hierarchy
to
carry out the iterationonly for the topmost teams {in terms ofthe team-hierarchy}
that
meetthe condition. For instance, in ourapplication domain, the team TASK-FORCEhas
{among others} two subteamsTRANSPORTS and ESCORTS. If a transitionis allowed
to be taken byTRANSPORTS only, then an iteration 020over all teams that
are allowed to take the transition021 will not considereither ESCORTS or
TASK-FORCE.However, if the
transition allows TASK-FORCE, then the iteration step will take place only
once026it will
beexecuted once for the team TASK-FORCE,which is the parent team for TRANSPORTS
and ESCORTS.
130Monitoring Teams by OverhearingAlgorithm 6Team-Propaga te-Down{plan Y,
probability 032, beliefs, b, plans M }1: for allteamsT who are allowed to
take anoutgoing hierarchical-decomposition transition from Y do
2: C T
fc j c2 M; c 034rst child ofY ; c is to be taken by teamT g
3: 032
0
032= j C T
j
4: forall plans c2 C
T
do
5: b t+1
{Y ;:block} b
t+1
{Y ; :block} + 032
0 6: Team-Propaga te-Down{c; 032
0
; b; M }Algorithm 7Team-Propaga te-For w ard{team-hierarchyH , beliefs b,
plansM }1:for all plans X 2 Mdo
2: b
t+1
{X; :block} 0:0
3:b
t+1
{X;block} 0:0 4: out
x
0:0
5: 021 x
0:0 6: for all plans X2 M in post-orderdo {children in temporal order
beforeparents}
7: if Xis a leaf then
8: out x
b
t {X; :block}{1000 e
000025
x
} {calculate probability ofX terminating at time t} 9: else {X is a parent}
10: out
x
is known { because post-order guaranteesall children set it in line 21}
11: for all temporaloutgoing transitions T
x!y
from X do 12: 021
x
021
x
+{1 000 026
xy
}031
xy
13:for all teams E whoare allowed to take a temporal outgoingtransition
do
14: if021
x
> 0then {some transition can be taken} 15: for all temporal outgoingtransitions
T
x!y from X to be taken byE do
16: 032 out
x
{1000026
xy
}031 xy
17: if T x!y
leads to a successorplan Y then
18: b t+1
{Y ;:block} b
t+1
{Y ; :block} + 032
19:Team-Propaga te-Down{Y ; 032; b; M }
20:else {T
x!y is a terminating transition}
21: out
parent{x}
out
parent{x}
+{1 000 026
xy
}031
xy
{parent'soutgoing probability is its chil- dren's}
22: b
t+1
{X; block} b
t+1 {X; block} + out x
000 021
x
23: b
t+1 {X; :block} b
t+1
{X; :block} 000out
x131Kaminka, Pynadath, & Tambe
Algorithm8 below may require some clari034cations.First, it is important
to note that the plans Y {line 1} are traversedin pre-order026parents beforechildren.The
scaling calculation
dependson the parent having the scaled probability. Second, the iteration
over sub-plans Y essentially captures all plans inthe subtree rooted in the
parent planP , except for those
in the subtreerooted by P 's childX , which already has been adjusted by
YOYO* prior to
the call to thisalgorithm. In fact, the use of X's team T to scale only
other plansmakes
sure that any of X's siblings, that are alternatives toX for the team T
,do not get scaled.
This is correct because this procedure is called when incorporatingevidence
for X {rather
thanany of its siblings}.Algorithm 8Scale{parentplan P , team T ,child planX
, beliefsb}1:for all subplans Y ofP , where team{Y} 6= T , in pre-orderdo
2: b
t+1
{Y ; :block} b
t+1 {Y ; :block}+ b
t
{Y;:block}b
t
{parent{Y };:block} b
t+1
{parent{Y }; :block}
3: b
t+1
{Y ; block} b
t+1 {Y ; block}+ b
t
{Y;block}b
t {parent{Y };:block}
b t+1
{parent{Y }; :block}References Aiello, M., Busetta, P.,Dona, A., & Sera034ni,
L. {2001}. Ontologicaloverhearing. In In-
telligent Agents VIII, Pro c e e dingsof the international workshop on Agents,
Theories,
Ar chite ctur es, and Languages {AT AL-2001}.
Barber, K. S., & Martin, C. E. {2001}.Dynamic reorganization of decision-making
groups. In Pro c e e dings ofthe FifthInternational Conferenc e on Autonomous
Agents {Agents-01},
pp.513025520. ACM Press.
Calder,R. B., Smith, J. E., Courtemanche, A. J., Mar,J. M. F., & Ceranowicz,
A.Z. {1993}. Modsaf behavior simulation and control. In Pro c e e dingsofthe
Third Conferenc eon
Computer Generate d For c es and Behavioral Rer esentationOrlando, Florida.
Institute
for Simulation and Training, University of CentralFlorida.
Charniak, E., & Goldman, R. P. {1993}. A Bayesian model of planrecognition.
Arti034cial
Intelligence, 64 {1}, 5302579. Cohen, P. R., Johnston, M., McGee, D.,Oviatt,
S., Pittman, J., Smith, I., Chen, L., & Clow,
J. {1997}. Quickset:Multimodal interaction for distributed applications.
InPro c e e dings
of the Fifth Annual International Multimodal Conferenc e {Multimedia '97},
pp. 3102540. Cohen, P. R., & Levesque, H. J.{1990}. Rational interaction
as the basis forcommunica-
tion. In Cohen, P. R., Morgan, J., & Pollack, M. E.{Eds.}, Intentions in
Commu-
nication, Systems Development Foundation Benchmark Series, chap. 12, pp.221025255.
MIT Press.
Cohen, P. R., & Levesque, H. J. {1991}.Teamwork. Nous, 35.
Decker, K. {1995}.Environment Centere d Analysis andDesign of Coor dination
Mechanisms.
Ph.D. thesis, Department of ComputerScience, University of Massachusetts,
Amherst. Devaney , M., & Ram, A. {1998}.Needles in a haystack: Plan recognition
inlarge spatial
domains involving multiple agents. In Pro c e e dingsof the Fifteenth National
Conferenc e
on Arti034cial Intelligence {AAAI-98}, pp. 942025947 Madison,WI.
132Monitoring Teams by Overhearing Dunin-Keplicz, B., & Verbrugge,R. {2001}.
The role of dialogue in collective problem
solving. In Pro c e e dings of034fth International Symposium on the Lo
gic al Formalization
of Commonsense Re asoning{Commonsense 2001}, pp. 89025104. Finin, T., Labrou,
Y., & May034eld {1997}.KQML as an agent communication language.In
Bradshaw, J. {Ed.}, Software Agents. MIT Press.
Ghahramani, Z., & Jordan, M. I. {1997}.Factorial hidden Markov models.Machine
Le arning,
29, 245025275.
Goldman, R. P., Geib, C. W., & Miller, C. A. {1999}.A new model of plan
recognition. In Pro c e e dings ofthe Conferenc e on Uncertainty in Arti034cialIntel
ligence {UAI-1999}
Stockholm, Sweden.
Grosz, B. {1996}.Collaborating systems. AI Magazine, 17 {2}.
Grosz, B. J., &Kraus, S. {1999}. The evolution ofSharedPlans. In Wooldridge,
M., & Rao, A. {Eds.}, Foundations and Theories of Rational Agency, pp. 227025262.
Grosz, B. J., & Kraus,S. {1996}. Collaborative plans for complexgroup actions.
Arti034cial
Intelligence, 86, 269025358. Han, K., & Veloso, M. {1999}.Automated robot
behavior recognition appliedto robotic soc-
cer. InPro c e e dings of the IJCAI-99 Workshop on Te am Behavior and Plan-Re
c o gnition.
Also appears inProceedings of the 9th International Symposiumof Robotics
Research
{ISSR-99}. Horling, B., Benyo, B., & Lesser, V.{2001}. Using self-diagnosis
to adaptorganizational
structures. In Pro c e e dings ofthe Fifth International Conferenc e on
Autonomous Agents {A gents-01}, pp. 529025536. Huber, M. J. {1996}. Plan-Based
Plan Re c o gnition Models for the E033ective Coor dination of
Agents Through Observation. Ph.D. thesis, University of Michigan.
Huber, M. J., & Hadley, T. {1997}. Multiple roles, multipleteams, dynamic
environment:
Autonomousnetrek agents. In Johnson, W. L. {Ed.},Pro c e e dings of the
First Interna- tional Conferenc e on Autonomous Agents {Agents-97}, pp. 332025339
Marina del Rey,
CA. ACM Press. Intille, S. S., & Bobick, A. F. {1999}.A framework for recognizing
multi-agent action from
visual evidence. InPro c e e dings ofthe Sixteenth National Conferenc e
on Arti034cial Intel ligence {AAAI-99},pp. 518025525. AAAI Press.
Jennings, N.R. {1993}. Commitments and conventions:the foundations of coordination
in
multi-agent systems. Know ledge Engineering Review,8 {3}, 223025250.
Jennings,N. R. {1995}. Controlling cooperativeproblem solving in industrial
multi-agent systems using joint intentions.Arti034cial Intel ligence,75
{2}, 195025240.
Jordan,M. I., Ghahramani, Z., & Saul, L. K. {1997}.Hidden Markov decision
trees. In Mozer, M. C., Jordan, M. I., & Petsche,T. {Eds.}, Advanc es inNeural
Information
Pro c essingSystems,Vol. 9, p. 501. The MIT Press. Kaminka, G. A., Pynadath,
D. V., & Tambe, M. {2001}. Monitoringdeployed agent teams.
InPro c e e dings of the Fifth International Conferenc e on Autonomous Agents
{Agents-
01},pp. 308025315.
133Kaminka, Pynadath, & Tambe
Kamink a,G. A., & Tambe, M. {2000}.Robust multi-agent teams via socially-attentive
monitoring. Journal of Arti034cialIntel ligence Rese ar ch, 12, 105025147.
Kinny, D., Ljungberg, M., Rao, A., Sonenberg,E., Tidhar, G., & Werner, E.
{1992}.Planned
team activity. InCastelfranchi, C., & Werner, E. {Eds.},Arti034cial Social
Systems,
Le ctur e notes in AI 830, pp.227025256. Springer Verlag, New York.
Kj346rul033, U. {1992}.A computational scheme for reasoning in dynamicprobabilistic
net-
works. InPro c e e dings ofthe Conferenc e on Uncertainty in Arti034cialIntel
ligence {UAI-
1992}, pp. 121025129 San Mateo, CA. Morgan Kaufmann. Knoblock, C. A., Minton,
S., Ambite,J. L., Ashish, N., Modi, P. J., Muslea,I., Philpot,
A. G., & Tejada,S. {1998}. Modeling Web sources forinformation integration.
In
Pro c e e dings of the FifteenthNational Conferenc e on Arti034cial Intelligence
{AAAI-98}.
Kumar,S., & Cohen, P. R. {2000}.Towards a fault-tolerant multi-agentsystemarchitecture.
In Pro c e edings of the Fourth International Conferenc e on Autonomous
Agents {Agents-
00}, pp. 459025466 Barcelona,Spain. ACM Press.
Kumar, S., Cohen, P. R., & Levesque, H. J. {2000}.The adaptive agent architecture:
Achiev- ing fault-tolerance using persistent broker teams. In Pro c e e dingsof
the Fourth Inter-
national Conferenc e on Multiagent Systems {ICMAS-00}, pp. 159025166 Boston,
MA.
IEEEComputer Society.
Lenser,S., Bruce, J., & Veloso, M. {2001}.Cmpack: A complete software system
for au- tonomous legged soccer robots.In Pro c e e dings ofthe Fifth International
Conferenc e on Autonomous Agents {Agents-01}, pp. 204025211. ACM Press.
Lesh, N., Rich, C., & Sidner, C. L.{1999}. Using plan recognition in human-computer
col-
laboration. In Pro c e edings ofthe Seventh International Conferenc e on
User Model ling
{UM-99} Ban033, Canada.
Levesque, H.J., Cohen, P. R., & Nunes, J. H. T.{1990}. On acting together.
InPro c e e dings
of the EigthNational Conferenc e on Arti034cial Intelligence {AAAI-90}
Menlo-Park, CA. AAAI Press.
Marsella, C. S., Adibi, J., Al-Onaizan, Y., Kaminka,G. A., Muslea, I., Tallis,
M., & Tambe,
M. {2001}. Onbeing a teammate: Experiences acquired in thedesign of robocup
teams.
Journalof Autonomous Agents and Multi-Agent Systems, 4 {10252}. Martin,
D. L., Cheyer, A. J., & Moran, D. B. {1999}. The open agent architecture:A
framework for building distributed software systems. Applied Arti034cial
Intelligence,
13 {1-2},92025128.
Ndumu, D. T., Nwana, H. S.,Lee, L. C., & Collis, J. C. {1999}.Visualizing
and debugging
distributed multi-agent systems. In Pro c e edings of the Third International
Conferenc e
on Autonomous Agents{Agents-99}. ACM Press. Payne, T. R., Sycara, K., Lewis,
M.,Lenox, T. L., & Hahn, S. {2000}.Varying the user
interactionwithin multi-agent systems. InPro c e e dings ofthe Fourth International
Conferenc eon Autonomous Agents {Agents-00}, pp. 412025418.
Pechoucek,M., Marik, V., & Stepankova,O. {2000}. Role of acquaintance modelsinan
agent-based production planning.In Klusch, M., & Kerschberg, L. {Eds.},
Coop er ative
134Monitoring Teams by Overhearing Information Agents IV, Pro c e edings
of the Fourth International Workshop {CIA-2000},
No. 1860 in LNAI,pp. 179025190. Springer Verlag.
Pechoucek, M., Marik, V., & Stepankova, O. {2001}. Towardsreducing communication
tra036c
inmulti-agent systems. Journal of Applied System Studies.
Pynadath,D. V., & Wellman, M. P.{2000}. Probabilisticstate-dependent grammarsfor
plan
recognition. In Pro c e e dings ofthe Conferenc eon Uncertainty in Arti034cial
Intelligence
{UAI-2000}, pp. 507025514. Rabiner, L. R. {1989}. Atutorial on Hidden Markov
Models and selectedapplications in
speech recognition.Pro c e e dings ofthe IEEE, 77 {2}, 257025286.
Reed, C. {1998}. Dialogue frames in agentcommunications. In Pro c e e dingsof
the Third
InternationalConferenc e on Multiagent Systems {ICMAS-98}, pp. 246025253.
Rich, C., & Sidner,C. L. {1997}. COLLAGEN: When agents collaborate with
people.
In Johnson, W. L.{Ed.}, Pro c e e dings of the FirstInternational Conferenc
e on Au- tonomous Agents {Agents-97}, pp. 284025291 Marina del Rey,CA. ACM
Press.
Tambe,M. {1996}. Tracking dynamic team activity. In Pro c e e dingsof the
National Confer-
enceon Arti034cial Intel ligence {AAAI}.
Tambe, M. {1997}.Towards 035exible teamwork.Journal of Arti034cial Intel
ligence Rese ar ch,
7, 83025124.
Tambe,M., Adibi, J., Al-Onaizan, Y., Erdem, A., Kaminka, G. A., Marsella,
S. C., &
Muslea, I. {1999}. Buildingagent teams using an explicit teamwork model
and learning. Arti034cial Intel ligence, 111 {1}, 215025239. Tambe, M.,
Johnson, W. L., Jones,R., Koss, F., Laird, J. E., Rosenbloom, P. S., & Schwamb,
K.{1995}. Intelligent agents for interactive simulation environments.AI
Magazine,
16 {1}. Tambe, M., Pynadath, D. V., Chauvat, N., Das, A., & Kaminka, G.
A.{2000}. Adaptive
agentintegration architectures for heterogeneous team members. In Pro c
e e dingsof
the Fourth International Conferenc e on Multiagent Systems {ICMAS-00}, pp.
301025308
Boston, MA. Tidhar, G. {1993a}. Team orientedprogramming: Preliminary report.
Tech.rep. 41, Aus-
tralian Arti034cial Intelligence Institute, Melbourne, Australia. Tidhar,
G. {1993b}. Teamoriented programming: Social structures. Tech. rep. 47, Australian
Arti034cialIntelligence Institute, Melbourne, Australia. Vercouter, L.,
Beaune, P.,& Sayettat, C. {2000}. Towardsopen distributed information
systems by the way of a multi-agent conception framework.In Working Notes
of
the AAAI-2000 Workshop on Agent-Oriente d InformationSystems {AOIS-2000},
pp.
2902538.
135