Acta Polytechnica


doi:10.14311/AP.2015.55.0154
Acta Polytechnica 55(3):154–161, 2015 © Czech Technical University in Prague, 2015

available online at http://ojs.cvut.cz/ojs/index.php/ap

COMPARISON OF CLASSICAL AND INTERACTIVE
MULTI-ROBOT EXPLORATION STRATEGIES IN POPULATED

ENVIRONMENTS

Nassim Kaldea, b, c, ∗, Olivier Simonind, François Charpilleta, b, c

a Inria, Villers-lès-Nancy, 54600, France
b Université de Lorraine, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France
c CNRS, LORIA, UMR 7503, Vandoeuvre-lès-Nancy, 54506, France
d CITI-Inria Laboratory, INSA Lyon, 69621 Villeurbanne, France
∗ corresponding author: nassim.kalde@loria.fr

Abstract. Multi-robot exploration consists in coordinating robots for mapping an unknown
environment. It raises several issues concerning task allocation, robot control, path planning and
communication. We study exploration in populated environments, in which pedestrian flows can
severely impact performances. However, humans have adaptive skills for taking advantage of these
flows while moving. Therefore, in order to exploit these human abilities, we propose a novel exploration
strategy that explicitly allows for human-robot interactions.

Our model for exploration in populated environments combines the classical frontier-based strategy
with our interactive approach. We implement interactions where robots can locally choose a human
guide to follow and define a parametric heuristic to balance interaction and frontier assignments. Finally,
we evaluate to which extent human presence impacts our exploration model in terms of coverage ratio,
travelled distance and elapsed time to completion.

Keywords: artificial intelligence, multi-agent system, robotic exploration, human-robot interaction.

1. Introduction
Mobile robots intervene in our daily life and pro-
vide services (e.g., guidance, assistance [1, 2]) or even
leisure activities (e.g., providing company, dancing
[3, 4]). This intrusion of mobile robots into citizens’
day-to-day lives must take people into account, whilst
seeking social compliance. Human activities and mo-
tion patterns are already studied [5], so that a robot
can learn a model of the human behaviour to generate
a socially compliant control and apply it. For exam-
ple, by observing pedestrians walking nearby, a robot
could model the pedestrian dynamics and generate
its own navigation control for efficiently navigating in
populated environments.

Multi-Robot Exploration (MRE) consists in recon-
structing all the reachable space of an unknown en-
vironment by controlling mobile robots. Introducing
human presence awareness into a robotic exploration
system for populated environments can constitute an
interesting route for study purposes. Indeed, it paves
the way for human-robot interaction (HRI)-based ex-
ploration approaches. In populated environments, the
exploration task raises new concerns regarding clean
reconstruction and efficient robot coordination.

Concerning reconstruction quality, it is particularly
difficult to separate static aspects (background) from
dynamic aspects (people, robots) of the scene [6]. Ob-
viously, mobile robot perceptions are biased due to
the dynamics of the environment, thus hindering lo-

calisation and mapping. Regarding the selection of
targets to explore, pedestrian movements create spatio-
temporal reachability of known/unknown areas mak-
ing exploration tricky. In fact, the reachable space
evolves dynamically according to the density of the
human presence.

Nevertheless, humans can understand the dynamics
of their environment, and can sense, decide and act
adequately. In this sense, we can assume that every
person has an adaptive heuristic, depending on the lo-
cal environment, that allows him or her to walk readily
through dense areas (e.g., crowds). We are interested
in exploiting human skills as possible heuristics for
the exploration task. We propose a weighted heuristic
that incorporates human presence for selecting areas
to explore or for initiating human-robot interactions.

This is followed by a brief state of the art in MRE,
and we also situate our approach among HRI appli-
cations in mobile robotics. In the third section, we
formalise the multi-agent system for exploration in
populated environments, and present the framework
of our study. The fourth section defines the mixed
exploration approach (robot-frontier/interaction) and
proposes a human-aware exploration heuristic for es-
tablishing human following interactions. We then
perform several experiments of our mixed approach
in simulation, to underline the variability of the per-
formance depending on the environment. Finally, we
discuss our results and perspectives regarding machine
learning for adaptive heuristic parameterisation.

154

http://dx.doi.org/10.14311/AP.2015.55.0154
http://ojs.cvut.cz/ojs/index.php/ap


vol. 55 no. 3/2015 Comparison of Classical and Interactive Multi-Robot Exploration

Figure 1. Multi-Agent System simulated in V-REP [19].

2. Related Work
First, this section presents previous work in the field
of MRE. Then, we situate our study among mobile
robotic applications of HRI.

2.1. Multi-Robot Exploration
The MRE problem consists in acquiring an accurate
representation of an environment by efficiently coor-
dinating the actions of robots within it. Representa-
tion accuracy refers to the degree of closeness to the
ground truth. Coordination of the robots arises from
the teamwork involved in solving the task. Coordina-
tion efficiency can be evaluated at several levels, e.g.,
energy consumption, trajectory overlapping, etc.
Thus, MRE solutions design efficient control of

robots for accurately completing a chosen representa-
tion of the environment (e.g., graph). The proposed
solutions can be roughly classified into reactive, goal-
based and utility-based agent design.
Within reactive and bio-inspired approaches, the

actions of an agent are hardwired to its perceptions,
and simple navigation rules can be created, e.g., as
following walls, circular or boustrophedon patterns
[7–10]. These approaches are usually only concerned
with coverage. They do not always consider mapping,
as typical reactive agents are memory-less.

For goal-based agents, frontier-based methods offer
a good computation/performance tradeoff, making
them particularly suitable for deployment in real em-
bedded systems. The idea is to incrementally assign
robots to frontiers (separating the known and free
space from the unknown space), thus serializing the
exploration task into subgoals. Various frontiers eval-
uation and target selection methods are discussed in
the literature [11–13].
In utility-based approaches, an agent makes deci-

sions according to the value of world states. The
information gain value is proposed in [14], and [15]
considers curiosity, surprise and hunger as motiva-
tional agents.

In our present study, we consider goal-based agents
with frontiers to reach and humans to interact with
as subgoals during exploration. We are looking for a
parametric heuristic that evaluates/balances frontiers
and interactions for exploiting human adaptive skills.

2.2. Human-Robot Interaction
HRI is defined as the study of humans, robots and
their mutual influences [16]. To the best of our knowl-
edge, the office-conversant robot Jijo-2 is the only HRI
application of mobile robotic exploration. This robot
exhibits socially embedded learning mechanisms [17]
by gathering information while conversing with other
people. Thus, it realises semi-supervised learning by
incorporating local oracle heuristics while exploring.
We present an application in mobile robotics con-

sidering close interactions established by proximity or
direct perception between humans and robots. This
type of interactions belongs to the Intimate Interaction
class defined by Takeda into his HRI classification [18].
Our study bridges together intimate HRI applica-

tions and MRE goal-based algorithms.

3. Multi-Agent System
Formalisation

3.1. Modelling the Environment and
the Agents

We propose a model for representing the multi-agent
system for populated environments (Fig. 1).
In (1), the environment to explore is described by

a navigation map E, which evolves over time. This
evolution results from the actions of the agents (hu-
mans H and robots R). At each time t, a robot Ri
from R has a configuration from which it observes
the environment. Oti is an observed subset of E, it
corresponds to the robot’s observation at time t.
Formally, let:

E be a navigation map,
R = {R1, . . . ,Rn} be a set of robots,
H = {H1, . . . ,Hm} be a set of humans,
Oi ⊂ E be the observation of Ri.

(1)

3.2. Exploration and Completion
For the exploration task, we must represent the en-
vironment explored by the robots over time (2). Let
θ0:ti be the set of observations, namely the local t-time
history, that agent Ri has experienced up to time
t. Similarly, Θ0:t is the global t-time history, which
aggregates the local t-time histories from R.

155


N. Kalde, O. Simonin, F. Charpillet Acta Polytechnica

?

! ? unknown
occupied

free
! animated

Figure 2. Cell states transition diagram.

4

!
1 2 3 4

3
!

1 2 3 4

2

1 2 3 4
1

1 2 3 4

w

l

(a) E, R1, H1

4

R1

1 2 3

?
4

3

H1

1 2 3

?

4

2

?

1 2

?

3

?

4
1

?

1

?

2

?

3

?

4

w

l

(b) θ0:11 = O11

Figure 3. Full grid and robot observation at t = 1.

Thus, we have:

θ0:ti = θ
0:t−1
i ∪ O

t
i,

Θ0:t =
n⋃

i=1
θ0:ti .

(2)

It is of fundamental importance for the robots to know
when the exploration is finished. The completion
criterion determines this moment and can be defined
locally on each robot. Robots determine exploration
completion based upon already explored space Θ. The
mission is over as soon as there is no configuration
in the already explored space that allows for new
observations.

3.3. Instantiating the Multi-Agent
System

We represent E as a discrete occupancy grid of l×w
square cells. Each cell has 4 possible states: the
unknown (not observed), occupied (walls, objects),
animated (humans, robots) and free (empty) states.
States transitions are illustrated in Fig. 2. In this grid
representation, R becomes the set of cells animated
by the robots and Ri describes the position of one
robot on the grid. The observation area of each robot
is within a limited circle.
An environment, a robot and a human are repre-

sented in Fig. 3a. The robot is located on cell R1 at
(1, 1) and the human on cell H1 at (1, 2). The maxi-
mum field of view of R1 is within the dashed arc in
Fig. 3b. O11 consists of 7 cells: 3 are free, 2 are occu-
pied and 2 are animated. The explored environment
θ0:11 is limited to this first observation.

We have provided an instance of a multi-agent sys-
tem for exploration in populated environments. The
environment is represented with a discrete occupancy
grid, agents are characterised by their identifier and
their coordinates on the grid, and observations are

cRiTj T
R CRT

aRiTj T
R ART

opt.

Figure 4. Multi-Robot Exploration as a Task Allo-
cation Problem.

made by casting rays within the viewing range of
a robot. Our study is based on this representation of
a multi-agent system. In the next section, we present
the frontier/interaction exploration approach.

4. Mixed Exploration Approach
by Frontiers and Interactions

First, let us consider the MRE problem defined as a
target allocation problem of robots in an unknown
environment [12–14]. A solution to the MRE problem
defines a way to explore an unknown space, i.e., how
to assign robots from R to tasks/targets from T. To
achieve this, we can look for an assignment matrix
ART that optimises the cost matrix CRT (cf. Fig. 4).

4.1. Various Approaches for
Multi-Robot Exploration

We show how different sets of targets define the classi-
cal frontier-based exploration, our new interactive ap-
proach and the mixed approach (frontier/interaction).

4.1.1. Frontier-Based Exploration
A frontier is the observed boundary between an ex-
plored space and an unexplored space [11]. Classical
frontier-based exploration is defined by choosing the
targets from the set of frontiers F (3).

Let:
cRiFj be the cost for Ri to reach Fj,

aRiFj =
{

1 if Ri must go to Fj,
0 otherwise.

(3)

In populated environments this approach can fail
when the path to a chosen frontier is congested by
humans.

4.1.2. Interactive Exploration
Human-robot interaction is defined as the reciprocal
influence between a human and a robot, followed by
one or more effects. We introduce an interactive ap-
proach that takes into account the presence of humans
for establishing human-robot interactions (opening a
door, guiding though a crowd, etc.). Targets are now
chosen from the set of humans H (4).
Let:
cRiHj be the cost for Ri to interact with Hj,

aRiHj =
{

1 if Ri must interact with Hj,
0 otherwise.

(4)

A purely interactive approach can be inefficient in
sparsely populated environments. Indeed, without
any perception of human presence, the robots adopt
a wait-and-see policy and pause the exploration.

156


vol. 55 no. 3/2015 Comparison of Classical and Interactive Multi-Robot Exploration

R

F H

d R
i
F

j

d
R

i H
k

(a) Distances.

R

F H

(σ
)

p R
i
F

j

(1
−
σ)

p
R

i H
k

(b) Penalties.

Figure 5. Distances and penalties considered in the
system.

4.1.3. Mixed Exploration
Mixed exploration enables to initiate interactions and
also to reach frontiers. Thus, we combine the two
target sets (frontiers and humans) to define a new set
G (5).
Let:

cRiGj be the mixed cost for Ri to Gj,

aRiGj =
{

1 if Ri is assigned to Gj,
0 otherwise.

(5)

This approach requires to smartly adjust interac-
tion and frontier assignments to overcome the two
above-mentioned issues (wait-and-see policy and the
congested frontier).

4.2. Mixed Cost Model
In this study, robots can interact only by following
pedestrians. The optimisation criterion is to explore
a possibly populated environment with minimum dis-
tance and time. Thus, we define mixed costs using
distances and weighted penalties, see Fig. 5. The
weight σ balances interaction and frontier penalties.
First we detail distances, then we introduce penal-
ties and explain the different weights used in the cost
formula.

4.2.1. Distance
First, we incorporate distances between robots and tar-
gets as immediate costs (Fig. 5a). Thus, we initialise
CRG with normalised robot-frontier and robot-human
distances (DRF, DRH) in (6).

DRG = (DRF | DRH),

DRX =
dR1X1 dR1X|X|

dRnX1 dRnX|X|





 . (6)

Distance costs have multiple drawbacks. Two exam-
ples follow:
If a robot travels towards a frontier but a crowd

hinders its navigation: the robot cannot adapt the
exploration depending on navigation feasibility. Re-
mote but reachable frontiers are not reevaluated as
good options. The distance cost is prohibitive and the
next target is always chosen between the last frontier
and close humans who are nearby. A solution is to

CRG

α · DRG + (1 −α) · (PRF | 0)

α · DRG + (1 −α) · (0 | PRH)

P
R

G

D
R

G

α

σ

Figure 6. CRG according to α and σ.

use a planned distance, which is set to infinity when
a target is momentarily unreachable.
If a robot follows a pedestrian walking nearby but

the person stops to discuss with other people: the
robot cannot decide either to maintain or stop the
current interaction depending on the human activity.
Due to distances, the robot will resume exploration
only if one person moves again. This also causes a
growing unease for the people. A solution is to make
an a priori evaluation of an interaction and to update
the evaluation of the interaction a posteriori, while it
is taking place.

4.2.2. Penalty
We tackle these two drawbacks with a heuristic that
associates penalties to each robot-frontier/human pair
(Fig. 5b).

A penalty pRiXj is defined as the sum of a time
penalty and an orientation penalty. The time penalty
tRiXj is the time elapsed since a frontier discovery or
a human remains idle. The orientation penalty oRiXj
is the smallest unsigned angle between the orientation
of a robot and the orientation of a frontier/human
(a frontier is oriented towards the unknown). Thus,
we define PRG with normalised robot-frontier and
robot-human penalties (PRF, PRH) in (7).

PRG =
(
σ · PRF

∣∣ (1 −σ) · PRH), σ ∈ [0, 1],
PRX =

pR1X1 pR1X|X|

pRnX1 pRnX|X|





 ,

pRiXj = tRiXj + oRiXj .

(7)

Parameter σ sets more or less weight on the fron-
tier penalties or on the interaction penalties. When
this parameter is high, it increases the frontier costs
and decreases the interaction costs. This results in
favouring interactions over frontiers.

4.2.3. Distance and Penalty
The mixed cost matrix CRG which incorporates dis-
tances DRG and penalties PRG is represented in (8).

CRG = α · DRG + (1 −α) · PRG, α ∈ [0, 1]. (8)

157


N. Kalde, O. Simonin, F. Charpillet Acta Polytechnica

(a) empty (100 m2) (b) cave (144 m2) (c) structured (242 m2)

Figure 7. Environments.

0.9

1

1.1

1.2

e
m
p
ty

coverage (%)

25

30

35

40

45

distance (m)

50

60

70

80

90

time (s)

0.82

0.84

0.86

0.88

0.9

0.92

u
n
s
t
r
u
c
t
u
r
e
d

40

60

80

100

120

140

100

150

200

250

0 0.25 0.5 0.75 1

0.6

0.65

0.7

0.75

0.8

α

s
t
r
u
c
t
u
r
e
d

0 0.25 0.5 0.75 1

60

80

100

120

α
0 0.25 0.5 0.75 1

100

150

200

α

Figure 8. Local Greedy not dense: empty (top); unstructured (middle); structured (bottom).

Parameter α modulates the immediate distance cost
and the information coming from the penalty heuristic.
When α is high (resp. low), the importance of the
penalties is reduced (resp. increased). Distances and
penalties are counterbalanced with α, while σ sets
more or less focus on frontiers or on interactions.
We present the influence of α and σ on the cost

formula in (Fig. 6). The values range from 0 to 1 for
each parameter and the formula written on each side
is obtained when one parameter is set to its extreme
value.

We have adopted a mixed approach, and have de-
fined a parametric cost matrix based on a penalty
heuristic. Now, we evaluate the exploration perfor-
mance of this heuristic for two greedy optimisation
methods, assuming different values of α and σ.

5. Experimental Framework
We use the V-REP robotic simulator for our experi-
ments [19]. The environment is discretised with 0.5 m

square cells. The robots share their exploration map,
so the frontiers that are discovered are known by every
robot. Contiguous frontier cells are grouped together
into a frontier area. Inside a frontier area, the targeted
cell minimises the sum of distances to the other cells.
Assignments are locally computed by each robot.

To optimise its assignment, it takes into account the
entire set of frontiers known until now, but only the
robots and pedestrians perceived locally (within a
2 m radius). Planning is done using a potential field
propagated on the grid.

5.1. Protocol
The parameters are as follows:

• Map: Three environments are considered. The
first contains no obstacle (empty), the second has
unstructured obstacles (unstructured) and the third
environment is composed of a corridor and three
rooms (structured). Maps are shown in (Fig. 7).

158


vol. 55 no. 3/2015 Comparison of Classical and Interactive Multi-Robot Exploration

0.9

1

1.1

1.2

e
m
p
ty

coverage (%)

24

26

28

30

32

34

distance (m)

50

55

60

65

70

time (s)

0.8

0.9

1

u
n
s
t
r
u
c
t
u
r
e
d

40

60

80

100

120

100

150

200

0 0.25 0.5 0.75 1

0.66

0.68

0.7

α

s
t
r
u
c
t
u
r
e
d

0 0.25 0.5 0.75 1
40

60

80

100

120

140

160

α
0 0.25 0.5 0.75 1

100

150

200

250

α

Figure 9. Group Greedy not dense: empty (top); unstructured (middle); structured (bottom).

• Population density: Environments are human pop-
ulated with 0 or 30% of occupation. Each human
agent moves in a straight line and avoid obstacles
by stopping and rotating.

• Number of robots: Two explorers are used for each
experiment, they are represented as cylinders.

• Optimisation method: We use two different cost
optimisation methods:

The first one is a local greedy method, where each
robot chooses the minimum cost target among only
its own possible targets as in [11] for distances.

The second one is a group greedy method, where
for all locally visible robots at each time step, the
robot-target assignment with minimum cost is re-
cursively discarded until the local robot is assigned.

• Modulators: α and σ are discretised from 0 to 1
with a step of 0.25.

5.2. Metrics
Each scenario is evaluated with classical MRE metrics:
coverage, distance and time. In addition, we use a
common metric in HRI, called the Robotic Attention
Demand (RAD), which measures the autonomy of a
robot during its task [20, 21]. Here we consider the
number of interactions initiated during exploration.

5.3. Results
First, let us consider environments without humans.
We study the influence of α by fixing σ to 1. This al-
lows to only adjust the distance and frontiers penalty.
This is legitimate, since no human implies no inter-
action penalty. The performances averaged over 10

runs are plotted in Fig. 8 for local greedy, and Fig. 9
for group greedy.

For local greedy in Fig. 8, regarding the empty and
structured maps, we distinguish two steps. In the first
step, distance and time increase, and in the second
step they both decrease until α = 1. The unstructured
map for local greedy and all maps for group greedy
(Fig. 9) present only one step where distance and time
decrease when increasing α. In these cases, when α
is high, penalties fade and robots do less round trips
between remote frontiers in the scene. Thus, in non-
populated environments, our heuristic does not give
better performances.
Now we consider the presence of pedestrians. The

maps are populated at 30 % up to 1 human/m2, thus
enabling human-robot interactions. Figs. 10 and 11
give the mean performances of local greedy and group
greedy, respectively for 10 runs of each (α,σ) combina-
tion. When σ increases, the penalty of the interactions
is reduced, favouring interactions to the detriment of
frontiers.
For the empty case (Fig. 10), full coverage with

the shortest distance and time are at (α,σ) = (0, 0).
Only penalties are used, and frontiers are preferred
over interactions. No interaction was initiated (RAD),
but an average of 28 frontiers were assigned. In the
unstructured case, the best average performance is
at (0.75,0). Distances are overweighted compared
to penalties, and interactions are heavily penalised.
Nevertheless, an average of 18 interactions (RAD)
were initiated against 26 frontiers assignments. In
the structured environment, the best performances

159


N. Kalde, O. Simonin, F. Charpillet Acta Polytechnica

0.96
0.98

1
e
m
p
ty

coverage (%)

50

100

distance (m)

100
200

time (s)

0.8
0.85

u
n
s
t
r
u
c
t
u
r
e
d

50
100

100
200
300

0
0.25

0.5
0.75

1 0
0.25

0.5
0.75

1
0.5
0.6
0.7

α σ

s
t
r
u
c
t
u
r
e
d

0
0.25

0.5
0.75

1 0
0.25

0.5
0.75

1

100

α σ

0
0.25

0.5
0.75

1 0
0.25

0.5
0.75

1

200

300

α σ

Figure 10. Local Greedy dense: empty (top); unstructured (middle); structured (bottom).

and lowest standard deviations are at (0.5,0). Dis-
tances and penalties are equally weighted, and again
interactions are penalised. An average of 31 frontier
assignments and 18 interaction assignments took place.
Here, interactions are interesting, because a robot can
discover the corridor by following someone.
For group greedy (Fig. 11), the best performance

is located at (0.25, 0) for the empty scene. Penalties
have more weight than distances; frontiers are pre-
ferred over interactions. An average of 16 frontier
assignments and 7 interaction assignments is noticed.
The unstructured environment has maximum cover-
age, with minimum travelled distance and time at
(0.5,0). The average number of frontiers assigned is 8
times the number of interactions (only 4). For the last
map, the best average performance is at (0.25,0) for
a frontier/interaction ratio of 29/4. With these new
results, the distance does not suffice for choosing the
best targets with human presence. Instead, a smart
equilibrium with our penalty heuristic always gives
the best performance (α 6= 1). Here with (σ = 0), the
frontiers were chosen considering only distances, but
interactions were chosen carefully by adding heavy
penalties. Thus our heuristic is already sufficient for
selecting interactions only if necessary, but also it is
not yet able to promote them.

6. Conclusion
In this paper, we have defined the interaction-based
exploration by targeting the humans perceived by
the robots. Interactive exploration paves the way
for exploiting human natural heuristics, for a better
understanding of the dynamics of a populated envi-

ronment. The mixed approach, based upon frontier
and interactive exploration, aims at bringing out the
best of both approaches. For this purpose, we de-
signed a parametric heuristic to equilibrate frontiers
and interactions (pedestrian following) assignments.
This heuristic considers penalties for the idle state of
the targets (frontier, human) and their orientation.
We have shown in simulation that, in some cases,

incorporating an interactive aspect into exploration
can be beneficial, even with this simplistic heuristic.
To enable efficient dynamic exploration, it is therefore
paramount to discover these particular cases. In this
sense, machine learning and online tuning of weights
might be of interest for achieving a robotic heuristic
adaptation. This work opens up prospects for ex-
ploiting human adaptiveness in robotic exploration of
populated environments.

References
[1] W. Burgard, A. B. Cremers, D. Fox, et al.
Experiences with an interactive museum tour-guide
robot. Artificial intelligence 114(1):3–55, 1999.

[2] K. Kosuge, Y. Hirata. Human-robot interaction. In
Proceedings of the IEEE International Conference on
Robotics and Biomimetics, pp. 8–11. 2004.

[3] R. C. Arkin, M. Fujita, T. Takagi, R. Hasegawa. An
ethological and emotional basis for human-robot
interaction. Robotics and Autonomous Systems
42(3):191–201, 2003.

[4] M. P. Michalowski, S. Sabanovic, H. Kozima. A
dancing robot for rhythmic social interaction. In
Proceedings of the ACM/IEEE International Conference
on Human-Robot Interaction. 2007.

160


vol. 55 no. 3/2015 Comparison of Classical and Interactive Multi-Robot Exploration

0.98
0.99

1
e
m
p
ty

coverage (%)

50

100

distance (m)

100
200

time (s)

0.8
0.85

u
n
s
t
r
u
c
t
u
r
e
d

50

100

100
200
300

0
0.25

0.5
0.75

1 0
0.25

0.5
0.75

1
0.5
0.6
0.7

α σ

s
t
r
u
c
t
u
r
e
d

0
0.25

0.5
0.75

1 0
0.25

0.5
0.75

1
50

100

α σ

0
0.25

0.5
0.75

1 0
0.25

0.5
0.75

1

200
300

α σ

Figure 11. Group Greedy dense: empty (top); unstructured (middle); structured (bottom).

[5] M. Bennewitz, W. Burgard, G. Cielniak, S. Thrun.
Learning motion patterns of people for compliant robot
motion. The International Journal of Robotics Research
24(1):31–48, 2005.

[6] A. Dubois, A. Dib, F. Charpillet. Using hmms for
discriminating mobile from static objects in a 3d
occupancy grid. In Tools with Artificial Intelligence
(ICTAI), 2011 23rd IEEE International Conference on.
2011.

[7] D. Baronov, J. Baillieul. Reactive exploration through
following isolines in a potential field. In Proceedings of
the American Control Conference. 2007.

[8] R. Morlok, M. Gini. Dispersing robots in an unknown
environment. In Distributed Autonomous Robotic
Systems, pp. 253–262. 2007.

[9] M. Andries, F. Charpillet. Multi-robot exploration of
unknown environments with identification of
exploration completion and post-exploration
rendez-vous using ant algorithms. In Proceedings of the
IEEE/RSJ International Conference on Intelligent
Robots and Systems. 2013.

[10] E. Ferranti, N. Trigoni, M. Levene. Brick&Mortar:
an on-line multi-agent exploration algorithm. In
Proceedings of the IEEE International Conference on
Robotics and Automation. 2007.

[11] B. Yamauchi. A frontier-based approach for
autonomous exploration. In Proceedings of the IEEE
International Symposium on Computational Intelligence
in Robotics and Automation. 1997.

[12] J. Faigl, M. Kulich, L. Preucil. Goal assignment
using distance cost in multi-robot exploration. In
Proceedings of the IEEE/RSJ International Conference
on Intelligent Robots and Systems. 2012.

[13] A. Bautin, O. Simonin, F. Charpillet. MinPos : A
Novel Frontier Allocation Algorithm for Multi-robot
Exploration. In Proceedings of the 5th International
Conference on Intelligent Robotics and Applications.
2012.

[14] W. Burgard, M. Moors, C. Stachniss, F. E. Schneider.
Coordinated multi-robot exploration. Robotics, IEEE
Transactions on 21(3):376–386, 2005.

[15] L. Macedo, A. Cardoso. Exploration of unknown
environments with motivational agents. In Proceedings
of the Third International Joint Conference on
Autonomous Agents and Multiagent Systems. 2004.

[16] M. A. Goodrich, A. C. Schultz. Human-robot
interaction: a survey. Foundations and trends in
human-computer interaction 2007.

[17] H. Asoh, Y. Motomura, F. Asano, et al. Jijo-2: An
office robot that communicates and learns. IEEE
Intelligent Systems 16(5):46–55, 2001.

[18] H. Takeda, N. Kobayashi, Y. Matsubara, T. Nishida.
Towards ubiquitous human-robot interaction. In
Proceedings of the Working Notes for IJCAI Workshop
on Intelligent Multimodal Systems. 1997.

[19] E. Rohmer, S. P. Singh, M. Freese. V-rep: A
versatile and scalable robot simulation framework. In
Proceedings of the IEEE/RSJ International Conference
on Intelligent Robots and Systems. 2013.

[20] D. R. Olsen, M. A. Goodrich. Metrics for evaluating
human-robot interactions. In Proceedings of PERMIS.
2003.

[21] A. Steinfeld, T. Fong, D. Kaber, et al. Common
metrics for human-robot interaction. In Proceedings of
the 1st ACM SIGCHI/SIGART conference on
Human-robot interaction. 2006.

161


	Acta Polytechnica 55(3):154–161, 2015
	1 Introduction
	2 Related Work
	2.1 Multi-Robot Exploration
	2.2 Human-Robot Interaction

	3 Multi-Agent System Formalisation
	3.1 Modelling the Environment and the Agents
	3.2 Exploration and Completion
	3.3 Instantiating the Multi-Agent System

	4 Mixed Exploration Approach by Frontiers and Interactions
	4.1 Various Approaches for Multi-Robot Exploration
	4.1.1 Frontier-Based Exploration
	4.1.2 Interactive Exploration
	4.1.3 Mixed Exploration

	4.2 Mixed Cost Model
	4.2.1 Distance
	4.2.2 Penalty
	4.2.3 Distance and Penalty


	5 Experimental Framework
	5.1 Protocol
	5.2 Metrics
	5.3 Results

	6 Conclusion
	References