Acta Polytechnica doi:10.14311/APP.2016.56.0067 Acta Polytechnica 56(1):67–75, 2016 © Czech Technical University in Prague, 2016 available online at http://ojs.cvut.cz/ojs/index.php/ap ADAPTIVE DISTRIBUTION OF A SWARM OF HETEROGENEOUS ROBOTS Amanda Prorok∗, M. Ani Hsieh, Vijay Kumar General Robotics, Automation, Sensing & Perception (GRASP) Laboratory, University of Pennsylvania, Philadelphia, USA ∗ corresponding author: prorok@seas.upenn.edu Abstract. We present a method that distributes a swarm of heterogeneous robots among a set of tasks that require specialized capabilities in order to be completed. We model the system of heterogeneous robots as a community of species, where each species (robot type) is defined by the traits (capabilities) that it owns. Our method is based on a continuous abstraction of the swarm at a macroscopic level as we model robots switching between tasks. We formulate an optimization problem that produces an optimal set of transition rates for each species, so that the desired trait distribution is reached as quickly as possible. Since our method is based on the derivation of an analytical gradient, it is very efficient with respect to state-of-the-art methods. Building on this result, we propose a real-time optimization method that enables an online adaptation of transition rates. Our approach is well-suited for real-time applications that rely on online redistribution of large-scale robotic systems. Keywords: heterogeneous multi-robot systems; swarm robotics; stochastic systems; task allocation. 1. Introduction Technological advances in embedded systems, such as component miniaturization and improved efficiency of sensors and actuators, are enabling the deploy- ment of very large-scale robot systems, i.e., swarms of robots. However, the smaller we design our plat- forms, the more stringent the tradeoffs we need to make with respect to endowed capabilities. As a consequence, investigators are composing their robot systems with multiple, heterogeneous types of robots in order to tackle increasingly challenging tasks [1, 2]. Our premise is that, in a swarm of robots, one single type of robot cannot cater to all aspects of the task at hand, because at the individual level, it is governed by design rules that limit the scope of its capabilities. In this work, our objective is to distribute a swarm of heterogeneous robots as quickly and efficiently as possible among tasks that require specialized compe- tences. This objective is part of a larger vision to develop control and coordination strategies for teams of heterogeneous robots with specific capabilities. For example, a larger robot may be able to carry more powerful sensors, but may be less agile than its smaller counterpart. Or, we could consider the limited pay- load of aerial robots: If a given task requires rich sensory feedback, multiple heterogeneous aerial robots can complement each other by carrying distinct sen- sors, altogether more than a single one could carry on its own. Initially, we consider tasks that are to be performed in parallel, continuously, and indepen- dently, yet we develop a framework that can easily be extended to accommodate temporal and precedence constraints. Instances of information gathering lend themselves naturally to our problem formulation, with applications to surveillance, environmental monitor- ing, and situational awareness [3–5]. Given a set of tasks, and knowledge about what the task requirements are, our problem considers which robots should be allocated to which tasks. This prob- lem is an instance of the MT-MR-TA: Multi-Task Robots, Multi-Robot Tasks problem [6], and can be reformulated as a set-covering problem that stems from combinatorial optimization. This concept con- siders subsets of robots in a multi-robot system, and pairs them optimally (given a cost function) to tasks. This problem is strongly NP-hard [7]. A number of heuristic algorithms have been proposed. However, the running times of these algorithms are functions of the sizes of the feasible subsets (of robots paired to a task), and hence, become very expensive for large robot teams and swarms. Furthermore, the algorithms are penalized when multiple robot-subset-to-task com- binations are feasible (this is the case, for example, when robots have overlapping capabilities). These algorithms are not suitable for large-scale sys- tems, such as robot swarms. In particular, for systems that are required to adapt to changing task require- ments online, we need to consider algorithms that are efficient and implementable on mobile platforms that run them in real-time. Our method builds on previous work done in the domain of dynamic redistribution of homogenous robot swarms [8–10]. We consider a strategy that is scalable in the number of robots and their capabilities, and is robust to changes in the robot population. The key property of this strat- egy is its inherently decentralized architecture, with robots switching between behaviors (tasks) stochas- tically. This model is inspired by previous work in the swarm-robotic domain that explores self-organized behavior of natural systems [11, 12]. 67 http://dx.doi.org/10.14311/APP.2016.56.0067 http://ojs.cvut.cz/ojs/index.php/ap A. Prorok, M. Ani Hsieh, Vijay Kumar Acta Polytechnica The present work focuses on the optimization of transition rates that enables the robot swarm to con- verge quickly to a configuration that satisfies a de- sired trait distribution. The key difference between our work and previous work is that we formulate our desired state as a distribution of traits among sites, instead of specifying the desired state as a direct mea- sure of the robot distribution. In other words, our framework allows a user to specify how much of a given capability is needed for a given task, irrespec- tive of which robot type satisfies that need. As a consequence, we do not employ optimization methods that utilize final robot distributions in their formula- tions (which is the case in the works presented by [8] and [13]). Instead, we explicitly optimize the distribu- tion of traits, and implicitly solve the combinatorial problem of distributing the right number of robots of a given type to the right tasks. Indeed, we will later show that there are cases where multiple final robot distributions satisfy the desired trait distribution. In such cases, state-of-the-art strategies require that we first determine the best final robot distribution (one that can be reached the fastest), and subsequently optimize transition rates using methods such as in [8]. 2. Problem Formulation Heterogeneity and diversity are core concepts of this work. To develop our formalism, we will borrow termi- nology from biodiversity literature [14, 15]. We define our robot system as a community of robots. Each robot belongs to a species, defining the unique set of traits that encodes the robots’ capabilities. In this work, we will consider binary instantiations of traits (corresponding to the presence or absence of a given trait in a species). As an example, one trait might consider the presence/absence of a particular sensor, such as a camera or laser range finder. Another trait might consider the capability of fitting through a pas- sageway with a fixed width. In this work, we assume that the tasks have been encoded through binary char- acteristics that represent the skill sets critical to task completion. 2.1. Notation We consider a community of S robot species, with a total number of robots N, and N(s) robots per species such that ∑S s=1 N (s) = N. The community is defined by a set of U traits, and each robot species owns a subset of these traits. A species is defined by a vector q(s) = [q(s)1 ,q (s) 2 , ...,q (s) U ]. We can then define a S ×U matrix Q, with rows q(s): Qsu = { 0 if species s does not have trait u, 1 if species s has trait u. We model the interconnection topology of the M sites via a directed graph, G = (E,V) where the set of vertices, V, represents task sites {1, . . . ,M} and the set of edges, E, represents the ordered pairs (i,j), such that (i,j) ∈V×V, and i and j are adjacent. We assume the graph G is a strongly connected graph, i.e., a path exists between any pair of vertices (in contrast to a fully connected graph, where an edge exists between any pair of vertices). In other words, if the nodes in our graph are physically distributed sites, then, via some road, we can reach any other site. We assign every edge in E a transition rate, k (s) ij > 0, where k (s) ij defines the transition probability per unit time for one robot of species s at site i to switch to site j. Here k(s)ij is a stochastic transition rule. We assume every robot has knowledge of G as well as all the transition rates of its species k(s)ij . We note that this information is represented by a small number of values (at most M2 ·S values, or much less if the graph is sparse), and needs to be transmitted to the robots only once, at the start of a run. We impose a limitation on the maximum rate of each edge with k(s)ij < k (s) ij,max. These values can be determined by applying system identification methods on the actual system. For example, in a system where nodes represent physically distributed sites, the transition rate represents the rate with which a specific path is chosen. This value can depend on observed factors, such as typical road congestion or the condition of the terrain. The distribution of the robots belonging to a species s at time t is described by a vector x(s)(t) = [x(s)1 (t), ..., x (s) M (t)] >. Then, if x(s) are the columns of X(t), and q(s) are the rows of Q, we have the M ×U matrix Y that describes the distribution of traits on sites. For time t this relationship is given by Y(t) = X(t) · Q (1) As we will see in Section 2.3, there may be several robot distributions X(t) that satisfy this equation for a given Y(t). 2.2. Problem Statement The initial state of the system is described by X(0), and hence, the initial configuration of traits at the sites is described by Y(0). The time evolution of the number of robots of species s at site i is given by a linear law dx(s)i dt = ∑ ∀j|(i,j)∈E kjix (s) i (t) − ∑ ∀j|(i,j)∈E kijx (s) i (t). (2) Then, for all species s, our base model is given by dx(s) dt = K(s)x(s) ∀s ∈ 1, . . . ,S, (3) where K(s) ∈ RM×M is a rate matrix with the prop- erties K(s) > 1 = 0, (4) K(s)ij ≥ 0 ∀(i,j) ∈E. (5) 68 vol. 56 no. 1/2016 Adaptive Distribution of a Swarm of Heterogeneous Robots These two properties result in the following definition: K(s)ij =   k (s) ji , if i 6= j, (i,j) ∈E, 0, if i 6= j, (i,j) /∈E, − ∑M i=1,(j,i)∈E k (s) ij if i = j. Since the total number of robots and the number of robots per species is conserved, the system in Eq. 3 is subject to the constraint X> · 1 = [N(1),N(2), . . . ,N(S)]>. (6) The goal is to find an optimal rate matrix K(s)F for each species s so that we have Ȳ = X̄ · Q. (7) In other words, the task is to redeploy the robots of each species configured according to X(0) initially, so that a desired trait configuration Ȳ is reached. In doing this, we reach a robot configuration X̄ that satisfies Eq. 1, subject to Eq. 6. 2.3. System Properties Since we describe the desired configuration of our system through Ȳ, the final robot distribution X̄ is not known a priori. Given knowledge of Q we can infer the following properties: If a solution to Eq. 7 subject to Eq. 6 exists, then (1.) If rank(Q) < S, the system is underdetermined, and an infinite number of solutions X̄ will satisfy Eq. 7. In other words, at least one species in the system can be replaced by a combination of the other species. (2.) If rank(Q) = S, only one solution X̄ exists that satisfies Eq. 7. In other words, no species in the system can be expressed as a combination of the other species. We note that solving case (1) is relevant when we embed redundancy into the robot system, and case (2) is relevant when we consider fully complementary robot species. 3. Methodology In this section, we describe our methodology for ob- taining an optimal transition matrix K(s)F for each species so that the desired trait distribution is reached. Berman et al. [8] present an exposé of optimization methods that can be used to obtain optimal transition rates for a homogenous robot swarm that is required to converge to a desired distribution. Two general approaches are considered: convex optimization and stochastic optimization. The convex optimization ap- proach requires knowledge of the desired final robot distribution. Indeed, our problem formulation spec- ifies a desired trait distribution Ȳ without explicit definition of the final robot distribution X̄. Hence, convex optimization strategies as in [8] are not appli- cable to our problem, unless rank(Q) = S, and we can infer X̄. Given this rationale, we choose an optimiza- tion approach that is able to find optimal transition rates with knowledge of Ȳ and X(0), without knowl- edge of X̄. Although fully stochastic schemes such as Metropolis optimization have been shown to pro- duce similar results [8], they are not computationally efficient, and are ill-suited to real-time applications. In the following, we present a differentiable objective function that can be efficiently minimized through gra- dient descent techniques. We show that our method has a computational complexity that is well-suited to real-time applications. Additionally, our method explicitly minimizes the convergence time of K(s), un- like the convex optimization methods presented in [8] which approximate K(s) with a symmetric equivalent (forcing bidirectionally equal transition rates between sites). 3.1. Design of Optimal Transition Rates We combine the solution of the linear ordinary differ- ential equation, Eq. 3, and Eq. 7 to obtain the solution for a desired trait distribution Ȳ = S∑ s=1 eK (s)Fτ x(s)0 · q(s), (8) where τ is the time at which the desired state is reached. We design our objective function as follows. To find the values of K(s)F for all species for the given initial configuration, X(0), we consider the following optimization minimize J (1) = ∥∥∥∥Ȳ − S∑ s=1 eK (s)τ x(s)0 · q(s) ∥∥∥∥2 F (9) such that k(s)ij < k (s) ij,max, which formulates that a minimum cost is found when the final trait distribution corresponds to the desired trait distribution, subject to maximum transition rates k (s) ij,max. The notation x (s) 0 is shorthand for x(s)(0). The operator ‖ · ‖F denotes the Frobenius norm of a matrix. There is no closed-form solution to the optimization problem in Eq. 9, but we can use the derivatives of J (1) with respect to the parameters to perform gradient descent. So that the implementation of the optimization function is efficient, it is impor- tant that the function is differentiable and that an analytical gradient can be computed. By applying the chain rule, the derivative of our objective function with respect to the transition matrix K(s) is ∂J (1) ∂K(s) = ∂J (1) ∂eK (s)τ · ∂e K(s)τ ∂K(s)τ · ∂K (s)τ ∂K(s) . (10) We first compute the derivative of the cost with respect to the expression eK (s)τ: ∂J (1) ∂eK (s)τ = 2 [∑ r∈S eK (r)τ x(r)0 · q(r) − Ȳ ] · [ x(s)0 · q(s) ]> . (11) 69 A. Prorok, M. Ani Hsieh, Vijay Kumar Acta Polytechnica The derivation of the 2nd element of Eq.10 requires the derivative of the matrix exponential. Computing the derivative of the matrix exponential is not trivial. We adapt the closed-form solution given in [16] to our problem, and write the gradient of our cost function as ∂J (1) ∂K(s) = V−1> [ V> ∂J (1) ∂eK (s)τ V−1> � W(τ) ] V>τ, (12) where � is the Hadamard product, K(s) = VDV−1 is the eigendecomposition of K(s). V is the M ×M matrix whose jth column is a right eigenvector corre- sponding to eigenvalue di, and D = diag(d1, . . . ,dM ). The matrix W(t) is composed as follows 1 W(t) = { (edit −edjt)/(dit−djt) if i 6= j, edit if i = j. 3.2. Optimization of Convergence Time The cost function in Eq. 9 does not consider conver- gence time τ as a variable. By adding a term that pe- nalizes high convergence time values, we can compute transition rates that explicitly optimize convergence time. The modified objective function is minimize J (2) = J (1) + ατ2 (13) such that k(s)ij < k (s) ij,max and τ > 0, and α > 0. By increasing α, we increase the impor- tance of the convergence time (by penalizing high values of τ). The derivative with respect to the tran- sition rates is ∂J (2) ∂K(s) = ∂J (1) ∂K(s) . (14) In order to optimize the convergence time, we need the derivative with respect to τ. This derivative is computed analogously to the derivative with respect to K(s) (confer Eq. 12). We have ∂J (2) ∂τ = ∂J (1) ∂τ + 2ατ (15) with ∂J (1) ∂τ = S∑ s=1 1>V−1>A1V> · K(s)1 (16) and A1 = V> ∂J (1) ∂eK (s)τ V−1> � W(τ). (17) The optimization of Eq. 13 produces transition rates that lead to the desired trait distribution quickly, but there is no guarantee that this is the steady state of K(s). If we compute the transition rates at the outset 1Here, we assume that that K(s) has M distinct eigenvalues. If this is not the case, an analogous decomposition of K(s) to Jordan canonical form is possible, as elaborated in [16]. We note that for most models of interest, however, this is rarely the case. of the experiment (without refining them online), we may wish to ensure that the state reached at the optimal time tF remains near-constant. Hence, we modify our cost function in Eq. 13 as follows. minimize J (3) = J (2) + β S∑ s=1 ∥∥eK(s)τ x(s)0 (18) −eK (s)(τ+ν)x(s)0 ∥∥2 2 such that k(s)ij < k (s) ij,max and τ > 0, and β > 0. The additional term in our cost func- tion allows us to ensure that the robot distribution reached by employing K(s)F remains near-constant for arbitrarily long time intervals ν. This is possible because our model in Eq. 3 is stable [9], and the dif- ference between the current robot distribution and the one at steady state can only decrease monoton- ically over time. By increasing the value of β, the difference of the robot distributions at times τ and τ + ν is decreased. In other words, as we will see in Section 4, the trait distribution corresponding to the steady state robot distribution of K(s) gets arbitrarily close to the desired trait distribution Ȳ as β increases (the same is true when we increase ν). Note that when α = 0, β should not be infinitely large, as in this case, K(s)F = 0. However, for all practical purposes β is bounded and α > 0. Let us refer to this additional third term of J (3) (and second term of Eq. 18) as J (3,3). Then, the derivative of the new objective function with respect to the transition rates can be expressed as ∂J (3) ∂K(s) = ∂J (2) ∂K(s) + ∂J (3,3) ∂K(s) . (19) Again, we apply the chain rule to obtain ∂J (3,3) ∂K(s) = ∂J (3,3) ∂eK (s)τ ∂eK (s)τ ∂K(s)τ ∂K(s)τ ∂K(s) (20) − ∂J (3,3) ∂eK (s)(τ+ν) ∂eK (s)(τ+ν) ∂K(s)(τ + ν) ∂K(s)(τ + ν) ∂K(s) . The outer derivative is ∂J (3,3) ∂eK (s)τ = ∂J (3,3) ∂eK (s)(τ+ν) (21) = 2β [ eK (s)τ x(s)0 −eK (s)(τ+ν)x(s)0 ] · x(s)0 > . We apply the same development as in Eq. 12 to obtain the equation ∂J (3,3) ∂K(s) = V−1>A2V> · τ − V−1 >A3V> · (τ + ν) (22) with A2 = V> · ∂J (3,3) ∂eK (s)τ · V−1> � W(τ) (23) 70 vol. 56 no. 1/2016 Adaptive Distribution of a Swarm of Heterogeneous Robots and A3 = V> · ∂J (3,3) ∂eK (s)(τ+ν) · V−1> � W(τ + ν). (24) For all above cost functions, z = 1, 2, 3, the derivative with respect to the off-diagonal elements ij of the matrix K(s), with (i,j) ∈E, is ∂J (z) ∂K(s)ij = { ∂J (z) ∂K(s) } ij − { ∂J (z) ∂K(s) } jj , (25) where {·}ij denotes the element on row i and column j. The derivative with respect to time τ is analogous. Finally, we summarize our optimization problem as follows: K(s)F,τF = argmin K(s),τ J (3), (26) under the constraints shown in Eq. 18. To solve the system, we implement a basin-hopping optimiza- tion algorithm [17], which attempts to find the global minimum of a smooth scalar function. Locally, our basin-hopping algorithm uses a quasi-Newton method (namely, the Broyden-Fletcher-Goldfarb-Shanno algo- rithm [18] with bound constraints). 3.3. Computational Complexity The computational complexity of computing the gra- dient of our objective function is O(S ·M3 +S ·M2·U). The first part of this complexity is dictated by the eigenvalue decomposition, which is known to be O(M3) for non-sparse matrices [19]2. We compute this decomposition only once per optimization (see Eq. 12, where K(s) = VDV−1), for each optimization of K(s). The second part is dictated by the multipli- cation of the matrices in Eq. 12, for which the cost is O(M2 · U). Globally speaking, the computation grows linearly with the number of species and traits, and it grows cubically with respect to the number of tasks. Overall, for the results shown in Section 4, the average time to compute the gradient for a system with M = 8, U = 4, and S = 4 is around 1.35 ms with ν = 0, and 2.2 ms with ν > 0 (the number of param- eters to optimize can be as large as 225 in this case, depending on the graph’s adjacency matrix). The code was implemented in Python using the NumPy and SciPy libraries, and tested on a 2 GHz Intel Core i7 using a single CPU. 3.4. Continuous Optimization of K(s) As shown above, the optimization of the objective function J (3) is efficient and can be performed in real- time. Building on this result, we can implement a continuous, online optimization strategy that allows us to refine the optimal K(s)F as a function of the current state. In noisy systems, where the trajectories of individual agents exhibit deviations from predicted 2In the special case where all eigenvalues are dis- tinct, the eigenvalue decomposition can be reduced to O(M 2.376 log(M )) [20]. macroscopic trajectories, this strategy inevitably leads to an improvement of the convergence time. By taking the actual robot distribution into account, it can re- compute updated optimal transition rates. Practically, we initially compute K(s)F at time t = 0 to control the system over a finite period δ from t = 0 to t = δ. After that period (at time t = δ), we optimize a new value of K(s)F that controls the system for the next period, as a function of the actual robot distribution that was encountered at time t = δ. This process can be repeated indefinitely. The value δ is called the sampling time. Formally, we write our control policy as K(s)F(t),τF(t) = argmin K(s),τ J̃ (3)(X(tp)) (27) with tp ≤ t < tp + δ tp ∈ kδ,k ∈ N, where we rewrite our cost equation as a function of the robot distribution J̃ (3)(X) = ∥∥∥Ȳ − S∑ s=1 eK (s)τ x(s) · q(s) ∥∥∥2 F + ατ2 (28) + β S∑ s=1 ∥∥∥eK(s)τ x(s) −eK(s)(τ+ν)x(s)∥∥∥2 2 . Note that J̃ (3)(X(0)) is equal to J (3). In practice, if δ is small with respect to the rates at which robots transition between sites (cf. kij,max), we set β = 0, since the continuous optimization of K(s) enforces a current state that is close to the desired state, irre- spective of the steady-state distribution. For cases where the optimization time becomes large (implying that δ also becomes large), we either need to set β > 0, or use a strategy that accounts for computation delay, such as those presented in [21]. Also, we note that we can accelerate the computations by setting the initial values of the present sampling window with optimized values of the preceding sampling window (i.e., warm start). 4. Results Previous work has shown the benefit of validating methods over multiple, complementary levels of ab- straction (sub-microscopic, microscopic, and macro- scopic) [22]. In the present work, we propose an evaluation of our methods on two levels: microscopic and macroscopic. Indeed, the most efficient way of simulating the swarm of robots is by considering a continuous macroscopic model, derived directly from the ordinary differential equation, Eq. 3. In order to validate the control policy at a lower level of ab- straction, we also implement a discrete microscopic model that emulates the behavior of individual robot controllers. This agent-level control is based on the transition rates k(s)ij encoded by the transition matrix K(s): A robot of species s at site i transitions to site 71 A. Prorok, M. Ani Hsieh, Vijay Kumar Acta Polytechnica ©5 ©6 (a) 5 ©6 (b) Figure 1. A strongly connected instance of a graph with 10 nodes representing spatially distributed sites (nodes) and their paths (edges). The system includes 4 traits. The trait abundance at each node is represented by a bar plot. Nodes 5 and 6 are highlighted. (a) Initial configuration (b) Desired final configuration. j according to probability p(s)ij that is an element of the matrix P(s) = eK (s)∆T , where ∆T is the dura- tion of one time-step. Running multiple iterations of the microscopic model enables us to capture the stochasticity resulting from our control system. Our performance metric considers the degree of con- vergence to Ȳ, expressed by the fraction of misplaced traits µ(Y) = ‖(Y − Ȳ)‖1 ‖Y‖1 . (29) We say that one system converges faster than another if it takes less time for µ(Y) to decrease to some relative error µthresh, such as µthresh = 5%. Similar performance metrics have been proposed in [8–10]. We will consider three optimization methods, two of which stem from this paper, and one of which stems from [8]: Fixed-NC — We consider the time-constant, non- convex optimization problem posed in Eq. 26, with α = 1, β = 5, and ν = 2, producing a fixed K(s)F for each species. These values were not tuned in any way to improve performance. Adaptive-NC — We consider the time-continuous optimization problem in Eq. 27, with α = 1, β = 0, and a sampling time δ = 0.08 s, producing an adaptive control policy K(s)F(t) for each species. Fixed-C — We adapt the convex optimization method presented in [8], denoted in the latter work as [P1]. This adapted method optimizes the asymptotic convergence rate (of a system of ho- mogenous robots) by minimizing the second eigen- value λ2 of a symmetric matrix S(s), such that λ2(S(s)) ≥ Re(λ2(K(s))). Since this method re- quires the knowledge of the desired species distri- bution X̄, we compute a random instantiation of X̄ that satisfies the desired trait distribution defined by Eq. 7. We note that in practical applications, computing a good instantiation of X̄ is not trivial. For all optimization methods, we set k(s)ij,max = 2 [s −1]. 4.1. Example To illustrate our method, we consider an example of N = 3893 robots moving between 10 sites. We sample a random initial robot distribution X(0), and generate a random, feasible desired trait distribution Ȳ. The initial trait distribution is visualized in Fig. 1(a), and the desired trait distribution is visualized in Fig. 1(b). The graph is generated randomly according to the Watts-Strogatz model [23] (with a neighboring node degree of K = 3, and a rewiring probability of γ = 0.6; the graph is guaranteed to be connected). The robot community consists of 3 species and 4 traits, and is defined as follows: Q =   1 0 1 01 0 0 1 0 1 0 1   with X> · 1 = [987, 1490, 1416]> (30) We solve the system for Ȳ as shown in Fig. 1(b). We show the evolution of the robot distribution in Fig. 2(a) and trait distribution in Fig. 2(b)—to avoid clutter, we plot two selected sites (5 and 6) and one selected trait (corresponding to the 4th trait, shown as the top bar in cyan in Fig. 1). The plots demonstrate a good agreement between the macroscopic model and the discrete microscopic model. Fig. 3 shows the ratio of misplaced traits µ(Y) over time for the initial and desired trait distributions de- picted in Fig. 1. The macroscopic model demonstrates that the system successfully achieves a negligible er- ror. We run 20 iterations of the discrete microscopic model, once with method Fixed-NC, and once with 72 vol. 56 no. 1/2016 Adaptive Distribution of a Swarm of Heterogeneous Robots Microscopic Macroscopic N um be r of ro bo ts Time [s] 6 5 (a) Microscopic Macroscopic N um be r of tr ai ts Time [s] 6 5 (b) Figure 2. The plot shows the macroscopic model as well as the average over 20 iterations of the microscopic model, on the graph shown in Fig. 1. We plot values for nodes 5 and 6, which are highlighted in the graph in Fig. 1. (a) Number of robots of species 1 present at nodes 5 and 6 (b) Number of trait 4 (top bar in cyan in Fig. 1) present at nodes 5 and 6. µ (Y ) Time [s] Mic. Fixed-NC Mic. Adapt.-NC Mac. Fixed-NC Figure 3. Ratio of misplaced traits over time for the initial and desired trait distributions depicted in Fig. 1. The simulation was run with 3893 robots and a total of 7786 present traits. The plot shows the macroscopic model as well as the average over 20 iterations of the microscopic model with and without continuous optimization of transition rates. The errorbars show the standard deviation. method Adaptive-NC. Since the adaptive optimiza- tion method takes the current robot configuration into account, it produces transition rates that lead to the desired configuration faster. Also, since it continuously optimizes toward the desired final trait distribution, the error remains low. We note that, due to the stochasticity of the microscopic model, the error ratio (which counts absolute differences) will always be larger than 0. 4.2. Comparison of Methods We compare the three optimization methods, Fixed- NC, Adaptive-NC, Fixed-C and evaluate their per- formance with respect to the metric in Eq. 29. We instantiate 40 random graphs with M = 6 nodes, and random matrices Q with S = 4 species and U = 4 traits, and generate random desired trait distributions Ȳ for each graph. The microscopic model is iterated 4 times on each graph instantiation. For the method Fixed-C, we compute a random robot distribution X̄ that satisfies the desired trait distribution. We mea- sure the time tµ,thresh at which the system converges to a value µthresh = 5% of misplaced traits. Since we sample random matrices Q, we obtain different rank values. Fig. 4(a) shows results for rank(Q) = 3, i.e., a system with redundant species, and Fig. 4(b) shows results for rank(Q) = 4, i.e., a system with comple- mentary species (cf. the description in Sec. 2.3). 73 A. Prorok, M. Ani Hsieh, Vijay Kumar Acta Polytechnica t µ ,t hr es h [s ] Fixed-C Fixed-NC Adapt.-NC (a) Fixed-C Fixed-NC Adapt.-NC (b) Figure 4. The plot shows the convergence time of three optimization methods, evaluated on the microscopic model, with tµthresh for µthresh = 0.05, for 40 random graphs with M = 6 and random matrices Q with 4 species and 4 traits. The microscopic model was iterated 4 times over each graph instantiation. (a) rank(Q) = 3 (b) rank(Q) = 4. The boxplots show the median and the 25th and 75th percentiles. The plots show that our method Fixed-NC is able to improve upon Fixed-C: for rank(Q) = 4 by 16%, and for rank(Q) = 3 by 25%. The stronger improve- ment in the lower-rank case points towards the impor- tance of finding a good final robot distribution X̄ when several are possible that satisfy Eq. 7. The results for Adaptive-NC confirm the fast convergence towards desired trait distributions, with a 75% improvement over Fixed-NC in both cases. It is clear that this method outperforms the other two methods because of its ability to take into account the current state of the robot distribution, and to adapt the transition rates as a function of this. Finally, we compute the error obtained by our method Fixed-NC by comparing the analytical steady-state distribution of traits (obtained by tak- ing the eigenvectors that correspond to the zero- eigenvalues of each rate matrix K(s) and multiply- ing them by Q) with the desired trait distribution Ȳ. The median, 90th percentile and maximum error from the steady-state to the desired trait distribution are 0.108%, 0.572% and 0.812%, respectively. These results demonstrate that, despite the fact that our method is not explicitly optimizing for the steady- state, it reaches a steady-state error smaller than system noise (at steady-state). 5. Conclusion We present a method that distributes a swarm of het- erogeneous robots among a set of tasks with the goal of satisfying a desired distribution of robot capabilities among those tasks. We propose a formulation for het- erogeneous robot systems through species and traits, and show how this formulation is used to achieve an optimal distribution of robots as a function of a de- sired final trait configuration. To find the optimal transition rates, we pose an optimization problem, and develop a solution based on an analytical gradient that is computationally efficient and capable of pro- ducing fast convergence times. Building on this result, we propose a variant real-time optimization method that enables an online adaptation of transition rates as a function of the state of the current robot distri- bution. We validate our approach on random graph instantiations, and show that our baseline method outperforms a classical alternative approach. Also, we show how, when using the variant adaptive optimiza- tion, a significant gain in convergence speed is made. We believe that this method is well-suited to appli- cations that control large-scale teams of robots that need to converge quickly to desired configurations as a function of their capabilities, and that need to adapt to changes in real-time. Future work will include a more in-depth study of the implications of diversity in swarm-robotic systems, as well as an implementation of the proposed framework on real robots. Acknowledgements We gratefully acknowledge the support of ONR grants N00014-15-1-2115 and N00014-14-1-0510, ARL grant W911NF-08-2-0004, NSF grant IIS-1426840, and Ter- raSwarm, one of six centers of STARnet, a Semiconductor Research Corporation program sponsored by MARCO and DARPA. 74 vol. 56 no. 1/2016 Adaptive Distribution of a Swarm of Heterogeneous Robots References [1] T. Balch, L. E. Parker. Special issue on Heterogeneous Multi-Robot Systems. Autonomous Robots 8:207–383, 2000. [2] E. G. Jones, B. Browning, M. B. Dias, et al. Dynamically Formed Heterogeneous Robot Teams Performing Tightly Coordinated Tasks. International Conference on Robotics and Automation (ICRA) pp. 1–8, 2006. doi:10.1109/ROBOT.2006.1641771. [3] B. Charrow. Information-theoretic active perception for multi-robot teams. Ph.D. thesis, University of Pennsylvania, 2015. [4] A. M. Hsieh, A. Cowley, F. J. Keller, et al. Adaptive teams of autonomous aerial and ground robots for situational awareness. Journal of Field Robotics 24:991–1014, 2007. doi:10.1002/rob.20222. [5] P. Tokekar, J. Vander Hook, D. Mulla, V. Isler. Sensor Planning for a Symbiotic UAV and UGV system for Precision Agriculture. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2013. doi:10.1109/IROS.2013.6697126. [6] B. P. Gerkey, M. J. Mataric. A Formal Analysis and Taxonomy of Task Allocation in Multi-Robot Systems. Interntional Journal of Robotics Research 23(9):939–954, 2004. doi:10.1177/0278364904045564. [7] B. Korte, J. Vygen. Combinatorial Optimization: Theory and Algorithms. Springer-Verlag, Berlin., 2000. [8] S. Berman, Á. Halasz, M. A. Hsieh, V. Kumar. Optimized Stochastic Policies for Task Allocation in Swarms of Robots. IEEE Transactions on Robotics 25:927–937, 2009. doi:10.1109/TRO.2009.2024997. [9] Á. Halasz, M. A. Hsieh, S. Berman, V. Kumar. Dynamic Redistribution of a Swarm of Robots Among Multiple Sites. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2007. doi:10.1109/IROS.2007.4399528. [10] M. A. Hsieh, Á. Halasz, S. Berman, V. Kumar. Biologically inspired redistribution of a swarm of robots among multiple sites. Swarm Intelligence 2(2-4):121–141, 2008. doi:10.1007/s11721-008-0019-z. [11] M. J. B. Krieger, J. B. Billeter, L. Keller. Ant-like task allocation and recruitment in cooperative robots. Nature 406:992–995, 2000. [12] T. H. Labella, M. Dorigo, J.-L. Deneubourg. Division of labor in a group of robots inspired by ants’ foraging behavior. ACM Transactions on Autonomous Adaptive Systems 1:4–25, 2006. [13] L. Matthey, S. Berman, V. Kumar. Stochastic strategies for a swarm robotic assembly system. In IEEE International Conference on Robotics and Automation (ICRA), pp. 1953–1958. IEEE, 2009. doi:10.1109/ROBOT.2009.5152457. [14] D. Tilman. Functional Diversity. Encyclopedia of Biodiversity 3:109–120, 2001. [15] O. L. Petchey, K. J. Gaston. Functional diversity: back to basics and looking forward. Ecology Letters 9(6):741– 758, 2006. doi:10.1111/j.1461-0248.2006.00924.x. [16] J. D. Kalbfleisch, J. F. Lawless. The Analysis of Panel Data Under a Markov Assumption. Journal of American Statistical Association 80:863–871, 1985. [17] D. J. Wales, J. P. K. Doye. Global optimization by basin-hopping and the lowest energy structures of Lennard-Jones clusters containing up to 110 atoms. arXivorg pp. 1–8, 1998. [18] A. Mordecai. Nonlinear programming: analysis and methods. Courier Corporation, 2003. [19] J. Demmel, I. Dumitriu, O. Holtz. Fast Linear Algebra is Stable. arXivorg pp. 1–26, 2007. [20] V. Y. Pan, Z. Q. Chen. The Complexity of the Matrix Eigenproblem. ACM Symposium on Theory of Computing pp. 507–516, 1999. [21] M. Milam, R. Franz, J. Hauser, M. Murray. Receding horizon control of vectored thrust flight experiment. IEE Proceedings on Control Theory and Applications 152:340–348, 2015. [22] A. Martinoli. Swarm Intelligence in Autonomous Collective Robotics: From Tools to the Analysis and Synthesis of Distributed Collective Strategies. Ph.D. thesis, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1999. [23] D. J. Watts, S. H. Strogatz. Collective dynamics of ‘small-world’ networks. Nature 393:440–442, 1998. 75 http://dx.doi.org/10.1109/ROBOT.2006.1641771 http://dx.doi.org/10.1002/rob.20222 http://dx.doi.org/10.1109/IROS.2013.6697126 http://dx.doi.org/10.1177/0278364904045564 http://dx.doi.org/10.1109/TRO.2009.2024997 http://dx.doi.org/10.1109/IROS.2007.4399528 http://dx.doi.org/10.1007/s11721-008-0019-z http://dx.doi.org/10.1109/ROBOT.2009.5152457 http://dx.doi.org/10.1111/j.1461-0248.2006.00924.x Acta Polytechnica 56(1):67–75, 2016 1 Introduction 2 Problem Formulation 2.1 Notation 2.2 Problem Statement 2.3 System Properties 3 Methodology 3.1 Design of Optimal Transition Rates 3.2 Optimization of Convergence Time 3.3 Computational Complexity 3.4 Continuous Optimization of K(s) 4 Results 4.1 Example 4.2 Comparison of Methods 5 Conclusion Acknowledgements References