Analysis of Rate-Based Pull and Push Strategies with Limited Migration Rates in Large Distributed Networks W. Minnebo and B. Van Houdt Department of Mathematics and Computer Science University of Antwerp - iMinds Middelheimlaan 1, B-2020 Antwerp, Belgium {wouter.minnebo,benny.vanhoudt}@uantwerpen.be ABSTRACT In this paper we analyze the performance of pull and push strategies in large homogeneous distributed systems where the number of job transfers per time unit is limited. Job transfer strategies which rely on lightly-loaded servers to attract jobs from heavily-loaded servers are known as pull strategies, whereas for push strategies the heavily loaded servers initiate the job transfers to lightly loaded servers. To this end, servers transmit probe messages to discover other servers that are able to take part in a job transfer. Previous work on rate-based pull and push strategies focused on the impact of the probe rate on the mean job response time. In this paper we also limit the overall migration rate and show that any predefined migration rate can be matched by both the rate-based pull and push strategies. We present closed form formulas for the mean response time (as a func- tion of the allowed probe and migration rate) and validate their accuracy by simulation. We also introduce and analyze a new pull strategy and show that under high loads it is superior to the push strategies considered, while the push strategies offer only a very lim- ited gain for medium to low load scenarios. Categories and Subject Descriptors C.4 [Computer Systems Organization]: Performance of Systems; D.4.8 [Operating Systems]: Performance Keywords Distributed computing, performance analysis, processor schedul- ing 1. INTRODUCTION In order to optimally use the available resources in a dis- tributed network it is desirable to be able to dynamically relocate jobs among a large number of processing nodes. Jobs may enter the network via one or multiple central dispatchers (e.g., [4,10,12,14,15]) or via the processing nodes themselves (e.g., [2, 3, 8, 13]). A central dispatcher will dis- tribute the jobs among the nodes using some load balancing algorithm. In a more distributed approach, the nodes them- selves will arrange for jobs to relocate after they are sched- uled. Two approaches are common: push and pull. In a push variant (or work sharing) highly loaded nodes attempt to find lightly loaded nodes to migrate jobs to. Pull variants (or work stealing) reverse the roles, so that lightly loaded nodes try to attract work from the highly loaded nodes. Several authors studied the performance of push and pull strategies. A comparison for a homogeneous distributed sys- tem with Poisson arrivals and exponential job lengths was presented in [1, 2] and extended to heterogeneous systems in [9, 11]. These studies showed that the pull strategy is su- perior under high load conditions, while the push strategy achieves a lower mean delay under low to moderate loads. Nodes typically communicate by means of probe messages, exchanging information such as queue length. For simplic- ity we assume that sending/receiving probe messages is in- stantaneous and does not incur an extra computational or bandwidth cost. When a node wants to push or pull a job, it probes a random other node to see if a transfer between the nodes would be allowed. Under a traditional pull or push strategy a server sends a maximum of Lp probes the instant its last job completes or the instant a job arrives when the server is already busy [1,2]. The fraction of queues sending probe messages is different, and as a result pull and push strategies achieve a different overall probe rate for the same load of the system. This makes a performance comparison biased, as sometimes the strategy with the higher probe rate is best [5]. In [5] rate-based pull and push variants are introduced that can match any predetermined probe rate R, allowing the comparison of pull and push strategies when they use the same number of probes. In these variants, probes are no longer sent at job arrival or completion times but at a fixed rate r as long as the server is idle (for pull) or has jobs waiting (for push). The main result in [5] showed that the rate-based push strategy results in a lower mean delay if and only if λ < √ (R + 1)2 + 4(R + 1) − (R + 1) 2 , under the so-called infinite system model and that a hybrid pull/push strategy is always inferior to the pure pull or push strategy. VALUETOOLS 2015, December 14-16, Berlin, Germany Copyright © 2016 ICST DOI 10.4108/eai.14-12-2015.2262564 In [6] the model was extended with an extra parameter T, where a node is considered highly loaded if it has more than T jobs. This allowed the construction of the max-push strat- egy that extended the range of λ values where the push vari- ants outperformed the pull strategy. All prior work, including [5, 6], assumed zero cost for job transfers, which is not always realistic. When jobs are dif- ficult to migrate, it would be desirable to be able to limit migrations to a predefined overall migration rate M, while not exceeding the predefined overall probe rate R. This paper makes the following contributions: 1. We indicate how to set the parameter r (and T) of the push, pull and max-push strategy to match any predefined migration rate M. 2. We argue that setting T = 1 for the pull strategy is no longer optimal when an overall migration limit M is considered, as was the case in [6] and introduce a new pull strategy, called the conditional-pull strategy. 3. We show that the conditional-pull strategy is equiva- lent in stationary queue length distribution to the max- push variant when the overall probe rate R tends to infinity, i.e., when only an overall migration limit M is considered. 4. We consider a system where both an overall probe limit (R) and an overall migration limit (M) are imposed. For this system we compare the performance of push and pull strategies. We find that even for moderate R the conditional-pull performs almost as well as the max-push for low to moderate loads, and performs sig- nificantly better for higher loads. The paper is structured as follows. Section 2 summarizes the rate-based strategies considered in this paper. In Sec- tion 3 we briefly summarize earlier work concerning rate- based pull and push strategies, and introduce an overall mi- gration limit M for these strategies. Also, we derive an ex- pression for the corresponding maximum probe rate rboth|M for the rate-based push and pull, and rewrite the mean delay in an equivalent form. In Section 4 we adapt the max-push strategy to match M by finding the corresponding probe rate rmp|M , after summarizing earlier work. Section 5 con- siders pull strategies with T > 1, and introduces the new conditional-pull strategy. It is shown that the conditional- pull is equivalent to the max-push strategy in case there is no probe limit R and only a migration limit M. In addi- tion, the infinite system model describing the evolution of the conditional pull strategy is numerically validated, and argued to be the proper limiting process as the system size tends to infinity. Finally, we compare the mean delay of max-push and conditional-pull in Section 6. 2. RATE BASED STRATEGIES We consider a continuous-time system of N queues, where each queue has a single server and infinite buffer. Each queue operates under Poisson job arrivals with rate λ < 1, and exponential service time with mean 1. Jobs are processed in a first-come-first-served order. Traditional strategies send a maximum of Lp probes the instant a server’s last job completes or the instant a job arrives when the server is already busy. In contrast, under rate-based strategies probes are no longer sent at job arrival or completion times but at a fixed rate r as long as the server is idle (pull) or has at least T jobs waiting (push). More formally, probe messages are transmitted by a server according to an interrupted Poisson process with rate r. The strategies considered in this paper can be summarized as follows: 1. Rate-based Push: As soon as the queue length exceeds T, a server starts to generate probe messages according to a Poisson process with rate r. Whenever the queue length drops below T, this process is interrupted until the queue length exceeds T again. The node that is probed is selected at random and is only allowed to accept a job if it is idle. 2. Rate-based Pull: Whenever a server is idle it generates probe messages according to a Poisson process with rate r. This process is interrupted whenever the server is busy. The node that is probed is selected at random and is only allowed to transfer one of its jobs if its queue length exceeds T. 3. Max-Push: The instant a new job arrives at a queue with length T, probes are sent at an infinite rate. When λ < 1 this corresponds to stating that the job is instantaneously transferred to an empty server. A server with T jobs in its queue, generates probe mes- sages according to a Poisson process with rate r. When- ever the queue length drops to T − 1, this process is interrupted as long as the queue length remains below T. The node that is probed is selected at random and is only allowed to accept a job if it is idle. 4. Conditional-Pull: Whenever a node is idle, the node will generate probe messages according to a Poisson process with rate r. This process is interrupted when- ever the server becomes busy. The probed node is selected at random and the probe is always success- ful if there are at least T jobs waiting to be served, and successful with some probability p (matching M, see (26)) if there are exactly T − 1 jobs waiting to be served. We do not consider hybrid strategies, which combine both push and pull behavior. These were proven to be inferior to a pure push or pull strategy when T = 1 [5, Theorem 4]. 3. PULL AND PUSH STRATEGIES Infinite system models and closed form solutions for both pull and push strategies were introduced in [5] and [6]. Be- fore introducing new constraints and strategies, we briefly summarize the main findings of [5] and [6]. The evolution of both the rate-based pull and push strat- egy under the infinite system model is described by a set of ODEs denoted as d dt x(t) = F(x(t)), where x(t) = (x1(t), x2(t), . . .) and xi(t) represents the fraction of the number of nodes with at least i jobs at time t. From [6, Theorem 2 and 3], it is known that d dt x(t) = F(x(t)) has a unique fixed point π̄ = (π̄1, π̄2, . . .) with ∑ i≥1 π̄i < ∞ that is a global attractor, given by π̄i = λ ( (1 + r)λi−1 −rλT ) 1 + r(1 −λT ) 1 ≤ i ≤ T + 1, (1) π̄i = π̄T+1 ( λ 1 + (1 −λ)r )i−T−1 i > T + 1. (2) This fixed point is used in conjunction with Little’s Law in [6, Corollary 1] to formulate the mean delay Dboth of a job under the push or pull strategy: Dboth = 1 1 −λ − rλT ( λ (1−λ)(1+r) + T ) 1 + r(1 −λT ) . (3) From the relationships R = (1 − π̄1)rpull|R and R = rpush|Rπ̄T+1, we find R = (1 −λ)rpull|R, (4) and R = λT+1 (1 −λT ) + 1/rpush|R . (5) It follows that whenever R > λT+1/(1−λT ), the rate rpush|R can be chosen arbitrarily large (i.e., rpush|R = ∞). 3.1 Limiting the Overall Migration Rate When the overall migration rate is limited, the choice of r must satisfy this constraint. We first indicate how to set r to match M, and rewrite the formula for the mean delay. Then we show whether R or M is the strictest constraint for a given load λ. Theorem 1. Both rate-based pull and push strategies match a predefined migration rate M by letting the probe rate r = rboth|M , with rboth|M : rboth|M = M (λ(1 −λ) + M)λT −M . (6) For this setting, both strategies achieve the same mean delay. Proof. The relationship (6) readily follows from the for- mulation of the overall migration rate for both rate-based strategies: Mboth = r(1 −λ)λT+1 r(1 −λT ) + 1 . (7) From a push perspective this equation describes the fraction of nodes with queue length larger than or equal to T + 1 (π̄T+1 = λ T+1/(1 + r(1 − λT )) from (1)) sending probes at rate r, succeeding with probability (1 − λ). For a pull strategy the overall migration rate is expressed as the frac- tion of empty queues (1 − λ) sending probes at rate r and succeeding when probing a queue of length T + 1 or longer (π̄T+1 = λ T+1/(1 + r(1 −λT )) from (1)). Theorem 2. The mean delay of rate-based pull and push strategies can be expressed as Dboth = 1 1 −λ ( 1 − Mboth λ ( T + λ (1 −λ)(1 + r) )) . (8) 0.4 0.5 0.6 0.7 0.8 0.9 1 0 5 10 15 20 T 1 T 2 T 3 T 1 T 2 T 3 r push|R r pull|R r both|M R=1 M=0.125 1−M/R Load (λ) P ro b e r a te ( r) Figure 1: The probe rates r imposed by either the probe limit R = 1 for push (dot-dashed) and pull (dashed), or migration limit M = 1/8 (full), for T = 1, 2, 3. Note that rpull|R is independent of T. Proof. Equation (8) follows by rewriting (7) to rλT 1 + r(1 −λT ) = Mboth λ(1 −λ) , and substituting this expression in (3). The previous theorem shows that the improvement in mean delay compared to a standard M/M/1 queue, can be ex- pressed as a migration frequency (M/λ) times a migration gain (T + λ/((1 −λ)(1 + r))). The migration frequency de- notes how many migrations per job take place on average. The migration gain quantifies the number of places in the queue the migrating job skips. All migrating jobs skip at least T places by construction of the strategy, and skip more places depending on the queue length of the job sender. The average number of places skipped above T equals the aver- age number of customers in an M/M/1 queue with service rate 1 + r(1 −λ), which equals λ/((1 −λ)(1 + r)). Both the overall migration limit M and the overall probe limit R impose a maximum on r. In case R ≤ M, all probes are allowed to generate a migration, so r will only be con- strained by R. In any practical setting there will be more probes allowed than migrations, and the probe rate will be constrained by R or M depending on λ. An overview of the probe rates matching R or M, for both push and pull strategies with T = 1, 2, 3, is given in Figure 1. To determine the range of λ in which each constraint is most strict, we determine the intersection of rboth|M with rpush|R and rpull|R. Lemma 1. The probe rates rpush|R and rboth|M intersect at λ = 0 and 1 −M/R only, and both rates are positive for λ = 1 −M/R if and only if ( 1 − M R )T > R 2 R−M+R2 . The proof is given in Appendix A of [7]. Theorem 3. For the rate-based push strategy r should be set as follows in order to respect both the probe rate R and migration rate M: 1. R ≥ λT+1/(1 −λT ) and M ≥ λT+1(1 −λ)/(1 −λT ): r can be arbitrarily large. 2. R ≥ λT+1/(1 −λT ) and M < λT+1(1 −λ)/(1 −λT ): r can be at most rboth|M . 3. R < λT+1/(1 −λT ) and M ≥ λT+1(1 −λ)/(1 −λT ): r can be at most rpush|R. 4. R < λT+1/(1 −λT ) and M < λT+1(1 −λ)/(1 −λT ): Let τ = ( 1 − M R )T and υ = R 2 R−M+R2 . • If τ > υ, r can be at most rboth|M if λ < 1−M/R and at most rpush|R otherwise. • If τ ≤ υ, r can be at most rpush|R. The proof is given in Appendix B of [7]. Lemma 2. If M < R 1+TR , rboth|M − rpull|R has a unique root λT in (0, 1), otherwise it has no roots in (0, 1). The proof is given in Appendix C of [7]. For T = 1, the unique root λT of Lemma 2 reduces to λ1 = √ M + M/R. However, there seems to be no closed form expression for general T. Theorem 4. For the rate-based pull strategy r should be set as follows in order to respect both the probe rate R and migration rate M: 1. M ≥ λT+1(1 −λ)/(1 −λT ): r can be at most rpull|R. 2. M < λT+1(1 −λ)/(1 −λT ): • If the migration limit is sufficiently high (M ≥ R 1+TR ), then r can be at most rpull|R. • If the migration limit is sufficiently low (M < R 1+TR ), r can be at most rpull|R if λ < λT , and at most rboth|M otherwise. The proof is given in Appendix D of [7]. 4. MAX-PUSH The rate-based push is unable to reach an overall request rate higher than λT+1/(1−λT ) for any T. When the overall probe limit R exceeds this value, it is possible to use the remaining request rate by using the max-push variant as introduced in [6]. This strategy lets nodes with a queue length of T send probes at a finite rate rmp|R, and migrates all new arrivals to queues with length T by sending probes at an infinite rate until an empty server is found. As λT+1/(1− λT ) is an increasing function in λ and decreasing in T, the unique solution for λ to λT+1/(1 −λT ) = R is increasing in T. Therefore, there is a unique T > 1 satisfying λ T+1 /(1 −λT ) ≤ R < λT/(1 −λT−1). (9) For this T, the evolution of the max-push strategy can be described by a set of ODEs d dt x(t) = G(x(t)), where x(t) = (x1(t),x2(t), . . .) and xi(t) represents the fraction of the num- ber of nodes with at least i jobs at time t. Theorems 7, 8 from [6] show that the set of ODEs has a unique fixed point π̇ = (π̇1, . . . , π̇T ) that is a global attrac- tor, and can be expressed as π̇i = λ i 1 + ( λ 1−λ + r)(1 −λ T−i) 1 + ( λ 1−λ + r)(1 −λ T−1) , (10) for 1 ≤ i ≤ T. The mean delay Dmp of a job under the max-push strategy is given by [6, Corollary 3]: Dmp = 1 −λT + ( λ 1−λ + r)(1 −Tλ T−1 + (T − 1)λT ) 1 + r(1 −λ)(1 −λT−1) −λT . (11) For the max-push strategy the overall probe rate R equals R = π̇T ( λ 1 −λ + r ) , (12) as the instantaneous transfer of an arrival to a queue with T jobs requires 1/(1 − λ) probe messages on average. There- fore, a predefined overall probe rate R can be matched by setting rmp|R = R λT−1(R + λ) −R − λ 1 −λ , (13) where 0 ≤ rmp|R < ∞ for λT+1/(1 − λT ) ≤ R < λT/(1 − λT−1). 4.1 Limiting the Overall Migration Rate As the rate-based push strategy cannot exceed the probe rate λT+1/(1 − λT ), it is unable to exceed an overall mi- gration rate of (1 −λ)λT+1/(1 −λT ). It follows that when M > (1−λ)λT+1/(1−λT ), queues with a length of at least T +1 can probe at an arbitrarily high rate without exceeding the migration limit M, effectively reducing all queues to a length of at most T. Queues with length T can then send probes with a finite r to match M. In other words, in order to match M (instead of R as in the previous section), set T such that (1 −λ)λT+1 1 −λT ≤ M < (1 −λ)λT 1 −λT−1 . (14) and determine the probe rate rmp|M when the queue length equals T by the following theorem: Theorem 5. The max-push strategy matches a predefined migration rate M by letting the probe rate r = rmp|M with rmp|M = λ ( M ((1 −λ)λ + M) λT −λM − 1 1 −λ ) . (15) When matching M this way, π̇i for i ≤ T reduces to π̇i = λ i − M(1 −λi−1) 1 −λ , (16) Proof. The relationship (15) follows from the formula- tion of the overall migration rate. The migrations of new arrivals in queues with length T are given by π̇Tλ. The migrations resulting from a successful probe sent by queues with length T are given by π̇Tr(1−λ). Probes are successful if they locate an empty server, which they do with proba- bility 1 − λ. Therefore, the overall migration rate can be expressed as Mmp = π̇T ( λ 1 −λ + r ) (1 −λ) (17) = (1 −λ)((1 −λ)r + 2λ)λT+1 λ(r(1 −λ) + 1) − ((1 −λ)r + λ)λT . The reduction of π̇i to (16) follows from substitution of (15) in (10), and shows the improvement over an M/M/1 queue directly. Theorem 6. For the max-push strategy r and T should be set as follows in order to respect both the probe rate R and migration rate M: • If λ < 1 −M/R, T must be chosen according to (14) and r can be at most rmp|M . • If λ > 1 − M/R, T must be chosen according to (9) and r can be at most rmp|R. • If λ = 1 −M/R, both constraints are equivalent. The proof is given in Appendix E of [7]. Theorem 7. The mean delay of the max-push strategy can be expressed as Dmp = 1 1 −λ ( 1 − Mmp λ ( α + β Mmp )) , (18) with α = π̇TλT and β = π̇Tr(1 − λ)(T − 1). When r = rmp|M , the mean delay Dmp reduces to Dmp|M = 1 1 −λ + M ( 1 −λT ) (1 −λ)2λ − MT + λT+1 (1 −λ)λ . (19) Proof. The improvement in mean delay compared to a standard M/M/1 queue, can be expressed as a migration frequency (Mmp/λ) times a migration gain. The migration frequency denotes how many migrations per job take place on average. The migration gain quantifies the number of places in the queue the migrating job skips. The fraction of migrating jobs arriving at a queue with length T (π̇Tλ/Mmp) skip T places in the queue. The fraction of migrating jobs from queues with length T, being πTr(1−λ)/Mmp, skip T− 1 places in the queue. Hence, the migration gain is (α + β)/Mmp. The reduction to Dmp|M is found by applying Little’s Law to the expression for π̇ in (16), and shows the improvement over an M/M/1 queue explicitly. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 1 2 3 4 5 6 R=1 M=0.1 T 1 T 4 Load (λ) M e a n D e la y ( D ) Figure 2: The mean delay of the pull strategy for T = 1, ..., 4. The probe rate r is constrained by both R and M (full lines). The delay shown in dashed lines is achieved when there is no migration limit, and only the probe limit is in effect. Choosing T = 1 is no longer optimal when a maximum migration rate is imposed. 5. CONDITIONAL PULL When only considering a maximum allowed probe rate R, the optimal choice for a pull strategy is to let T = 1 [6, The- orem 5]. This is no longer the case when taking a maximum allowed migration rate M into account, as shown in Figure 2. Intuitively, when the migration limit is small, it is best to pull jobs from longer queues only, resulting in a lower mean delay. To reduce the mean delay of the rate-based pull strategy, we introduce the conditional pull strategy that can match both R and M. Empty servers send probes according to an interrupted Poisson process with rate r. Under the condi- tional pull strategy, empty nodes always accept jobs from queues with length of at least T + 1 and also accept jobs from a queue with length T with some probability p. This strategy relies on the choice of p to match the migration rate M, and lets the probe rate r be determined by R, i.e. r = rpull|R = R/(1 −λ). Thus, both R and M are matched, this is in contrast with the previous strategies, where r was always chosen as large as possible without exceeding R and M. First we note that one can easily see that for λT , being the unique root defined in Lemma 2, λT < λT+1 (as M/((λ(1− λ) + M)λT − M) increases in T and R 1−λ increases in λ independent of T). Given λ, the conditional pull strategy sets T such that λT−1 ≤ λ < λT , (20) with λ0 = 0. To analyze the response time of a job under the conditional pull strategy we introduce a set of ODEs d dt xi(t) = H(x(t)), where x(t) = (x1(t),x2(t), . . .) and xi(t) represents the fraction of the number of nodes with at least i jobs at time t. As explained below, the set of ODEs H(x(t)) describing the time evolution of the queue lengths under the conditional pull strategy is defined as 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.5 1 1.5 2 2.5 3 3.5 4 4.5 R=1 M=0.1 T 1 T 5 Load (λ) M e a n D e la y ( D ) Figure 3: Showing the mean delay of the pull strat- egy with T > 1, respecting a migration limit Ṁ. The conditional pull variant is shown in dashed lines. dx1(t) dt = −(x1(t) −x2(t)) + (λ + rxT+1(t) + rp(xT (t) −xT+1(t)))(1 −x1(t)) (21) dxi(t) dt = λ(xi−1(t) − xi(t)) − (xi(t) − xi+1(t)), (22) for 1 < i < T , and dxi(t) dt = λ(xi−1(t) −xi(t)) − (1 + rp1[i=T](1 −x1(t)))(xi(t) −xi+1(t)), (23) for i ≥ T, where 1[A] = 1 if A is true and 1[A] = 0 other- wise. The terms λ(xi−1(t) − xi(t)) and xi(t) − xi+1(t), for i ≥ 1, correspond to arrival and service completions, respec- tively. Queues of length 1 are created by job transfers at rate (rxT+1(t) + rp(xT (t)−xT+1(t)))(1−x1(t)) as the frac- tion of empty nodes (1−x1(t)) probe at rate r, and a probe is successful with probability xT+1(t) + p(xT (t) −xT+1(t)). Similarly, migrating jobs reduce the number of queues with exactly i jobs, for i > T, at rate r(1−x1(t))(xi(t)−xi+1(t)) and at rate rp(1 −x1(t))(xT (t) −xT+1(t)) for i = T. The next theorem shows that this set of ODEs has a unique fixed point with ∑ i≥1 π̂i < ∞. In Appendix F of [7] we briefly argue why this fixed point can be used to approxi- mate the queue length distribution of a node as the number of nodes becomes large. The argument is similar to the one used in [5]. We also validate the accuracy of this approxi- mation by simulation in Section 5.1. Theorem 8. The set of ODEs d dt x(t) = H(x(t)) has a unique fixed point π̂ = (π̂1, π̂2, . . .) with ∑ i≥1 π̂i < ∞. The fixed point can be expressed as: π̂i = λi ( (1 −λ)r (∑T−i j=0 λj + p(r + 1) ( 1 −λT−i )) + 1 ) (1 −λ)r (∑T−1 j=0 λj + p(r + 1) (1 −λT−1) ) + 1 (24) for 1 ≤ i ≤ T , and for i > T as π̂i = πT ( λ 1 + r(1 −λ) )i−T (25) Proof. Assume π̂ is a fixed point with ∑ i≥1 π̂i < ∞, meaning Hi(π̂) = 0 for i ≥ 1, where H(x) = (H1(x),H2(x), . . .). When ∑ i≥1 π̂i < ∞, we can simplify ∑ i≥1 Hi(π) = 0 to λ − π̂1 = 0. Hence, π̂1 must equal λ. The expressions for π̂i then readily follow from the conditions Hi(π̂) = 0, for i ≥ 1. Theorem 9. A predefined overall migration rate M can be matched by setting p = pcp|M , with pcp|M = M − π̄T+1r(1 −λ) (π̄T − π̄T+1)r(1 −λ) (26) = λ ( r ( −λ2 + λ + M ) λT −M(r + 1) ) (1 −λ)r(r + 1) (λM − (−λ2 + λ + M) λT ) . When matching M by setting p = pcp|M , π̂i reduces to π̂i = λ i − M(1 −λi−1) 1 −λ , (27) for i ≤ T . Proof. The fraction of empty queues (1−λ) send probes at rate r. Probes are successful with probability 1 if they locate a queue with length at least T + 1 (π̄T+1). Probes are successful with probability p if they locate a queue with length equal to T (π̄T − π̄T+1). In other words: Mcp = r(1 −λ)(π̄T+1 + p(π̄T − π̄T+1), (28) from which (26) follows by algebraic manipulation. The re- duction of π̂i to (27) is found by substituting (26) in (24). Theorem 10. The mean delay Dcp of a job under the conditional pull strategy equals Dcp = 1 1 −λ ( 1 − Mcp λ (α−β) ) , (29) with α = T + λ (1 −λ)(1 + r) and β = r(1 −λ)pπ̄T Mcp . Proof. The improvement in mean delay compared to a standard M/M/1 queue, can be expressed as a migration frequency (Mcp/λ) times a migration gain (α − β). The fraction of migrating jobs where the job is pulled from a queue with length at least T + 1, (r(1 −λ)π̄T+1/Mcp) skip α places in the queue: The same remarks as in Theorem 2 apply. The other jobs ((r(1 −λ)p(π̄T − π̄T+1))/Mcp) are pulled from a queue with length equal to T, thus skipping exactly T −1 places. In other words, the migration gain can be expressed as: αr(1 −λ)π̄T+1 + (T − 1)(r(1 −λ)p(π̄T − π̄T+1)) Mcp , which can be rewritten as α−β. Figure 3 shows the mean delay of the conditional pull strat- egy. The dots represent λT , i.e., the intersection points of rpull|R and rboth|M . The conditional pull strategy achieves a lower mean delay compared to the rate-based pull strategies with T > 1, as it transfers more jobs. Theorem 11. When R = +∞ and M is finite, the max- push and conditional pull strategies have the same stationary queue length distribution. Proof. When there is no probe limit R, the parameter r is allowed to be arbitrarily large for the conditional pull strategy. In this case the maximum queue length will be T as limr→∞ π̂i = 0 for i > T, see (25). We therefore know from Theorems 5 and 9 that both the max-push and conditional pull strategy have the same queue length distribution when only matching M, if they use the same T. What remains to be shown is that both strategies make use of the same T. Recall that λT was defined as the solution in (0, 1) of rpull|R− rboth|M , that is, R 1 −λ = M (λ(1 −λ) + M)λT −M . The left hand side tends to infinity as R tends to infinity. Hence, rboth|M must tend to infinity, meaning λT is the so- lution to M = (1 − λ)λT+1/(1 − λT ). The λT ’s are thus exactly the M values where the max-push strategy changes its T value, see (14). Hence, both the conditional pull and max-push strategy choose the same T. 5.1 Model Validation We validate the infinite system model for the conditional pull strategy by comparing the closed form results of Theo- rem 10 with time consuming simulation results for systems with a finite number of nodes N. The infinite and finite sys- tem model only differ in the system size. Hence, the rate r and probability p in the simulation experiments is indepen- dent of N and was determined by using the expression for p from Equation (26) and r = R/(1 − λ). Each simulated point in the figures represents the average value of 25 simu- lation runs. Each run has a length of 106 time units (where the service time is exponentially distributed with a mean of 1 time unit) and a warm-up period of length 106/3 time units. Load (λ) 0.5 0.65 0.7 0.75 0.8 25 1.1e-2 1.7e-2 1.9e-2 2.4e-2 2.9e-2 50 5.6e-3 8.2e-3 9.5e-3 5.7e-3 7.1e-3 System 100 2.8e-3 3.9e-3 4.6e-3 5.7e-3 7.1e-3 Size 200 1.3e-3 2.0e-3 2.3e-3 2.7e-3 3.4e-3 (N) 400 7.0e-4 1.0e-3 1.2e-3 1.3e-3 1.7e-3 800 3.5e-4 5.0e-4 5.6e-4 6.6e-4 8.1e-4 1600 1.6e-4 2.5e-4 2.6e-4 3.8e-4 4.3e-4 Table 1: Relative error of mean delay, given by (29), for the conditional pull strategy with R = 1 and M = 0.1 when compared to simulation results. Table 1 compares the mean delay in a finite system with N nodes with the mean delay in the infinite system model under the conditional pull strategy with R = 1 and M = 0.1 for N = 25, 50, . . . , 1600 and λ = 0.5, 0.65, 0.7, 0.75 and 0.8. For each combination of N and λ we also show the relative error. The error clearly decreases as N grows, and is worse for larger λ values. The observed overall migration rate in the simulation is strictly lower than the predefined M, meaning less jobs will be transferred than anticipated. Hence, the mean delay in the simulation experiments is pessimistic. This error is in part due to the choice of p, which was determined using (26). This choice relies on the infinite system model whereas we are now studying a finite system. The relative error in the observed overall migration rate is nearly load-insensitive and decreases linearly as the system doubles in size, as shown in Table 2. N 25 50 100 200 400 800 1600 Rel. Err. 4% 2% 1% .5% 0.25% .13% .064% Table 2: Relative error of the observed overall mi- gration rate for finite system size when compared to the targeted migration rate M. 6. PUSH VERSUS PULL STRATEGIES We compare the performance of the max-push and the condi- tional pull strategies with a predefined overall probe limit R and migration limit M (using Theorems 7 and 10). The pa- rameter T is determined by the load λ, as each strategy is only defined for a specific T given any λ (see (9), (14) and (20)). For the max-push, the value for T and r is chosen to match the strictest constraint of either R or M depending on the load (see Theorem 6). For the conditional pull all idle servers probe with rate r = R/(1 −λ), and p is chosen to match M (see Theorem 9). The mean delay of the max-push and conditional pull strat- egy with M = 0.1 and R = 0.4 is shown in Figure 4. The max-push strategy is limited by the probe limit when λ > 1 −M/R and by the migration limit when √ M < λ < 1 − M/R. The mean delay of the push strategy is one in case λ < √ M, as all newly arriving jobs at a busy server can be migrated instantaneously to an empty server without violating the R and M constraints. For the conditional pull strategy the limiting factor is R when λ < √ M + M/R, and M for λ > √ M + M/R. The intervals where both strategies are constrained by M do not always overlap, i.e. √ M + M/R can be larger than 1 − M/R, as is the case for R = 1 and M = 0.3. When√ M + M/R < 1 −M/R both strategies transfer the same number of jobs when √ M + M/R < λ < 1 − M/R. How- ever, the max-push will outperform the conditional pull as the average migration gain is larger. This is not unexpected as the max-push strategy avoids that queues become larger than T, whereas queues with a length exceeding T exist for the conditional pull as it only sends random probes at a finite rate. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 3 4 5 6 7 1 − M/R √ M √ M + M/R Push Pull R=0.4 M=0.1 Load (λ) M e a n D e la y ( D ) Figure 4: Mean delay of the max-push and condi- tional pull strategies, with R = 0.4 and M = 0.1. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 2 3 4 5 6 7 1 − M/R √ M √ M + M/R Push Pull R=1 M=0.1 Load (λ) M e a n D e la y ( D ) Figure 5: Mean delay of the max-push and condi- tional pull strategies, with R = 1 and M = 0.1. As expected from Theorem 11, the difference in performance between max-push and conditional-pull becomes smaller when increasing R, as shown in Figure 5. As the empty queues send probes with rate r = R/(1 − λ), they are allowed to send more probes as R increases. This increases the odds that a long queue is probed, thus lowering the mean delay. This can also be observed by looking at the values for T. By increasing R, a larger value for T can be used for the same load. This requires that jobs are pulled from longer queues, increasing the migration gain per transfer. In conclusion, whenever the maximum allowed probe rate R clearly exceeds the maximum allowed migration rate M (which is the case that is mainly of practical interest), the pull strategy is either clearly superior (for large λ) or has a similar performance to the max-push strategy (for medium to low λ). 7. REFERENCES [1] D. Eager, E. Lazowska, and J. Zahorjan. Adaptive load sharing in homogeneous distributed systems. Software Engineering, IEEE Transactions on, SE-12(5):662 –675, may 1986. [2] D. Eager, E. Lazowska, and J. Zahorjan. A comparison of receiver-initiated and sender-initiated adaptive load sharing. Perform. Eval., 6(1):53–68, 1986. [3] N. Gast and B. Gaujal. A mean field model of work stealing in large-scale systems. SIGMETRICS Perform. Eval. Rev., 38(1):13–24, 2010. [4] Y. Lu, Q. Xie, G. Kliot, A. Geller, J. R. Larus, and A. Greenberg. Join-idle-queue: A novel load balancing algorithm for dynamically scalable web services. Perform. Eval., 68:1056–1071, 2011. [5] W. Minnebo and B. Van Houdt. A fair comparison of pull and push strategies in large distributed networks. IEEE/ACM Transactions on Networking, 2013. [6] W. Minnebo and B. Van Houdt. Improved rate-based pull and push strategies in large distributed networks. In Proc. of the IEEE 21-th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, San Francisco (USA), 2013. [7] W. Minnebo and B. Van Houdt. Tech. report: Analysis of rate-based pull and push strategies with limited migration rates in large distributed networks. http://win.ua.ac.be/~vanhoudt/papers/reports/ MigrationPaper.pdf, 2015. [8] R. Mirchandaney, D. Towsley, and J. Stankovic. Analysis of the effects of delays on load sharing. IEEE Trans. Comput., 38(11):1513–1525, 1989. [9] R. Mirchandaney, D. Towsley, and J. A. Stankovic. Adaptive load sharing in heterogeneous distributed systems. J. Parallel Distrib. Comput., 9(4):331–346, 1990. [10] M. Mitzenmacher. The power of two choices in randomized load balancing. IEEE Trans. Parallel Distrib. Syst., 12:1094–1104, October 2001. [11] I. V. Spilbeeck and B. V. Houdt. Performance of rate-based pull and push strategies in heterogeneous networks. Performance Evaluation, 91:2 – 15, 2015. Special Issue: Performance 2015. [12] A. Stolyar. Pull-based load distribution in large-scale heterogeneous service systems. Queueing Systems, 80(4):341–361, 2015. [13] B. Van Houdt. Performance comparison of aggressive push and traditional pull strategies in large distributed systems. In Proceedings of QEST 2011, Aachen (Germany), IEEE Computer Society, pages 265–274, SEP 2011. [14] N. Vvedenskaya, R. Dobrushin, and F. Karpelevich. Queueing system with selection of the shortest of two queues: an asymptotic approach. Problemy Peredachi Informatsii, 32:15–27, 1996. [15] L. Ying, R. Srikant, and X. Kang. The power of slightly more than one sample in randomized load balancing. In Proc. of IEEE INFOCOM, 2015. http://win.ua.ac.be/~vanhoudt/papers/reports/MigrationPaper.pdf http://win.ua.ac.be/~vanhoudt/papers/reports/MigrationPaper.pdf Introduction Rate Based Strategies Pull and Push Strategies Limiting the Overall Migration Rate Max-Push Limiting the Overall Migration Rate Conditional Pull Model Validation Push versus pull strategies References