JACODESMATH / ISSN 2148-838X J. Algebra Comb. Discrete Appl. 2(3) • 191-209 Received: 11 May 2015; Accepted: 21 August 2015 DOI 10.13069/jacodesmath.36015 Journal of Algebra Combinatorics Discrete Structures and Applications Approximate counting with m counters: A probabilistic analysis Research Article Guy Louchard1∗, Helmut Prodinger2∗∗ 1. Université Libre de Bruxelles, Département d’Informatique, CP 212, Boulevard du Triomphe, B-1050 Bruxelles, Belgium 2. University of Stellenbosch, Mathematics Department, 7602 Stellenbosch, South Africa Abstract: Motivated by a recent paper by Cichoń and Macyna [1], who introduced m counters (instead of just one) in the approximate counting scheme first analysed by Flajolet [2], we analyse the moments of the sum of the m counters, using techniques that proved to be successful already in several other contexts [11]. 2010 MSC: 60C05, 05A16, 68Q24 Keywords: Approximate counting, Moments, Constant and fluctuating components, Complex analysis, Product of Fourier series 1. Introduction Approximate counting is a technique that was first analysed by Flajolet [2]; some subsequent pa- pers [6, 12–14] added to the analysis. A counter C is kept, and each time an item arrives and needs to be counted, a random experiment is performed; if the current value of the counter is i, then with probability 2−i the counter is increased by 1, otherwise it keeps its value; at the beginning, the counter value is C = 1. After n random increments, the value of the counter is typically close to log2 n, and the cited papers contain exact and asymptotic values for average and variance. For instance, Flajolet [2] gives the dominant constant part of mean and variance and the periodic part of the mean. Recently, Cichoń and Macyna [1] used this idea as follows: Instead of one counter, they keep m counters, where m ≥ 1 is an integer. For our subsequent analysis we will assume that m is fixed. When a new element arrives (and needs to be counted), it is randomly (with probability 1 m ) assigned to one ∗ E-mail: louchard@ulb.ac.be ∗∗ E-mail: hproding@sun.ac.za (Corresponding Author) 191 Approximate counting with m counters of the m counters, and then the random experiment is performed as usual. The parameter that Cichoń and Macyna are interested in is the total number of changes of any counter. They also consider another strategy which is not discussed in the present paper. In other words, if we (for convenience) assume that the initial setting of a counter to the value 1 counts as a change, Cichoń and Macyna are interested in the sum of the values of the m counters. The paper [16] provided the first analysis of Cichoń and Macyna’s scheme: Based on exact expres- sions, asymptotics for expectation and variance are derived with Rice’s method. There is a price to be paid for dealing with these exact expressions, as there are computational hardships to be dealt with. Let us also mention Fuchs, Lee and Prodinger [4] who analyze this algorithm via the Poisson-Laplace-Mellin method. In the present paper, the approach is different: Going to approximations immediately, one loses exact expressions, but on the other hand the computations become much more manageable, so that one can go to higher moments, which we do here, mostly, to show the power of the method. The new interest in approximate counting that Cichoń and Macyna’s scheme initiated, motivated us to provide this paper: We had a long report [10] on asymptotics of the moments of extreme-value related distribution functions, but only a shortened version of it was published [11]; especially the analysis of classical approximate counting had to be left out. We present it here, together with additional material that deals with the m counters (instead of just one, as in the classical case). We also present new simplifications of products of some Fourier series. Let J(m,n) be the random variable (RV): “total value of the counter after n items have arrived”; we can write J(m,n) = ∑m 1 Ji(n) where Ji(n) is related to the ith counter. When we have only one counter, we can just write J(n). The most important motivation of the paper is to compute the asymptotic distribution and the moments of J(m,n). The asymptotic distribution is related to the extreme-value Gumbel distribution function (DF): exp(−exp(−x)). The moments are usually given by a constant part and a small fluctuating part. There we use Laplace and Mellin transforms and singularity analysis. Our aim is to derive an (almost) purely mechanical computation of constant and fluctuating com- ponents, with the help of computer algebra systems (we use Maple here). As an example, we provide the first four moments, (even the third moment is very rarely computed in the literature) but the treatment is completely automatic (with some human guidance of course). The fourth moment is particularly in- teresting: it presents a wide variety of combinatorial and mathematical constants as well as several types of Fourier series (including products of them). A last but not least motivation is to simplify the analytic treatment by using only easy complex analysis: only simple poles are needed, and we do not use alternating series (so we do not need Rice’s formula). A small number of analytic functions are the only tools we need. This should be compared with the complicated techniques sometimes used in previous papers. We have uniform integrability for the moments of our RV’s. To show that the limiting moments are equivalent to the moments of the limiting distributions, we need a suitable rate of convergence. This is related to a uniform integrability condition (see Loève [8, Section 11.4]). The total error term related to our asymptotics of moments is detailed in [11]; it is given by O(n−C), where C is some constant. Another technical point of interest are the periodic oscillations that always occur in approximate counting and related questions: When one goes for higher moments, there are many extra terms coming in, making the Fourier coefficients very complicated. In particular, there are high powers of some (simple) periodic functions. We present a technique to bring the Fourier coefficients of these into some standard form, using residue calculus. To summarize, we had several motivations to write this paper: • show the power of the method we introduced in [11] • present, with great details, the analysis of classical Approximate counting • give new simplifications of product of some Fourier series 192 G. Louchard, H. Prodinger • analyze, with precision, Approximate counting with m counters The paper is organized as follows: Definitions, notations and known properties are recalled in Section 2. Classical approximate counter (1 counter) is analyzed in Section 3. The case of m counters is considered in Section 4. The asymptotic moments of J(n) are given in Section 5. Section 6 concludes the paper. 2. Definitions, notations and known properties Let us first give the notations we will need throughout the paper. They will be used in Section 3, but we prefer to summarize them here instead of introducing them one by one. L := ln 2, log := log2, ε := small real > 0, α̃ := α/L, Q(j) := j∏ k=1 (1 − 2−k), Q(0) = 1, Q := Q(∞) = 0.288788095087 . . . , R(j) := (−1)j+1 2j(j+1)/2Q(j) , R(0) = −1, χl := 2πil L , ρk := ∑ l 6=0 Γ(χl)ψ(χl) ke−2lπi log n, k = 1, 2, 3, ρ4 := ∑ l 6=0 Γ(χl)ψ(1,χl)e −2lπi log n. These functions appear in many analyses of algorithms (see, for instance, Flajolet [2], Flajolet and Sedgewick [3], Hwang et al. [5], Louchard [9], Louchard and Prodinger [11]). The following facts will be frequently used: Q(i) ≥ Q, (1 −u)n = e−nu [ 1 −nu2/2 + O(nu3) ] , u ∈ ]0, 1[ . For the integer-valued RV J (from now on, we drop i and n from Ji(n) in order to ease the notation; J is now related to one counter, with n items), we set p(j) := Pr(J = j), P(j) := Pr(J ≤ j). Setting η = j − log n, we will first compute f and F such that p(j) ∼ f(η), P(j) ∼ F(η), n →∞, and, of course, f(η) = F(η) −F(η − 1). Asymptotically, the distribution will be a periodic function of log n in the following sense: fix n; P(J(n) ≤ j) ∼ F(η),η = j − log n = j −blog nc−{log n}. Set log n′ = log n + 1 (hence n′ = 2n, the base could of course be changed). Then P(J(n′) ≤ j + 1) ∼ F(j + 1 − log n′) = F(η). The asymptotic distribution of J(n′) is the same as the one of J(n), only shifted by 1. All n with the same {log n} lead to the same points of F(η). The distribution P(j) does not converge in the weak sense, it does however converge along subsequences nm for which the fractional part of log nm is constant. This type of convergence is not uncommon in the Analysis of Algorithms. Many examples are given in [11]. 193 Approximate counting with m counters Next, we must check that E ( Jk ) = ∑ j jkp(j) ∼ ∑ j (η + log n)kf(η), (1) by computing a suitable rate of convergence. This is related to a uniform integrability condition (see Loève [8, Section 11.4].) 3. Classical approximate counting (one counter). Analysis of J(n) In this section, we provide the asymptotic distribution and the first four asymptotic moments of the RV J(n) based on n items. From Flajolet [2, Proposition 1], we have p(j) = j−1∑ k=0 2k −R(k) Q(j − 1 −k) (1 − 1/2j−k)n. (2) Letting η = j − log n, it is proved in [2] that f(η) = ∞∑ k=0 2k −R(k) Q exp(−2−η+k). (Compare also [6, 13, 14].) It has been pointed out in [9] that it has some similarities with the Digital Search Tree distribution. Actually, as noticed by S. Janson (private communication), the distribution for approximate counting is the same as for unsuccessful search in Digital Search Trees (not only asymptot- ically). The rate of the convergence problem is completely solved in Flajolet [2]. Also, we obtain by summing P(j) = − j−1∑ k=0 j∑ u=k+1 2k R(k) Q(u− 1 −k) (1 − 1/2u−k)n, F(η) = − ∞∑ k=0 2k R(k) Q ∞∑ i=k exp(−2−η+i). (3) We note that the algorithm can be generalized by changing the base. The analysis is quite similar, and we won’t provide details here. The first three asymptotic moments are given in our unpublished report [10]. We take the opportunity to present them here, with some complements. In particular we analyze the product of Fourier series, which leads to convolutions of the coefficients. In order to show the power of the methods we use, we also give the fourth moment. Using the techniques we described in our published paper [11], we proceed as follows. 3.1. Some preliminary identities Some preliminary identities are necessary. They appear in the asymptotic moments, but we prefer to avoid overloading Sections 3.3 and 3.4. Using a classical Euler identity, we will derive several summation formulae. This identity is ∞∏ k=0 (1 + qkz) = ∞∑ i=0 ziqi(i−1)/2/Q(i). 194 G. Louchard, H. Prodinger (for a simple proof, see Knuth [7, Ex 5.1.1–16]). It holds for general q, once the definition of Q(i) it adapted by replacing 1 2 by q. Set Π∗ := ∞∏ k=1 (1 + qkz), Π := (1 + z)Π∗, Σk := (k − 1)! ∞∑ i=1 qki/(1 + zqi)k. It is not hard to see that, with z = −1, q = 1/2, we get the following expansions Π∗ → Q, Σ1 → C1, Σ2 → C2, Σ3 → 2!C3, Σ4 → 3!C4, with the abbreviations Ck := ∞∑ j=1 1 (2j − 1)k . Now we want to compute sums of type Uk := ∞∑ i=0 ik2iR(i). We obtain Π∗ ′ = Σ1Π ∗, Π′ = Π∗ + (1 + z)Σ1Π ∗, and setting z = −1, q = 1/2, we derive U1 = Q. Similarly, set T1 := zΠ′, compute T ′1, etc. With the same procedure, we obtain: U2 = −Q(−1 + 2C1), U3 = Q(1 − 6C1 − 3C2 + 3C21 ), U4 = −Q(−1 + 14C1 + 18C2 − 18C21 + 8C3 − 12C1C2 + 4C 3 1 ), U5 = Q(1 − 30C1 − 75C2 + 75C21 − 80C3 + 120C1C2 − 40C 3 1 − 30C4 + 40C1C3 + 15C22 − 30C 2 1C2 + 5C 4 1 ). (4) Also U0 = 0; (5) this is Equation (21) in Flajolet [2]. More generally, setting z = −2k, q = 1/2 in Π, we derive ∞∑ i=0 2(k+1)iR(i) = 0. (6) 3.2. Slow increase property It is necessary that the functions we need decrease exponentially in the direction i∞. Indeed, (see ([11] for details), some integration in the complex plane (along a rectangle) must be convergent. It will appear that all functions we use here are analytic (in some domain), depending on classical functions such as Γ, ζ, ψ(k,s) (the (k + 1)-gamma function). 195 Approximate counting with m counters We know that Γ(s) decreases exponentially in the direction i∞: |Γ(σ + it)| ∼ √ 2π|t|σ−1/2e−π|t|/2. Also, we have a “slow increase property” for all functions we encounter: let s = σ + it, |ζ(s)| = O(|t|1−σ), σ < 1. We will also use the function H2(−Ls), with H2(α) := ∞∑ k=0 eαk2kR(k). To analyze this function, we use the “sum splitting technique” as described in Knuth [7, p. 131], and used in Flajolet [2]: let σ > −1 and ρ(x) be an increasing function. For k < ρ(|s|), the contribution is bounded by ρ(|s|) sup 0≤k≤ρ(|s|) ( 1 2σk+k 2 ) = O ( 1 + ρ(|s|)2ρ(|s|) ) ; for k ≥ ρ(|s|), the contribution is bounded by ∞∑ k=ρ(|s|) 2k 2k 2 = O ( 2ρ(|s|) 22ρ(|s|) ) . Choosing ρ(x) = log x insures the slow increase property. S. Janson (private communication) mentioned that, with θ(z) := ∏ k≥1(1 −z/2 k), we have H2(α) = −θ(2eα) = (eα − 1)θ(eα). This easily leads to H2(α) ∼−Q(α−L) + O((α−L)2), H2(α) ∼ 3Q(α− 2L) + O((α− 2L)2), H2(α) ∼−21Q(α− 3L) + O((α− 3L)2), (7) and similar asymptotics for α = kL,k ≥ 4. 3.3. The asymptotic moments The detailed proofs of relations (8) to (17) are given in [11]. Let an (integer-valued) RV K be such that Pr(K − log n ≤ η) ∼ F(η), where F(η) is the DF of a continuous RV Z with mean m1, second moment m2, variance σ2 and centered moments µk. Assume that F(η) is either an extreme-value DF or a convergent series of such and that (1) is satisfied. Let ϕ(α) = E(eαZ) = 1 + ∞∑ k=1 αk k! mk = e αm1λ(α), (8) say, with λ(α) = 1 + α2 2 σ2 + ∞∑ k=3 αk k! µk. (9) 196 G. Louchard, H. Prodinger Also ϕ(α) = ∫ +∞ −∞ eαηF ′(η)dη. (10) We have here, for J, φ(α) := ∫ ∞ −∞ eαηf(η)dη = − 1 QL H2(α)Γ(−α̃), <(α) < 0. But we need a larger range for α. But, by (7), we can continue φ(α) analytically for all α: all singularities of Γ(−α̃) are cancelled by the successive roots of H2(α). Moreover, this implies eαηf(η) → 0, η →∞, η →−∞. (11) Now, we have ∫ +∞ −∞ eαη[F ′(η) −F ′(η − 1)]dη = (1 −eα)ϕ(α) = [eαη[F(η) −F(η − 1)]]∞−∞ −α ∫ +∞ −∞ eαη[F(η) −F(η − 1)]dη = [eαηf(η)] ∞ −∞ −α ∫ +∞ −∞ eαηf(η)dη = −αφ(α), by (11). This gives ϕ(α) = α eα − 1 φ(α) = − H2(α)Γ(1 − α̃) Q(1 −eα) . This leads to (we give only the first two expressions for mi) m1 = γ L −C1, m2 = π2 + 6γ2 6L2 − 2γC1 L −C2 + C21 −C1, µ2 = σ 2 = π2 6L2 −C1 −C2, µ3 = 2ζ(3) L3 − 2C3 − 3C2 −C1, µ4 = 3C 2 1 −C1 − 7C2 − 12C3 + 3C 2 2 − π2C1 L2 + 6C1C2 − π2C2 L2 + 3π4 20L4 − 6C4. Let w, κ’s (with or without subscripts) denote periodic functions of log n, with period 1 and 0 mean (for w) or non-zero small mean (for κ) and small (of order 10−4) amplitude. Actually, these functions depend on the fractional part of log n: {log n}. The moments of J(n)−log n are asymptotically given by m̃i +wi. The constant part m̃i comes from the residue of some function at 0 (see [11]) and wi comes from the residues of the same function at some complex poles. The generating function of m̃i is given by φ(α) := ∫ ∞ −∞ eαηf(η)dη = 1 + ∞∑ i=1 αi i! m̃i = ϕ(α) eα − 1 α . (12) 197 Approximate counting with m counters This leads to m̃1 = 1 2 −C1 + γ L , m̃2 = π2 + 6γ2 6L2 + γ − 2γC1 L + 1 3 − 2C1 −C2 + C21. More generally, the centered moments of J(n) are asymptotically given by µi = µ̃i + κi, with the asymptotic dominant constant centered moment generating function given by Θ(α) := 1 + ∞∑ k=2 αk k! µ̃k = 2 α sinh(α/2)λ(α). (13) The neglected part is of order 1/nβ with 0 < β < 1. We derive, with (4), the following result about these centered moments. Theorem 3.1. The asymptotic dominant constant parts of the centered moments of J(n) are given by µ̃2 = π2 6L2 + 1 12 −C1 −C2, µ̃3 = 2ζ(3) L3 − 2C3 − 3C2 −C1, µ̃4 = 1 80 + 3C21 − 3C1 2 − 15C2 2 − 12C3 − 6C4 + 3π4 20L4 + π2 12L2 + 6C1C2 − π2C1 L2 − π2C2 L2 + 3C22. Note that Θ(α) = φ(α)e−αm̃1. Now E(J(n) − log n)k ∼ m̃k + wk, with wk = 1 L ∑ l 6=0 Υ∗k(χl)e −2lπi log n, (14) and Υ∗k(s) = L φ (k)(α) ∣∣∣ α=−Ls . (15) With (5), we check that we have no singularity of φ(k), k > 0, at α = 0. The fundamental strip for (15) is <(s) ∈ 〈−1, 0〉. We first obtain w1 = − 1 L ∑ l 6=0 Γ(χl)e −2lπi log n; m̃1, µ̃2 and w1 are identical to the expressions given in Flajolet [2]. To compute the periodic components κi (with non-zero small mean) to be added to the centered moments µ̃i, we first set m1 := m̃1 + w1. 198 G. Louchard, H. Prodinger Now, we start from φ(α) := 1 + ∞∑ k=1 αk k! m̃k = ϕ(α) eα − 1 α . We replace m̃k by m̃k + wk, leading to φp(α) = φ(α) + ∞∑ k=1 αk k! wk. But it is easy to check that ∑ l 6=0 φ(−Lχl)e −2lπi log n = 0, so we obtain φp(α) = φ(α) + ∞∑ k=0 ∑ l 6=0 φ(k)(α) ∣∣∣ α=−Lχl e−2lπi log n αk k! = φ(α) + ∑ l 6=0 φ(α−Lχl)e −2lπi log n. (16) Finally, we compute Θp(α) = φp(α)e −αm1 = 1 + ∞∑ k=2 αk k! (µ̃k + κk) = Θ(α) + ∞∑ k=2 αk k! κk, (17) leading to the (exponential) generating function of κk. By expansion and taking differences, we have a result about the oscillating parts. Theorem 3.2. The asymptotic oscillating parts of the centered moments of J(n) are given by κ2 = −w21 − 2γw1 L + 2 L2 ρ1, κ22 = w 4 1 + 4γ2w21 L2 + 4 L4 ρ21 + 4γw31 L − 4w21 L2 ρ1 − 8w1 L3 ρ1, κ3 = 4L2w21 + 12w1Lγ + 6γ 2 −π2 2L2 w1 − 6(γ + w1L) L3 ρ1 − 3 L3 ρ2 − 3 L3 ρ4, κ4 = w1 [ −w1/2 + 12γC2 L + 12γC1 L − 3w31 + π2w1 L2 − 8ζ(3) L3 + 6C1w1 + 6w1C2 − 12γ2w1 L2 − 4γ3 L3 − 12w21γ L − γ L ] + L2 − 12C2L2 − 12C1L2 + 24γw1L + 12w21L2 + 12γ2 L4 ρ1 + 12 L4 ∑ l 6=0 e−2lπi log nψ(χl)ψ(1,χl)Γ(χl) + 12(w1L + γ) L4 ρ4 + 4 L4 ∑ l 6=0 e−2lπi log nψ(2,χl)Γ(χl) + 4 L4 ρ3 + 12(w1L + γ) L4 ρ2. All algebraic manipulations of this paper are mechanically performed by Maple.1 1 Γ′(x) = Γ(x)ψ(x), Γ′′(x) = Γ(x)ψ(1,x) + Γ(x)ψ2(x), Γ′′′(x) = Γ(x)ψ(2,x) + 3Γ(x)ψ(1,x)ψ2(x) + Γ(x)ψ3(x) etc. 199 Approximate counting with m counters Note that the non-zero κi mean must be added to the dominant constant parts of the centered moments. This will be considered in the next section. 3.4. The corrections Products of Fourier series may have a constant term, even if the factors do not. This term must be included in the dominant constant part of our moments. This is the object of the present subsection. We denote by [f]k the coefficient of e −2kπi log n in the Fourier expansion of f. In [11], we have proved the following relations c1[0] := [w 2 1]0 = 1 L2 ∑ k 6=0 Γ(−χk)Γ(χk) = − 2 L D − 11 12 + π2 6L2 with D := ∑ l≥1 (−1)l l(2l − 1) . The coefficient c1[k] of e−2kπi log n in the Fourier expansion of w21 is given by c1[k] = 1 L2 ∑ j 6=0,k Γ(−χj)Γ(χk + χj) = 2 L ∑ l≥1 (−1)lΓ(χk + l) l!(2l − 1) + 2 L2 Γ(χk) ( ψ(χk) + γ ) , c2[0] := [w 3 1]0 = 1 + 2ζ(3) L3 + 1 L D − 6 L2 D1 − 2 log 3 L − 2 L D2, with D1 = ∑ l≥1 (−1)lHl−1 l(2l − 1) , D2 = ∑ l,j≥1 (−1)l+j (l + j)(2l − 1) [ 1 2j − 1 + 1 2j+l − 1 ]( l + j j ) . This has been checked numerically and gives the tiny value −9.428177 × 10−25. The coefficient c2[k] of e−2kπi log n in the Fourier expansion of w31 is given by c2[k] = −   2L3 [ 2L ∑ l≥1 (−1)l l!(2l − 1) Γ(l + χk) ( ψ(l + χk) + γ ) −L2 ∑ l≥1 (−1)lΓ(l + χk) l! 2l (2l − 1)2 + 1 12 ( 18Ψ(1,χk) + 18 ( ψ(χk) + γ )2 − 11L2 −π2)Γ(χk) ] + 2 L2 ∑ l≥1 (−1)l l!(2l − 1) [ L ∑ h≥0 (−1)h h!(2h+l − 1) Γ(h + l + χk) + L ∑ h≥1 Γ(l + h + χk) (−1)h h!(2h − 1) + LΓ(l + χk)2 −l −LΓ(l + χk) − (l− 1)!Γ(χk) + Γ(χk) ( ψ(χk) + γ + L 2 )] + [ π2 6L3 + 1 12L − 1 L − 2 L2 ∑ l≥1 (−1)l−1 l(2l − 1) ] Γ(χk)   . 200 G. Louchard, H. Prodinger For a complete description of the Fourier coefficients of the oscillations occurring in the third and fourth moment, we need the following expressions (note that Maple splits the higher derivatives of the Gamma function; if one could rework that, one could reduce the number of necessary expressions): c3[0] := [w1ρ1]0 , c4[0] := [w 4 1]0, c5[0] := [ (ρ1) 2 ] 0 , c6[0] := [ w21ρ1 ] 0 , c7[0] := [w1ρ4]0 , c8[0] := [w1ρ2]0 , c3[k] := [w1ρ1]k , c4[k] := [w 4 1]k, c5[k] := [ (ρ1) 2] k , c6[k] := [ w21ρ1 ] k , c7[k] := [w1ρ4]k , c8[k] := [w1ρ2]k . We will show now how to “compute” some Fourier coefficients we need. “Computing” is perhaps a very ambitious word, it might be better replaced by “rewriting”. We have w1 = − 1 L ∑ k 6=0 Γ(χk)e −2πikx, and higher powers have convolutions as coefficients: w41 = 1 L4 ∑ k 6=0 ∑ k1+k2+k3+k4=k Γ(χk1 )Γ(χk2 )Γ(χk3 )Γ(χk4 )e −2πikx, and none of the k1, . . . ,k4 is allowed to be zero. The only thing that we are able to achieve is to have only one Gamma-term in the (multiple) sum, where a typical term might look like∑ j1+···+jt=j C (1) j1 . . .C (t) jt Γ(s)(χk + j)e −2πikx. Such a representation is not a priori better than the straight-forward convolution, but we will sketch now how to achieve them. One (small) advantage is that the zeroth term can be explicitly determined, and extracted, and what is left is then oscillating around zero. As an example, we consider W := w21 − [w 2 1]0 = ∑ k 6=0 c1[k]e −2πikx, with c1[k] = 2 L ∑ l≥1 (−1)lΓ(χk + l) l!(2l − 1) + 2 L2 ( Γ′(χk) + γΓ(χk) ) . Let us discuss W2. It is clear that the convolution of W with itself contains already several terms. For instance, [W2]0 = 4 L2 ∑ j,l≥1 (−1)j+l j!(2j − 1)l!(2l − 1) ∑ k 6=0 Γ(−χk + j)Γ(χk + l) + 8 L3 ∑ l≥1 (−1)l l!(2l − 1) ∑ k 6=0 Γ(−χk + l) ( Γ′(χk) + γΓ(χk) ) + 4 L4 ∑ k 6=0 ( Γ′(−χk) + γΓ(−χk) )( Γ′(χk) + γΓ(χk) ) . We would like to demonstrate how to rewrite the k-sums in these expressions. The survey paper [15] has many similar examples. The approach we found most versatile is via residue calculus. One writes a suitable function, computes all the residues in the complex plane, and the sum of them is zero. There are of course some technical subtleties, like showing that integral tends to zero for larger and larger radii, 201 Approximate counting with m counters and also there are usually some series that do not converge absolutely. The suitable limit of them is the Abel limit, i.e., consider a power series in x, and let x tend to a point at the boundary of convergence. Here, we want to concentrate on the computational part only. The function that is suitable for the first part is L 2z − 1 Γ(j −z)Γ(l + z). A first contribution comes from the poles at z = χk: S1 = ∑ k∈Z Γ(j −χk)Γ(l + χk). The next contribution stems from the poles at z = j,j + 1, . . . : S2 = (l + j − 1)!L ∑ h≥0 (−1)h−1 2j+h − 1 ( l + j + h− 1 h ) . Now we look at the poles at z = −l,−l− 1, . . . : ∑ h≥0 L 2−l−h − 1 Γ(j + l + h) (−1)h h! . This series does not converge absolutely. It is best to pull out the “bad” part, which leads to two contributions: S3 = (l + j − 1)!L ∑ h≥0 (−1)h−1 2l+h − 1 ( l + j + h− 1 h ) , S4 = (l + j − 1)!L ∑ h≥0 (−1)h−1 ( l + j + h− 1 h ) . As announced, we must interpret S4 as a limit: S4 = −(l + j − 1)!L lim x→1 ∑ h≥0 (−x)h ( l + j + h− 1 h ) = −(l + j − 1)!L2−l−j. Altogether we found ∑ k 6=0 Γ(j −χk)Γ(l + χk) = −(j − 1)!(l− 1)! − (l + j − 1)!L ∑ h≥0 (−1)h−1 2j+h − 1 ( l + j + h− 1 h ) − (l + j − 1)!L ∑ h≥0 (−1)h−1 2l+h − 1 ( l + j + h− 1 h ) + (l + j − 1)!L2−l−j. Now, let us look at ∑ k 6=0 Γ(−χk + l) ( Γ′(χk) + γΓ(χk) ) . The proper function is L 2z − 1 Γ(−z + l) ( Γ′(z) + γΓ(z) ) . 202 G. Louchard, H. Prodinger The poles at z = χk, k 6= 0, lead to S1 = ∑ k 6=0 Γ(−χk + l) ( Γ′(χk) + γΓ(χk) ) . The pole at z = 0 leads to S2 = Γ(l) (π2 −L2 12 − γ2 2 − Lγ 2 ) − Γ′(l) ( γ + L 2 ) − 1 2 Γ′′(l). The poles at z = l, l + 1, . . . lead to S3 = L ∑ h≥0 1 2l+h − 1 (−1)h h! ( Γ′(l + h) + γΓ(l + h) ) = L ∑ h≥0 1 2l+h − 1 (−1)h(l + h− 1)! h! Hl+h. The poles at z = −1,−2, . . . lead to − ∑ h≥1 L 1 − 2−h Γ(h + l) (−1)hγ h! = −Lγ ∑ h≥1 1 2h − 1 (−1)h(l + h− 1)! h! + Lγ(1 − 2−l). Therefore∑ k 6=0 Γ(−χk + l) ( Γ′(χk) + γΓ(χk) ) = −Γ(l) (π2 −L2 12 − γ2 2 − Lγ 2 ) + Γ′(l) ( γ + L 2 ) + 1 2 Γ′′(l) −L ∑ h≥0 1 2l+h − 1 (−1)h(l + h− 1)! h! Hl+h + Lγ ∑ h≥1 1 2h − 1 (−1)h(l + h− 1)! h! −Lγ(1 − 2−l). Finally, let us consider ∑ k 6=0 ( Γ′(−χk) + γΓ(−χk) )( Γ′(χk) + γΓ(χk) ) . The function of interest is L 2z − 1 ( Γ′(−z) + γΓ(−z) )( Γ′(z) + γΓ(z) ) . The poles at z = χk, k 6= 0, lead to S1 = ∑ n 6=0 ( Γ′(−χk) + γΓ(−χk) )( Γ′(χk) + γΓ(χk) ) . The pole at z = 0 leads to S2 = − 11π4 360 − L2π2 72 − L4 720 . The poles at z = 1, 2, . . . lead to S3 = Lγ ∑ h≥1 (−1)h−1Hh h(2h − 1) . 203 Approximate counting with m counters The poles at z = −1,−2, . . . lead to S3 = −Lγ ∑ h≥1 (−1)hHh h(1 − 2−h) = Lγ ∑ h≥1 (−1)h−1Hh h(2h − 1) −Lγ ∑ h≥1 (−1)h−1Hh h . Altogether ∑ k 6=0 ( Γ′(−χk) + γΓ(−χk) )( Γ′(χk) + γΓ(χk) ) = 11π4 360 + L2π2 72 + L4 720 − 2Lγ ∑ h≥1 (−1)h−1Hh h(2h − 1) + Lγ ∑ h≥1 (−1)h−1Hh h . The last alternating sum has a closed form evaluation: ∑ h≥1 (−1)hHh h = L2 2 − π2 12 . This gives us the constant term in w41: c4[0] = [w 4 1]0 = ( 2 L ∑ h≥1 (−1)h−1 h(2h − 1) − 11 12 + π2 6L2 )2 + 4 L2 ∑ j,l≥1 (−1)j+l j!(2j − 1)l!(2l − 1) × [ −(j − 1)!(l− 1)! − (l + j − 1)!L ∑ h≥0 (−1)h−1 2j+h − 1 ( l + j + h− 1 h ) − (l + j − 1)!L ∑ h≥0 (−1)h−1 2l+h − 1 ( l + j + h− 1 h ) + (l + j − 1)!L2−l−j ] + 8 L3 ∑ l≥1 (−1)l l!(2l − 1) × [ −Γ(l) (π2 −L2 12 − γ2 2 − Lγ 2 ) + Γ′(l) ( γ + L 2 ) + 1 2 Γ′′(l) −Lγ(1 − 2−l) −L ∑ h≥0 1 2l+h − 1 (−1)h(l + h− 1)! h! Hl+h + Lγ ∑ h≥1 1 2h − 1 (−1)h(l + h− 1)! h! ] + 4 L4 [ 11π4 360 + L2π2 72 + L4 720 − 2Lγ ∑ h≥1 (−1)h−1Hh h(2h − 1) −Lγ (L2 2 − π2 12 )] . Although a few simplifications in this expression are still possible, it is clear that the complexity of the expressions does not make it attractive enough to write more similar evaluations. We can now compute the corrected values. Theorem 3.3. Taking the contribution of products of Fourier series into account, the corrected asymp- 204 G. Louchard, H. Prodinger totic constant and oscillating parts of the centered moments of J(n) are given by µ̃2,c = µ̃2 − c1[0] = 1 −C1 −C2 + 2 D L , µ̃3,c = µ̃3 + 2c2[0] + 6 L γc1[0] − 6 L2 c3[0] = −C1 + 6ζ(3) L3 − 2C3 + 2 + 2 D L − 12 L2 D1 − 12 L2 γD − 11 2L γ + γπ2 L3 − 3C2 − 4 L D2 − 4 L log(3) − 6 L2 c3[0], µ̃4,c = µ̃4 − 1 2 c1[0] − 3c4[0] + π2 L2 c1[0] + 6C1c1[0] + 6C2c1[0] − 12 L2 γ2c1[0] − 12 L γc2[0] + 24 L3 γc3[0] + 12 L2 c6[0] + 12 L3 c7[0] + 12 L3 c8[0] κ2,c = − ∑ l 6=0 c1[l]e −2lπi log n − 2γw1 L + 2 L2 ρ1, κ3,c = 2 ∑ l 6=0 c2[l]e −2lπi log n + 6 L γ ∑ l 6=0 c1[l]e −2lπi log n + (6γ2 −π2)w1 2L2 − 6γ L3 ρ1 − 6 L2 ∑ l 6=0 c3[l]e −2lπi log n − 3 L3 ρ2 − 3 L3 ρ4, κ4,c = [ 12γC2 L + 12γC1 L − 8ζ(3) L3 − 4γ3 L3 − γ L ] w1 + L2 − 12C2L2 − 12C1L2 + 12γ2 L4 ρ1 + 12 L4 ∑ l 6=0 e−2lπi log nψ(χl)ψ(1,χl)Γ(χl) + 12γ L4 ρ4 4 L4 ∑ l 6=0 e−2lπi log nψ(2,χl)Γ(χl) + 4 L4 ∑ l 6=0 ρ3 + 12γ L4 ρ2 − 1 2 ∑ l 6=0 c1[l]e −2lπi log n − 3 ∑ l 6=0 c4[l]e −2lπi log n + π2 L2 ∑ l 6=0 c1[l]e −2lπi log n + 6C1 ∑ l 6=0 c1[l]e −2lπi log n + 6C2 ∑ l 6=0 c1[l]e −2lπi log n − 12 L2 γ2 ∑ l 6=0 c1[l]e −2lπi log n − 12 L γ ∑ l 6=0 c2[l]e −2lπi log n + 24 L3 γ ∑ l 6=0 c3[l]e −2lπi log n + 12 L2 ∑ l 6=0 c6[l]e −2lπi log n + 12 L3 ∑ l 6=0 c7[l]e −2lπi log n + 12 L3 ∑ l 6=0 c8[l]e −2lπi log n. Note that µ̃2,c fits with the result given in [16]. 4. m counters: Asymptotic independence of the m counters In this section, we analyze the asymptotic properties of the RV J(m,n). We will prove that, asymptotically, the counters are independent with n/m items each. We must analyze the random variable J(m,n) = ∑m 1 Ji(n) where Ji(n) has the distribution pη(j) with η is now given by νi: the number of items arriving in counter i. The quantity νi is bin[ñ, ñ(1 − 1/m))] with ñ := n/m. Actually, {ν1, . . . ,νm} is given by a multinomial distribution. We know that we can construct a “box” [ñ− ñθ, ñ + ñθ]m, 1 2 < θ < 1, (18) 205 Approximate counting with m counters such that, by large deviation analysis, the probability that {ν1, . . . ,νm} is outside this box is bounded by exp(−Cñ2θ−1). We will analyze J(m,n) −m log(ñ) = m∑ i=1 Xi, with Xi := Ji − log(ñ). The rate of convergence is analyzed as follows. 4.1. Let us first assume that νi is exactly given by its mean ñ. As mentioned in the previous section, the rate of convergence problem is solved in Flajolet [2]. 4.2. Now we assume that we are inside the box (18). We drop the ˜ sign from ñ for convenience. Let 0 < ε < 1. We must bound S2 := ∑ j jk ∣∣pn(j) −pn+ξnθ (j)∣∣ , with |ξ| < 1. Note that n + ξnθ = n[1 + ξnθ−1]. Set 1 > β > θ. • For j < β log n, we have 2−η > n1−β, |pn(j)| ≤ 1 Q2 ∞∑ k=0 1 2k(k−1)/2 exp(−n1−β2k) = O(exp(−n1−β)), |pn+ξnθ (j)| ≤ 1 Q2 ∞∑ k=0 1 2k(k−1)/2 exp(−n1−β(1 + ξnθ−1)2k) = O(exp(−n(1−β)(1−ε))), |pn(j) −pn+ξnθ (j)| = O(exp(−n (1−β)(1−ε))). • For β log n ≤ j < 2 log n, we have 2−η > 1 n , (1 − 1/2j)n − (1 − 1/2j)n+ξn θ = (1 − 1/2j)n(1 − (1 − 1/2j)ξn θ ) = O(nθ/2j). We use again the “sum splitting technique.” Set r = √ 2 log n. 1. Truncating the sum in (2) to k ≥ r leads to an error E1: E1 ≤ 1 Q2 ∞∑ k=r 1 2k(k−1)/2 [ exp(− 2k n ) + exp(− 1 + ξnθ−1 n 2k) ] = O ( 1 n ) 2. The remaining sum k ≤ r leads to E2 ≤ 1 Q2 r∑ k=0 1 2k(k−1)/2 [ (1 − 1/2j−k)n − (1 − 1/2j−k)n+ξn θ ] = r∑ k=0 O( nθ 2j−k ) = O( 1 n(β−θ)(1−ε) ). 206 G. Louchard, H. Prodinger • For j = 2 log n + x, x ≥ 0, we set r = √ 2 log n + √ 2x. So 1/2r 2/2 ≤ 2−x/n. We proceed now as in the second range 1. E1 = O ( 2−x n ) . 2. E2 ≤ 1 Q2 r∑ k=0 1 2k(k−1)/2 nθ 2j−k = O ( nθ n2(1−ε)2x(1−ε) ) . Now we come to S2. We get S2 = O ( (β log n)k+1 exp(−n(1−β)(1−ε)) ) + (2 log n)k+1O ( 1 n(β−θ)(1−ε) ) + O (∑ x≥0 (2 log n + x)kO (2−x n )) = O ( 1 n(β−θ)(1−ε) ) ; (not with the same ε, of course). 4.3. Now we consider the case ν > n + nθ. We will show that S3 := ∑ j pν(j)j k exp(−Cn2θ−1) is small. We notice that 1 < ν/n < m. But, by the rate of convergence proved in [2], ∑ j pν(j)j k is asymptotically bounded by O((log ν)k) = O((log(nm))k) and S3 is asymptotically small. 4.4. The last case to consider is 0 < ν < n−nθ. We have here 0 < ν/n < 1. The analysis proceeds like in the previous subsection. We therefore omit the details. In conclusion, as S2 and S3 are asymptotically small, we can assume that νi can be deterministically chosen as ñ for all i, and that the counters are asymptotically independent. 5. m counters: Asymptotic moments If Ji are iid RV, with asymptotic centered moments µk = µ̃k + κk, then J = ∑m i=1 Ji has asymptotic distribution given by the convolution f(η)(m), mean m log(ñ) + mm1 and asymptotic centered moments µk(m) given by µ2(m) = mµ2, µ3(m) = mµ3, µ4(m) = m[µ4 + 3(m− 1)µ 2 2]. 207 Approximate counting with m counters For µ2(m) and µ3(m), we immediately use µ2 and µ3. For µ4(m), we have µ2 = µ̃2 + κ2, so µ 2 2 = µ̃22 + κ 2 2 + 2µ̃2κ2. Hence µ̃4(m) = m[µ̃4 + 3(m− 1)µ̃22], κ4(m) = m[κ4 + 3(m− 1)(κ22 + 2µ̃2κ2)]. Also, we have [κ22]0 = c4[0] + 4γ2c1[0] L2 + 4c5[0] L4 + 4γc2[0] L − 4c6[0] L2 − 8c3[0] L3 , [κ22]k = c4[k] + 4γ2c1[k] L2 + 4c5[k] L4 + 4γc2[k] L − 4c6[k] L2 − 8c3[k] L3 . The corrected moments must now be computed. The interesting case is the fourth moment, since here the dependency on m is more involved: we obtain our last theorem. Theorem 5.1. Taking the contribution of products of Fourier series into account, the asymptotic constant and oscillating parts of the corrected fourth centered moment of J are given by µ̃4,c(m) = m [ µ̃4,c + 3(m− 1)µ̃22 ] + 3m(m− 1) [ [κ22]0 − 2µ̃2c1[0] ] , κ4,c(m) = m [ κ4,c + 3(m− 1) ∑ l 6=0 [κ22]le −2lπi log n + 3(m− 1)2µ̃2κ2,c ] . 6. Conclusion If we compare the approach in this paper with other ones that appeared previously, then we can notice the following. Traditionally, one would stay with exact enumerations as long as possible, and only at a late stage move to asymptotics. Doing this, one would, in terms of asymptotics, carry many unimportant contributions around, which makes the computations quite heavy, especially when it comes to higher moments. Here, however, approximations are carried out as early as possible, and this allows for streamlined (and often automatic) computations of asymptotic distributions and higher moments. Acknowledgment: We would like to thank two referees for many useful suggestions that improved the paper. References [1] J. Cichoń and W. Macyna, Approximate counters for flash memory, 17th IEEE International Confer- ence on Embedded and Real-Time Computing Systems and Applications (RTCSA), Toyama, Japan, 2011. [2] P. Flajolet, Approximate counting: A detailed analysis, BIT, 25, 113–134, 1985. [3] P. Flajolet and R. Sedgewick, Digital search trees revisited, SIAM J. Comput., 3, 748–767, 1986. [4] M. Fuchs, C. K. Lee, and H. Prodinger, Approximate counting via the Poisson-Laplace-Mellin method, DMTCS, proc AQ, 13–28, 2012. [5] H.-K. Hwang, M. Fuchs and V. Zacharovas, Asymptotic variance of random symmetric digital search trees, Discrete Math. Theor. Comput. Sci., 12, 103–166, 2010. [6] P. Kirschenhofer and H. Prodinger, Approximate counting: An alternative approach, RAIRO Theor. Inform. Appl., 25, 43–48, 1991. [7] D. E. Knuth, The art of computer programming, volume 3: Sorting and searching, Addison-Wesley, 1973, (Second edition, 1998). 208 G. Louchard, H. Prodinger [8] M. Loève, Probability theory, 3rd ed, D. Van Nostrand, 1963. [9] G. Louchard, Exact and asymptotic distributions in digital and binary search trees, RAIRO Theor. Inform. Appl., 21(4), 479–496, 1987. [10] G. Louchard and H. Prodinger, Asymptotics of the moments of extreme-value related distribution functions, Technical report, 2005. Long version: http://www.ulb.ac.be/di/mcs/louchard/moml.ps. [11] G. Louchard and H. Prodinger, Asymptotics of the moments of extreme-value related distribution functions, Algorithmica, 46, 431–467, 2006. [12] G. Louchard and H. Prodinger, Generalized approximate counting revisited, Theoret. Comput. Sci., 391, 109–125, 2008. [13] H. Prodinger, Hypothetic analyses: Approximate counting in the style of Knuth, path length in the style of Flajolet, Theoret. Comput. Sci., 100, 243–251, 1992. [14] H. Prodinger, Approximate counting via Euler transform, Mathematica Slovaka, 44, 569–574, 1994. [15] H. Prodinger, Periodic oscillations in the analysis of algorithms and their cancellations, J. Iran. Stat. Soc., 3, 251–270, 2004. [16] H. Prodinger, Approximate counting with m counters: a detailed analysis, Theoret. Comput. Sci., 439, 58–68, 2012. 209 Introduction Definitions, notations and known properties Classical approximate counting (one counter). Analysis of J(n) m counters: Asymptotic independence of the m counters m counters: Asymptotic moments Conclusion References