JACODESMATH / ISSN 2148-838X J. Algebra Comb. Discrete Appl. 2(2) • 117-149 Received: 1 December 2014; Accepted: 22 April 2015 DOI 10.13069/jacodesmath.28239 Journal of Algebra Combinatorics Discrete Structures and Applications Identifying long cycles in finite alternating and symmetric groups acting on subsets Research Article Steve Linton1∗, Alice C. Niemeyer2∗∗, Cheryl E. Praeger3§ 1. School of Computer Science, University of St. Andrews, North Haugh, St. Andrews, Fife, KY16 9SX, Scotland 2. Department of Mathematics and Statistics, Maynooth University, Co. Kildare, Ireland 3. Centre for the Mathematics of Symmetry and Computation, The University of Western Australia, 35 Stirling Hwy, Crawley, WA 6009, Australia Abstract: Let H be a permutation group on a set Λ, which is permutationally isomorphic to a finite alternating or symmetric group An or Sn acting on the k-element subsets of points from {1, . . . ,n}, for some arbitrary but fixed k. Suppose moreover that no isomorphism with this action is known. We show that key elements of H needed to construct such an isomorphism ϕ, such as those whose image under ϕ is an n-cycle or (n−1)-cycle, can be recognised with high probability by the lengths of just four of their cycles in Λ. 2010 MSC: 20B30, 60C05, 20P05, 05A05 Keywords: Symmetric and alternating groups in subset actions, Large base permutation groups, Finding long cycles 1. Introduction The second and third authors predicted in [9] that, for a permutation group H on a set Λ, which is permutationally isomorphic to a symmetric group Sn acting on the k-element subsets of points from {1, . . . ,n} (that is in its k-set action), for some arbitrary but fixed k, it should be possible to recognise an element in H corresponding to an n-cycle in Sn by the lengths of just four of its cycles in Λ. The purpose of this paper is to prove this result. Theorem 1.1. Let H be a permutation group on a set Λ, which is permutationally isomorphic, via an unknown isomorphism ϕ, to a finite symmetric group Sn in its k-set action, for some k. Let h be ∗ E-mail: sal@cs.st-andrews.ac.uk ∗∗ E-mail: alice.niemeyer@nuim.ie § E-mail: Cheryl.Praeger@uwa.edu.au 117 Identifying long cycles a uniformly distributed random element of H and let λ1, . . . ,λ4 be independent, uniformly distributed random points of Λ. Then there exist positive constants N0 and c such that, for n ≥ N0, Prob ( ϕ(h) is an n-cycle ∣∣∣∣ the h-cycle containing λi haslength n, for i = 1, . . . , 4 ) > 1 − c n 1 6 . Subset actions of Sn and the alternating group An play a crucial role in algorithms for permutation groups. They are examples of ‘large-base’ actions. Most primitive permutation groups are ‘small-base’ and very efficient algorithms are available to compute with them (for a detailed definition see [12, p. 51]). However these algorithms become prohibitively expensive when applied to large-base groups and, there- fore, alternative means of handling large-base groups are essential (see [12, Chapter 10] for a discussion on currently available algorithms for this case). The large-base primitive permutation groups all contain in their socles alternating groups with associated subset actions. Hence finding efficient algorithms for these actions is important. The probabilistic algorithm described in [6] recognises alternating and symmetric groups in their actions on k-sets constructively. It takes as input a group H and an integer n. Under the assumption that H is isomorphic to Sn, the algorithm examines a number of random elements of H seeking to find key-elements, namely elements for which good estimates for their proportions in Sn are known and which have particular properties. Should this search fail, the algorithm concludes that the assumption that H is isomorphic to Sn is incorrect and reports that H is not isomorphic to Sn. Otherwise, it attempts to construct an isomorphism from H to Sn. Since this algorithm is randomised, there is a possibility that the search appears to succeed but in fact has not found suitable elements. In most settings this will be detected as part of the larger algorithm, see [6]. The theoretical underpinning of the algorithm described above is to determine very good bounds for the probability of finding the key-elements in H under the assumption that H is isomorphic to Sn. This is the purpose of the current paper. For the connection between estimation results and probabilistic algorithms in the context of recognition algorithms for groups see [10]. The groups the algorithm can take as input need not be permutation groups. Rather, they may belong to a more general class of groups called groups of black box permutations (see [6]). This allows the algorithm to be employed in different computational models, for example, to deal with groups given by a set of generating matrices, which are permutation isomorphic to Sn acting on the underlying vector space, without converting such a group into a permutation group. The crucial requirement is the ability to compute the image of a point under the action of a generator of the group. Suppose that t is an upper bound for the time taken to perform this action, which might be viewed as a black box procedure. Then the time required to determine that a word of length r in the group generators permutes a given point in a cycle of length m is at most mrt. In some contexts this time can even be very much less than the time required to find the product of two group generators. For suitable n and k the algorithms in [6] have running time (excluding any time needed to read the input) growing significantly more slowly than ( n k ) , implying that they can use the input black box permutations only through computing their action on a selection of the ( n k ) points, and not through examining any other aspect of their structure, or computing any other elements of the group they generate. For example, checking that the n-th power of an element is the identity may already be more expensive than our entire algorithm. The key elements sought by the algorithm to recognise An or Sn in its k-set action are elements containing an m-cycle for large m, as described in Table 1. Our Theorem 1.1 follows from a more general theorem, Theorem 3.1, which determines the probability of finding elements of each of these types among a certain number of random elements of H. Once these key elements have been found, a permutational isomorphism from H on Λ to An or Sn on k-sets can be implemented using the methods described in [2, Section 4], especially Method B, or those described in [1, Sections 4 and 5], especially Lemmas 4.1, 4.3 and 5.5. Alternative methods, focussing in particular on the k-set action, are developed in [6]. 118 S. Linton, A.C. Niemeyer C.E. Praeger 1.1. Context of our results In a seminal collection of papers, Erdös and Turán initiated the study of asymptotic behaviour of the proportions of various kinds of elements in permutation groups. For example, they showed [4, 5] that for n large enough, most elements in the symmetric group Sn of degree n have order n( 1 2 +o(1)) log(n). In the same vein Warlimont [13] proved that the conditional probability that a random element g in Sn is an n-cycle, given that gn = 1, is 1 −O(n−1). Applied algorithmically, Warlimont’s result is used to conclude, from the fact that the nth power of a ‘hidden’ permutation g ∈ Sn is the identity, that g is almost certainly an n-cycle. Finding an n-cycle is a key step in many algorithms that ‘constructively recognise’ Sn, so this is valuable. However, testing whether gn = 1 requires log(n) multiplications of black box permutations. In computational models where a single multiplication costs about ( n k ) operations, employing this approach would not yield an algorithm whose running time grows significantly more slowly than ( n k ) . The results of [9] provide a basis for extending this to a situation where we know only that 〈g〉 has an orbit of length n in some action. An extension of this nature to k-set actions of Sn and An is the subject of this paper. We refine and improve significantly the main result of [9]. For example, we employ a similar division of the elements of Sn into several families according to properties of points which lie in cycles of lengths dividing n. However, examining this subdivision alone is not sufficient to achieve the results in this paper. We need to study the probability that several k-element subsets of {1, . . . ,n} have exactly n distinct images under 〈g〉 for g an element in one of the families. Moreover, in our algorithmic applications we also required analogous results for elements of Sn and An containing m-cycles, for m ≥ n− 6. Line G n m r key-elements ρ(G,n,m) 1 Sn n 1 n-cycle 1 2 odd n− 2 2 2-cycle 1 3 even n− 3 2 2-cycle 2/3 4 An odd n 1 n-cycle 1 5 even n− 1 1 (n−1)-cycle 1 6 2 or 4 (mod 6) n− 3 3 3-cycle 1 7 3 or 5 (mod 6) n− 4 3 3-cycle 3/4 8 0 (mod 6) n− 5 3 3-cycle 7/20 9 1 (mod 6) n− 6 3 3-cycle 9/40 Table 1. Groups and types of elements In Section 2 we briefly describe the algorithmic application, and in particular we explain the meaning of the parameters r and ρ in Table 1. In Section 3 we introduce the notation which we shall use throughout the paper and give the precise statement of the main result (Theorem 3.1). The proof of Theorem 3.1 (and hence of Theorem 1.1) is given in Section 4. In particular we exhibit an explicit value for the constant c of Theorem 3.1(a) and Theorem 1.1. We present some background material in Sections 5 and 6. Sections 7 - 11 contain the various parts which are pulled together in Section 4 for the proof of Theorem 3.1. 2. Algorithmic application The results in this paper are motivated by algorithmic applications in [6] and [7]. In these applica- tions, H is a permutation group acting on a set Λ of ( n k ) points. We wish to test whether H is permutation isomorphic to G = An or G = Sn acting on the set ( Ω k ) of k-element subsets of Ω = {1, . . . ,n}. That is 119 Identifying long cycles to say, whether there is a group isomorphism ϕ : H → G and a bijection f : Λ → ( Ω k ) such that, for each h ∈ H and λ ∈ Λ, (λh)f = (λ)fhϕ. These isomorphisms will be constructed in the form of a computer program rather than listing the image of each element. We say that an element h ∈ H corresponds to an element g ∈ G if the permutation isomorphism ϕ maps h to g. The algorithms construct a ‘nice generating’ set for H of size 2. In the case where H is permutation isomorphic to Sn in its action on ( Ω k ) , this generating set consists of elements that, in the natural representation of Sn on n points, correspond to an n-cycle and a 2-cycle interchanging two consecutive points of the n-cycle. In the case where H is permutation isomorphic to An in its action on( Ω k ) the nice generating set consists of elements that in An correspond to an n-cycle or (n−1)-cycle, and to a 3-cycle. We wish to find these elements by selecting independent, uniformly distributed random elements from the group H. However, the proportion of 2-cycles in Sn or 3-cycles in An is too small to allow us to find such elements directly by random selection. Therefore, we seek elements in H which correspond to permutations containing a 2-cycle or a 3-cycle together with one long cycle of length m, say, where m is at least n− 6 and m is coprime to 2 or 3, respectively. The algorithms in [6] and [7] seek elements h ∈ H which correspond to the kinds of elements g listed in Table 1, where H is permutation isomorphic to G = Sn or G = An, with G,n as in the second and third columns. The fourth column, labelled m, lists the length of the m-cycle which the element g contains. The fifth column, labelled r, lists an integer between 1 and 3. Ultimately we wish to find an element h in H which corresponds to an element in G with cycle type as recorded in the sixth column. This element is constructed as a power of the element h. The first element in the nice generating set for H corresponds to an element satisfying the conditions of Line 1, 4, or 5, namely it corresponds to an n-cycle or an (n − 1)-cycle. The second nice generator corresponds to a 2-cycle if G = Sn and is constructed from an element h ∈ H which corresponds to g as in Line 2 or 3. If G = An, the second nice generator corresponds to a 3-cycle and is constructed from h ∈ H corresponding to an element g as in Line 6, 7, 8 or 9. The last column, labelled ρ(G,n,m), records a rational number such that the proportion of elements h of H which correspond to elements of G containing an m-cycle and with order dividing rm is ρ(G,n,m) m (see (1)). The group H acts on a set Λ of size |Λ| = ( n k ) , and in the context of the algorithm m, n and k are so large that it is ‘too expensive’ to compute the full cycle structure of elements of H in their action on Λ. Instead we compute the cycle lengths of elements h ∈ H on a handful of randomly chosen points of Λ, that is to say, we ‘trace’ these points under the action of 〈h〉. In computer experiments in GAP [3], we discovered that if H is permutation isomorphic to G = Sn or An on ( Ω k ) then, for m,r as in one of the lines of Table 1, most elements of H which produced cycles of lengths a multiple of m and dividing rm, when we traced each of four or five independent random points of Λ, corresponded to elements of G containing an m-cycle. This computer experiment is formalised in procedures FindMCycle and TraceCycle. Our experimental observation turns out to be true in general, and is proved in Theorem 3.1 for sufficiently large n. Our experiments also suggest that the results hold for smaller values of n. For clarity of exposition the proofs of Theorem 3.1 are written in terms of the action of G on ( Ω k ) . For n,m and r as in one of the lines of Table 1, define N(n,m) to be the set of all g ∈ Sn that contain an m-cycle and Ngood(G,n,m) to be the set of all g ∈ N(n,m) ∩G for which o(g) divides rm. Note that, for given G,n,m, only one of the lines of Table 1 is satisfied, and hence r is determined by G,n,m. We define ρ(G,n,m) to be the rational number satisfying |Ngood(G,n,m)| |G| = ρ(G,n,m) m . (1) As an example of how to interpret this information, consider Line 3 of Table 1. The proportion of elements g of Sn containing an (n−3)-cycle is 1n−3, and 2/3 of these elements contain also a 2-cycle or three 1-cycles on the remaining 3 points. Thus the proportion of elements of Sn containing an (n−3)-cycle and having order dividing 2(n− 3) is 2/3 n−3 = ρ(Sn,n,n−3) n−3 . In order to construct a 2-cycle (the entry in column 6 for 120 S. Linton, A.C. Niemeyer C.E. Praeger this line), we raise the element g to the (n − 3)rd power producing x = gn−3. Since n − 3 is odd, the element x is the identity if g has three fixed points, a 2-cycle if g contains a 2-cycle, or possibly a 3-cycle if g contains a 3-cycle and 3 does not divide n. Thus three quarters of the elements of Ngood(Sn,n,n−3) yield a 2-cycle by powering. The algorithm FindMCycle can therefore easily be incorporated into a Monte Carlo algorithm to construct a transposition in this case: by repeating FindMCycle a number of times we will with high probability construct a transposition by powering the output of FindMCycle. The other lines have a similar interpretation for ρ(G,n,m). We now describe the two algorithms. Algorithm 1 assumes that we have a function RandomGrpElt which takes as input a generating set Y for a group H and returns independent, uniformly distributed random elements of H. Algorithm 2 assumes that we have a function RandomPoint which takes as input a finite set Λ and returns independent, uniformly distributed random points of Λ. Note that Algorithm 1 calls Algorithm 2 and that we assume that Algorithm 2 has access to the variables of Algorithm 1. Algorithm 1: FindMCycle(n,m,r,H, Λ,ε,M) Data: Let (n,m,r) be as in one of the lines of Table 1. Let H be a permutation group with a generating set Y acting on a finite set Λ. Let ε be a real number with 0 < ε < 1 and let M be an integer with M ≥ 4. Result: An element h ∈ H or fail; This algorithm inspects up to O(n log(ε−1)) uniformly distributed independent random elements from H to find one which has orbits of length a multiple of m and dividing rm on each of M randomly selected points from Λ. If such an h ∈ H is found it returns h, otherwise it returns fail. Set N := ⌈ 5n log( 2 ε ) ⌉ ; for i = 1, . . . ,N do hi := RandomGrpElt(Y ); if TraceCycle(hi) = true then return hi; return fail; Algorithm 2: TraceCycle(h) Data: A permutation h ∈ H; Result: A boolean ‘true’ or ‘false’ This algorithm tests whether the permutation h ∈ H has orbits of length a multiple of m and dividing rm on M randomly selected points from Λ. If this is the case it returns true, otherwise it returns false. for i = 1, ..,M do λi := RandomPoint(Λ); Put Γ = {λj}Mj=1; for λ ∈ Γ do if |λ〈g〉| 6= r0m for some r0 | r then return false; return true; Remark 2.1. (a) The number M of random points of Λ tested in the algorithm TraceCycle is often a bounded constant (as, for example, in Theorem 3.1), but in our analysis we allow it to be as large as O(n), see (2). (b) The algorithm TraceCycle performs O(n) image computations to check whether |λ〈g〉| = r0m, for each random point λ. Thus if ξrp, ξrge, νim, are upper bounds for the costs of producing a random point using RandomPoint, producing a random group element using RandomGrpElt, and computing the image of a point of Λ under an element of H, respectively, then the cost of FindMCycle is O(n log(ε−1)(ξrge + Mξrp + Mnνim)). 121 Identifying long cycles This cost is modest when compared with the cost ( n k ) νim of computing the product of two permutations of Λ (especially when k = O(n)) or the cost of directly computing the order of any element. Our main result Theorem 3.1 shows that these simple and inexpensive procedures provide an effective way to find and identify elements of Sn and An containing m-cycles from their actions on k-element subsets. 3. Statement of the main theorem and notation In order to state our main theorem we introduce several parameters that are used throughout the paper. Suppose that the triple (G,n,m) satisfies one of the lines of Table 1, and note that r is determined by G,n,m. The integer M used in the algorithm FindMCycle is assumed to satisfy 4 ≤ M ≤ log ( 9 8 ) n− 2 2 . (2) Let d(x) be the number of positive divisors of an integer x. By [11, pp. 395-396], d(x) = xo(1). In fact, for every δ > 0, there is a positive constant cδ such that d(x) ≤ cδxδ (3) for all x. Choose real numbers δ and s satisfying 0 < δ < min{1 −s, s 3 ,s− 1 2 } and 1 2 < s < M − 1 M . (4) Further let ` = min{M(1 −s), 3 − 2s− 2δ, 1 + s− 3δ, 2s− 2δ} . (5) By (4), all of M(1 −s) > 1, 3 − 2s− 2δ > 1, 1 + s− 3δ > 1 and 2s− 2δ > 1 hold. Hence ` > 1. Next we define the constant aδ by aδ := 5 4 ( 1 + 3 cδ 150s−δ + ( cδ 150s−δ )2) , (6) with cδ as in (3), and the constant bM,δ,s, which we usually abbreviate to bM, by bM = ( 33 8 )M + 72 aδc 2 δr 2s+2δ + 6.24 aδc 3 δr 3δ + c2δ r2s−2δ + ( 31 r1−s )M . (7) The theorem involves an ‘error probability’ ε, that is, a real number satisfying 0 < ε < 1. We assume that the integer n satisfies the following inequalities: n ≥   12(rn)s + 6 (rn)s log n( 10bM ε )1/(`−1) . (8) Theorem 3.1. Let (G,n,m) be as in one of the lines of Table 1, and let k be a positive integer satisfying 2 ≤ k ≤ n/2. Suppose that H is a permutation group permutation isomorphic to G acting on k-element subsets of {1, . . . ,n} (via the unknown isomorphism ϕ : H → G). Then the following hold (a) Let h be a uniformly distributed random element of H and let λ1, . . . ,λ4 be independent, uniformly distributed random points of Λ. Then there exist positive constants N0 and c such that, for n ≥ N0, Prob   ϕ(h) contains an m-cycle ∣∣∣∣∣∣∣ for i = 1, . . . , 4 the h-cycle on λi has length rim for some ri | r   > 1 − c n 1 6 . 122 S. Linton, A.C. Niemeyer C.E. Praeger (b) Let M be an integer satisfying (2), and let s,δ be real numbers satisfying (4), and ` as in (5). Then FindMCycle is a Monte Carlo Algorithm which, given as input the permutation group H, an error probability ε > 0 and the integer M, returns an output h such that, provided n satisfies (8), (i) the probability that h ∈ H and ϕ(h) contains an m-cycle is at least 1 −ε, (ii) the probability that h ∈ H and ϕ(h) does not contain an m-cycle is at most ε/2, and (iii) the probability that h = Fail is at most ε/2. Notation 3.2. For the rest of the paper we assume that n,m,r and G are as in one of the lines of Table 1, noting that r is determined by G,n,m. Let M be an integer satisfying (2), let s,δ be real numbers satisfying (4), and let `,cδ,aδ and bM be as in (5), (3), (6) and (7) respectively. Let Sn act naturally on Ω = {1, 2, . . . ,n}. Let k and k0 be positive integers satisfying 2 ≤ k ≤ n/2, and 1 ≤ k0 ≤ k. A k0-element subset of Ω is called a k0-subset. We use the notation in Table 2 to describe an element g ∈ Sn, where γ0 is a k0-subset of Ω. Here we identify a cycle of g with the subset of Ω it permutes. ck0 (γ0,g) length of the g-cycle containing γ0 on k0-subsets s-small g-cycle g-cycle in Ω of length less than (rn)s s-large g-cycle g-cycle in Ω of length at least (rn)s ∆(g) union of g-cycles in Ω whose lengths divide rm Σ(g) Ω \ ∆(g) v cardinality of ∆(g) u cardinality of Σ(g) Table 2. Table for Notation 3.2 We define in Table 3 several classes of elements in G. We usually omit mentioning n and m in our notation. For example, we refer to N(n,m) (defined in Section 2) simply as N and to Ngood(G,n,m) simply as Ngood. N set of all g ∈ Sn that contain an m-cycle Ngood set of all g ∈N ∩G for which o(g) divides rm F set of all g ∈ G\N such that m | o(g) R set of all g ∈F such that |∆(g)| ≤ 4(rn)s S0 set of all g ∈F such that |∆(g)| > 4(rn)s and all g-cycles in ∆(g) are s-small S+1 set of all g ∈F such that |∆(g)| > 4(rn) s, exactly one g-cycle C in ∆(g) is s-large, and |∆(g) \C| > 3(rn)s S−1 set of all g ∈F such that |∆(g)| > 4(rn) s, exactly one g-cycle C in ∆(g) is s-large, and |∆(g) \C| ≤ 3(rn)s S≥2 set of all g ∈F such that |∆(g)| > 4(rn)s and at least two g-cycles in ∆(g) are s-large Table 3. Families of Elements 123 Identifying long cycles Remark 3.3. (a) The definition of aδ is not too critical. We simply need aδ to be greater than or equal to the right hand side of (6) for the values of rm we are considering, see Remark 8.2 and Lemma 8.3. For example, if rm ≥ c1/(s−δ)δ then we may take aδ = 25/4. (b) Currently Equation (8) limits the practical applicability of Theorem 3.1 severely, but we note that in our analysis we allow k to be as large as n/2. The first two inequalities of (8) imposed on n are due to the subdivision of the set of permutations of order divisible by m into disjoint subsets which depend on s. We give a uniform proof that holds for all values of k in the range 2 ≤ k 6= n/2. If, for example, k were bounded as n increases, then several of the arguments would be simpler and the constraints on n correspondingly less severe. (c) The main constraint forcing n to be very large is the third inequality in (8). For example, for our parameter choice in Theorem 3.1, namely M = 4,s = 17 24 and δ = 1 6 , we have cδ ≤ 138.32 and, for n large enough, aδ = 254 . In this case we find bM > 2·10 8 and the last inequality of (8) dictates n > 3.3·10112/ε12. Moreover, even though a larger value of M allows us to choose a smaller value for cδ, the choice might result in a smaller value for `, which in turn has undesired consequences, making bM larger, and hence requiring n to be larger. 4. Proof of the main theorem The proof of the main theorem, Theorem 3.1, relies on many supporting results. In this section we subdivide the proof into various parts and show how these parts are then brought together to give a complete proof. The individual parts of the proof are proved in later sections. The main idea of the proof is to divide the elements of Sn that could possibly be returned by FindMCycle into disjoint families, and to compute the probability that TraceCycle returns true for an element of each of these families. The families of elements in this subdivision are defined in Table 3, namely N ,R,S0,S+1 ,S − 1 ,S≥2, and we use the notation introduced in this table throughout the paper. Proof of Theorem 3.1(b). We prove this theorem by analysing the algorithm FindMCycle. Let N = ⌈ 5n log( 2 ε ) ⌉ . A call to algorithm FindMCycle can terminate in one of three possible ways: (G) For some i with 1 ≤ i ≤ N the i-th iteration of the for-loop returns an element in N . We call this a good outcome. (B) For some i with 1 ≤ i ≤ N the i-th iteration of the for-loop returns an element which is not in N . We call this a bad outcome. (U) The for-loop is executed N times and TraceCycle returns false for each of the selected random elements. In this case the algorithm returns Fail. We call this an ugly outcome. Thus to prove the three parts of Theorem 3.1 we must prove Prob(G) ≥ 1 −ε, Prob(B) ≤ ε/2, Prob(U) ≤ ε/2. Clearly any two of these inequalities implies the third. We shall therefore prove only Prob(B) ≤ ε/2 and Prob(U) ≤ ε/2. To study these outcomes more closely we define the following events. Ei the i-th iteration of the for-loop is executed. Let gi denote the random element selected in the i-th iteration. Gi event Ei occurs, gi ∈N and TraceCycle(gi) = true Bi event Ei occurs, gi /∈N and TraceCycle(gi) = true Ui event Ei occurs and TraceCycle(gi) = false 124 S. Linton, A.C. Niemeyer C.E. Praeger Note that Ei = Gi∪̇Bi∪̇Ui and that Prob(E1) = 1. Further, for i > 1 we have that Ei = U1 ∩ . . .∩Ui−1 = Ui−1. (9) Thus G = G1 ∨G2 ∨ . . .∨GN B = B1 ∨B2 ∨ . . .∨BN U = U1 ∧U2 ∧ . . .∧UN = UN. (10) Proof that Prob(U) ≤ ε/2: For a uniformly distributed random element g ∈ G, let p1 = Prob(TraceCycle(g) = false | g ∈Ngood) p2 = Prob(TraceCycle(g) = false | g /∈Ngood) and let p = ρ m p1 + m−ρ m p2, where ρ := ρ(G,n,m) (see Table 1), the proportion of elements of G containing an m-cycle that have order dividing rm. Note that, since the proportion of elements containing an m-cycle in Sn is 1/m, we have Prob(g ∈Ngood) = ρm. Given Ei, the event Ui is the disjoint union of the events Ui1, that gi ∈Ngood and TraceCycle(gi) = false, and Ui2, that gi 6∈Ngood and TraceCycle(gi) = false. Thus Prob(Ui | Ei) = ρ m Prob(TraceCycle(gi) = false | gi ∈Ngood) + m−ρ m Prob(TraceCycle(gi) = false | gi /∈Ngood) = ρ m p1 + m−ρ m p2 = p. Note, in particular, that this probability is independent of i. By (9) we have Ei = Ui−1, and hence Prob(Ui) = Prob(Ei)Prob(Ui | Ei) = Prob(Ui−1) ·p. As this is true for all i with 1 ≤ i ≤ N, we have Prob(Ui) = pi, (11) and in particular, Prob(U) = Prob(UN ) = pN. The required inequality Prob(U) ≤ ε/2 holds whenever pN ≤ ε/2. We now prove the latter inequality. By Proposition 7.5 we have 1 −p1 ≥ ( n−2 n )M . Therefore, p ≤ ρ m p1 + m−ρ m = 1 − ρ m (1 −p1) ≤ 1 − ρ n ( n− 2 n )M . (12) Now N = ⌈ 5n log( 2 ε ) ⌉ = ⌈ log((ε/2)−1) (5n)−1 ⌉ , and so by Lemma 5.2, (1 − 1 5n )N ≤ ε/2. Thus pN ≤ ε/2 holds if 1− ρ n ( n−2 n )M ≤ 1− 1 5n , or equivalently, if ( n n−2 )M ≤ 5ρ. Since ρ ≥ 9/40 (see Table 1), it is sufficient to prove that ( n n−2 )M ≤ 9 8 . By our assumption, M ≤ log( 9 8 ) n−2 2 , and hence M log ( n n− 2 ) = M log ( 1 + 2 n− 2 ) ≤ M 2 n− 2 ≤ log ( 9 8 ) and exponentiating both sides gives the required inequality. Thus pN ≤ ε/2 and hence Prob(U) ≤ ε/2 is proved. 125 Identifying long cycles Proof that Prob(B) ≤ ε/2: Recall the definition of B in (10). Note that, if TraceCycle(g) = true, then o(g) is divisible by m. Thus, by the definition of F, for a uniformly distributed, random element g ∈ G, q := Prob(g ∈F and TraceCycle(g) = true) (13) = Prob(g 6∈N and TraceCycle(g) = true). Now, for all i with 1 ≤ i ≤ N, we have that Prob(Bi | Ei) = Prob(gi 6∈N and TraceCycle(gi) = true) = q. Hence Prob(Bi) = Prob(Ei)Prob(Bi | Ei) = Prob(Ei) q. If i ≥ 2 then Ei = Ui−1 by (9), and so by (11), Prob(Bi) = pi−1q. Therefore, Prob(B) = N∑ i=1 Prob(Bi) = q N∑ i=1 pi−1 = q 1 −pN 1 −p < q 1 −p . (14) The most substantial part of the paper is devoted to finding an upper bound for q. It follows from Table 3 that F = R∪̇S0 ∪̇S+1 ∪̇S − 1 ∪̇S≥2. Hence q = q(R) + q(S0) + q(S+1 ) + q(S − 1 ) + q(S≥2), where We estimate these proportions in Sections 8 - 11. Recall the definition of ` in (5), and that ` > 1. q(R) = Prob(g ∈R and TraceCycle(g) = true) q(S0) = Prob(g ∈S0 and TraceCycle(g) = true) q(S+1 ) = Prob(g ∈S + 1 and TraceCycle(g) = true) q(S−1 ) = Prob(g ∈S − 1 and TraceCycle(g) = true) q(S≥2) = Prob(g ∈S≥2 and TraceCycle(g) = true). Table 4. Subdivision of the probability q of (13). Define bM (R) = ( 33 8 )M and note that q(R) = Prob(g ∈ R) · Prob(TraceCycle(g) = true | g ∈ R) ≤ Prob(TraceCycle(g) = true | g ∈R). Then Proposition 7.3 gives q(R) ≤ bM (R) nM(1−s) ≤ bM (R) n` . Define bM (S0) = aδc2δr 2s+2δ72. Then Proposition 8.4 and (3) give q(S0) ≤ bM (S0) n3−2s−2δ ≤ bM (S0) n` . Define bM (S+1 ) = aδc 3 δr 3δ6.24. Then Proposition 9.1 and (3) give q(S+1 ) ≤ bM (S+1 ) n1+s−3δ ≤ bM (S+1 ) n` . 126 S. Linton, A.C. Niemeyer C.E. Praeger Define bM (S≥2) = c2δr 2δ−2s. Then Proposition 10.1 gives q(S≥2) ≤ bM (S≥2) n2s−2δ ≤ bM (S≥2) n` . Define bM (S−1 ) = ( 31 r1−s )M . Then Proposition 11.1(b) yields q(S−1 ) ≤ bM (S−1 ) nM(1−s) ≤ bM (S−1 ) n` . Thus by (7), bM (R) + bM (S0) + bM (S+1 ) + bM (S≥2) + bM (S − 1 ) ≤ ( 33 8 )M + aδc 2 δr 2s+2δ72 + aδc 3 δr 3δ6.24 + c2δ r2s−2δ + ( 31 r1−s )M = bM and q ≤ bM n` . (15) Remark 4.1. We make a critical observation that the argument up to this point relies only on the first two inequalities of (8), and does not depend on the third inequality of (8). By (15) and the inequalities (14) and (12), we have that Prob(B) < q 1 −p ≤ bM n` ρ n ( n−2 n )M = bM ρ ( n n− 2 )M 1 n`−1 . We showed above that ( n n−2 )M ≤ 9 8 ≤ 5ρ. Thus Prob(B) < 5bM n`−1 . By assumption n ≥ ( 10bM ε )1/(`−1) and so this is at most ε/2. Hence Prob(B) < ε/2. The proof of Theorem 3.1(a) requires a short argument applying Theorem 3.1(b). Proof of Theorem 3.1(a). We use the algorithm TraceCycle with M = 4. Note first that the probability that a random element h ∈ H corresponds to an element g ∈ G containing an m-cycle, given that the h-cycles containing four random k-subsets λ1, . . . ,λ4 all have lengths of the form rim with ri | r, is Prob(g ∈N | TraceCycle(g) = true). Recall the definition of q in (13). Then Prob(g ∈N | TraceCycle(g) = true) = Prob(g ∈N and TraceCycle(g) = true) Prob(TraceCycle(g) = true) = Prob(TraceCycle(g) = true) −q Prob(TraceCycle(g) = true) = 1 − q Prob(TraceCycle(g) = true) ≥ 1 − q Prob(g ∈Ngood and TraceCycle(g) = true) = 1 − q Prob(TraceCycle(g) = true | g ∈Ngood) ·Prob(g ∈Ngood) . 127 Identifying long cycles Set s = 5 8 , δ = 1 24 and let ` = 1 + 1 6 . Note that ` = min{M(1 −s), 3 − 2s− 2δ, 1 + s− 3δ, 2s− 2δ}, so in particular the inequalities (4) and (5) all hold. We choose N0 to be the least natural number for which inequality (8) holds. Hence the inequality (2) holds and in particular also 12(rn)s + 6 ≤ n and (rn)s log(n) ≤ n. Inequality (15) holds by Remark 4.1, so we have q ≤ b4 n` , where, since M = 4, the constant b4 given by (7), satisfies b4 ≤ ( 33 8 )4 + 72 aδc 2 δr 5/3 + 6.24 aδc 3 δr 3/8 + c2δ r7/6 + ( 31 r7/24 )4 . By Proposition 7.5 we have that Prob(TraceCycle(g) = true | g ∈ Ngood) ≥ ( n−2 n )4 . Also, by Equation (1), Prob(g ∈Ngood) = ρ(G,n,m) m . Hence, using n ≥ N0, and the displayed inequality above, we have Prob(g ∈N | TraceCycle(g) = true) ≥ 1 − b4 n1+ 1 6 ( n n− 2 )4 m ρ(G,n,m) ≥ 1 − ( N0 N0 − 2 )4 b4 ρ(G,n,m) · 1 n 1 6 = 1 − c n 1 6 , where c = ( N0 N0−2 )4 b4 ρ(G,n,m) . 5. Preliminaries It is useful to collect together some of the arithmetic facts we use in the rather delicate estimations in the remaining sections. Lemma 5.1. Let n,m,r be as in one of the lines of Table 1, and let d be a divisor of rm with d ≤ n. Then either d = m, or d ≤ 2m/7, or r,d are as in Table 5. In particular, either d ≤ 2m/7 or d is one of r d 1 m 3 m 2 2 m 3 2m 5 2m 3 3 3m 5 3m 7 Table 5. possibilities for r and d at most 3 different divisors of rm greater than 2m/7 and in the latter case d ≤ 2m/3 ≤ 2n/3. Proof. We have d = r0 mj , where r0 divides r and j divides m. If j = 1 then d = m since 2m ≥ 2(n− 6) > n. So assume j ≥ 2. Assume also that d > 2m/7, or equivalently 7r0 > 2j. If m is even, then (see Table 1) r = 1. Hence r0 = 1 and j ≤ 2. Thus d = m/2 or m/3 as in Table 5. So assume now that m is odd, so j ≥ 3. If j = 3 then we have the examples (r,d) = (1, m 3 ), (2, m 3 ), (1, 2m 3 ) in Table 5 and no others since if r = 3 then (see Table 1) gcd(m, 6) = 1. Now assume that j ≥ 5. Then r0 > 1 and we find (r,d) = (2, 2m 5 ), (3, 3m 5 ), (3, 3m 7 ) in Table 5 and no others (since gcd(m, 6) = 1 when r = 3.). The next result follows from the fact that log(1 −p) > −p for 0 < p < 1. 128 S. Linton, A.C. Niemeyer C.E. Praeger Lemma 5.2. Let ε,p be real numbers such that 0 < ε < 1 and 0 < p < 1. Set N(ε,p) := ⌈ log(ε−1) p ⌉ . If m ≥ N(ε,p) then (1 −p)m ≤ ε. Lemma 5.3. Let s be a real number with 1 2 < s < 1 and n,r,t positive integers such that 12(rn)s + 6 ≤ n. Then (i) ms/n < ns/n < (rn)s/n < 1/12. (ii) n ≥ 156. (iii) 2(rn)s − t > 24−t 12 (rn)s. (iv) if s = 2/3 then n ≥ 1746. Proof. (i) This follows directly from 12(rm)s < 12(rn)s < 12(rn)s + 6 ≤ n. (ii) As s > 1/2 and r ≥ 1 we have 12 √ n + 6 ≤ 12 √ rn + 6 < 12(rn) s + 6 ≤ n. An easy calculation shows that this implies n ≥ 156. (iii) Note that n ≥ 156 implies ns > n1/2 ≥ √ 156 > 12 and so 2(rn)s−t = (2rs− t ns )ns > (2rs− t 12 )ns = 24rs−t 12 ns ≥ 24−t 12 rsns. (iv) By calculator. The next inequalities are easily verified. Lemma 5.4. Let x ∈ R with x > 12. Then (a) x ( 1 2 )x < 1 4x , and (b) ( 11 12 )x < 5 x . For the estimates in our last arithmetic result Lemma 5.6, we first restate how to estimate sums via integrals. Lemma 5.5. Let a,b ∈ Z with a < b, and let f(x) be a function defined on the interval [a − 1,b + 1], satisfying one of the lines of Table 6. Then b∑ x=a f(x) ≤ ∫ b+ε a−δ f(t)dt. conditions on f δ ε increasing in [a,b + 1] 0 1 decreasing in [a− 1,b] 1 0 non-negative in [a− 1,b + 1] and for some c ∈ (a,b) decreasing in [a− 1,c] and increasing in [c,b + 1] 1 1 Table 6. Conditions on f 129 Identifying long cycles Lemma 5.6. Let a,c ∈ R+ and n ∈ Z+ with n > a > c + 2 ≥ 3, and let t,` ∈ Z+ with t ≥ 2 and t ≥ `. Then, summing over integers x in the interval (a,n], ∑ a ` the function f(x) = x t (x−c)` is decreasing on (a, tc t−`] and increasing on [ tc t−`,n], while if t = ` then f(x) is decreasing on (a,n]. In either case, by Lemma 5.5 we have∑ a 1, and let c,` be integers such that 1 ≤ ` < c. Then( ca− 1 a− 1 )( c ` ) ≤ ( ca `a ) . Proof. The proof is by induction on `, for fixed c,a. Since ( c ` ) = ( c c−` ) and ( ca `a ) = ( ca (c−`)a ) , it is sufficient to prove this for 1 ≤ ` ≤ bc/2c. Suppose first that ` = 1. Here it is straightforward to check that ( ca− 1 a− 1 )( c 1 ) = ( ca a ) . 130 S. Linton, A.C. Niemeyer C.E. Praeger Now suppose that 1 ≤ ` < bc/2c and that the inequality holds for `. Then, using induction we have( ca− 1 a− 1 )( c ` + 1 ) = ( ca− 1 a− 1 )( c ` ) c− ` ` + 1 ≤ ( ca `a ) c− ` ` + 1 . This latter quantity is at most ( ca (`+1)a ) if and only if c− ` ` + 1 · 1 (`a)!(ca− `a)! ≤ 1 (`a + a)!(ca− `a−a)! (16) and this is equivalent to c− ` ` + 1 ≤ ca− `a `a + a · ca− `a− 1 `a + a− 1 . . . ca− `a−a + 1 `a + 1 . Now the first factor on the right hand side is equal to (c− `)/(` + 1), and each of the other factors is at least 1 since c ≥ 2` + 1. Thus the inequality (16) holds, and so the induction proof is complete. Lemma 6.2. (a) For 2 ≤ k ≤ d < n we have( d k ) ≤ ( d n )k ( n k ) and moreover, if d ≤ αn for some α < 1 then( d k )( n k ) ≤ αk−1 d−k + 1 n−k + 1 ≤ αk. (b) For 2 ≤ k ≤ 2n/3 we have ( bn/2c bk/2c ) < 2 ( n k )( 3k 4n )dk/2e . Proof. Every part of the proof depends on the following observation: Fact 1: For 0 ≤ i ≤ t ≤ n with i < n we have t−i n−i ≤ t n with strict inequality if t < n. For (a) observe that ( v k )( n k ) = k−1∏ i=0 v − i n− i ≤ k−1∏ i=0 v n = (v n )k . If d ≤ αn for some α < 1 then( d k )( n k ) = ( k−2∏ i=0 d− i n− i ) · d−k + 1 n−k + 1 ≤ ( k−2∏ i=0 αn− i n− i ) · d−k + 1 n−k + 1 . Now, again by Fact 1, (αn− i)/(n− i) ≤ α and d−k+1 n−k+1 ≤ d n ≤ α, and therefore( d k )( n k ) ≤ αk−1 d−k + 1 n−k − 1 ≤ αk. 131 Identifying long cycles For (b) let n0 = bn/2c and k0 = bk/2c. Note then that( n0 k0 )( n k ) = n0(n0 − 1) · · ·(n0 −k0 + 1) n(n− 1) · · ·(n−k + 1) k(k − 1) · · ·(k0 + 1) = k0−1∏ i=0 n0 − i n− i · k−1∏ j=k0 k + k0 − j n− j . Now k + k0 ≤ 2n/3 + n/3 ≤ n. Applying Fact 1 with t = n0 to the first product and with t = k + k0 to the second, we obtain ( n0 k0 )( n k ) ≤ k0−1∏ i=0 n0 n · k−1∏ j=k0 k + k0 n = (n0 n )k0 · ( k + k0 n )k−k0 ≤ ( 1 2 )k0 · ( 3k 2n )k−k0 = 3dk/2e 2k · ( k n )dk/2e ≤ 2 · 3dk/2e 4dk/2e · ( k n )dk/2e . Note that the first inequality is strict if either k0 ≥ 2 or k − 1 > k0, that is, if k ≥ 3. If k = 2 then( n0 k0 ) = bn/2c, while 2 ( n k )( 3k 4n )dk/2e = 3 2 (n− 1) > bn/2c. Thus (b) is proved for all k. Lemma 6.3. Let d,k,t be positive integers and a > 0 such that k ≤ d and t d−k+1 ≤ a. Then (d + t)(d + t− 1) . . . (d + t−k + 1) < d(d− 1) . . . (d−k + 1)(1 + (1 + a)kt a(d−k + 1) ). Proof. Note first that (d + t)(d + t− 1) . . . (d + t−k + 1) = d(1 + t d )(d− 1)(1 + t (d− 1) ) . . . (d−k + 1)(1 + t (d−k + 1) ) ≤ d(d− 1) . . . (d−k + 1)(1 + t (d−k + 1) )k. Set x = t d−k+1, so 0 < x ≤ a. Then (1 + x)k = k∑ j=0 ( k j ) xj = 1 + x k∑ j=1 ( k j ) xj−1 ≤ 1 + x k∑ j=1 ( k j ) aj−1 132 S. Linton, A.C. Niemeyer C.E. Praeger ≤ 1 + x a k∑ j=0 ( k j ) aj = 1 + x a (1 + a)k. Now we state and prove the result on partitions. Proposition 6.4. Let U be a finite set of size u > 1, and let P be a partition of U in which all parts have size at least 2. For 2 ≤ k0 ≤ u, let NP(k0) denote the number of k0-subsets of U that are unions of parts of P. Then NP(k0) ≤ (bu/2c bk0/2c ) , and moreover, if k0 is odd and u is even, then u ≥ 4 and NP(k0) ≤ ( (u−2)/2 (k0−1)/2 ) . In particular, NP(k0) = 1 if k0 = u and NP(k0) ≤ 1u−1 ( u k0 ) otherwise. Proof. First we construct a partition P′ of U having at most two parts of size 1, and all parts of size at most 2. Start with P′ = ∅ and run through the parts of P. For each part P ∈P of even size, choose any partition of P with all parts of size 2, and add the parts of this partition to P′. If all parts of P have even size, then the construction of P′ is completed in this way. So suppose that P has at least one part of odd size. In this case P′ will have 1 or 2 parts of size 1, and its construction is completed as follows. For each part P ∈ P of odd size p := |P |, add (p− 1)/2 parts of size 2 to P′ formed from p− 1 of the points of P. Let P1, . . . ,Pr be the odd length parts of P. Pair up the remaining r points into parts of size 2 and add them to P′, leaving exactly 1 or 2 of these points to form singleton parts of P′. Next we define, for each k0-subset η of U that is a union of parts of P, a k0-subset η′ that is a union of parts of P′. Note that if k0 is odd then η must contain a part of P of odd size, and in this case P′ has one or two singleton parts. If k0 is odd and P′ has two singleton parts, then we choose one of them, and we always place this chosen singleton part in η′. To define η′ for a given η, we start with η′ = ∅ and build it up by considering in turn each of the parts P of P contained in η. If |P | is even, then P is a union of parts of P′ of size 2, and we add all of these parts to η′. If |P | is odd, then we add to η′ all the parts of size 2 of P′ contained in P . At this stage |η′| = k0 −`, where ` is the number of odd sized parts of P contained in η. Next we add to η′ up to b`/2c parts of P′ of size 2 that contain points from two different parts of P. If η′ cannot be completed in this way then either (i) ` is odd, or (ii) ` is even and is equal to the number of odd sized parts of P. Case (i) occurs if and only if k0 is odd, and here we add to η′ the designated singleton part of P′. In case (ii) there are two singleton parts of P′, and we add to η′ these two singleton parts. Note that, if ` ≥ 2, then we may have had some freedom in choosing the b`/2c parts of P′ of size 2 that contain points from two different parts of P, so η′ may not be determined uniquely by η. On the other hand, η′ always determines η uniquely, since η is the union of the parts of P that have at least two points in η′. Thus distinct sets η correspond to distinct sets η′. It follows that NP(k0) ≤ N′ where N′ is the number of k0-subsets γ ⊆ U such that γ is a union of parts of P′ and in addition, if k0 is odd and P′ has two singleton parts, then γ contains a designated one of these singleton parts. Suppose that γ is such a k0-subset. If P′ has at most one part of size 1, then γ contains bk0/2c of the parts of P′ of size 2 (and also a singleton part if k0 is odd). Thus N′ ≤ (bu/2c bk0/2c ) . Note that in this case, if k0 were odd, then P would have at least one odd part, and so P′ would have exactly one odd part, whence u would be odd. Thus the first assertion is proved in this case. So suppose that P′ has two singleton parts, in which case u is even. If k0 is odd then k0 ≥ 3 and γ consists of bk0/2c of the parts of P′ of size 2 and the designated singleton part, whence u ≥ 4 and N′ ≤ ( (u−2)/2 (k0−1)/2 ) < (bu/2c bk0/2c ) . On the other hand, if k0 is even then γ consists of k0/2 of the two-point parts (or k0/2 − 1 parts of size two and the two singleton parts). Again N′ ≤ (bu/2c bk0/2c ) . This proves the first assertion in all cases. Note that bu/2c = bk0/2c if and only if either k0 = u, or k0 = u−1 with u odd. If k0 = u obviously NP(k0) = N ′ = 1. If k0 = u− 1 with u odd then P′ has a unique part of size 1 and its complement is 133 Identifying long cycles the unique k0-subset of U that is a union of parts of P′ - it may or may not be a union of parts of P. Thus NP(k0) ≤ N′ = 1 ≤ 1u−1 ( u k0 ) . So suppose from now on that bk0/2c < bu/2c, and set u1 = bu/2c and k1 = bk0/2c. Then (bu/2c bk0/2c ) =( u1 k1 ) , and by Lemma 6.1, this is at most 1 2u1−1 ( 2u1 2k1 ) . If k0 and u are even, then k0 < u and this quantity is at most 1 u−1 ( u k0 ) . If k0 is even and u is odd, then 2 ≤ k0 and 12u1−1 ( 2u1 2k1 ) = 1 u−2 ( u−1 k0 ) . This in turn is at most 1 u−1 ( u k0 ) . Now suppose k0 and u are odd. Then bk0/2c < bu/2c implies k0 ≤ u − 2, and 1 2u1−1 ( 2u1 2k1 ) = 1 u−2 ( u−1 k0−1 ) which is at most 1 u−1 ( u k0 ) . Finally consider k0 odd and u is even. As shown above u ≥ 4 and N′ ≤ ( (u−2)/2 (k0−1)/2 ) . By Lemma 6.1, this is at most 1 u−3 ( u−2 k0−1 ) , which in turn is at most 1 u−1 ( u k0 ) . For a prime p and an integer n, let np denote the p-part of n, that is the highest power of p dividing n. Recall that, for a positive integer k0 ≤ n, a k0-subset γ′ of Ω, and an element g ∈ Sn, we denote by ck0 (γ ′,g) the length of the g-cycle containing γ′ in the action of g on k0-sets. Lemma 6.5. Let g ∈ Sn, let C be a g-cycle of length t, let k0 be a positive integer such that k0 ≤ t and let p be a prime dividing t. (a) Suppose that γ′ is a k0-subset of C such that the p-part tp does not divide ck0 (γ ′,g). Then γ′ is a union of Z(C,p)-orbits, where Z(C,p) is the subgroup of order p of the cyclic group 〈gC〉 ∼= Zt induced by g on C. In particular p divides gcd(k0, t). (b) The number σ(k0,C) of k0-subsets γ′ of C such that tp does not divide ck0 (γ ′,g) is at most ( bt/2c bk0/2c ) , and in particular, is 1 if k0 = t, and at most 1t−1 ( t k0 ) if k0 < t. Proof. (a) Since tp does not divide ck0 (γ ′,g) and 〈gC〉∼= Zt, it follows that the setwise stabiliser H of γ′ in 〈gC〉 contains the unique subgroup Z(C,p) of 〈gC〉 of order p. As γ′ is H-invariant, γ′ is a union of H-orbits in C, and hence γ′ is a union of Z(C,p)-orbits in C. In particular, p divides k0 as well as t. (b) If k0 = t then C is its unique k0-subset and σ(k0,C) = 1. If k0 < t then, by Proposition 6.4, σ(k0,C) ≤ ( bt/2c bk0/2c ) and also σ(k0,C) ≤ 1t−1 ( t k0 ) . Corollary 6.6. Let G,n,m,r be as in one of the lines of Table 1, and let g ∈ G. Let Σ(g) be as in Table 2 with u = |Σ(g)|, and let k0 be a positive integer such that k0 ≤ u. Then the number σ(k0, Σ(g)) of k0-subsets γ′ of Σ(g) such that ck0 (γ ′,g) divides rm satisfies σ(k0, Σ(g)) ≤   0 if k0 = 1 1 if k0 = u 1 u−1 ( u k0 ) if 1 < k0 < u. Proof. For each g-cycle C in Σ(g), by the definition of Σ(g), |C| does not divide rm, and hence there exists a prime p(C) such that |C|p(C) does not divide rm. Let Z(C,p(C)) denote the subgroup of order p(C) of the cyclic group 〈gC〉 induced by g on C, let P(C) denote the set of Z(C,p(C))-orbits in C (all of length p(C)), and let P = ∪CP(C) denote the corresponding partition of Σ(g). Suppose that γ′ is a k0-subset of Σ(g), and for each g-cycle C in Σ(g), let k(C) = |γ′ ∩ C|. Then ck0 (γ ′,g) is the least common multiple of ck(C)(γ′ ∩C,g), over all g-cycles C such that k(C) 6= 0. Note that ck(C)(γ′ ∩C,g) divides |C|. Suppose now that ck0 (γ ′,g) divides rm. Then for each C such that k(C) 6= 0, also ck(C)(γ′ ∩C,g) divides rm, and hence |C|p(C) does not divide ck(C)(γ′∩C,g). By Lemma 6.5, γ′∩C is a union of parts of P(C). Thus γ′ is a union of parts of P. Since all parts of P have size at least 2, this implies that σ(k0, Σ(g)) = 0 if k0 = 1, and the inequality for 1 < k0 ≤ u follows from Proposition 6.4. 134 S. Linton, A.C. Niemeyer C.E. Praeger 7. Tracing k-subsets For the remainder of this paper we assume that k is an integer with 2 ≤ k ≤ n/2. We use ∆(g), Σ(g) and other notation introduced in Tables 2 and 3. Further, we use without further reference the number M of independent uniformly distributed random k-subsets in Algorithm 2 TraceCycle, where M satisfies (2), in particular M ≥ 4. Proposition 7.1. Let G,n,m,r be as in one of the lines of Table 1, and suppose that g ∈ G does not contain an m-cycle. Set v = |∆(g)| and suppose that v ≤ n−k − 1. Then the proportion of k-subsets γ of Ω such that ck(γ,g) = r0m, for some r0 dividing r, is at most v k nk + 1 n−v−1. Proof. Set u = n−v = |Σ(g)|. Suppose that γ is a k-subset of Ω such that ck(γ,g) = r0m for some r0 dividing r, and set k0 := |γ ∩ Σ(g)|. Then k0 ≤ min{k,u}. By assumption, v ≤ n − k − 1 and so u = n−v ≥ k + 1 and k0 ≤ min{k,u} = k. Also ck0 (γ ∩ Σ(g),g) divides ck(γ,g), and hence divides rm. By Corollary 6.6, the number σ(k0, Σ(g)) of k0-subsets γ′ of Σ(g) such that ck0 (γ ′,g) divides rm is 0 if k0 = 1, 1 if k0 = u, and at most 1u−1 ( u k0 ) , otherwise. If k0 = 0 then γ is one of the ( v k ) k-subsets of ∆(g). Thus the number of possibilities for γ is at most( v k ) + k∑ k0=2 σ(k0, Σ(g)) ( n−u k −k0 ) ≤ ( v k ) + 1 u− 1 k∑ k0=2 ( u k0 )( n−u k −k0 ) < ( v k ) + 1 u− 1 ( n k ) . Now u− 1 = n−v− 1, hence the above is ( v k ) + 1 n−v−1 ( n k ) . By Lemma 6.2(a), ( v k ) is at most (v/n)k ( n k ) , which completes the proof. Lemma 7.2. Let G,n,m,r be as in one of the lines of Table 1. Let g be a uniformly distributed random element of G, and suppose that g does not contain an m-cycle, and that v = |∆(g)| ≤ n−k − 1. Then the following both hold. (a) Prob(TraceCycle(g) = true) ≤ 2M (( vk nk )M + ( 1 n−v−1 )M) , (b) Prob(TraceCycle(g) = true) ≤ 16 max {( v n )4 , ( 1 n−v−1 )4} . Moreover, if 3 ≤ v ≤ n− 3 then Prob(TraceCycle(g) = true) ≤ 16 ( v n )4 . Proof. Now TraceCycle(g) = true if and only if ck(γ,g) = r0m, for some r0 dividing r, for each of the M independent uniformly distributed random k-sets γ tested during the algorithm. Thus if g does not contain an m-cycle, the probability that TraceCycle(g) = true is pM, where p is the proportion of k- subsets γ such that ck(γ,g) = r0m for some r0 dividing r. By Proposition 7.1, p ≤ v k nk + 1 n−v−1 . Note that pM ≤ p4 since p ≤ 1 and M ≥ 4. Set x = v k nk and y = 1 n−v−1 . If x ≤ y then (x + y) M ≤ (2y)M = 2MyM, and similarly if x ≥ y then (x + y)M ≤ 2MxM. It follows that pM ≤ 2M (xM + yM ), proving part (a). For (b), we observe that pM ≤ p4 ≤ (x + y)4 ≤ (2 max{x,y})4 = 16 · max{x,y}4. Part (b) follows on noting that x ≤ v/n (since v ≤ n). Finally suppose that 3 ≤ v ≤ n − 3. Then n ≥ v + 3 ≥ v + 2 + 2 v−1 so n(v−1) ≥ v 2 + v and hence (n−v−1)v ≥ n, that is, v n ≥ 1 (n−v−1) . The last assertion now follows from part (b). 135 Identifying long cycles Now we analyse the effect of TraceCycle applied to elements of R. Proposition 7.3. Let G,n,m,r be as in one of the lines of Table 1 and suppose that 12(rn)s + 6 ≤ n. Then, for a uniformly distributed random element g ∈ G, Prob(TraceCycle(g) = true | g ∈R) ≤ ( 33 8n1−s )M . Proof. By definition, for g ∈ R, v = |∆(g)| ≤ 4(rn)s and g does not contain an m-cycle. By our assumptions on n and k and the hypothesis, we have n−k − 1 ≥ n/2 − 1 > 4(rn)s ≥ v. Thus by Proposition 7.1, the proportion of k-subsets γ such that ck(γ,g) = r0m, for some r0 dividing r, is at most v k nk + 1 n−v−1 ≤ (4rs)k nk(1−s) + 1 n−4(rn)s−1 . Now TraceCycle(g) = true if and only if ck(γ,g) = r0m, for some r0 dividing r, for each of M independent uniformly distributed random k-sets γ tested during the algorithm. Thus, given g ∈R, the probability of this occurring is at most( (4rs) k nk(1−s) + 1 n− 4(rn)s − 1 )M . Now 12(rn)s < n, that is to say, 4r s n1−s < 1 3 . Also k ≥ 2, r ≤ 3 and s < 1. Therefore ( 4r s n1−s )k ≤ ( 4r s n1−s )2 < 4rs 3n1−s < 4 n1−s . Also, by assumption, n − 4(rn)s − 1 ≥ 8(rn)s + 5 > 8rsns > 8rsn1−s. Therefore, the probability that TraceCycle(g) = true is at most( 4 n1−s + 1 8rsn1−s )M ≤ ( 33 8n1−s )M . Next we analyse the effect of TraceCycle applied to elements of Ngood (defined in Table 3). Lemma 7.4. Let G,n,m,r be as in one of the lines of Table 1, and let k0 be an integer satisfying 0 ≤ k0 ≤ k. Let g ∈N and let C be the m-cycle contained in g. Then the number of k0-subsets of C that can occur as γ ∩C, for a k-subset γ of Ω such that ck(γ,g) is not divisible by m, is at most σk0 =   1 if k0 = 0 and k ≤ n−m 0 if gcd(m,k0) = 1 or if k0 < k −n + m ω(gcd(m,k0)) (bm/2c bk0/2c ) if gcd(m,k0) > 1 and k0 ≥ max{1,k −n + m} where ω(d) is the number of distinct prime divisors of an integer d. Proof. Let σ′ be the number of k0-subsets of C that can occur as γ ∩ C, for a k-subset γ of Ω such that ck(γ,g) is not divisible by m. Note that, if γ is such a k-subset, then γ \ C is contained in the complement C of C and hence k = |γ| ≤ k0 + |C| = k0 + n−m. Thus if k0 < k −n + m then σ′ = 0. Also if k0 = 0 ≥ k −n + m, then γ ∩C = ∅ so σ′ ≤ 1. Suppose now that k0 > 0 and k0 ≥ k −n + m, that is, k0 ≥ max{1,k −n + m}. Let γ be such that ck(γ,g) is not divisible by m. Then ck0 (γ ∩C,g) properly divides m, and hence there exists a prime p dividing m such that the p-part mp does not divide ck0 (γ∩C,g). By Lemma 6.5(a), p divides gcd(m,k0) (and in particular if gcd(m,k0) = 1 then σ′ = 0). If such a prime p exists then, by Lemma 6.5(b), the number of k0-subsets γ ∩C such that mp does not divide ck0 (γ ∩C,g) is at most(bm/2c bk0/2c ) . Finally there are at most ω(gcd(m,k0)) primes p to consider, and the proof is complete. 136 S. Linton, A.C. Niemeyer C.E. Praeger Proposition 7.5. Let G,n,m,r be as in one of the lines of Table 1 and suppose that g ∈ Ngood, and 12(rn) s + 6 ≤ n. Then the proportion of k-subsets γ of Ω such that ck(γ,g) 6= mr0, for any r0 dividing r, is at most k∑ k0=max{k−(n−m),0} σk0 ( n−m k −k0 ) / ( n k ) ≤ √ 8k ( 3k 4m )dk/2e , where σk0 is as in Lemma 7.4. Moreover, for a uniformly distributed random element g ∈ G, Prob(TraceCycle(g) = true | g ∈Ngood) ≥ ( n− 2 n )M . Proof. Let C denote the m-cycle in g and let γ be a k-subset of Ω such that ck(γ,g) 6= mr0 for any r0 dividing r. By the definition of Ngood, this implies that ck0 (γ ∩ C,g) is not divisible by m, where k0 = |γ ∩C|. Now 0 ≤ k0 ≤ min{k,m} = k, and moreover k0 ≥ k− (n−m) since γ ⊆ (γ ∩C) ∪ (Ω \C). Given γ ∩ C, there are at most ( n−m k−k0 ) choices for γ \ C. Hence, by Lemma 7.4, the number of such k-subsets γ is at most X := k∑ k0=max{k−n+m,0} σk0 ( n−m k −k0 ) (17) where σ0 = 1, and σk0 = ω((gcd(m,k0)) (bm/2c bk0/2c ) for k0 > 0. Now ω(gcd(m,k0)) ≤ ω(k0) ≤ √ 2k0 ≤ √ 2k (see for example, [11, p. 395]). Hence, X ≤ √ 2k ∑k k0=max{k−n+m,0} (bm/2c bk0/2c )( n−m k−k0 ) and by Lemma 6.2(b), we have, X ≤ √ 2k k∑ k0=max{k−n+m,0} 2 ( m k0 )( 3k0 4m )dk0/2e(n−m k −k0 ) ≤ √ 8k ( 3k 4m )dk/2e k∑ k0=max{k−n+m,0} ( m k0 )( n−m k −k0 ) ≤ √ 8k ( 3k 4m )dk/2e( n k ) and hence the proportion X/ ( n k ) ≤ p where p := √ 8k ( 3k 4m )dk/2e . Now we consider the final assertion. Note that TraceCycle(g) = true if and only if, for each of the M independent uniformly distributed random k-subsets γ tested, we have ck(γ,g) = r0m for some r0 dividing r. The class Ngood is, for some lines of Table 1, a union of several conjugacy classes of elements of Sn, say Ngood = ∪CN(C). For g ∈N(C), the proportion p(C) of k-subsets γ of Ω, such that ck(γ,g) 6= r0m for any r0 dividing r, may depend on the class C, although, as we have shown above, p(C) ≤ p for all C. Thus, given g ∈ N(C), the probability that TraceCycle(g) = true is (1 − p(C))M ≥ (1 − p)M. This implies that Prob(TraceCycle(g) = true | g ∈Ngood) ≥ (1 −p) M . Thus to complete the proof it is sufficient to prove that p ≤ 2 n for some upper bound p of X/ ( n k ) . Note that, by Lemma 5.3(ii), m ≥ n − 6 ≥ 150. Suppose first that 4 ≤ k ≤ n 2 . We consider the function F(x) = ( 3x 4m )x 2 = e x 2 log 3x 4m on the interval [4, n 2 ]. Note that 3x 4m ≤ 3n 8m < 1 and k 2 ≤ dk 2 e, so 137 Identifying long cycles F(k) ≥ ( 3k 4m )dk/2e , and hence p ≤ √ 8kF(k). Differentiating we have F ′(x) = F(x) 1 2 ( log( 3x 4m ) + 1 ) , and since F(x) > 0 for x > 0, it follows that F(x) has a unique minimum at log 3x 4m = −1, that is, when x = 4m 3e (which may or may not lie in the interval [4, n 2 ]). Thus the maximum of F(x) on the interval [4, n 2 ] occurs at one of the endpoints. We claim that max{F(4),F(n 2 )} < 1 n3/2 . It follows from a proof of this claim that p ≤ √ 8kF(k) ≤ √ 8k 1 n3/2 ≤ 2 n , since k ≤ n 2 . Since m ≥ n − 6 ≥ 150, we have m2 > 9n3/2, which implies that F(4) = ( 3 m )2 < 1 n3/2 . Also 3n 8m ≤ 3 8 + 6 8m < 1 2 , and n3/2 < 2n/4. Then, applying Lemma 5.4(a), we find F( n 2 ) = ( 3n 8m )n/4 < ( 1 2 )n/4 < 1 n3/2 proving the claim for k ≥ 4. For the remaining cases where k = 2 or 3, note that ω(gcd(m,k0)) ≤ 1 for 1 ≤ k0 ≤ 3, σk0 = 0 when k0 = 1, n ≥ 156, and n−m ≤ 6. If k = 2 then by (17), X( n 2 ) ≤ (62)(n 2 ) + (bm/2c1 ) ·(60)(n 2 ) ≤ 15 · 2 155 · 1 n + m n− 1 · 1 n < 2 n . If k = 3 then, again by (17), X( n 3 ) ≤ (63)(n 3 ) + (bm/2c1 )(61)(n 3 ) + (bm/2c1 )(60)(n 3 ) ≤ 20 · 6 154 · 155 · 1 n + 3 ·m(6 + 1) 154(n− 1) · 1 n < 2 n . 8. Bounding S0 Let G,m,n,r be as in one of the lines of Table 1, so G is An or Sn. To estimate the probability of a uniformly distributed random element g ∈ G being in S0 or S+1 , and TraceCycle(g) = true we use the following result from [8]. Recall the definitions of an s-small and an s-large cycle and of v from Notation 3.2. Let i ∈{1, 2, 3}. In the next two sections we use the following notation: Notation 8.1. 1. For v ≥ 1 let P(v,rm) denote the proportion of elements of Sv of order dividing rm, and let P(0,rm) = 1. 2. For v ≥ 1 let P0(v,rm) denote the proportion of elements of Sv of order dividing rm, all of whose cycles are s-small, and let P0(0,rm) = 1. 3. Let P +1 (v,rm) denote the proportion of elements g ∈ Sv of order dividing rm, and such that g has exactly one s-large cycle of length d, say, where in addition, d satisfies (rn)s ≤ d < v − 3(rn)s. 4. Let D denote the set of all divisors of rm which are at most n. 5. Let D+1 (v) denote the set of all divisors d of rm satisfying (rn) s ≤ d < v − 3(rn)s. Note that r = 1 or r is a prime. Hence the number d(rm) of positive divisors of rm is at most 2d(m), as d|rm if and only if either d|m or d = rd0 and d0|m. Remark 8.2. The following result is essentially [8, Lemma 2.4]. Suppose that s, δ and cδ are as in Notation 3.2. In particular, s > δ. In Lemma 8.3 we may use as a′δ any constant such that a ′ δ ≥ 138 S. Linton, A.C. Niemeyer C.E. Praeger a′δ m0 25/4 c 1/(s−δ) δ aδ as in (6) 150 Table 7. Possible values of a′δ for Lemma 8.3 5 4 (1 + 3 cδ (rm)s−δ + ( cδ (rm)s−δ )2 ) for all sufficiently large values of rm, say rm ≥ m0. These conditions hold in particular for a′δ,m0 in one of the lines of Table 7. Note that, for the proof of Theorem 3.1, we have n ≥ 156 by Lemma 5.3(ii), so rm ≥ 150, as in line 2 of Table 7. Lemma 8.3. Let m,n,r be as in one of the lines of Table 1. Further, let v ≥ 16 and s, δ, cδ and aδ be as in Notation 3.2. Let a′δ and m0 be as in one of the lines of Table 7 (or more generally as in Remark 8.2) and suppose that rm ≥ m0. Then (a) P0(v,rm) < a′δd(rm)r 2sn2s v3 . (b) If 3(rn)s < v then P0(v,rm) ≤ a′δd(rm) 2r2sn2s v(v − (rn)s)3 . (c) P +1 (v,rm) = ∑ d∈D+1 (v) 1 d P0(v −d,rm). Proof. This result follows from [8, Lemma 2.4] and its proof. A direct application of [8, Lemma 2.4] would require that rm ≥ v, which we cannot guarantee to hold. However, the proof of that lemma shows, without the assumption that rm ≥ v, that P0(v,rm) ≤ d(rm)(rm) 2s (1 + 3cδ(rm) δ−s + (cδ(rm) δ−s )2) v(v − 1)(v − 2) whenever v ≥ 3. Statement (a) follows from this, since m ≤ n, δ < s and, for v ≥ 16, v(v−1)(v−2) > 4 5 v3. To prove (b) we let Ds denote the set of all divisors d of rm such that d < min{v, (rn) s}. By [8, Lemma 2.3(a)] we have that P0(v,rm) = 1v ∑ d∈Ds P0(v−d,rm), where P0(j,m) = 0 for j ≤ 0. Since, using Lemma 5.3(ii), v−d > 3(rn)s−(rn)s > 24 for d ∈ Ds, we have by (a) that P0(v−d,rm) ≤ a′δd(rm)r 2sn2s (v−d)3 . Thus P0(v,rm) ≤ 1v ∑ d∈Ds a′δd(rm)r 2sn2s (v−d)3 . Since v −d > v − r sns > 0 for d ∈ Ds, and |Ds| ≤ d(rm), we have P0(v,rm) ≤ a′δd(rm) 2r2sn2s v(v−(rn)s)3 . Finally, we prove (c). The number of permutations in Sv of order dividing rm with exactly one s-large cycle of a given length, d say, where d divides rm and (rn)s ≤ d < v−3(rn)s is ( v d ) (d−1)!P0(v− d,rm)(v − d)!. Hence the proportion in Sv of such permutations is 1dP0(v − d,rm). Summing over all d ∈ D+1 (v) yields the desired result. Proposition 8.4. Let G,m,n,r be as in one of the lines of Table 1. If 12(rn)s+6 ≤ n and (rn)s log(n) ≤ n then, for a uniformly distributed random element g ∈ G, Prob(g ∈S0 ∩G and TraceCycle(g) = true) ≤ aδd(rm)2r2s 72 n3−2s where aδ is as in (6). 139 Identifying long cycles Proof. The set S0 = ∪̇S0(v), where S0(v) is the set of all g ∈S0 with |∆(g)| = v, where v ranges over all integers satisfying 4(rn)s < v ≤ n. For g ∈ S0(v), the restriction g∆(g) of g to ∆(g) is a permutation in Sym(∆(g)) of order dividing rm with all cycles of length less than (rn)s. Consider a fixed v-set ∆. If G = Sn, then all elements of Sym(∆) are induced by permutations in G. On the other hand if G = An, then one of the lines 4-9 of Table 1 holds and hence rm is odd; thus all elements of Sym(∆) of order dividing rm actually lie in Alt(∆) and are therefore induced by elements of G. Therefore in all cases the number of possibilities for the restriction g∆ of elements g ∈ G, for a given v-subset ∆ = ∆(g), is v!P0(v,rm) and the restriction gΣ where Σ = Ω\∆ lies in Sym(Σ) or Alt(Σ) according as G = Sn or An, respectively. Hence the number of permutations in S0 ∩G corresponding to this value of v satisfies |S0(v) ∩G| ≤ ( n v ) v!P0(v,rm) (n−v)! |Sn : G| = n! P0(v,rm) |Sn : G| = |G| ·P0(v,rm). As 3(rn)s < 4(rn)s < v, we have n ≥ 156 by Lemma 5.3(ii) so rm ≥ 150, and hence we can apply Lemma 8.3(b) with a′δ = aδ. Thus, for a random g ∈ G, Prob(g ∈S0(v) ∩G) ≤ P0(v,rm) ≤ aδd(rm) 2r2sn2s v(v − (rn)s)3 . For any g ∈ Sn with |∆(g)| = v and v ≤ n − k − 1, we have in particular 3 ≤ v ≤ n − 3. Hence by Lemma 7.2(b), given that g ∈S0(v) ∩G with v ≤ n−k − 1, Prob(TraceCycle(g) = true) ≤ 16 (v n )4 . Hence, if v ≤ n − k − 1, then the probability that g ∈ S0(v) and TraceCycle(g) = true is at most aδd(rm)2r2sn2s 16v(v−(rn)s)3 ( v n )4 ; and if n − k − 1 < v ≤ n, this probability is at most aδd(rm) 2r2sn2s 1 v(v−(rn)s)3 . Summing over the values of v, we find Prob(g ∈S0 ∩G and TraceCycle(g) = true) ≤ Σ1 + Σ2 where Σ1 = 16aδd(rm) 2 r 2sn2s n4 ∑ 4(rn)s 2(rn)s, and find Σ1 = 16aδd(rm) 2 r 2sn2s n4 ∑ 4(rn)s d + 3(rn) s (see Notation 3.2). Σ1 < 16 ∑ d∈D` 1 d   ∑ 3(rn)s+d 23 12 rsns by Lemma 5.3(iii) and, since d ≥ (rn)s, also d+(rn) s d ≤ 2. Note also that d + (rn)s < n and n + 1 −d− (rn)s < n. Hence Σ1 ≤ 16 aδd(rm) 3r2sn2s n4 ( 2 · 123 ·n3 3 · 233 ·r3sn3s + 4 · 122n2 232r2sn2s + 12 · 12n1 23rsns + 8 · log(n) + n1 (rn) s ) = 16aδd(rm) 3 n1+s ( 3456 36501rs + 576 529n1−s + 144rs 23n2−2s + 8r2s log(n) n3−3s + rs n2−2s ) . Since, by hypothesis (rn)s log(n) ≤ n and by Lemma 5.3(i) ns/n ≤ rsns/n ≤ 1/12 and r ≥ 1, the last expression is at most 16aδd(rm) 3 n1+s ( 3456 36501 + 576 529 · 12 + 144 23 · 122 + 8 122 + 1 122 ) ≤ 4.7aδ d(rm)3 n1+s . We now consider Σ2 = ∑ n−k≤v≤n (∑ d∈D+1 (v) 1 d P0(v −d,rm) ) . As v−d > 3(rn)s and n−k ≥ n/2 we have by Lemma 8.3(b) (with a′δ = aδ) that Σ2 aδd(rm)2(rn) 2s ≤ ∑ n/2≤v≤n   ∑ d∈D+1 (v) 1 d(v −d− (rn)s)4   = ∑ d∈D+1 (v) 1 d   ∑ v(d)≤v≤n 1 (v −d− (rn)s)4   where v(d) = max{n 2 ,d + 3(rn)s} since, by Notation 8.1, each d ∈ D+1 (v) is less than v − 3(rn) s. By 143 Identifying long cycles Lemma 5.5, this quantity is at most ∑ d∈D+1 (v) 1 d (∫ n v(d)−1 1 (v −d− (rn)s)4 dv ) = ∑ d∈D+1 (v) 1 d [ − 1 3 1 (v −d− (rn)s)3 ]n v(d)−1 < ∑ d∈D+1 (v) 1 3d 1 (v(d) − 1 −d− (rn)s)3 . In particular each d ∈ D+1 (v) is less than m. By Lemma 5.1, there are at most three divisors of rm which are less than m and greater than 2m/7, and the sum of the reciprocals 1 d of these divisors is at most 7 m , which is less than 7.3 n since n ≥ 156 (by Lemma 5.3(ii)). Using v(d) ≥ d + 3(rn)s and Lemma 5.3(iii), the contribution from these exceptional divisors is therefore at most 1 (2(rn)s − 1)3 ∑ d∈D+1 (v),d>2m/7 1 3d < ( 12 23(rn)s )3 7.3 3n < 0.35 (rn)3sn . Finally we estimate the contribution of the remaining elements d of D+1 (v). We note that each such d is at most 2n 7 and at least (rn)s, and that (rn)s < n−6 12 by our hypothesis. Thus, using v(d) ≥ n 2 , the remaining contribution is at most d(rm) 3(rn) s 1 (n 2 − 1 − 2n 7 − n−6 12 )3 . Observe that n 2 −1− 2n 7 − n−6 12 = 11n−42 84 and since n > 84 by Lemma 5.3(a) we have 11n−42 84 > n 8 . Hence, using also that (rn) s n < 1 12 (by Lemma 5.3(i)), the above expression is less than d(rm) (rn)s 83 3n3 < d(rm) 83 122 · 3 (rn)3sn < 1.19 d(rm) (rn)3sn . Thus Σ2 aδd(rm)2(rn)2s < 0.35 (rn)3sn + 1.19 d(rm) (rn)3sn ≤ 1.54 d(rm) (rn)2sn1+s and hence Prob(g ∈S+1 ∩G and TraceCycle(g) = true) < 6.24 aδ d(rm)3 n1+s . 10. Bounding S≥2 Proposition 10.1. Let G,m,n,r be as in one of the lines of Table 1. Then |S≥2 ∩G| |G| ≤ d(rm)2 (rn) 2s . 144 S. Linton, A.C. Niemeyer C.E. Praeger Proof. If g is an element of S≥2∩G then it has two cycles of lengths d1,d2, where di|rm, and di ≥ (rn) s. There are at most d(rm) choices for each di. Thus, there are at most d(rm)2 choices for the two divisors d1 and d2. For a given d1,d2, the proportion of elements in G having cycles of lengths d1 and d2 is at most (d1d2) −1 ≤ (rn)−2s. Thus altogether we get a proportion of at most d(rm)2(rn)−2s. 11. Bounding S−1 Proposition 11.1. Let G,m,n,r be as in one of the lines of Table 1. Suppose that n is such that 12(rn) s + 6 ≤ n. Let k be a fixed integer with 2 ≤ k ≤ n/2. Then (a) the proportion of k-subsets γ such that ck(γ,g) = r0m, for some r0 dividing r, for g ∈ S−1 ∩G, is less than 31/((rn)1−s). (b) If TraceCycle is Algorithm 2 and M is as defined there, then for a uniformly distributed random element g ∈ G, Prob(TraceCycle(g) = true | g ∈S−1 ∩G) < ( 31 (rn) 1−s )M , and so Prob(g ∈S−1 ∩G and TraceCycle(g) = true) < ( 31 (rn) 1−s )M . Proof. We start by recording some important facts used throughout the proof. Let g ∈ S−1 ∩G and put v = |∆(g)| and u = |Σ(g)|, such that u + v = n. The definition of S−1 implies that g has a unique s-large cycle C in ∆(g) of length d and we have (i) d ≤ n and d 6= m since g ∈F; (ii) v > 4(rn)s and v −d ≤ 3(rn)s. By Lemma 5.1 and the hypothesis n ≥ 12(rn)s + 6, it follows that d ≤ 2m/3 ≤ 2n/3. Hence u = n−v ≥ n−d−3(rn)s ≥ n 3 −3(rn)s ≥ 4(rn)s + 2−3(rn)s = (rn)s + 2. Also, v ≤ d + 3(rn)s ≤ 2n 3 + 3(rn) s . This implies that v = n−u ≤ n− 2 − (rn)s and hence in particular v ≤ n− 3 (18) and 1 u− 1 < 1 (rn) s < 1 (rn) 1−s . (19) Set t = v −d so that t = v −d ≤ 3(rn)s. Then v = d + t ≤ 2n/3 + 3(rn)s. (20) Suppose that γ is a k-subset for which ck(γ,g) = r0m, for some r0 dividing r, and set k0 := |γ∩Σ(g)|. Then ck0 (γ ∩ Σ(g),g) divides rm, and hence the number of possibilities for the k0-subset γ ∩ Σ(g) is at 145 Identifying long cycles most the number σ(k0, Σ(g)) of Corollary 6.6. In particular σ(k0, Σ(g)) = 0 if k0 = 1. Thus k0 = 0 or 2 ≤ k0 ≤ min{u,k}, and the case k0 = 0 is only possible if v ≥ k. First we prove the following upper bound for the number K¬0 = K¬0(g) of k-subsets γ such that k0 = |γ ∩ Σ(g)| ≥ 2. K¬0( n k ) < 97 96(rn) 1−s . (21) By the remarks above K¬0 ≤ min{k,u}∑ k0=2 σ(k0, Σ(g)) ( n−u k −k0 ) . If k0 ≤ u−1 then, by Corollary 6.6 and our considerations above, σ(k0, Σ(g)) ≤ 1u−1 ( u k0 ) ≤ 1 (rn)(1−s) ( u k0 ) , while if k0 = u then σ(k0, Σ(g)) = 1. Thus K¬0( n k ) ≤ 1(n k ) (rn)1−s min{u−1,k}∑ k0=2 ( u k0 )( n−u k −k0 ) + R¬0, where R¬0 =  0 if k ≤ u− 1,(n−uk−u) (nk) if k ≥ u. Hence K¬0( n k ) ≤ 1(n k ) (rn)1−s min{u,k}∑ k0=0 ( u k0 )( n−u k −k0 ) + R¬0 = 1 (rn) 1−s + R¬0. (22) Thus (21) is proved if k ≤ u− 1, so suppose that k ≥ u. Recall that u > (rn)1−s + 1 by (19). Hence R¬0 ≤ ( n−u k−u )( n k ) = u∏ i=1 (k −u + i) (n−u + i) ≤ ( k n )u ≤ ( 1 2 )u ≤ 1 2 ( 1 2 )(rn)(1−s) . Using n1−s > 12 > 8 (see Lemma 5.3(i)) and Lemma 5.4(a), we have R¬0 ≤ 12 ( 1 2 )(rn)(1−s) < 1 2 1 4(rn)2(1−s) < 1 96 1 (rn)1−s , and now the inequality (21) follows from inequality (22). To complete the proof of part (a) it remains to estimate the number K=0 = K=0(g) of k-subsets γ ⊆ ∆(g) such that ck(γ,g) = r0m for some r0 dividing r. Since this number is zero if v < k, we assume that v ≥ k. Recall that C is the unique s-large cycle of g contained in ∆(g) and d = |C|. By Lemma 5.1, d ≤ 2m/3 < 2n/3. Since m divides ck(γ,g) it follows that γ 6⊆ C. We prove K=0( n k ) ≤ 30.6 (rn) 1−s . (23) The number K=0 of such k-subsets is at most ( v k ) − ( d k ) . 146 S. Linton, A.C. Niemeyer C.E. Praeger Set t = v −d so that t = v −d ≤ 3(rn)s. Then we have( v k ) = 1 k! (d + t)(d + t− 1) . . . (d + t−k + 1). We consider separately the cases (i) (rn)s < k, (ii) k ≤ min{(rn)s,d− t + 1}, and (iii) d− t + 1 < k ≤ (rn)s. Recall that (rn)s ≤ d. Consider first Case (ii), so k ≤ (rn)s and d − t + 1 ≥ k. If d ≤ m/2 define a = 1 and observe that t d−k+1 ≤ a. If d > m/2 then, by Lemma 5.1, it follows that d ≥ 3m/5. In this case t d−k+1 ≤ 3(rn)s 3m/5−(rn)s = 3(rn)s m(3/5−(rn)s/m). By the hypothesis (rn) s ≤ (n − 6)/12 ≤ m/12 and by Lemma 5.3(i) we have then t d−k+1 ≤ 3 12 1 (3/5−1/12) = 15 31 . In this case define a = 15 31 . Then again t d−k+1 ≤ a. Setting d(k) := d(d− 1) . . . (d−k + 1), by Lemma 6.3 we obtain( v k ) = 1 k! (d + t)(d + t− 1) . . . (d + t−k + 1) < 1 k! ( d(k) ( 1 + (1 + a)kt a(d−k + 1) )) = ( d k )( 1 + (1 + a)kt a(d−k + 1) ) . (24) If d ≤ m/2 we have a = 1 and so ( v k ) ≤ ( d k ) + ( d k ) 2kt d−k + 1 . Applying Lemma 6.2(a) with α = 1 2 , ( d k ) ≤ 1 2k−1 ( n k ) d−k + 1 n−k + 1 . Hence K=0 ≤ ( v k ) − ( d k ) < ( d k ) 2kt d−k + 1 ≤ ( n k ) 2t n−k + 1 < ( n k ) 2t n−k . On the other hand, if m/2 < d ≤ 2n/3, then a = 15 31 , and (24) becomes( v k ) ≤ ( d k ) + ( d k )( 46 31 )k 31t 15(d−k + 1) . By Lemma 6.2(a) with α = 2 3 , ( d k ) ≤ 2k−1 3k−1 ( n k ) d−k + 1 n−k + 1 and hence K=0 = ( v k ) − ( d k ) < ( d k ) (46/31) k 31t 15(d−k + 1) < ( n k ) 2k−1 3k−1 (46/31) k 31t 15(n−k + 1) < ( n k ) 92k 93k 31t 10(n−k) < ( n k ) 31t 10(n−k) . 147 Identifying long cycles Note that by Lemma 5.3, since k ≤ (rn)s and by our assumptions, t n−k ≤ 3(rn)s n(1−k/n) ≤ 3(rn)s n(1−(rn)s/n) ≤ 3(rn)s n(11/12) = 36(rn)s 11n . Thus for all d we have K=0( n k ) < 31 · 36 10 · 11 · (rn) s n < 10.2 r (rn) 1−s ≤ 30.6 (rn) 1−s and (23) is proved for Case (ii). Now consider Cases (i) and (iii). Recall from (20) that v = d + t ≤ 2n/3 + 3(rn)s. By Lemma 5.3(i), (rn) s ≤ 1 12 n. Therefore v ≤ 2n/3 + 3 12 n = 11 12 n. This shows, using Lemma 6.2(a), that K=0( n k ) ≤ (vk)−(dk)(n k ) < (vk)(n k ) ≤ (v n )k ≤ ( 11 12 )k . In Case (i) we have k > (rn)s and hence, observing that (rn)s > n1/2 > 12 by Lemma 5.3(i), and using Lemma 5.4(b), we have ( 11 12 )k < ( 11 12 )(rn)s < 5 (rn)s < 5 (rn)1−s . Thus K=0 (nk) < 5 (rn)1−s and (23) holds for Case (i). In Case (iii) we have (rn)s ≥ k > d− t + 1 and so d < 4(rn)s as t ≤ 3(rn)s. Therefore, v = d + t < 7(rn) s, and using Lemmas 5.3(i) and 6.2(a), K=0( n k ) ≤ (v n )k < ( 7(rn) s n )k ≤ ( 7(rn) s n )2 < 49(rn) s 12n ≤ 49 4(rn) 1−s . Thus (23) holds for Case (iii) and hence in all cases. Combining (23) with (21), we conclude that the proportion of k-subsets γ such that ck(γ,g) = r0m, for some r0 dividing r, is less than 31/((rn) 1−s ) for all values of k and v. This proves (a). Now TraceCycle(g) = true if and only if ck(γ,g) = r0m, for some r0 dividing r, for each of the M independent uniformly distributed random k-sets γ tested in the algorithm. Thus, given g ∈S−1 ∩G, the probability that TraceCycle(g) = true is at most ( 31/((rn) 1−s ) )M . The last assertion follows on noting that for events A and B we have Prob(A∩B) = Prob(A)Prob(B | A) ≤ Prob(B | A). Acknowledgment: The first author acknowledges the support of EPSRC grant EP/C523229 and the second and third authors acknowledge the support of ARC Discovery grants DP0879134 and DP140100416. We thank Yohei Negi and Sven Reichard for some discussions on an early draft of this paper. We thank an anonymous referee for valuable suggestions. 148 S. Linton, A.C. Niemeyer C.E. Praeger References [1] R. M. Beals, C. R. Leedham-Green, A. C. Niemeyer, C. E. Praeger, and Á. Seress, A black-box algorithm for recognizing finite symmetric and alternating groups, I, Trans. Amer. Math. Soc., 355, 2097-2113, 2003. [2] S. Bratus and I. Pak, Fast constructive recognition of a black box group isomorphic to Sn or An using Goldbach’s conjecture, J. Symbolic Comput., 29(1), 33-57, 2000. [3] The GAP Group, GAP - Groups, Algorithms, and Programming, Version 4.7.7, 2015, http://www.gap-system.org. [4] P. Erdős, and P. Turán, On some problems of a statistical group-theory. I, Wahrscheinlichkeitstheorie Verw. Gebiete, 4, 175-186, 1965. [5] P. Erdős, and P. Turán, On some problems of a statistical group-theory. III, Acta Math. Acad. Sci. Hungar., 18 , 309-320, 1967. [6] S. Linton, A. C. Niemeyer and C. E. Praeger, Constructive recognition of Sn in its action on k-sets, in preparation. [7] Y. Negi, Recognising large base actions of finite alternating groups, Honours Thesis, School of Math- ematics and Statistifcs, The University of Western Australia, 2006. [8] A. C. Niemeyer and C. E. Praeger, On permutations of order dividing a given integer, J. Algebraic Combinatorics, 26, 125-142, 2007. [9] A. C. Niemeyer and C. E. Praeger, On the proportion of permutations of order a multiple of the degree, J. London Math. Soc., 76, 622-632, 2007. [10] A. C. Niemeyer, C. E. Praeger and Á. Seress, Estimation problems and randomised group algorithms, Probabilistic group theory, combinatorics, and computing, Lecture Notes in Math., 2070, Springer, London, 35-82, 2013. [11] I. Niven, H. S. Zuckerman and H. L. Montgomery, An Introduction to the Theory of Numbers, John Wiley & Sons Inc., New York, fifth edition, 1991. [12] Á. Seress, Permutation group algorithms, Cambridge Tracts in Mathematics, 152, Cambridge Uni- versity Press, Cambridge, 2003. [13] R. Warlimont, Über die Anzahl der Lösungen von xn = 1 in der symmetrischen Gruppe Sn, Arch. Math., 30, 591-594, 1978. 149 Introduction Algorithmic application Statement of the main theorem and notation Proof of the main theorem Preliminaries Binomial inequalities and partitions Tracing k-subsets Bounding S0 Bounding S1+ Bounding S2 Bounding S1- References