ISSN 2148-838Xhttp://dx.doi.org/10.13069/jacodesmath.729477 J. Algebra Comb. Discrete Appl. 7(2) • 195–207 Received: 10 September 2019 Accepted: 18 April 2020 Journal of Algebra Combinatorics Discrete Structures and Applications Generalization of the ball-collision algorithm Research Article Carmelo Interlando, Karan Khathuria, Nicole Rohrer, Joachim Rosenthal, Violetta Weger Abstract: In this paper we generalize the ball-collision algorithm by Bernstein, Lange, Peters from the binary field to a general finite field. We also provide a complexity analysis and compare the asymptotic complexity to other generalized information set decoding algorithms. 2010 MSC: 94B35, 94A60 Keywords: Coding theory, ISD, Ball-collision 1. Introduction Since 1978 it has been known that decoding a random linear code is an NP-complete problem, this was shown in [7] by Berlekamp, McEliece and van Tilborg. Therefore the interesting task arises of finding the complexity of decoding a random linear code using the best algorithms available. Until today two main methods for decoding have been proposed: information set decoding (ISD) and the generalized birthday algorithm (GBA). The ISD is more efficient if the decoding problem has only a small number of solutions, whereas GBA is efficient when there are many solutions. Also other ideas such as statistical decoding [1], gradient decoding [2] and supercode decoding [5] have been proposed but fail to outperform ISD algorithms. The input of an ISD algorithm is a corrupted codeword and it outputs the error vector. ISD algorithms are often formulated via the parity check matrix, since it is enough to find a vector of a certain weight which has the same syndrome as the corrupted codeword, this problem is also referred to as the syndrome decoding problem. ISD algorithms are based on a decoding algorithm proposed by Prange [28] in 1962 and their structures do not change much from the original: as a first step one chooses an information set, then Gaussian elimination brings the parity check matrix in a standard form and assuming that the errors are outside of the information set, these row operations on the syndrome will exploit the error vector, if the weight does not exceed the given error correction capacity. Carmelo Interlando; Department of Mathematics and Statistics, San Diego State University, San Diego, CA 92182-7720 (email: carmelo.interlando@sdsu.edu). Karan Khathuria (Corresponding Author), Nicole Rohrer, Joachim Rosenthal, Violetta Weger; Insti- tute of Mathematics, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland (email: karan.khathuria@math.uzh.ch, nicole.rohrer@uzh.ch, rosenthal@math.uzh.ch, violetta.weger@math.uzh.ch). 195 https://orcid.org/0000-0003-4928-043X https://orcid.org/0000-0002-9886-2770 https://orcid.org/0000-0003-4545-3559 https://orcid.org/0000-0001-9186-2885 C. Interlando et al. / J. Algebra Comb. Discrete Appl. 7(2) (2020) 195–207 The problem of decoding random linear codes has recently been receiving prominence with the proposal of using code-based public key cryptosystems for an upcoming post-quantum cryptographic public key standard. The idea of using linear codes in public key cryptography was first formulated by Robert McEliece [24]. Since the publication of McEliece a large amount of research has been done and the interested reader will find more information in a recent survey [9]. If the secret code is hidden well enough an adversary who wants to break a code-based cryptosystem encounters the decoding problem of a random linear code. It is therefore of crucial importance to understand the complexity of the best algorithms capable of decoding a general linear code. The ISD algorithms were often considered when proposing a variant of the McEliece cryptosystem, to find the key size needed for a given security level. ISD algorithms hence do not break a code-based cryptosystem but they determine the choice of secure parameters. Since some of the new proposals (for example [3, 4, 18]) involve codes over general finite fields, having efficient ISD algorithms generalized to Fq is an essential problem. Bernstein, Lange and Peters found a clever improvement of the ISD algorithm which they called ball-collision decoding [8]. The algorithm of Bernstein et. al. was presented for random binary linear codes. The main contribution of our paper is a generalization of the ball-collision decoding algorithm to arbitrary finite fields. The paper is structured as follows: in Section 2 we discuss the previous work on ISD algorithms focusing on those which have been generalized to an arbitrary finite field. In Section 3 we describe the ball-collision algorithm over the binary field and the notations and concepts involved in the algorithm. In Section 4 we present the ball-collision algorithm over Fq and in Section 5 we perform the complexity analysis of our algorithm including numerical parameter optimization and asymptotic analysis. 2. Related work Eventhough the cost of one iteration of Prange’s original ISD algorithm was very low, the algorithm was still coming with a huge complexity due to the number of iterations needed. Many improvements have been suggested to Prange’s simplest form of ISD (see for example [11–13, 19, 21, 31]), they all focus on a more elaborate and more likely weight distribution of the error vector, which results in a higher cost of one iteration, but less iterations have to be performed. The improvements were splitting from an early time on into two directions: the first direction is following the splitting of Lee and Brickell [20] into the information set and the redundant set, i.e. they ask for v errors in the information set and t−v outside. The second direction is Dumer’s splitting approach [13], which is asking for v errors in k + ` bits, which are containing an information set, and t−v in the remaining n−k − ` bits. Apart from improvements regarding the success probability, one can also improve the cost of one iteration: Canteaut and Chabaud [10] have provided a speed up for finding information sets. They show that the information set should not be taken at random after one unsuccessful iteration, but rather a part of the previous information set should be reused and therefore a part of the Gaussian elimination step is already performed. The second direction has resulted in many improvements, for example in 2009 Finiasz and Sendrier [14] have built two intersecting subsets of the k + ` bits, which contain an information set, and ask for v disjoint errors in both sets and t − 2v in the remaining n − k − ` bits. Niebuhr, Persichetti, Cayrel, Bulygin and Buchmann [26] in 2010 improved the performance of ISD algorithms over Fq based on the idea of Finiasz and Sendrier [14]. In 2011 May, Meurer and Thomae [22] proposed an improvement using the representation technique introduced by Howgrave-Graham and Joux [17]. To this algorithm Becker, Joux, May and Meurer [6] (BJMM) in 2012 introduced further improvements. In the same year Meurer in his dissertation [25] proposed a new generalized ISD algorithm based on these two papers. In 2015, May-Ozerov [23] used the nearest neighbor algorithm to improve the BJMM version of ISD. Later in 2017, the nearest neighbor algorithm over Fq was applied to generalized BJMM algorithm by Gueye, Klamti and Hirose [15]. Now we focus on the first direction of improvements, which were first proposed for codes over the 196 C. Interlando et al. / J. Algebra Comb. Discrete Appl. 7(2) (2020) 195–207 binary field and then later generalized over an arbitrary finite field. In 1988, the same year as Lee and Brickell proposed their algorithm, Leon [21] introduced a zero window inside the redundant set of size `, where no error are allowed. In 1993 Stern [30] kept this zero window and proposed to partition the information set in to two sets and asks for v errors in each part and t−2v errors outside the information set. The generalization of both Lee-Brickell and Stern’s algorithm to a general finite field Fq were performed by Peters [27] in 2010. In 2011 Bernstein, Lange and Peters proposed the ball-collision algorithm [8], where they keep the partitioning of the information set but they reintroduce errors in the zero window, in fact they partition the zero window into two sets and ask for w errors in both and hence for t−2v−2w errors outside. In 2016, Hirose [16] generalized the nearest neighbor algorithm over Fq and applied it to the generalized Stern algorithm. It is important to remark (see [25]) that the BJMM algorithm, even if having the smallest complexity, comes with a different cost: memory. In order to achieve a complexity of 128 bits, BJMM needs about 109 terabytes of memory. In fact, Meurer observed, that if one restricts the memory to 240, BJMM and the ball-collision algorithm are performing almost the same. In this paper we provide the missing generalization of the ball-collision algorithm. The order of the complexities of ISD algorithms over F2 is consistent also with their generalizations over Fq. 3. Preliminaries 3.1. Notation We first want to fix some notation: let q be a prime power and let n,k,t ∈ N be positive integers, such that k,t < n. We will denote by 1n the n×n identity matrix. For an m×n matrix A and a set S ⊆{1, ...,n} of size k, we denote by AS the m×k matrix consisting of the columns of A indexed by S. For a set S ⊆ {1, ...,n} of size k, we denote by Fnq (S) the vectors in Fnq having support in S. The projection of x ∈ Fnq (S) to Fkq is then canonical and denoted by πS(x). On the other hand we denote by σS the canonical embedding of a vector x ∈ Fkq into Fnq (S), where S ⊆{1, ...,n} is again of size k. For an [n,k] linear code C over Fq we denote by H a parity check matrix of size (n−k) ×n and by G a generator matrix of size k ×n. We denote the Hamming weight of a vector x ∈ Fnq , by w(x). The corrupted codeword c ∈ Fnq is given by c = mG + e, where m ∈ Fkq is the message and e ∈ Fnq is the error vector. The syndrome of c is then defined as s = Hc> and coincides with the syndrome of the error vector, since Hc> = H(mG + e)> = HG>m> + He> = He>. 3.2. Ball-collision algorithm over the binary field In what follows we describe the ball-collision algorithm over the binary proposed in [8] by Bernstein, Lange and Peters. 3.3. Concepts There are a few concepts for computing the complexity of the ball-collision algorithm introduced in [8] that we will present and generalize to arbitrary finite fields beforehand. In general the complexity of an ISD attack consists of the cost of one iteration times the expected number of iterations. The cost in the following refers to operations, i.e. additions or multiplications, over the given field. 197 C. Interlando et al. / J. Algebra Comb. Discrete Appl. 7(2) (2020) 195–207 Algorithm 1 Ball-collision over the binary Input: The (n−k)×n parity check matrix H, the syndrome s ∈ Fn−k2 and the positive integers v1, v2, w1, w2, k1, k2, `1, `2 ∈ Z, such that k = k1 + k2, vi ≤ ki, wi ≤ `i and t−v1 −v2 −w1 −w2 ≤ n−k − `1 − `2. Output: e ∈ Fn2 with He> = s and w(e) = t. 1: Choose I ⊆{1, ...,n} a uniform random information set of size k. 2: Choose a uniform random partition of I into disjoint sets X1 and X2 of size k1 and k2 = k−k1 respectively. 3: Choose uniform random partition of Y = {1, ...,n} \ I into disjoint sets Y1,Y2 and Y3 of sizes `1,`2 and `3 = n−k − `1 − `2. 4: Find an invertible matrix U ∈ F(n−k)×(n−k)2 such that (UH)Y = 1n−k and (UH)I = ( A1 A2 ) , where A1 ∈ F(`1+`2)×k2 and A2 ∈ F `3×k 2 . 5: Compute Us = ( s1 s2 ) with s1 ∈ F`1+`22 and s2 ∈ F `3 2 . 6: Compute the set S consisting of all triples (A1(πI(x1)) + πY1∪Y2(y1),x1,y1), where x1 ∈ F n 2 (X1), w(x1) = v1 and y1 ∈ Fn2 (Y1), w(y1) = w1. 7: Compute the set T consisting of all triples (A1πI(x2)+πY1∪Y2(y2)+s1,x2,y2), where x2 ∈ F n 2 (X2), w(x2) = v2 and y2 ∈ Fn2 (Y2), w(y2) = w2 . 8: for each (a,x1,y1) ∈ S do 9: for each (a,x2,y2) ∈ T do 10: if w(A2(πI(x1 + x2)) + s2) = t−v1 −v2 −w1 −w2: then Output: e = x1 + y1 + x2 + y2 + σY3(A2(πI(x1 + x2)) + s2). 11: else Start over with Step 1 and a new selection of I. i) The success probability over the binary is usually given by having chosen the correct weight distri- bution of the error vector. For example, if one assumes that the error vector has weight v in the information set and t−v outside the information set, the success probability results in( k v )( n−k t−v )( n t )−1 . This will not change over Fq, since the scalar multiples will cross out:( k v ) (q − 1)v ( n−k t−v ) (q − 1)t−v( n t ) (q − 1)t = ( k v )( n−k t−v )( n t )−1 . ii) The concept of intermediate sums is important whenever one wants to compute something for all vectors in a certain space. For example we are given a k×n matrix A and want to compute Ax> for all x ∈ Fn2 , of weight t. This would usually cost k times t−1 additions and t multiplications, for each x ∈ Fn2 . But if we first compute Ax>, where x has weight one, this only outputs the corresponding column of A and has no cost. From there we can compute the sums of two columns of A, there are ( n 2 ) many of these sums and each one costs k additions. From there we can compute all sums of three columns of A, which are ( n 3 ) many and using the sums of two columns we have already computed, means we only need to add one more column costing k additions. Proceeding in this way, until one reaches the weight t, to compute Ax> for all x ∈ Fn2 , of weight t costs k ·(L(n,t)−n) additions, where L(n,t) = t∑ i=1 ( n i ) . This changes slightly over a general finite field. As a first step one computes Ax> for all x ∈ Fnq , of weight 1. Hence this step is no longer for free, but rather means computing Aλ for all λ ∈ F?q, 198 C. Interlando et al. / J. Algebra Comb. Discrete Appl. 7(2) (2020) 195–207 costing (q − 1)kn multiplications. From there on one computes the sum of two multiples of the columns, there are ( n 2 ) (q−1)2 many and each sum costs k additions. Hence proceeding in the same manner the cost turns out to be (q−1)kn multiplications and (L̄(n,t)−n(q−1))k additions, where L̄(n,t) = t∑ i=1 ( n i ) (q − 1)i. iii) The next concept called early abort is also important whenever a computation is done while checking the weight of the result. For example one wants to compute x + y, where x,y ∈ Fn2 , which usually costs n additions, but we only proceed in the algorithm if w(x + y) = t. Hence we compute and check the weight simultaneously and if the weight of the partial solution exceeds t one does not need to continue. Over the binary one expects a randomly chosen bit to have weight 1 with probability 1 2 , hence after 2t we should reach the wanted weight t, and after 2(t + 1) we should exceed the weight t. Hence on average we expect to compute only 2(t + 1) many bits of the solution, before we can abort. Over Fq, we expect a randomly chosen element to have weight 1 with probability q−1q , therefore we need to compute q q−1 (t + 1) many entries of the solution before we can abort. iv) An important step in the ball-collision algorithm is to check for a collision, i.e. if Ax> = By> one continues, where again A,B ∈ Fk×n2 and x,y are living in some sets S and T respectively. There are | S | · | T | many choices for (x,y), assuming that they are distributed uniformly over Fn2 , then on average one expects the number of collisions to be | S | · | T | 2−n. Similarly over Fq the number of collisions will be | S | · | T | q−n. 4. Generalization of the ball-collision algorithm In this section we generalize the ball-collision algorithm from the binary case [8] to a general finite field. Again, as in the binary case, the idea of the algorithm is to solve UHe> = Us instead of He> = s, where an invertible U is chosen such that (up to permutation) UH = ( A 1n−k ) and Us = ( s1 s2 ) with s1 ∈ F`1+`2q , s2 ∈ F`3q . We are therefore looking for a vector e ∈ Fnq fulfilling UHe> = ( A1 1`1+`2 0 A2 0 1`3 )e1e2 e3   = (s1 s2 ) , with e1 ∈ Fkq, e2 ∈ F`1+`2q , e3 ∈ F`3q . This leads to the following system of equations: A1e1 + e2 = s1, A2e1 + e3 = s2. The algorithm solves the above by finding e1 = πI (x1 + x2), e2 = πY1∪Y2 (y1 + y2), e3 = −A2(πI (x1 + x2)) + s2, such that A1(πI (x1)) + πY1∪Y2 (y1) = s1 −A1(πI (x2)) −πY1∪Y2 (y2). 199 C. Interlando et al. / J. Algebra Comb. Discrete Appl. 7(2) (2020) 195–207 Algorithm 2 Ball-collision over Fq Input: The (n−k)×n parity check matrix H, the syndrome s ∈ Fn−kq and the positive integers v1, v2, w1, w2, k1, k2, `1, `2 ∈ Z, such that k = k1 + k2, vi ≤ ki, wi ≤ `i and t−v1 −v2 −w1 −w2 ≤ n−k − `1 − `2. Output: e ∈ Fnq with He> = s and w(e) = t. 1: Choose an information set I ⊆{1, ...,n} of H of size k. 2: Partition I into two disjoint subsets X1 and X2 of size k1 and k2 = k −k1 respectively. 3: Partition Y = {1, ...,n}\I into disjoint subsets Y1 of size `1, Y2 of size `2 and Y3 of size `3 = n−k − `1 − `2. 4: Find an invertible matrix U ∈ F(n−k)×(n−k)q , such that (UH)Y = 1n−k and (UH)I = ( A1 A2 ) , where A1 ∈ F(`1+`2)×kq and A2 ∈ F`3×kq . 5: Compute Us = ( s1 s2 ) , where s1 ∈ F`1+`2q and s2 ∈ F`3q . 6: Compute the following set: S ={(A1(πI(x1)) + πY1∪Y2(y1),x1,y1) | x1 ∈ Fnq (X1),w(x1) = v1,y1 ∈ F n q (Y1),w(y1) = w1} 7: Compute the following set: T ={(−A1(πI(x2)) + s1 −πY1∪Y2(y2),x2,y2) | x2 ∈ Fnq (X2),w(x2) = v2,y2 ∈ F n q (Y2),w(y2) = w2} 8: for (a,x1,y1) ∈ S do 9: for (a,x2,y2) ∈ T do 10: if w(−A2(πI(x1 + x2)) + s2) = t−v1 −v2 −w1 −w2 then Output: e = x1 + x2 + y1 + y2 + σY3(−A2(πI(x1 + x2)) + s2) 11: else go to Step 1 and choose new information set I. This last condition is fulfilled by the collision between S and T in Step 9. Observe that for q = 2 the above algorithm is equivalent to the one proposed over the binary. We hence did not change it in its substantial form. We now want to prove that the ball-collision algorithm over Fq works, i.e. that it returns any vector e of the desired form, if it exists. For this we follow the idea of [8]. Theorem 4.1. The ball-collision algorithm over Fq finds any vector e that fulfills UHe> = Us and is of the desired form, i.e., e has v1,v2,w1,w2 and t−v1 −v2 −w1 −w2 nonzero entries in X1,X2,Y1,Y2 and Y3 respectively. Proof. Clearly the output e is of the desired form: • x1 is of weight v1 and in Fnq (X1), • x2 is of weight v2 and in Fnq (X2), • y1 is of weight w1 and in Fnq (Y1), • y2 is of weight w2 and in Fnq (Y2), • w(−A2(πI (x1 + x2)) + s2) = w(A2(πI (x1 + x2))−s2) = t−v1 −v2 −w1 −w2 and it lies in Fnq (Y3). As the above subspaces do not intersect, w(e) can be calculated by adding up the weights of each of them. Hence w(e) = t and each of the subspaces has the desired weight distribution by definition. 200 C. Interlando et al. / J. Algebra Comb. Discrete Appl. 7(2) (2020) 195–207 It remains to prove that UHe> = Us. Let us write each of the subspaces Fnq (I),F n q (Y1 ∪ Y2) and Fnq (Y3) separately. UHe> = ( A1 1`1+`2 0 A2 0 1`3 ) πI (x1 + x2)πY1∪Y2 (y1 + y2) −A2(πI (x1 + x2)) + s2   = ( A1(πI (x1 + x2)) + πY1∪Y2 (y1 + y2) A2(πI (x1 + x2)) −A2(πI (x1 + x2)) + s2 ) = ( A1(πI (x1 + x2)) + πY1∪Y2 (y1 + y2) s2 ) . And we know that A1(πI (x1 + x2)) + πY1∪Y2 (y1 + y2) = s1 by the collision of S and T in Step 9. We now want to prove that the algorithm returns each of the above vectors such that He> = s under the assumption, that we worked with a correct partitioning into X1,X2,Y1,Y2,Y3. We do that by checking whether the algorithm considers all possible combinations and does not exclude any possible solution. U is invertible and hence does not exclude any solution when multiplied by H and s. In Step 6, where we build the sets S and T, we go through all possible error vectors x1,x2,y1 and y2, which have the desired weight distribution. There are only two steps in the algorithm, where we exclude certain vectors: 1. When we only keep the collisions between S and T in Step 9. But this is justified as A1e1 +e2 = s1, i.e. A1(πI (x1)) + πY1∪Y2 (y1) = −A1(πI (x2)) + s1 −πY1∪Y2 (y2) needs to be satisfied. 2. When we check whether w(−A2(πI (x1 + x2)) + s2) = t−v1−v2−w1−w2. But also this is justified as e3 ∈ F`3q needs this weight to complete the weight of e to be t. Hence we consider all possible error vectors that are of the given weight distribution and satisfy UHe> = Us. 5. Complexity analysis In this section we want to analyze the complexity of the extended ball-collision algorithm over Fq. Since the cost will be given in operations over Fq, we will denote by M the multiplications needed and by A the amount of additions. Note that one addition over Fq costs log2(q) bit operations and one multiplication over Fq costs log2(q) 2 bit operations, observe that one could also use the following speed up [29] such that multiplication costs log2(q) log2(log2(q)) log2(log2(log2(q))) bit operations. Success Probability of one Iteration We have the same success probability over Fq as over F2, as observed in Section 3.3, hence one iteration succeeds with a probability of ( n t )−1( `3 t−v1 −v2 −w1 −w2 )( k1 v1 )( k2 v2 )( `1 w1 )( `2 w2 ) . (1) 201 C. Interlando et al. / J. Algebra Comb. Discrete Appl. 7(2) (2020) 195–207 Cost of one Iteration 1. In Step 4 of the algorithm, one uses Gaussian elimination to find an invertible matrix U, bringing H into systematic form, since we will also need to compute Us we will directly perform Gaussian elimination on the matrix ( H | s ) , where we adjoined the vector s as a column to H. A broad estimate of the cost for this step is (n−k)2(n + 1)(A + M). 2. To build the set S we want to use the concept of intermediate sums over Fq described before. Hence to compute A1(πI (x1)), for all x1 ∈ Fnq (X1) we need (q − 1)(`1 + `2)k1 multiplications and (L̄(k1,v1) − k1(q − 1))(`1 + `2) additions. After having computed A1(πI (x1)), we add πY1∪Y2 (y1) using intermediate sums. This costs L̄(`1,w1) additions for each of the x1 ∈ Fnq (X1), which are( k1 v1 ) (q − 1)v1 many. Hence resulting in a total cost of (q − 1)(`1 + `2)k1M + (L̄(k1,v1) −k1(q − 1))(`1 + `2)A + ( k1 v1 ) L̄(`1,w1)(q − 1)v1A. 3. To build the set T we proceed similarly, the only difference being that s1 needs to be added to the first step of the intermediate sums over Fq, hence adding a cost of (`1 + `2)(q−1)k2 additions. The total cost of this step is hence given by (q − 1)(`1 + `2)k2(M + A) + (L̄(k2,v2 −k2(q − 1)))(`1 + `2)A + ( k2 v2 ) L̄(`2,w2)(q − 1)v2A. 4. In Step 9, when checking for collisions between S and T, we want to calculate the number of collisions we can expect on average. The elements in S and T are all of length `1 + `2 and hence there is a total of q`1+`2 possible elements. S has ( k1 v1 )( `1 w1 ) (q − 1)v1+w1 many elements and T has( k2 v2 )( `2 w2 ) (q − 1)v2+w2 many elements, we therefore get that the expected number of collisions is( k1 v1 )( k2 v2 )( `1 w1 )( `2 w2 ) (q − 1)v1+v2+w1+w2 q`1+`2 . 5. For each collision we have, we check whether w(−A2(πI (x1 + x2)) + s2) = t − v1 − v2 − w1 − w2 is satisfied. For this we will use the method of early abort: to compute one bit of the result costs (v1 + v2 + 1) additions and (v1 + v2) multiplications, hence this step costs on average q q − 1 (t−v1 −v2 −w1 −w2 + 1) ((v1 + v2 + 1)A + (v1 + v2)M) . Hence the total cost of one iteration is given by (n−k)2(n + 1)(A + M) + (`1 + `2)[(q − 1)((k1 + k2)M + k2A) + (L̄(k1,v1) −k1(q − 1))A + (L̄(k2,v2) −k2(q − 1))A] + ( k1 v1 ) L̄(`1,w1)(q − 1)v1A + ( k2 v2 ) L̄(`2,w2)(q − 1)v2A (2) + ( k1 v1 )( k2 v2 )( `1 w1 )( `2 w2 ) (q − 1)v1+v2+w1+w2q−(`1+`2) · q q − 1 (t−v1 −v2 −w1 −w2 + 1) ((v1 + v2 + 1)A + (v1 + v2)M) . 202 C. Interlando et al. / J. Algebra Comb. Discrete Appl. 7(2) (2020) 195–207 Overall cost Combining the result from (1) and (2) the overall cost of the ball-collision algorithm over Fq then amounts to ( n t )(( `3 t−v1 −v2 −w1 −w2 )( k1 v1 )( k2 v2 )( `1 w1 )( `2 w2 ))−1 · [(n−k)2(n + 1)(A + M) + (`1 + `2)[(q − 1)((k1 + k2)M + k2A) + (L̄(k1,v1) −k1(q − 1))A + (L̄(k2,v2) −k2(q − 1))A] + ( k1 v1 ) L̄(`1,w1)(q − 1)v1A + ( k2 v2 ) L̄(`2,w2)(q − 1)v2A + ( k1 v1 )( k2 v2 )( `1 w1 )( `2 w2 ) (q − 1)v1+v2+w1+w2q−(`1+`2) · q q − 1 (t−v1 −v2 −w1 −w2 + 1) ((v1 + v2 + 1)A + (v1 + v2)M)]. 5.1. Asymptotic complexity In this subsection we want to find the asymptotic complexity of the ball-collision algorithm over Fq. Fix real numbers 0 < T < 1/2 and R, with −T logq(T) − (1 −T) logq(1 −T) ≤ 1 −R < 1. We consider codes of large length n, we fix functions k,t : N → N which satisfy limn→∞ t(n)/n = T and limn→∞k(n)/n = R. We fix real numbers V,W,L with 0 ≤ V ≤ R/2, 0 ≤ W ≤ L and 0 ≤ T − 2V − 2W ≤ 1 −R− 2L. We fix the parameters v1,v2,w1,w2,`1,`2,k1,k2 of the ball-collision algorithm over Fq such that i) limn→∞ vin = V, ii) limn→∞ win = W, iii) limn→∞ kin = R/2, iv) limn→∞ `in = L, for i ∈ {1, 2}. We use the convention that x logq(x) = 0, for x = 0. In what follows we will use the following asymptotic formula for binomial coefficients: lim n→∞ 1 n logq ( α + o(1)n β + o(1)n ) = α logq(α) −β logq(β) − (α−β) logq(α−β). With this formula we get the following: i) limn→∞ 1n logq ( n t ) = −T logq(T) − (1 −T) logq(1 −T), ii) limn→∞ 1n logq ( ki vi ) = R/2 logq(R/2) −V logq(V ) − (R/2 −V ) logq(R/2 −V ), 203 C. Interlando et al. / J. Algebra Comb. Discrete Appl. 7(2) (2020) 195–207 iii) limn→∞ 1n logq ( `i wi ) = L logq(L) −W logq(W) − (L−W) logq(L−W), iv) limn→∞ 1n logq ( n−k−`1−`2 t−v1−v2−w1−w2 ) = (1 −R− 2L) logq(1 −R− 2L) − (T − 2V − 2W) logq(T − 2V − 2W) − (1 −R− 2L−T + 2V + 2W) logq(1 −R− 2L−T + 2V + 2W). Success probability We will denote by S(V,W,L) the asymptotic exponent of the success probability: S(V,W,L) = lim n→∞ 1 n logq (( n t )−1( n−k − `1 − `2 t−v1 −v2 −w1 −w2 )( k1 v1 )( k2 v2 )( `1 w1 )( `2 w2 )) = T logq(T) + (1 −T) logq(1 −T) + (1 −R− 2L) logq(1 −R− 2L) −(T − 2V − 2W) logq(T − 2V − 2W) −(1 −R− 2L−T + 2V + 2W) logq(1 −R− 2L−T + 2V + 2W) +R logq(R/2) − 2V logq(V ) − (R− 2V ) logq(R/2 −V ) + 2L logq(L) −2W logq(W) − 2(L−W) logq(L−W). Cost of one iteration We will denote by C(V,W,L) the asymptotic exponent of the cost of one iteration. C(V,W,L) = lim n→∞ 1 n logq (( k1 v1 ) (q − 1)v1 + ( k2 v2 ) (q − 1)v2 + ( k1 v1 )( `1 w1 ) (q − 1)v1+w1 + ( k2 v2 )( `2 w2 ) (q − 1)v2+w2 + ( k1 v1 )( k2 v2 )( `1 w1 )( `2 w2 ) (q − 1)v1+v2+w1+w2q−`1−`2 ) = max { logq(q − 1)V + R/2 logq(R/2) −V logq(V ) − (R/2 −V ) logq(R/2 −V ), logq(q − 1)(V + W) + R/2 logq(R/2) −V logq(V ) − (R/2 −V ) logq(R/2 −V ) +L logq(L) −W logq(W) − (L−W) logq(L−W), logq(q − 1)(2V + 2W) − 2L + R logq(R/2) − 2V logq(V ) −(R− 2V ) logq(R/2 −V ) + 2L logq(L) − 2W logq(W) − (2L− 2W) logq(L−W) } . Overall cost The overall asymptotic cost exponent of the ball-collision algorithm over Fq is given by the difference of C(V,W,L) and S(V,W,L): D(V,W,L) = max { logq(q − 1)V −R/2 logq(R/2) + V logq(V ) + (R/2 −V ) logq(R/2 −V ) −2L logq(L) + 2W logq(W) + 2(L−W) logq(L−W), logq(q − 1)(V + W) −R/2 logq(R/2) + V logq(V ) + (R/2 −V ) logq(R/2 −V ) −L logq(L) + W logq(W) + (L−W) logq(L−W), logq(q − 1)(2V + 2W) − 2L } −T logq(T) − (1 −T) logq(1 −T) (1 −R− 2L) logq(1 −R− 2L) + (T − 2V − 2W) logq(T − 2V − 2W) +(1 −R− 2L−T + 2V + 2W) logq(1 −R− 2L−T + 2V + 2W). The asymptotic complexity is then given by qD(V,W,L)n+o(n). Asymptotically, we assume that the code attains the Gilbert-Varshamov bound, i.e. the code rate 204 C. Interlando et al. / J. Algebra Comb. Discrete Appl. 7(2) (2020) 195–207 R = k/n and the distance D = d/n relate via: R = 1 + D logq(D) + (1 −D) logq(1 −D) −D logq(q − 1). (3) In order to compute the asymptotic complexity of half-distance decoding (i.e. T = D/2) for a fixed rate R, we performed a numerical optimization of the parameters V,W and L such that the overall cost D(V,W,L) is minimized subject to the following constraints: 0 ≤ V ≤ R/2, 0 ≤ W ≤ L and 0 ≤ T − 2V − 2W ≤ 1 −R− 2L. Let F(q,R) be the exponent of the optimized asymptotic complexity. The asymptotic complexity of half-distance decoding at rate R over Fq is then given by qF (q,R)n+o(n). In Table 1, the values refer to the exponent of the worst-case complexity of distinct algorithms, i.e. F(q,Rw) where Rw = argmax0