Acta Polytechnica Acta Polytechnica 53(4):344–346, 2013 © Czech Technical University in Prague, 2013 available online at http://ctn.cvut.cz/ap/ ON AN ALGORITHM FOR MULTIPERIODIC WORDS Štěpán Holub∗ Department of Algebra, Faculty of Mathematics and Physics, Charles University, Sokolovská 83, 175 86 Praha ∗ corresponding author: holub@karlin.mff.cuni.cz Abstract. We consider an algorithm by Tijdeman and Zamboni constructing a word of length k that has periods p1, . . . , pr, and the richest possible alphabet. We show that this algorithm can be easily stated and its correctness briefly proved using the class equivalence approach. Keywords: periodicity, combinatorics on words. 1. A short (personal) history Non-trivial words with a given set of periods P = {p1, p2, . . . , pr} have received a lot of attention in the past decade. The motivation was to generalize the result by Fine and Wilf dealing with two periods, which has become part of the folklore. A word with periods P is called trivial if gcd(P ) is its period too. Papers [1] and [2] are two (independent) results con- sidering non-trivial such words with maximal length and maximal cardinality of the alphabet. These pa- pers supplemented some older research of Castelli, Mignosi, Restivo and Justin (see, e.g., [3] for more details and references). Already in 1998, I wrote a short manuscript giving an analogous result (without considering publishing it), which I showed to Sorin Constantinescu during the WORDS 2003 conference in Turku, where he presented their results. Since this was passed without notice in the subsequent publi- cation, and since I considered my approach simpler and more natural, I later decided to publish it in [4]. There was a gap in my paper, discovered by Gwénaël Richomme, which is fixed in [5]. The present paper extends the same approach to the construction of the richest word with a given set of periods and a given length. The basic idea is to con- sider relations defined by the periods and understand letters as (names of) equivalence classes generated by those relations. The idea is obvious and well known, usually expressed using the graph terminology (edges and connected components), rather than the alge- braic terminology (relations and equivalence classes). Tijdeman and Zamboni [3] point out that the straight- forward algorithm based on the graph approach is “simple but inefficient”, and then present an algorithm based on less transparent combinatorial analysis. The aim of this paper is to give a short description of their algorithm, as well as a short and intuitive proof of its correctness, using consistently the graph/equivalence viewpoint. 2. Notation Let w be a word of length k over an alphabet A. The set of all letters that occur in w is denoted by alph(w). The i-th letter of w is denoted by w[i − 1] so that w = w[0]w[1] · · ·w[k − 1]. The prefix of w of length n is denoted by pref n(w). We say that a positive integer p is a period of a word w if w[i] = w[i + p] for all 0 ≤ i ≤ |w|− p − 1 (where |w| denotes the length of the word). Note that any p ≥ |w| is a period of u. If P is a set of positive integers such that each p ∈ P is a period of w, we say that w has periods P . The word of length k having periods P and the maximal possible cardinality of alph(w) is called an FW-word relative to P (where FW stands for “Fine and Wilf” for historic reasons). The word is called trivial with respect to P if gcd(P ) is a period of w. The longest non-trivial FW-word relative to P is called an extremal FW-word relative to P . We denote its length by L(P ) (note that L(P ) = L(P ) − 1 where L(P ) is the notation adopted in [3]). 3. Classes of Equivalence Let w be a word which has periods P . For the rest of the paper we denote m = min P . Obviously, if i ≡ j mod m or |i − j| ∈ P , then w[i] = w[j]. These two conditions induce the relation ∼P,k on integers {0, . . . , k − 1} defined by: i ∼P,k j if a) i ≡ j mod m, or b) there are integers i′, j′ ∈{0, . . . , k − 1} such that i ≡ i′ mod m, j ≡ j′ mod m and |i′ − j′| ∈ P. Let ≈P,k be the equivalence closure of ∼P,k. In other words, we have i ≈P,k j if and only if i and j lie in the same connected component of the graph defined by edges i ∼P,k j. The class of ≈P,k containing i will be denoted by [i]P,k and represented by its minimal element min[i]P,k. Then we define a word FW(P, k) of length k over the alphabet N by FW(P, k)[i] = min[i]P,k. 344 http://ctn.cvut.cz/ap/ vol. 53 no. 4/2013 On an Algorithm for Multiperiodic Words The construction immediately yields that FW(P, k) is the unique (up to renaming of letters) FW-word of length k relative to P . The alphabet eventually used in FW(P, k) depends on the number of equivalence classes. In fact, its cardinality is part of the informa- tion yielded by the algorithm. Example 1. Let P = {5, 7}. The following picture illustrates the construction of the word FW(P, 8) = 01034010. The upper edges correspond to i ≡ j mod 5, and the lower edge corresponds to |i − j| = 7. 0 1 2 3 4 5 6 7 Note that the condition • i ∼P,k j if |i − j| ∈ P would alone be enough in order to generate the equiv- alence ≈P,k. However, conditions a) and b), though more complicated, are convenient since they allow to limit the number of equivalence classes to m (and the alphabet to {0, 1, . . . , m − 1}) from the very beginning. Consider, for example, how 0 ≈P,8 2 is immediately seen in the following adjusted picture. 0 1 2 3 4 0 1 2 4. The algorithm The basic step of the algorithm is the reduction of P to a new set of periods Q defined by Q = {p − m | p ∈ P, p 6= m}∪{m} (1) (where m = min P according to our convention). This reduction is, in fact, one step in the Eucledean algo- rithm, and is well known in the literature on multiperi- odic words. The key fact about P and Q is expressed in the following lemma, which is an improved version of Lemma 2 from [4]. Lemma 1. Let k ≥ 0. Then for all i, j ∈{0, 1, . . . , k} [i]Q,k = [j]Q,k if and only if [i]P,k+m = [j]P,k+m. Proof. “⇒”: If [i]Q,k = [j]Q,k, then there is a sequence i = i0, . . . , i` = j, of numbers from {0, 1, . . . , k − 1} such that is ∼Q,k is+1 for each s = 0, . . . , ` − 1. The relation is ∼Q,k is+1 implies is ∼P,k+m is+1, since either • is ≡ is+1 mod m, or • max{is, is+1} + m − min{is, is+1}∈ P . Therefore [i]P,k+m = [j]P,k+m. “⇐”: On the other hand, let i = i0, . . . , i` = j, be a sequence of numbers from {0, . . . , k + m − 1}, with i, j ∈{0, . . . , k − 1}, such that is ∼P,k+m is+1 for each s = 0, . . . , ` − 1. Certainly, we can suppose that the numbers in the sequence are pairwise distinct, whence |is − is+1| ≥ m and both min{is, is+1} and max{is, is+1}− m are in {0, 1, . . . , k − 1}. We now see that max{is, is+1}− m ∼Q,k min{is, is+1}. Therefore the sequence i = i0, (i0 mod m), . . . , (i` mod m), i` = j proves [i]Q,k = [j]Q,k. We have an immediate corollary. Corollary 1. For any k ≥ 0, the word FW(Q, k) is a prefix of FW(P, k + m). The following lemma is an easy observation. Lemma 2. Let k−m ≤ i ≤ m−1. Then [i]P,k = {i}. Proof. Both i − p and i + p are out of range {0, 1, . . . , k − 1} for any p ∈ P (including m). There- fore i is not related by ∼P,k to any other element. From Corollary 1 and Lemma 2, the formula LP = m + max{LQ, m − 1} can be readily derived (see [4, 5]). In addition, it yields the following construction of FW(P, k), equivalent to Algorithm B described in [3]. FW-construction (Algorithm B). (1.) If k ≤ m, then Lemma 2 gives FW(P, k) = 0 · 1 · · ·(k − 1). (Recall that we consider integers as letters. To stress this, we use the typewriter font for them. The multiplication sign means concatenation). (2.) Let k > m. Since the word FW(P, k) has a period m, it is determined by its prefix w of length m. Denote u = FW(Q, k−m). Corollary 1 and Lemma 2 imply that • w = pref m(u) if m ≤ k − m, and • w = u · |u| · (|u| + 1) · · ·(m − 1) otherwise. This can be succinctly stated as: FW(P, k)[i] =   FW(Q, k − m)[i mod m] if (i mod m) < k − m; i mod m otherwise. 345 Štěpán Holub Acta Polytechnica Example 2. Let P = {5, 7} and k = 8 as in Example 1. Recursive definition of FW(P, 8) leads to P = Q0 = {5, 7} k = k0 = 8 Q1 = {2, 5} k1 = 3 Q2 = {2, 3} k2 = 1 In order to obtain the word u0 = FW(Q0, k0) = FW(P, 8) we will need words u1 = FW(Q1, k1) and u2 = FW(Q2, k2). Since k2 = 1, we have u2 = 0. From the point (2.) above we have u1 = pref 3(w ω 1 ) where w1 = 01. Therefore u1 = 010. Similarly, we get u0 = pref 8(w ω 0 ) where w0 = 01034, whence FW(P, 8) = 01034010. Schematically: Q0 = {5, 7} Q1 = {2, 5} Q2 = {2, 3} k0 = 8 k1 = 3 k2 = 1 u0 = 01034010 u1 = 010 u2 = 0 w0 = 01034 w1 = 01 From the above example we see that the procedure has two parts: “descending” and “ascending”, which are called “Reduction” and “Extension” in [3]. The end of reduction can be defined in several ways. We have seen that we can turn to extension as soon as we know FW(Qi, ki). This typically happens if ki ≤ min Qi, or if min Qi = gcd(Qi). 5. Concluding remarks As already remarked, the above algorithm is identical with Algorithm B from [3]. Even all arguments we use can be in some way traced back to similar arguments in the literature. Nevertheless, I believe that the description presented here provides further evidence that the equivalence class approach is not only simple but it also yields an intuition sufficient to formulate and understand the construction. (Another elegant example, in my opinion, is the proof of the fact that the extremal FW-word is a palindrome, given in [4].) That said, one should stress that the inefficiency claim concerning the equivalence class approach is valid if we consider the naïve procedure suggested by Example 1. The precise computational complexity of Algorithm B goes beyond the scope of this paper (see the discussion in [3]). One possible drawback can be a bit discouraging no- tation like ∼P,k, and the fact that notions like “equiv- alence closure” may sound “too algebraic” to some ears. Computer theorists could therefore like to trans- late the exposition into graph language and speak about edges instead of generating relations, and about connected components instead of equivalence classes. The rest will be the same. Acknowledgements This work was supported by Czech Science Foundation grant 13-01832S. References [1] S. Constantinescu, et al. Generalised Fine and Wilf’s theorem for arbitrary number of periods. Theoret Comput Sci 339(1):49–60, 2005. [2] R. Tijdeman, et al. Fine and Wilf words for any periods. Indag Math (NS) 14(1):135–147, 2003. [3] R. Tijdeman, et al. Fine and wilf words for any periods II. Theor Comput Sci 410(30-32):3027–3034, 2009. [4] Š. Holub. On multiperiodic words. Theor Inform Appl 40(4):583–591, 2006. [5] Š. Holub. Corrigendum: On multiperiodic words. Theor Inform Appl 45(4):467–469, 2011. 346 Acta Polytechnica 53(4):344–346, 2013 1 A short (personal) history 2 Notation 3 Classes of Equivalence 4 The algorithm 5 Concluding remarks Acknowledgements References