AP07_2-3.vp 1 Introduction The study of factor complexity, palindromic complexity, balance property, return words, and the recurrence function of infinite aperiodic words is an interesting combinatorial problem. Moreover, the investigation of infinite words coding �-integers ��, for � being a Pisot number, can be interpreted as an investigation of one-dimensional quasicrystals. In this paper, new results concerning return words and recurrence function will be presented. In general, little is known about re- turn words. A return word of factor w of an infinite word u is any word which starts at some occurrence of w in u and ends just before the next occurrence of w. Vuillon [1] has shown that an infinite word over a bilateral alphabet is sturmian if and only if each of its factors has exactly two return words. Justin and Vuillon [2] proved that each factor of an Arnoux-Rauzy word of order m has m return words. Moreover, Vuillon proved that each factor of an -interval exchange coding word also has return words. Return words in the Thue-Morse se- quence and return words in the infinite words associated with simple Parry numbers have been studied in [4]. Even less is known about the recurrence function which, for an infinite uniformly recurrent word u, determines the minimal length Ru(n) such that if we take any subword of length Ru(n) of u, we will find there all the factors of length n. Cassaigne [5] deter- mined this function for sturmian words taking into account the continued fractions of their slope and, in [6], he given a general algorithm describing how to determine the recur- rence function if we know the return words. This algorithm will constitute the headstone of our further results. We will describe the cardinality of the set of return words for any factor w of the infinite word u� associated with qua- dratic non-simple Parry (and, thus Pisot) number � and, as a consequence, we will be able to deduce the exact formula for the recurrence function. These results will complete the list of properties already studied: factor complexity has been stud- ied in [7], palindromic complexity is described in [8], and re- sults on balances can be found in [9]. 2 Preliminaries First, let us introduce our “language”, which will be used throughout this paper. An alphabet � is a finite set of symbols called letters. A concatenation of letters is a word. The length of word w is denoted by w . The set �* of all finite words (includ- ing the empty word �) provided with the operation of concate- nation is a free monoid. We will also deal with right-sided infi- nite words u u u u� 0 1 2 � . A finite word w is called a factor of the word u (finite or infinite) if there exist a finite word w(1) and a word w(2) (finite or infinite) such that u w ww� ( ) ( )1 2 . The word w is a prefix of u if w( )1 � � and it is a suffix of u if w( )2 � �. A concatenation of k letters a (or words a) will be denoted by ak. The language of u, �(u), is the set of all factors of a word u, and �n(u) is the set of all factors of u of length n. We say that a letter a �� is a left extension of a factor w u� �( ) if the factor aw belongs to the language �(u) and w is left special if w has at least two left extensions. Right extensions and right special factors are defined by analogy. We call a factor w u� �( ) bispecial if it is left special and right special. A mapping � on the free monoid �* is called a morphism if � � �( ) ( ) ( )vW v w� for all v w, *�� . Obviously, for determining any morphism it suffices to give �(a) for all a ��. The action of a morphism can be naturally extended on right-sided infinite words by the prescription � � � �( ) : ( ) ( ) ( )u u u u u u0 1 2 0 1 2� �� A non-erasing morphism �, for which there exists a letter a �� such that �( )a aw� for some non-empty word w �� *, is called a substitution. An infinite word u such that �( )u u� is called a fixed point of the substitution �. Obviously, every sub- stitution has at least one fixed point, namely lim ( ) n n a �� � . In everything that follows, we will focus on the infinite word u� being the only fixed point of the substitution � given by � �( ) , ( ) , ,0 0 1 1 0 1 1� � � � �a b a b a �. (1) Let us only briefly mention that this infinite word codes the set of �-integers ��, where � is a quadratic non-simple Parry number. All details can be found, for example, in [8]. Since our main interest is in studying return words and the re- currence function, let us give the corresponding definitions. Definition 1 Let w be a factor of an infinite word u u u� 0 1 � (with u j ��), w � �. An integer � is an occurrence of w in u if u u u wj j j� � � �1 1� � . Let j, k, j 0 such that any segment of u of length � R n( ) contains all words in �n(u). As we deal with the infinite word u n� �� ��lim ( )0 , where �( )0 0 1� a , �( )1 0 1� b , a b� �1 , let us also recall what is known about the uniform recurrence of words being fixed points of a substitution. Definition 2 A substitution � over an alphabet � � a a ak1 2, , , ,� is called primitive if there exists K � � such that for any letter ai ��, the word �K(ai) contains all letters of �. Queffélec in [10] has shown that any fixed point of a prim- itive substitution � is a uniformly recurrent word. Thus, since the infinite word ub is a fixed point of a primitive substitution, u� is uniformly recurrent. It is not difficult to see that the set of return words of w is finite for any factor w u� �( ) if u is a uni- formly recurrent word. Definition 3 The recurrence function of an infinite word u is the function Ru : � �� � �� defined by � R n N v u v uu N n n( ) inf ( ), ( ) ( )� � � � � � ��� � � � In other words, Ru(n) is the smallest length such that if we take any segment of u of length Ru(n), we will find there all the factors of u of length n. Clearly, u is uniformly recurrent if and only if Ru(n) is fi- nite for every n � �. To get another expression for Ru(n) that is convenient to work with, let us introduce some more terms. Definition 4 Let u be an infinite uniformly recurrent word. � Let w u� �( ), then l w v v wu( ) max ( )� � Ret is the maxi- mal return time of w in u. � For all n � �, we define l n l w w uu u n( ) max ( ) ( )� � � . Once we have determined the lengths of the return words, the following proposition from [5] will allow us to calculate the recurrence function Ru(n). Proposition 5 For any infinite uniformly recurrent word u and for any n � �, one has R n l n nu u( ) ( )� � � 1 3 Return words The aim of this section is to determine return words of the infinite word u� being the fixed point of the substitution � in- troduced in (1). Vuillon in [1] has shown the following result for sturmian words. Proposition 6 Let u be an infinite word over a two letter alphabet. Then u is sturmian if and only if # ( )Ret w � 2 for every factor w of u. Let us mention that u� is sturmian for a b� �1 , as follows from [7]. Example 7 Let u � � 001001010010010100101� be the fixed point of the substitution � �( ) , ( )0 001 1 01� � (sturmian case). Let us show ex- amples of return words: Ret( ) { , }0 0 01� Ret( ) { , }00 001 00101� Ret( ) { , }001 001 00101� Ret( ) { , }0010 001 00101� Throughout this section, we will use methods analogous to those ones introduced in [4]. In order to study return words Ret(w) of factors w of an infinite uniformly recurrent word u, it is possible to limit our considerations to bispecial factors. Namely, if a factor w is not right special, i.e., if it has a unique right extension a ��, then the sets of occurrences of w and wa coincide, and Ret Ret( ) ( )w wa� (2) If a factor w has a unique left extension b ��, then j � 1 is an occurrence of w in the infinite word u if and only if j � 1 is an occurrence of bw. This statement does not hold for j � 0. Nevertheless, if u is a uniformly recurrent infinite word, then the set Ret(w) of return words of w stays the same no matter whether we include the return word corresponding to the pre- fix w of u or not. Consequently, we have Ret Ret Ret( ) ( ) ( )bw b w b bvb v w� � �� �1 1 , (3) where bvb�1 means that the word v is extended to the left by the letter b and the letter b on the right end is erased (Note that b is always the suffix of v for v w� Ret( )). Observation 8 Note that relation (2) implies that if w u� �( ) is not right special, then the maximal return times from Definition 4 satisfy l w l wau u( ) ( )� , where a �� is the only right extension of w. Simi- larly, relation (3) implies that if w is not left special, then l w l bwu u( ) ( )� , where b �� is the only left extension of w. For an aperiodic uniformly recurrent infinite word u, each factor w can be extended to the left and to the right to a bi- special factor. To describe the cardinality of Ret( )w , it suffices therefore to consider bispecial factors w. Observation 9 The only bispecial factors of u� that do not contain the letter 1 are 0 r, r a� � 1. Observation 10 Every bispecial factor w of u� containing at least one letter 1 has the prefix 0b1 and the suffix 1 0b. Consequently, there exists a bispecial factor v such that w vb b� 0 1 0�( ) . (The empty word � is bispecial, too.) Let us summarize the previous observations to get the de- scription of all bispecial factors. Corollary 11 The set of all the bispecial factors of u� is given by B r a nr n( ) , , ,� � �0 1� � , where B Br n b r n b( ) ( )( )� �1 0 1 0� for all n � �, B0 1( ) � �, and Br r( )1 0� for r a� �0 1, ,� . In order to find return words of bispecial factors, let us re- call a lemma from [4]. To formulate it, we also need to recall a definition. Definition 12 Let w be a factor of a fixed point u of a substitution �. We say that a word v v v v un n n0 1 1 1� � �� � ( ) is an ancestor of w if � w is a factor of �� v v v vn n0 1 1� � ), 16 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 47 No. 2–3/2007 � w is neither a factor of �� v v vn n1 1� � ) nor of �� v v v vn n0 1 1� � ). Clearly, any factor of the form �( )w has at least one ances- tor, namely the factor w. Lemma 13 Let an infinite word u be a fixed point of a substitution � and let be a factor of u. If the only ancestor of �(w) is the factor w, then Ret Ret( ( )) ( ( ))� �w w� . Proposition 14 Let w be a non-empty factor of u�, then �(w) has a unique ancestor, namely w. Proof. It is a direct consequence of the form of the substitution defined by (1) that any factor v having suffix 1 occurs only as suffix of �(w) for some factor w u� �( )� . Moreover, any factor �v which has only 1 as its left extension, occurs only as prefix of �( � )w for some � ( )w u� � � . The statement follows by injectivity of �. Now we are able to explain how to get return words of all bispecial factors. Corollary 15 Let Br n( ) be a bispecial factor of u�. Then # ( ) # ( )( ) ( )Ret RetB Br n r n� �1 for every n � 2 and r a� �0 1, ,� . Proof. It follows from Proposition 14 that # ( ) # ( ( ))( ) ( )Ret RetB Br n r n� ��1 1� . Then, Relations (2) and (3) say that # ( ( )) # ( ( ) ) # ( )( ) ( ) ( )Ret Ret Ret� �B B Br n b r n b r n� �� �1 10 1 0 . Let us sum up the results to get the main theorem about the cardinality of the set of return words of any factor of u�. Theorem 16 Let w be a factor of u�. Then 2 3� �# ( )Ret w . Proof. Using Corollary 15, it suffices to consider only bispecial factors of the form 0r, r a� � 1, and � to obtain all possible cardinalities of the sets of return words of factors of u�. It is not difficult to see that the return words of the simplest bispecial factors are the following: � Let r b� then # ( ) ,Ret 0 0 0 1r r� . � Let b r a� � � 1 then # ( ) , ,Ret 0 0 0 1 0 10r r r b� . � Since # ( ( ) ) ( )( ) , ( )( )Ret 0 1 0 0 1 0 0 1 0 1 1 0 1 0 10 1 1b b b b b b b � � � �� � � � a b b� , ,0 1 it is useful to put Ret( ) ,� � 0 1 . Observation 17 From the proof of Theorem 16, we can observe that in the case b a� � 1, # ( )Ret w � 2 for all the factors of u�. Thanks to Proposi- tion 6, we have another confirmation of the fact that u� for b a� � 1 is a sturmian word. 4 How to compute the recurrence function Let us show how to apply the knowledge of return words to describe the recurrence function (Definition 3) of the infinite word u�. The methods used here follow the ideas from [6]. As the reader will have anticipated, we want to compute lu(n) in order to get the recurrence function Ru(n) for every n � �. This task can be simplified using the notion of singular factors. A factor w u� �( ) is called singular if w � 1 or if there exist a word v u� �( ) and letters x x y y, , ,� � �� such that w xvy� , x x� �, y y� �, and � � �x vy xvy u, ( )� . Obviously, v is a bispecial factor. Proposition 18 Let u be a uniformly recurrent word and n � 1. If l n l nu u( ) ( )� �1 , then there exists a singular factor w of length n such that l w l nu u( ) ( )� . A singular factor w is said to be essential if l w l w l wu u u( ) ( ) ( )� � � 1 . It follows that to calculate l nu( ), it is sufficient to consider sin- gular, or, even, only essential singular factors. Theorem 19 Let u be a uniformly recurrent word and n � 1. l n l w w n and w l w w n and w essen u u u ( ) max ( ) max ( ) � � � � singular tial singular . Now we are able to give an algorithm for computing the recurrence function of an infinite uniformly recurrent word u: 1. Determine bispecial factors. 2. Deduce the form of singular factors and compute their lengths. 3. For every singular factor, determine the associated return words and compute their lengths. 4. Compute the function lu(n) to get the recurrence function Ru(n) for every n � � using Proposition 5. 5 Computation of the recurrence function of u� Let us apply the above described algorithm for computing the recurrence function of the infinite word u� being the only fixed point of the substitution �( )0 0 1� a , �( )1 0 1� b , a b� �1 . 1. The bispecial factors of u� are described in Corollary 11. 2. Let us show how the task of describing the singular factors can be simplified using the relation between a factor and its image under the substitution �. Observation 20 Let n � 2 and r a� �0 1, ,� . Then, S x y x B yr n r n( ) ( )( , ) � is a singular factor if and only if S x y x B yr n r n( ) ( )( , )� ��1 1 is a singu- lar factor. If we describe all the simplest non-trivial singular factors, i.e., those factors of the form S x y x B yr r ( ) ( )( , )1 1� , x y, ,� 0 1 , Observation 20 will allow us to get the set of all singular factors of u�. © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 17 Acta Polytechnica Vol. 47 No. 2–3/2007 Proposition 21 The set of all singular factors of u� is given by 0 1 0 1, ( , ) , , ,( )� � � �S x y r a nr n � � where S x y x B yr n r n( ) ( )( , ) � for all n � � and x, y come from the form of the simplest non-trivial singular factors: S r ar r( )( , ) ,1 0 0 00 0 2� � � Sb b( )( , )1 1 0 10 0� , Sb b( )( , )1 0 1 00 1� , Sb b( )( , )1 1 1 10 1� , S0 1 0 0 00( )( , ) � . To compute the length of singular factors, it is, of course, enough to compute the length of bispecial factors since S x y Br n r n( ) ( )( , ) � � 2 . For the lengths of bispecial factors, we have B B Br n r n r n( ) ( ) ( )� � 0 1 where denotes the number of 0’s in Br n( ) and Br n( ) 1 denotes the number of 1’s in Br n( ). Then B B a b B B r n r n r n r n ( ) ( ) ( ) ( ) 0 1 1 0 11 1 � � � � � � � � � � � � � �� � � �� � � 1 � � � � � � � � � � for all n � 2 . 3. To describe the return words of singular factors, we can again make use of a relation between return words of a singular factor and return words of its image under �. Proposition 22 Let n � 2 , r a� �0 1, ,� . The following sets are equal w w S x y v v S x yrn rn� � � �Ret Ret( ( , )) ( ) ( ( , ))� 1 . Proof. We will distinguish two situations. a) For x � 0 and y � 0 1, , applying Observation 8 on re- turn words of factors which are not right or left special, it follows that the set of lengths of return words of S x y x B yr n b r n b( ) ( )( , ) ( )� �0 1 01� is the same as that of �� � � �� �x B yr n( )( )�1 . Thus, we get w w S x y w w S x yrn rn� � � �Ret Ret( ( , )) ( ( ( , )))� 1 . b) The singular factor S Br n b r n b( ) ( )( , ) ( )11 10 1 0 11� �� can be extended to the left without changing the set of lengths of return words to 0 10 1 0 1 01 11 1a b r n b r nB B� � � �( ) ( ) ( ) ( )( ) ( )� �� . Thus, we get w w S x y w w Srn rn� � � �Ret Ret( ( , )) ( ( ( , )))� 0 1 11 . The statement then follows using Lemma 12 and Proposi- tion 14, and, in the case of (b), also using Relation (3) for return words of 0 1 11Sr n � ( , ) and Sr n �1 1 1( , ). Now it suffices to determine return words and their lengths for the simplest singular factors S x yr ( )( , )1 . Proposi- tion 22 implies that the lengths of return words of all the other singular factors will be obtained by calculating the lengths of images of the simplest return words. Here are the return words of the simplest singular factors. a) For the trivial singular factors 0, 1, one gets Ret( ) ,0 0 01� and Ret( ) ,1 10 10� a b . (4) b) For Sr r( )( , )1 0 0 00 0� , using the proof of Theorem 16, we have Ret( ) , , ,00 0 0 1 0 10 1 2 0 0 1 0 10 1 22 2r a a b r r b r a b r� � � � �� � if if � � � � � � � �� � � � � a r br 2 0 0 1 1 22, if (5) c) For Sb b( )( , )1 1 0 10 0� , one can easily find Ret( ) ,10 0 10 10 10b a a b� . (6) d) For Sb b( )( , )1 0 1 00 1� , one has Ret( ) ,( ) ( )00 1 0 10 0 10 101 1 1 1b b a b b b a b� � � � � � � . (7) e) Since Sb b( )( , ) ( )1 1 1 10 1 1 1� � � , using Relation (3), we have Ret Ret( ) ( ( )) ( ) , ( ) . 10 1 1 1 1 1 10 1 1 10 1 1 1 1 b a b v v� � � � � � � � � (8) f) For S0 1 0 0 00( )( , ) � , one has to consider more cases: � Let b � 1 and a � 2 (a sturmian case), it holds Ret( ) ,00 001 00101� . � Let b � 1 and a � 2, then Ret( ) ,00 0 00101� . � Let b � 2, then Ret( ) ,00 0 001� . 4. The last step is to compute l k ku � ( ), � �. For simplicity, let us write l(k) instead of l ku � ( ) in the sequel. Before starting, let us exclude singular factors which are not essential. (Let us recall that a singular factor w is essential if l w l w l w( ) ( ) ( )� � � 1 .) Naturally, we will apply Proposi- tion 22 to determine the maximal return times l(w) of singular factor w. a) Note that S S Sb n b n b n( ) ( ) ( )( , ) ( , ) ( , )1 0 0 1 1 1� � , but from Relations (6), (7), and (8), it is clear that for all n � �, one gets l S l S l Sb n b n b n( ( , )) ( ( , )) ( ( , ))( ) ( ) ( )1 0 0 1 1 1� � . Thus, Sb n( )( , )1 0 and Sb n( )( , )0 1 are not essential singular factors and one does not have to consider them in calculation of l(k), k � �. b) By analogy, if b � 2, we have S Sn b n 0 1 0 0 1 1( ) ( )( , ) ( , )� � , while l S l Sn b n( ( , )) ( ( , ))( ) ( )0 1 0 0 1 1� � for all n � �. Hence, the singular factors S n0 1 0 0( )( , )� are not essential. c) Next, if a r b� � �2 , we have S Sr n b n( ) ( )( , ) ( , )0 0 1 1� , nev- ertheless, l S l Sr n b n( ( , )) ( ( , ))( ) ( )0 0 1 1� for all n � �. There- fore, Sr n( )( , )0 0 , r b� , are not essential. 18 © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ Acta Polytechnica Vol. 47 No. 2–3/2007 d) If 1 2� � �r b , then we obtain S Sr n b n( ) ( )( , ) ( , )� �1 0 0 1 1 , but l S l Sr n b n( ( , )) ( ( , ))( ) ( )� �1 0 0 1 1 for all n � �, thus, the singular factors Sr n( )( , )�1 0 0 , 1 2� � �r b , are not essential. e) The last remark is that for the trivial singular factor w � 1, we have w Sr� ( )( , )1 0 0 , 1 2� � �r b , but l S l wr( ( , )) ( ) ( )1 0 0 � , hence, Sr ( )( , )1 0 0 , 1 2� � �r b , are not essential. The previous facts imply that in order to calculate l(k) we have to take into account only the trivial singular factors, the non-trivial singular factors of the form Sb n( )( , )1 1 , and, eventually, S n0 0 0 ( )( , ) and Sb n �1 0 0 ( )( , ). The formulae for l(k), k � �, will split into more cases, according to the values of a and b in the substitution (1). a) For b � 2, combining the previous facts and the descrip- tion of the simplest singular factors, we obtain the fol- lowing formula for l(k), k � �, � If 2 1b a� � , then l l b a( ) ( )1 1� � � �� , l b b( )� � �1 2 3 , l k n a( ) ( )� � 10 , for S k Sb n b n( ) ( )( , ) ( , )1 1 0 01 1� � � � , n � �, l k n b b( ) ( )� �� 0 10 11 , for S k Sb n b n � � �� �1 1 10 0 1 1( ) ( )( , ) ( , ) , n � �. � If 2b+1 < a, then l l b a( ) ( )1 1 1� � � � �� , l k n a( ) ( )� � 10 , for S k Sb n b n( ) ( )( , ) ( , )1 1 1 11� � � , n � �. b) For b � 1 and 2 3� �a , then it holds for l(k), k � �, l a( )1 1� � , l( )2 5� , l k n a( ) ( )� � 10 , for S k Sn n1 0 11 1 0 0( ) ( )( , ) ( , )� � � , n � �, l k n( ) ( )� � 00101 for S k Sn n0 1 1 10 0 1 1( ) ( )( , ) ( , )� �� � , n � �. c) For b � 1 and a � 3, we get l l a( ) ( )1 2 1� � � , l k n a( ) ( )� � 10 , for S k Sn n1 1 11 1 1 1( ) ( )( , ) ( , )� � � , n � �. Having calculated the formula for l(k), k � �, we have also computed the recurrence function, since R k l k k( ) ( )� � � 1. References [1] Vuillon, L.: A Characterization of Sturmian Words by Return Words. Eur. J. Comb., Vol. 22 (2001), No. 2, p. 263–275. [2] Justin, J., Vuillon, L.: Return Words in Sturmian and Episturmian Words. Theor. Inform. Appl., Vol. 34 (2000), p. 343–356. [3] Vuillon, L.: On the Number of Return Words in Infinite Words with Complexity. LIAFA Research Report 2000/15. [4] Balková, L’., Pelantová, E., Steiner, W.: Return Words in the Thue-Morse and Other Sequences. arXiv:math/0608 603 Sci., 2006. [5] Cassaigne, J.: Limit Values of the Recurrence Quotient of Sturmian Sequences. Theor. Comput. Sci., Vol. 218 (1999), No. 1, p. 3–12. [6] Cassaigne, J.: Recurrence in Infinite Words. Lecture Notes in Computer Science, Springer Verlag, Vol. 2010, 2001, p. 1–11. [7] Balková, L’.: Complexity for Infinite Words Associated with Quadratic Non-Simple Parry Numbers. Journal of Geometry and Symmetry in Physics (WGMP 2005 Proceed- ings) Vol. 7 (2006), p. 1–11. [8] Balková, L’., Masáková, Z.: Palindromic Complexity of Infinite Words Associated with Non-Simple Parry Num- bers. submitted to RAIRO Theor. Inform. Appl., 2006, 16 p. [9] Balková, L’., Pelantová, E., Turek, O.: Combinatorial and Arithmetical Properties of Infinite Words Associated with Non-Simple Quadratic Parry Numbers. To appear in RAIRO Theor. Inform. Appl., 2006, 17 p. [10] Queffélec, M.: Substitution Dynamical Systems – Spec- tral Analysis. Lecture Notes in Math., Springer Berlin Vol. 1294, 1987. Ing. L’ubomíra Balková phone: +420 224 358 560 e-mail: l.balkova@centrum.cz Doppler Institute for Mathematical Physics and Applied Mathematics & Department of Mathematics Czech Technical University in Prague Faculty of Nuclear Sciences and Physical Engineering Trojanova 13 120 00 Praha 2, Czech Republic © Czech Technical University Publishing House http://ctn.cvut.cz/ap/ 19 Acta Polytechnica Vol. 47 No. 2–3/2007 k 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 l(k) 3 5 8 8 13 13 13 21 21 21 21 21 34 34 34 34 34 34 34 34 R(k) 3 6 10 11 17 18 19 28 29 30 31 32 46 47 48 49 50 51 52 53 The table of the first 20 values of l(k) and R(k) for the simplest case a � 2 and b � 1