1 Introduction

Two words are called abelian equivalent if the amount of each letter is identical in both words, e.g. rotor and torro are abelian equivalent albeit banana and ananas are not. Abelian equivalence has been studied with various generalisations and specifications such as abelian-complexity, k-abelian equivalence, avoidability of (k-)abelian powers and much more (cf. e.g., [6, 10, 11, 13, 17, 22,23,24]). The number of occurrences of each letter is captured in the Parikh vector (also known as Parikh image or Parikh mapping) [21]: given a lexicographical order on the alphabet, the \(i^\mathrm{th}\) component of this vector is the amount of the \(i^\mathrm{th}\) letter of the alphabet in a given word. Parikh vectors have been studied in [12, 16, 19] and are generalised to Parikh matrices for saving more information about the word than just the amount of letters (cf. eg., [20, 25]).

A recent generalisation of abelian equivalence, for words over the binary alphabet \(\{\mathsf {0},\mathsf {1}\}\), is prefix normal equivalence (pn-equivalence) [14]. Two binary words are pn-equivalent if their maximal numbers of \(\mathsf {1}\)s in any factor of length n are equal for all \(n\in \mathbb {N}\). Burcsi et al. [5] showed that this relation is indeed an equivalence relation and moreover that each class contains exactly one uniquely determined representative - called a prefix normal word. A word w is said to be prefix normal if the prefix of w of any length has at least the number of \(\mathsf {1}\)s as any of w’s factors of the same length. For instance, the word is prefix normal but is not, witnessed by the fact that is a factor but not a prefix. Both words are pn-equivalent. In addition to being representatives of the pne-classes, prefix normal words are also of interest since they are connected to Lyndon words, in the sense that every prefix normal word is a pre-necklace [14]. Furthermore, as shown in [14], the indexed jumbled pattern matching problem (see e.g. [2, 4, 18]) is connected to prefix normal forms: if the prefix normal forms are given, the indexed jumbled pattern matching problem can be solved in linear time \(\mathcal {O}(n)\) of the word length n. The best known algorithm for this problem has a run-time of \(\mathcal {O}(n^{1.864})\) (see [7]). Consequently there is also an interest in prefix normal forms from an algorithmic point of view. An algorithm for the computation of all prefix normal words of length n in run-time \(\mathcal {O}(n)\) per word is given in [8]. Balister and Gerke [1] showed that the number of prefix normal words of length n is \(2^{n-\varTheta (\log ^2(n))}\) and the class of a given prefix normal word contains at most \(2^{n-O(\sqrt{n\log (n)})}\) elements. A closed formula for the number of prefix normal words is still unknown. In “OEIS” [15] the number of prefix normal words of length n (A194850), a list of binary prefix normal words (A238109), and the maximum size of a class of binary words of length n having the same prefix normal form (A238110), can be found. An extension to infinite words is presented in [9].

Our Contribution. In this work we investigate two conspicuities mentioned in [3, 14]: palindromes and extension-critical words. Generalising the result of [3] we prove that prefix normal palindromes (pnPal) play a special role since they are not pn-equivalent to any other word. Since not all palindromes are prefix normal, as witnessed by , determining the number of pnPals is an (unsolved) sub-problem. We show that solving this sub-problem brings us closer to determining the index, i.e. number of equivalence classes w.r.t. a given word length, of the pn-equivalence relation. Moreover we give a characterisation based on the maximum-ones function for pnPals. The notion of extension-critical words is based on an iterative approach: compute the prefix normal words of length \(n+1\) based on the prefix normal words of length n. A prefix normal word w is called extension-critical if \(w\mathsf {1}\) is not prefix normal. For instance, the word is prefix normal but is not and thus is called extension-critical. This means that all non-extension-critical words contribute to the class of prefix normal words of the next word-length. We investigate the set of extension-critical words by introducing an equivalence relation collapse, grouping all extensional-critical words that are pn-equivalent w.r.t. length \(n+1\). Finally we prove that (prefix normal) palindromes and the collapsing relation (extensional-critical words) are related. In contrast to [14] we work with suffix-normal words (least representatives) instead of prefix-normal words. It follows from Lemma 1 that both notions lead to the same results.

Structure of the Paper. In Sect. 2, the basic definitions and notions are presented. In Sect. 3, we present the results on pnPals. Finally, in Sect. 4, the iterative approach based on collapsing words is shown. This includes a lower bound and an upper bound for the number of prefix normal words, based on pnPals and the collapsing relation. Due to space restrictions all proofs are in the appendix.

2 Preliminaries

Let \(\mathbb {N}\) denote the set of natural numbers starting with 1, and let \(\mathbb {N}_0=\mathbb {N}\cup \{0\}\). Define \([n]=\{1,\dots ,n\}\), for \(n\in \mathbb {N}\), and set \([n]_0=[n]\cup \{0\}\).

An alphabet is a finite set \(\varSigma \), the set of all finite words over \(\varSigma \) is denoted by \(\varSigma ^{*}\), and the empty word by \(\varepsilon \). Let \(\varSigma ^+=\varSigma ^{*}\backslash \{\varepsilon \}\) be the free semigroup for the free monoid \(\varSigma ^{*}\). Let w[i] denote the \(i^\mathrm{th}\) letter of \(w\in \varSigma ^{*}\) that is \(w=\varepsilon \) or \(w=w[1]\dots w[n]\). The length of a word \(w=w[1]\dots w[n]\) is denoted by |w| and let \(|\varepsilon |=0\). Set \(w[i..j]=w[i]\dots w[j]\) for \(1\le i\le j\le |w|\). Set \(\varSigma ^n=\{w\in \varSigma ^{*}|\,|w|=n\}\) for all \(n\in \mathbb {N}_0\). The number of occurrences of a letter \(\mathsf {x}\in \varSigma \) in \(w\in \varSigma ^{*}\) is denoted by \(|w|_{\mathsf {x}}\). For a given word \(w\in \varSigma ^n\) the reversal of w is defined by \(w^R=w[n]\dots w[1]\). A word \(u\in \varSigma ^{*}\) is a factor of \(w\in \varSigma ^{*}\) if \(w=xuy\) holds for some words \(x,y\in \varSigma ^{*}\). If \(x=\varepsilon \) then u is called a prefix of w and a suffix if \(y=\varepsilon \). Let \({{\,\mathrm{Fact}\,}}(w), {{\,\mathrm{Pref}\,}}(w),{{\,\mathrm{Suff}\,}}(w)\) denote the sets of all factors, prefixes, and suffixes respectively. Define \({{\,\mathrm{Fact}\,}}_k(w)={{\,\mathrm{Fact}\,}}(w)\cap \varSigma ^k\) and \({{\,\mathrm{Pref}\,}}_k(w),{{\,\mathrm{Suff}\,}}_k(w)\) are defined accordingly. Notice that \(|{{\,\mathrm{Pref}\,}}_k(w)|=|{{\,\mathrm{Suff}\,}}_k(w)|=1\) for all \(k\le |w|\). The powers of \(w\in \varSigma ^{*}\) are recursively defined by \(w^0=\varepsilon \), \(w^n=ww^{n-1}\) for \(n\in \mathbb {N}\).

Following [14], we only consider binary alphabets, namely \(\varSigma =\{\mathsf {0},\mathsf {1}\}\) with the fixed lexicographic order induced by \(\mathsf {0}< \mathsf {1}\) on \(\varSigma \). In analogy to binary numbers we call a word \(w\in \varSigma ^n\) odd if \(w[n]=\mathsf {1}\) and even otherwise.

For a function \(f:[n]\rightarrow \varDelta \) for \(n\in \mathbb {N}_0\) and an arbitrary alphabet \(\varDelta \) the concatenation of the images defines a finite word \({{\,\mathrm{\mathsf {serialise}}\,}}(f)=f(1)f(2)\dots f(n)\in \varDelta ^{*}\). Since \({{\,\mathrm{\mathsf {serialise}}\,}}\) is bijective, we will identify \({{\,\mathrm{\mathsf {serialise}}\,}}(f)\) with f and use in both cases f (as long as it is clear from the context). This definition allows us to access f’s reversed function \(g:[n]\rightarrow \varDelta ;k\mapsto f(n-k+1)\) easily by \(f^R\).

Definition 1

The maximum-ones functions is defined for a word \(w\in \varSigma ^{*}\) by \(f_w:[|w|]_0 \rightarrow [|w|]_0;\,k\mapsto \max \left\{ \,|{v}|_{\mathsf {1}} \mid v\in {{\,\mathrm{Fact}\,}}_k(w)\right\} ,\) giving for each \(k\in [|w|]_0\) the maximal number of \(\mathsf {1}\)s occuring in a factor of length k. Likewise the prefix-ones and suffix-ones functions are defined by \(p_w:[|w|]_0 \rightarrow [|w|]_0; k\mapsto |{{\,\mathrm{Pref}\,}}_k(w)|_{\mathsf {1}}\) and \(s_w:[|w|]_0 \rightarrow [|w|]_0; k\mapsto |{{\,\mathrm{Suff}\,}}_k(w)|_{\mathsf {1}}\).

Definition 2

Two words \(u,v\in \varSigma ^{n}\) are called prefix-normal equivalent (pn-equivalent, \(u\equiv _{n}v\)) if \(f_u=f_v\) holds and v’s equivalence class is denoted by \([v]_{\equiv }=\{u\in \varSigma ^{n}|\,u\equiv _{n}v\}\). A word \(w\in \varSigma ^{*}\) is called prefix (suffix) normal iff \(f_w=p_w\) (\(f_w=s_w\) resp.) holds. Let \(\sigma (w)=\sum _{i\in [n]}f_w(i)\) denote the maximal-one sum of a \(w\in \varSigma ^{n}\).

Remark 1

Notice that \(s_w=p_{w^R},f_w=f_{w^R},p_w(i),s_w(i)\le f_w(i)\) for all \(i\in \mathbb {N}_0\). By \(p_{w^R}=s_w\) and \(f_w=f_{w^R}\) follows immediately that a word \(w\in \varSigma ^{*}\) is prefix normal iff its reversal is suffix normal.

Fici and Lipták [14] showed that for each word \(w\in \varSigma ^{*}\) there exists exactly one \(w'\in [w]_{\equiv }\) that is prefix normal - the prefix normal form of w. We introduce the concept of least representative, which is the lexicographically smallest element of a class and thus also unique. As mentioned in [5] palindromes play a special role. Immediately by \(w=w^R\) for \(w\in \varSigma ^{*}\), we have \(p_w=s_w\), i.e. palindromes are the only words that can be prefix and suffix normal. Recall that not all palindromes are prefix normal witnessed by .

Definition 3

A palindrome is called prefix normal palindrome (pnPal) if it is prefix normal. Let \({{\,\mathrm{NPal}\,}}(n)\) denote the set of all prefix normal palindromes of length \(n\in \mathbb {N}\) and set \({{\,\mathrm{npal}\,}}(n)=|{{\,\mathrm{NPal}\,}}(n)|\). Let \({{\,\mathrm{Pal}\,}}(n)\) be the set of all palindromes of length \(n\in \mathbb {N}\).

Table 1. Prefix normal palindromes (pnPals).

3 Properties of the Least-Representatives

Before we present specific properties of the least representatives (LR) for a given word length, we mention some useful properties of the maximum-ones, prefix-ones, and suffix-ones functions (for the basic properties we refer to [5, 14] and the references therein). Since we are investigating only words of a specific length, we fix \(n\in \mathbb {N}_0\). Beyond the relation \(p_w=s_{w^R}\) the mappings \(p_w\) and \(s_w\) are determinable from each other. Counting the \(\mathsf {1}\)s in a suffix of length i and adding the \(\mathsf {1}\)s in the corresponding prefix of length \((n-i)\) of a word w, gives the overall amount of \(\mathsf {1}\)s of w, namely

$$\begin{aligned} p_w(n)=p_w(n-i)+s_w(i)\quad \text {and}\quad s_w(n)=p_w(i)+s_w(n-i). \end{aligned}$$

For suffix (resp. prefix) normal words this leads to \(p_w(i)=f_w(n)-f_w(n-i)\) resp. \(s_w(i)=f_w(n)-f_w(n-i)\) witnessing the fact \(p_w=s_w\) for palindromes (since both equation hold). Before we show that indeed pnPals form a singleton class w.r.t. \(\equiv _n\), we need the relation between the lexicographical order and prefix and suffix normality.

Lemma 1

The prefix normal form of a class is the lexicographically largest element in the class and the suffix-normal of a class is a LR.

Lemma 1 implies that a word being prefix and suffix normal forms a singleton class w.r.t. \(\equiv _n\). As mentioned \(p_w=s_w\) only holds for palindromes.

Proposition 1

For a word \(w\in \varSigma ^n\) it holds that \(|[w]|_{\equiv }=1\) iff \(w\in {{\,\mathrm{NPal}\,}}(n)\).

The general part of this section is concluded by a somewhat artificial equation which is nevertheless useful for pnPals : by \(s_w(i)=p_w^R(i)-p_w^R(i+1)+s_w(i-1)\) with \(p_w^R(n+1)=0\) for \(i\in [n]\) and \(s_w=p_{w^R}\) we get

$$\begin{aligned} p_{w^R}(i)=p_w^R(i)-p_w^R(i+1)-p_{w^R}(i-1). \end{aligned}$$

The rest of the section will cover properties of the LRs of a class.

Remark 2

For completeness, we mention that \(\mathsf {0}^n\) is the only even LR w.r.t. \(\equiv _{n}\) and the only pnPal starting with \(\mathsf {0}\). Moreover, \(\mathsf {1}^n\) is the largest LR. As we show later in the paper \(\mathsf {0}^n\) and \(\mathsf {1}^n\) are of minor interest in the recursive process due to their speciality.

The following lemma is an extension of [5, Lemma 1] for the suffix-one function by relating the prefix and the suffix of the word \(s_w\) for a least representative. Intuitively the suffix normality implies that the \(\mathsf {1}\)s are more at the end of the word w rather than at the beginning: consider for instance \(s_w=1123345\) for \(w\in \varSigma ^7\). The associated word w cannot be suffix normal since the suffix of length two has only one \(\mathsf {1}\) (\(s_w(2)=1\)) but by \(s_w(5)=3,s_w(6)=4\), and \(s_w(7)=5\) we get that within two letters two \(\mathsf {1}\)s are present and consequently \(f_w(2)\ge 2\). Thus, a word w is only least representative if the amount of \(\mathsf {1}\)s at the end of \(s_w\) does not exceed the amount of \(\mathsf {1}\)s at the beginning of \(s_w\).

Lemma 2

Let \(w\in \varSigma ^n\) be a LR. Then we have

$$\begin{aligned} s_{w}(i)\ge {\left\{ \begin{array}{ll} s_{w}(n)-s_{w}(n-i+1) &{} \text {if }s_{w}(n-i+1)=s_{w}(n-i),\\ s_{w}(n)-s_{w}(n-i+1)+1 &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$

The remaining part of this section presents results for prefix normal palindromes. Notice that for \(w\in {{\,\mathrm{NPal}\,}}(n)\) with \(w=\mathsf {x}v\mathsf {x}\) with \(\mathsf {x}\in \varSigma ,v\) is not necessarily a pnPal; consider for instance with . The following lemma shows a result for prefix normal palindromes which is folklore for palindromes substituting \(f_w\) by \(p_w\) or \(s_w\).

Lemma 3

For \(w\in {{\,\mathrm{NPal}\,}}(n)\backslash \{\mathsf {0}^n\},v\in {{\,\mathrm{Pal}\,}}(n)\) with \(w=\mathsf {1}v\mathsf {1}\) we have

$$\begin{aligned} f_w(k)= {\left\{ \begin{array}{ll} \mathsf {1}&{} \text {if }k = 1,\\ f_v(k-1) +\mathsf {1}&{} \text {if }1 < k \le |w|-1,\\ f_w(|v|+1) + \mathsf {1}&{} \text {if }k = |w|. \end{array}\right. } \end{aligned}$$

In the following we give a characterisation of when a palindrome w is prefix normal depending on its maximum-ones function \(f_w\) and a derived function \(\overline{f_w}\). In particular we observe that \(f_w = \overline{f_w}^R\) if and only if w is a prefix normal palindrome. Intuitively \(\overline{f_w}\) captures the progress of \(f_w\) in reverse order. This is an intriguing result because it shows that properties regarding prefix and suffix normality can be observed when \(f_w,s_w,p_w\) are considered in their serialised representation.

Definition 4

For \(w\in \varSigma ^{n}\) define \(\overline{f}_w:[n]\rightarrow [n]\) by \(\overline{f}_w(k)=\overline{f}_w(k-1)-(f_w(k-1)-f_w(k-2))\) with the extension \(f_w(-1)=f_w(0)=0\) of f and \(\overline{f}_w(0) = f_w(n)\). Define \(\overline{p}_w\) and \(\overline{s}_w\) analogously.

Example 1

Consider the pnPal with \(f_w=12234\). Then \(\overline{f}_w\) is 43221 and we have \(f_w=\overline{f}_w^R\). On the other hand for we have \(p_v=112334\) and \(f_v = 122334\) and \(\overline{f}_v=432211\) and thus \(\overline{f}_v^R\ne f_v\).

The following lemma shows a connection between the reversed prefix-ones function and the suffix-ones function that holds for all palindromes.

Lemma 4

For \(w \in \mathrm {Pal}(n)\) we have \(s_w \equiv \overline{p}_w^R\).

By Lemma 4 we get \(p_w\equiv \overline{p}_w^R\) since \(p_w\equiv s_w\) for a palindrome w. As advocated earlier, our main theorem of this part (Theorem 1) gives a characterisation of pnPals. The theorem allows us to decide if a word is a pnPal by only looking at the maximum-ones-function, thus a comparison of all factors is not required.

Theorem 1

Let . Then w is a pnPal if and only if \(f_w = \overline{f}^\mathsf {R}_w\).

Table 2. Number of pnPals. [15] (A308465)

Table 2 presents the amount of pnPals up to length 30 These results support the conjecture in [5] that there is a different behaviour for even and odd length of the word.

4 Recursive Construction of Prefix Normal Classes

In this section we investigate how to generate LRs of length \(n+1\) using the LRs of length n. This is similar to the work of Fici and Lipták [14] except they investigated appending a letter to prefix normal words while we explore the behaviour on prepending letters to LRs. Consider the words and , both being (different) LRs of length 4. Prepending a \(\mathsf {1}\) to them leads to and which are pn-equivalent. We say that v and w collapse and denote it by \(v\leftrightarrow w\). Hence for determining the index of \(\equiv _n\) based on the least representatives of length \(n-1\), only the least representative of one class matters.

Definition 5

Two words \(w,v\in \varSigma ^n\) collapse if \(\mathsf {1}w\equiv _{n+1}\mathsf {1}v\) holds. This is denoted by \(w\leftrightarrow v\).

Prepending a \(\mathsf {1}\) to a non LR will never lead to a LR. Therefore It is sufficient to only look at LRs. Since collapsing is an equivalence relation, denote the equivalence class w.r.t. \(\leftrightarrow \) of a word \(w\in \varSigma ^{*}\) by \([w]_{\leftrightarrow }\). Next, we present some general results regarding the connections between the LRs of lengths n and \(n+1\). As mentioned in Remark 2, \(\mathsf {0}^n\) and \(\mathsf {1}^n\) are for all \(n\in \mathbb {N}\) LRs. This implies that they do not have to be considered in the recursive process.

Remark 3

By [14] a word \(w\mathsf {0}\in \varSigma ^{n+1}\) is prefix-normal if w is prefix-normal. Consequently we know that if a word \(w\in \varSigma ^n\) is suffix normal, \(\mathsf {0}w\) is suffix normal as well. This leads in accordance to the naïve upper bound of \(2^{n}+1\) to a naïve lower bound of \(|\varSigma ^n/\equiv _n|\) for \(|\varSigma ^{n+1}/\equiv _{n+1}|\).

Remark 4

The maximum-ones functions for \(w\in \varSigma ^{*}\) and \(\mathsf {0}w\) are equal on all \(i\in [|w|]\) and \(f_{\mathsf {0}w}(|w|+1)=f_w(|w|)\) since the factor determining the maximal number of \(\mathsf {1}\)’s is independent of the leading \(\mathsf {0}\). Prepending \(\mathsf {1}\) to a word w may result in a difference between \(f_w\) and \(f_{\mathsf {1}w}\), but notice that since only one \(\mathsf {1}\) is prepended, we always have \(f_{\mathsf {1}w}(i)\in \{f_{w}(i),f_{w}(i)+1\}\) for all \(i\in [n]\). In both cases we have \(s_w(i)=s_{\mathsf {x}w}(i)\) for \(\mathsf {x}\in \{\mathsf {0},\mathsf {1}\}\) and \(i\in [|w|]\) and \(s_{\mathsf {0}w}(n+1)=s_w(n)\) as well as \(s_{\mathsf {1}w}(n+1)=s_w(n)+1\).

Firstly we improve the naïve upper bound to \(2|\varSigma ^{n}/\equiv _n|\) by proving that only LRs in \(\varSigma ^n\) can become LRs in \(\varSigma ^{n+1}\) by prepending \(\mathsf {1}\) or \(\mathsf {0}\).

Proposition 2

Let \(w\in \varSigma ^n\) not be LR. Neither \(\mathsf {0}w\) nor \(\mathsf {1}w\) are LRs in \(\varSigma ^{n+1}\).

By Proposition 1 prefix (and thus suffix) normal palindromes form a singleton class. This implies immediately that a word \(w\in \varSigma ^n\) such that \(\mathsf {1}w\) is a prefix normal palindrome, does not collapse with any other \(v\in \varSigma ^n\backslash \{w\}\). The next lemma shows that even prepending once a \(\mathsf {1}\) and once a \(\mathsf {0}\) to different words leads only to equivalent words in one case.

Lemma 5

Let \(w,v\in \varSigma ^n\) be different LRs. Then \(\mathsf {0}w\equiv _n\mathsf {1}v\) if and only if \(v=\mathsf {0}^n\) and \(w=\mathsf {0}^{n-1}\mathsf {1}\).

By Lemma 5 and Remark 3 it suffices to investigate the collapsing relation on prepanding \(\mathsf {1}\)s. The following proposition characterises the LR \(\mathsf {1}w\) among the elements \(\mathsf {1}v\in [\mathsf {1}w]_\equiv \) for all LRs \(v\in \varSigma ^{n}\) with \(w\leftrightarrow v\) for \(w\in \varSigma ^n\).

Proposition 3

Let \(w\in \varSigma ^n\) be a LR. Then \(\mathsf {1}w\in \varSigma ^{n+1}\) is a LR if and only if \(f_{1w}(i)=f_w(i)\) holds for \(i\in [n]\) and \(f_{1w}(n+1)=f_w(n)+1\).

Corollary 1

Let \(w\in {{\,\mathrm{NPal}\,}}(n)\). Then \(f_{w\mathsf {1}}(i)=f_w(i)\) for \(i\in [n]\) and \(f_{w\mathsf {1}}(n+1)=f_w(n)+1\). Moreover \(s_{w\mathsf {1}}(i)=s_w(i)\) for \(i\in [n]\) and \(s_{w\mathsf {1}}(n+1)=s_w(n)+1\).

This characterization is unfortunately not convenient for determining either the number of LRs of length \(n+1\) from the ones from length n or the collapsing LRs of length n. For a given word w, the maximum-ones function \(f_w\) has to be determined, \(f_w\) to be extended by \(f_w(n)+1\), and finally the associated word - under the assumption \(f_{\mathsf {1}w}\equiv s_{\mathsf {1}w}\) has to be checked for being suffix normal. For instance, given leads to \(f_w=11223\), and is extended to \(f_{\mathsf {1}w}=112234\). This would correspond to which is not suffix normal and thus w is not extendable to a new LR. The following two lemmata reduce the amount of LRs that needs to be checked for extensibility.

Lemma 6

Let \(w\in \varSigma ^n\) be a LR such that \(\mathsf {1}w\) is a LR as well. Then for all LRs \(v\in \varSigma ^n\backslash \{w\}\) collapsing with \(w,f_v(i)\le f_w(i)\) holds for all \(i\in [n]\), i.e. all other LRs have a smaller maximal-one sum.

Corollary 2

If \(w,v\in \varSigma ^n\) and \(\mathsf {1}w\in \varSigma ^{n+1}\) are LRs with \(w\leftrightarrow v\) and \(v\ne w\) then \(w\le v\).

Remark 5

By Corollary 2 the lexicographically smallest LR w among the collapsing leads to the LR of \([\mathsf {1}w]\). Thus if w is a LR not collapsing with any lexicographically smaller word then \(\mathsf {1}w\) is LR.

Before we present the theorem characterizing exactly the collapsing words for a given word w, we show a symmetry-property of the LRs which are not extendable to LRs, i.e. a property of words which collapse.

Lemma 7

Let \(w\in \varSigma ^n\) be a LR. Then \(f_{1w}(i)\ne f_w(i)\) for some \(i\in [n]\) iff \(f_{1w}(n-i+1)\ne f_w(n-i+1)\).

By [5, Lemma 10] a word \(w\mathsf {1}\) is prefix normal if and only if \(|{{\,\mathrm{Suff}\,}}_k(w)|_{\mathsf {1}}<|{{\,\mathrm{Pref}\,}}_{k+1}(w)|_{\mathsf {1}}\) for all \(k \in \mathbb {N}\). The following theorem extends this result for determining the collapsing words \(w'\) for a given word w.

Theorem 2

Let \(w\in \varSigma ^n\) be a LR and \(w'\in \varSigma ^n\backslash \{w\}\) with \(|w|_1=|w'|_1=s\in \mathbb {N}\). Let moreover \(v\not \leftrightarrow w\) for all \(v\in \varSigma ^{*}\) with \(v\le w\). Then \(w\leftrightarrow w'\) iff

  1. 1.

    \(f_{w'}(i)\in \{f_w(i),f_w(i)-1\}\) for all \(i\in [n]\),

  2. 2.

    \(f_{w'}(i)=f_w(i)\) implies \(f_{1w'}(i)=f_{w}(i)\),

  3. 3.

    \(f_{w'}(i)\ge {\left\{ \begin{array}{ll} f_{w'}(n)-f_{w'}(n-i+1) &{} \text {if }f_{w'}(n-i+1)=f_{w'}(n-i),\\ f_{w'}(n)-f_{w'}(n-i+1)+1 &{} \text {otherwise}. \end{array}\right. }\)

Theorem 2 allows us to construct the equivalence classes w.r.t. the least representatives of the previous length but more tests than necessary have to be performed: Consider, for instance which is a smallest LR of length 17 not collapsing with any lexicographically smaller LR. For w we have \(f_w=1\cdot 2\cdot 3\cdot 4\cdot 5\cdot 5\cdot 6\cdot 7\cdot 8\cdot 8\cdot 8\cdot 9\cdot 10\cdot 10\cdot 11\cdot 12\cdot 13\) where the dots just act as separators between letters. Thus we know for any \(w'\) collapsing with w, that \(f_{w'}(1)=1\) and \(f_{w'}(17)=13\). The constraints \(f_{w'}(2)\in \{f_{w'}(2),f_{w'}(2)+1\}\) and \(f_{w'}(2)\le f_{w}(2)\) implies \(f_{w'}(2)\in \{1,2\}\). First the check that \(f_{w'}(10)=4\) is impossible excludes \(f_{w'}(2)=1\). Since no collapsing word can have a factor of length 2 with only one \(\mathsf {1}\), a band in which the possible values range can be defined by the unique greatest collapsing word \(w'\). It is not surprising that this word is connected with the prefix normal form. The following two lemmata define the band in which the possible collapsing words \(f_w\) are.

Lemma 8

Let \(w\in \varSigma ^n\backslash \{\mathsf {0}^n\}\) be a LR with \(v\not \leftrightarrow w\) for all \(v\in \varSigma ^n\) with \(v\le w\). Set \(u:=(\mathsf {1}w[1..n-1])^R\). Then \(w\leftrightarrow u \) and for all LRs \(v\in \varSigma ^n\backslash \{u\}\) with \(v\leftrightarrow w\) and all \(i\in [n]\) \(f_{v}(i)\ge f_{u}(i)\), thus \(\sigma (u) = \sum _{i\in [n]}f_u(i) \le \sum _{i\in [n]}f_v(i) = \sigma (v)\).

Notice that \(w'=(\mathsf {1}w[1..n-1])^R\) is not necessarily a LR in \(\varSigma ^n/\equiv _n\) witnessed by the word of the last example. For w we get with \(f_{u}(8)=f_w(8)\) and \(f_{u}(10)=7\ne 8=f_w(10)\) violating the symmetry property given in Lemma 7. The following lemma alters \(w'\) into a LR which represents still the lower limit of the band.

Lemma 9

Let \(w\in \varSigma ^n\) be a LR such that \(\mathsf {1}w\) is also a LR. Let \(w'\in \varSigma ^n\) with \(w\leftrightarrow w'\), and I the set of all \(i\in [\lfloor \frac{n}{2}\rfloor ]\) with

$$\begin{aligned} (f_{w'}(i)=f_w(i)&\wedge f_{w'}(n-i+1)\ne f_w(n-i+1))\text { or } \\ (f_{w'}(i)\ne f_w(i)&\wedge f_{w'}(n-i+1)= f_w(n-i+1)) \end{aligned}$$

and \(f_w(j)=f_{w'}(j)\) for all \(j\in [n]\backslash I\). Then \(\hat{w}\) defined such that \(f_{\hat{w}}(j)=f_{w'}(j)\) for all \(j\in [n]\backslash I\) and \(f_{\hat{w}}(n-i+1)=f_{w'}(n-i+1)+1\) (\(f_{\hat{w}}(i)=f_{\hat{w}}(i)+1\) resp.) for all \(i\in I\) holds, collapses with w.

Remark 6

Lemma 9 applied to \((\mathsf {1}w[1..n-1])^R\) gives the lower limit of the band. Let \(\hat{w}\) denote the output of this application for a given \(w\in \varSigma ^n\) according to Lemma 9.

Continuing with the example, we firstly determine \(\hat{w}\) for . We get with \(u=w[n-1..1]1\) Since for all collapsing \(w'\in \varSigma ^n\) we have \(f_{\hat{w}}(i)\le f_{w'}(i)\le f_w(i),w'\) is determined for \(i\in [17]\backslash \{5,9,13\}\). Since the value for 5 determines the one for 13 there are only two possibilities, namely \(f_{w'}(5)=5\) and \(f_{w'}(9)=7\) and \(f_{w'}(5)=4\) and \(f_{w'}(9)=8\). Notice that the words \(w'\) corresponding to the generated words \(f_{w'}\) are not necessarily LRs of the shorter length as witnessed by the one with \(f_{w'}(5)=5\) and \(f_{w'}(9)=7\). In this example this leads to at most three words being not only in the class but also in the list of former representatives. Thus we are able to produce an upper bound for the cardinality of the class. Notice that in any case we only have to test the first half of \(w'\)’s positions by Lemma 7. This leads to the following definition.

Table 3. f for .

Definition 6

Let \(h_d:\varSigma ^{*}\times \varSigma ^{*}\rightarrow \mathbb {N}_0\) be the Hamming-distance. The palindromic distance \(p_d:\varSigma ^{*}\rightarrow \mathbb {N}_0\) is defined by \(p_d(w)=h_d(w[1..\lfloor \frac{n}{2}\rfloor ],(w[\lceil \frac{n}{2}\rceil +1..|w|] )^R )\). Define the palindromic prefix length \(p_{\ell }:\varSigma ^{*}\rightarrow \mathbb {N}_0\) by \(p_{\ell }(w)=\max \left\{ \,k\in [|w|]\,|\,\exists u\in {{\,\mathrm{Pref}\,}}_k(w):\,p_d(u)=0\,\right\} \).

The palindromic distance gives the minimal number of positions in which a bit has to be flipped for obtaining a palindrome. Thus, \(p_d(w)=0\) for all palindromes w, and, for instance, since the first half of w and the reverse of the second half mismatch in two positions. The palindromic prefix length determines the length of w’s longest prefix being a palindrome. For instance and . Since a LR w determines the upper limit of the band and \(w[n-1..1]\mathsf {1}\) the lower limit, the palindromic distance of \(ww[n-1..1]\mathsf {1}\) is in relation to the positions of \(f_w\) in which collapsing words may differ from w.

Theorem 3

If \(w\in \varSigma ^n\) and \(\mathsf {1}w\) are both LRs then \(|[w]_{\leftrightarrow }|\le 2^{\lceil \frac{p_d(ww[n-1..1]\mathsf {1}}{2}\rceil }\).

For an algorithmic approach to determine the LRs of length n, we want to point out that the search for collapsing words can also be reduced using the palindromic prefix length. Let \(w_1,\dots , w_m\) be the LRs of length \(n-1\). For each w we keep track of \(|w|-p_{\ell }(w)\). For each \(w_i\) we check firstly if \(|w_i|-p_{\ell }(w_i)=1\) since in this case the prepended \(\mathsf {1}\) leads to a palindrome. Only if this is not the case, \([w_i]_{\leftrightarrow }\) needs to be determined. All collapsing words computed within the band of \(w_i\) and \(\hat{w_i}\) are deleted in \(\{w_{i+1},\dots ,w_m\}\).

In the remaining part of the section we investigate the set \({{\,\mathrm{NPal}\,}}(n)\) w.r.t. \({{\,\mathrm{NPal}\,}}(\ell )\) for \(\ell <n\). This leads to a second calculation for an upper bound and a refinement for determining the LRs of \(\varSigma ^n/\equiv _n\) faster.

Lemma 10

If \(w\in {{\,\mathrm{NPal}\,}}(n)\backslash \{\mathsf {1}^n\}\) then \(\mathsf {1}w\) is not a LR but \(w\mathsf {1}\) is a LR.

Remark 7

By Lemma 10 follows that all words \(w\in {{\,\mathrm{NPal}\,}}(n)\) collapse with a smaller LR. Thus, for all \(n\in \mathbb {N}\), an upper bound for \(|\varSigma ^{n+1}/\equiv _{n+1}|\) is given by \(2|\varSigma ^n/\equiv _n|-{{\,\mathrm{npal}\,}}(n)\).

For a closed recursive calculation of the upper bound in Remark 7, the exact number \({{\,\mathrm{npal}\,}}(n)\) is needed. Unfortunately we are not able to determine \({{\,\mathrm{npal}\,}}(n)\) for arbitrary \(n\in \mathbb {N}\). The following results show relations between prefix normal palindromes of different lengths. For instance, if \(w\in {{\,\mathrm{NPal}\,}}(n)\) then 1w1 is a prefix normal palindrome as well. The importance of the pnPals is witnessed by the following estimation.

Theorem 4

For all \(n\in \mathbb {N}_{\ge 2}\) and \(\ell =|\varSigma ^n/\equiv _n|\) we have

$$\begin{aligned} \ell +{{\,\mathrm{npal}\,}}(n-1)\le |\varSigma ^{n+1}/\equiv _{n+1}|\le \ell +{{\,\mathrm{npal}\,}}(n+1)+\frac{\ell -{{\,\mathrm{npal}\,}}(n+1)}{2}. \end{aligned}$$

The following results only consider pnPals that are different from \(\mathsf {0}^n\) and \(\mathsf {1}^n\). Notice for these special palindromes that \(\mathsf {0}^n\mathsf {0}^n,\mathsf {1}^n\mathsf {1}^n,\mathsf {1}^n\mathsf {1}\mathsf {1}^n,\mathsf {0}^n\mathsf {0}\mathsf {0}^n\), \(\mathsf {1}\mathsf {1}^n\mathsf {1}^n\mathsf {1},\mathsf {1}\mathsf {0}^n\mathsf {0}^n\mathsf {1}\in {{\,\mathrm{NPal}\,}}(k)\) for an appropriate \(k\in \mathbb {N}\) but \(\mathsf {0}^n\mathsf {1}\mathsf {0}^n\not \in {{\,\mathrm{NPal}\,}}(2n+1)\).

Lemma 11

If \(w\in {{\,\mathrm{NPal}\,}}(n)\backslash \{\mathsf {1}^n,\mathsf {0}^n\}\) then neither ww nor \(w\mathsf {1}w\) are prefix normal palindromes.

Lemma 12

Let with \(n\in \mathbb {N}_{\ge 3}\). If is also a prefix normal palindrome then \(w=\mathsf {1}^k\) or for some \(u\in \varSigma ^{*}\) and \(k\in \mathbb {N}\).

A characterisation for \(w\mathsf {1}w\) being a pnPal is more complicated. By \(w\in {{\,\mathrm{NPal}\,}}(n)\) follows that a block of \(\mathsf {1}\)s contains at most the number of \(\mathsf {1}\)s of the previous block. But if such a block contains strictly less \(\mathsf {1}\)s the number of \(\mathsf {0}\)s in between can increase by the same amount the number of \(\mathsf {1}\)s decreased.

Lemma 13

Let \(w\in {{\,\mathrm{NPal}\,}}(n)\backslash \{\mathsf {1}^n,\mathsf {0}^n\}\). If \(\mathsf {1}ww\mathsf {1}\) is also a prefix normal palindrome then .

Lemmas 11, 12, and 13 indicate that a characterization of prefix normal palindromes based on smaller ones is hard to determine.

5 Conclusion

Based on the work in [14], we investigated prefix normal palindromes in Sect. 3 and gave a characterisation based on the maximum-ones function. At the end of Sect. 4 results for a recursive approach to determine prefix normal palindromes are given. These results show that easy connections between prefix normal palindromes of different lengths cannot be expected. By introducing the collapsing relation we were able to partition the set of extension-critical words introduced in [14]. This leads to a characterization of collapsing words which can be extended to an algorithm determining the corresponding equivalence classes. Moreover we have shown that palindromes and the collapsing classes are related.

The concrete values for prefix normal palindromes and the index of the collapsing relation remain an open problem as well as the cardinality of the equivalence classes w.r.t. the collapsing relation. Further investigations of the prefix normal palindromes and the collapsing classes lead directly to the index of the prefix equivalence.