“Most of” leads to undecidability: Failure of adding frequencies to LTL

Linear Temporal Logic (LTL) interpreted on finite traces is a robust specification framework popular in formal verification. However, despite the high interest in the logic in recent years, the topic of their quantitative extensions is not yet fully explored. The main goal of this work is to study the effect of adding weak forms of percentage constraints (e.g. that most of the positions in the past satisfy a given condition, or that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sigma $$\end{document}σ is the most-frequent letter occurring in the past) to fragments of LTL. Such extensions could potentially be used for the verification of influence networks or statistical reasoning. Unfortunately, as we prove in the paper, it turns out that percentage extensions of even tiny fragments of LTL have undecidable satisfiability and model-checking problems. Our undecidability proofs not only sharpen most of the undecidability results on logics with arithmetics interpreted on words known from the literature, but also are fairly simple. We also show that the undecidability can be avoided by restricting the allowed usage of the negation, and discuss how the undecidability results transfer to first-order logic on words.


Introduction
Linear Temporal Logic [29] (LTL) interpreted on finite traces is a robust logical framework used in formal verification [1,18,19]. However, LTL is not perfect: it can express whether some event happens or not, but it cannot provide any insight on how frequently such an event occurs or for how long such an event took place. In many practical applications, such quantitative information is important: think of optimising a server based on how frequently it receives messages or optimising energy consumption knowing for how long a system is usually used in rush hours. Nevertheless, there is a solution: one can achieve such goals by adding quantitative features to LTL.
It is known that adding quantitative operators to LTL often leads to undecidability. The proofs, however, typically involve operators such as "next" or "until", and are often quite complicated (see the discussion on the related work below). In this work, we study the logic LTL F , a fragment of LTL where the only allowed temporal operator is "sometimes in the future" F . We extend its language with two types of operators, sharing a similar "percentage" flavour: with the Past-Majority PM ϕ operator (stating that most of the past positions satisfy a formula ϕ), and with the Most-Frequent-Letter MFL σ predicates (meaning that the letter σ is among the most frequent letters appearing in the past). These operators can be used to express a number of interesting properties, such as if a process failed to enter the critical section, then the other process was in the critical section the majority of time. Of course, for practical applications, we could also consider richer languages, such as parametrised versions of these operators, e.g. stating that at least a fraction p of positions in the past satisfies a formula. However, we show, as our main result, that even these very simple percentage operators raise undecidability when combined with F .
To make the undecidability proof for both operators similar, we define an intermediate operator, Half , which is satisfied when exactly half of the past positions satisfy a given formula. The Half operator can be expressed easily with PM , but not with MFL -we show, however, that we can simulate it to an extent enough to show the undecidability. Our proof method relies on enforcing a model to be in the language ({wht}{shdw}) + , for some letters wht and shdw, which a priori seems to be impossible without the "next" operator. Then, thanks to the specific shape of the models, we show that one can "transfer" the truth of certain formulae from positions into their successors, hence the "next" operator can be partially expressed. With a combination of these two ideas, we show that it is possible to write equicardinality statements in the logic. Finally, we perform a reduction from the reachability problem of Two-counter Machines [26]. In the reduction, the equicardinality statements will be responsible for handling zerotests. The idea of transferring predicates from each position into its successor will be used for switching the machine into its next configuration.
The presented undecidability proof of LTL with percentage operators can be adjusted to extensions of fragments of first-order logic on finite words. We show that FO 2 M [<], i.e. the two-variable fragment of first-order logic admitting the majority quantifier M and linear order predicate < has an undecidable satisfiability problem. Here the meaning of a formula Mx.ϕ(x, y) is that at least a half of possible interpretations of x satisfies ϕ(x, y). Our result sharpens an existing undecidability proof for (full) FO with Majority from [23] (since in our case the number of variables is limited) but also FO 2 [<, succ] with arithmetics from [25] (since our counting mechanism is weaker and the successor relation succ is disallowed). On the positive side, we show that the undecidability heavily depends on the presence of the negation in front of the percentage operators. To do so, we introduce a logic, extending the full LTL, in which the usage of percentage operators is possible, but suitably restricted. For this logic, we show that the satisfiability problem is decidable.
All the above-mentioned results can be easily extended to the model checking problem, where the question is whether a given Kripke structure satisfies a given formula. The full version of the paper is available on arXiv [4].

Related work
The first paper studying the addition of quantitative features to logic was [21], where the authors proved undecidability of Weak MSO with Cardinalities. They also developed a model of so-called Parikh Automaton, a finite automaton imposing a semi-linear constraint on the set of its final configurations. Such an automaton was successfully used to decide logics with counting as well as logics on data words [27,17]. Its expressiveness was studied in [11].
Another idea in the realm of quantitative features is availability languages [20], which extend regular expressions by numerical occurrence constraints on the letters. However, their high expressivity leads to undecidable emptiness problems. Weak forms of arithmetics have also attracted interest from researchers working on temporal logics. Several extensions of LTL were studied, including extensions with counting [24], periodicity constraints [14], accumulative values [7], discounting [2], averaging [9] and frequency constraints [8]. A lot of work was done to understand LTL with timed constraints, e.g. a metric LTL was considered in [28]. However, its complexity is high and its extensions are undecidable [3].
Arithmetical constraints can also be added to the First-Order logic (FO) on words via so-called counting quantifiers. It is known that weak MSO on words is decidable with threshold counting and modulo-counting (thanks to the famous Büchi theorem [10]), while even FO on words with percentage quantifiers becomes undecidable [23]. Extensions of fragments of FO on words are often decidable, e.g. the two-variable fragment FO 2 with counting [12] or FO 2 with modulo-counting [25]. The investigation of decidable extensions of FO 2 is limited by the undecidability of FO 2 on words with Presburger constraints [25].
Among the above-mentioned logics, the formalisms of this paper are most similar to Frequency LTL [8]. The satisfiability problem for Frequency LTL was claimed to be undecidable, but the undecidability proof as presented in [8] is bugged (see [9,Sec. 8] for discussion). It was mentioned in [9] that the undecidability proof from [8] can be patched, but no correction was published so far. Our paper not only provides a valid proof but also sharpens the result, as we use a way less expressive language (e.g. we are allowed to use neither the "until" operator nor the "next" operator). We also believe that our proof is simpler. The second-closest formalism to ours is average-LTL [9]. The main difference is that the averages of average-LTL are computed based on the future, while in our paper, the averages are based on the past. The second difference, as in the previous case, is that their undecidability proof uses more expressive operators, such as the "until" operator.

Words and logics.
Let AP be a countably-infinite set of atomic propositions, called here also letters. A finite word w ∈ (2 AP ) * is a non-empty finite sequence of positions labelled with sets of letters from AP. A set of words is called a language. Given a word w, we denote its i-th position with w i (where the first position is w 0 ) and its prefix up to the i-th position with w ≤i . We usually use the letters p, q, i, j to denote positions. With |w| we denote the length of w.
The syntax of LTL F , a fragment of LTL with only the finally operator F , is defined with the grammar: ϕ, ϕ ::= a (with a ∈ AP) | ¬ϕ | ϕ ∧ ϕ | F ϕ.
The satisfaction relation |= is defined for words as follows: We write w |= ϕ if w, 0 |= ϕ. The usual Boolean connectives: , ⊥, ∨, →, ↔ can be defined, hence we will use them as abbreviations. Additionally, we use the globally operator G ϕ := ¬F ¬ϕ to speak about events happening globally in the future.
Percentage extension. In our investigation, percentage operators PM, MFL and Half are added to LTL F .
The operator PM ϕ (read as: majority in the past) is satisfied if at least half of the positions in the past satisfy ϕ: For example, the formula G (r ↔ ¬g) ∧ G PM r ∧ G F (g ∧ PM g) is true over words where each request r is eventually fulfilled by a grant g, and where each grant corresponds to at least one request. This can be also seen as the language of balanced parentheses, showing that with the operator PM one can define properties that are not regular.
The operator MFL σ (read as: most-frequent letter in the past), for σ ∈ AP, is satisfied if σ is among the letters with the highest number of appearances in the past, i.e. w, i |= MFL σ if ∀τ ∈ AP. |{j < i : w, j |= σ}| ≥ |{j < i : w, j |= τ }| For example, the formula G ¬(r ∧ g) ∧ G MFL r ∧ G F (g ∧ MFL g) again defines words where each request is eventually fulfilled, but this time the formula allows for states where nothing happens (i.e. when both r and g are false).
The last operator, Half is used to simplify the forthcoming undecidability proofs. This operator can be satisfied only at even positions, and its intended meaning is exactly half of the past positions satisfy a given formula.
It is not difficult to see that the operator Half ϕ can be defined in terms of the past-majority operator as PM (ϕ) ∧ PM (¬ϕ) and that Half ϕ can be satisfied only at even positions.
In the next sections, we distinguish different logics by enumerating the allowed operators in the subscripts, e.g. LTL F,PM or LTL F,MFL .
Computational problems Kripke structures are commonly used in verification to formalise abstract models. A Kripke structure is composed of a finite set S of states, a set of initial states I ⊆ S, a total transition relation R ⊆ S × S, and a finite labelling function : S → 2 AP . A trace of a Kripke structure is a finite word (s 0 ), (s 1 ), . . . , (s k ) for any s 0 , s 1 , . . . , s k satisfying s 0 ∈ I and (s i , s i+1 ) ∈ R for all i < k.
The model-checking problem amounts to checking whether some trace of a given Kripke structure satisfies a given formula ϕ. In the satisfiability problem, or simply in SAT, we check whether an input formula ϕ has a model, i.e. a finite word w witnessing w |= ϕ.

Playing with Half Operator
Before we jump into the encoding of Minsky machines, we present some exercises to help the reader understand the expressive power of the logic LTL F,Half . The tools established in the exercises play a vital role in the undecidability proofs provided in the following section.
We start from the definition of shadowy words. wht shdw wht shdw wht shdw We will call the positions satisfying wht simply white and their successors satisfying shdw simply their shadows.
The following exercise is simple in LTL, but becomes much more challenging without the X operator.

Exercise 1.
There is an LTL F,Half formula ψ shadowy defining shadowy words.
Solution. We start with the "base" formula ϕ ex1 init := wht ∧ G (wht ↔ ¬shdw) ∧ G (wht → F shdw), which states that the position 0 is labelled with wht, each position is labelled with exactly one letter among wht, shdw and that every white eventually sees a shadow in the future. What remains to be done is to ensure that only odd positions are shadows and that only even positions are white.
In order to do that, we employ the formula ϕ ex1 odd := G ((Half wht) ↔ wht). Since Half is never satisfied at odd positions, the formula ϕ ex1 odd stipulates that odd positions are labelled with shdw. An inductive argument shows that all the even positions are labelled with wht: for the position 0, it follows from ϕ ex1 init . For an even position p > 0, assuming (inductively) that all even positions are labelled with wht, the formula ϕ ex1 odd ensures that p is labelled with wht. Putting it all together, the formula ψ shadowy := ϕ ex1 init ∧ ϕ ex1 odd is as required. In the next exercise, we show that it is possible to transfer the presence of certain letters from white positions into their shadows. It justifies the usage of "shadows" in the paper.
We introduce the so-called counting terms. For a formula ϕ, word w and a position p, by # < ϕ (w, p) we denote the total number of positions among 0, . . . , p−1 satisfying ϕ, i.e. the size of {p < p | w, p |= ϕ}. We omit w in counting terms if it is known from the context.
Solution. Note that the first two conditions can be expressed with the conjunction of ψ shadowy , G (σ → wht) and G (σ → shdw). The last condition is more involving.
Assuming that the words under consideration satisfy conditions 1-2, it is easy to see that the third condition is equivalent to expressing that all white positions p satisfy the equation (♥): supplemented with the condition (♦), ensuring that the last white position satisfies the condition 3, i.e.
The proof of the following lemma can be found in the appendix. Going back to Exercise 2, we show how to define (♥) and (♦) in LTL F,Half , taking advantage of shadowness of the intended models. Take an arbitrary white position p of w. The equation (♥) for p is clearly equivalent to: Since p is even, we infer that p 2 ∈ N. From the shadowness of w, we know that there are exactly p 2 shadows in the past of p. Moreover, each shadow satisfies either σ or ¬σ. Hence, the expression p 2 −# < shdw∧σ (w, p) from (♥ ), can be replaced with # < shdw∧¬σ (w, p). Finally, since wht and shdw label disjoint positions, the property that every white position p satisfies (♥) can be written as an LTL F,Half formula ). Its correctness follows from the correctness of each arithmetic transformation and the semantics of LTL F,Half .
For the property (♦), we first need to define formulae detecting the last and the second to last positions of the model. Detecting the last position is easy: since the last position of w is shadow, it is sufficient to express that it sees only shadows in its future, i.e. ϕ ex2 last := G (shdw). Similarly, a position is second to last if it is white and it sees only white or last positions in the future, which results in a formula ϕ ex2 stl := wht ∧ G (wht ∨ ϕ ex2 last ). Note that the correctness of ϕ ex2 last and ϕ ex2 stl follows immediately from shadowness. Hence, we can define the formula ϕ (♦) as F (ϕ ex2 stl ∧ σ) ↔ F (ϕ ex2 last ∧σ). The conjunction of ϕ (♥) and ϕ (♦) formulae gives us to ϕ trans σ σ . We consider a generalisation of shadowy models, where each shadow mimics all letters from a finite set Σ ⊆ AP rather than just a single letter σ. Such a generalisation is described below. In what follows, we always assume that for each σ ∈ Σ there is a uniqueσ, which is different from σ, andσ ∈ Σ. Moreover, we always assume that σ 1 = σ 2 impliesσ 1 =σ 2 .
shdw} be a finite set. A shadowy word w is called truly Σ-shadowy, if for every letter σ ∈ Σ only the white (resp. shadow) positions of w can be labelled with σ (resp.σ) and every white position p of w satisfies w, p |= σ ⇔ w, p+1 |=σ.
Knowing the solution for the previous exercise, it is easy to come up with a formula ψ truly−Σ shadowy defining truly Σ-shadowy models: just take the conjunction of ψ shadowy and ϕ trans σ σ over all letters σ ∈ Σ. The correctness follows immediately from from Exercise 2. The next exercise shows how to compare cardinalities in LTL F,Half over truly Σ-shadowy models. We are not going to introduce any novel techniques here, but the exercise is of great importance: it is used in the next section to encode zero tests of Minsky machines. There exists an LTL F,Half formula ψ #α=#β such that for any truly Σ-shadowy word w and any of its white positions p: The solution is in the appendix, here we briefly discuss the main idea. Follow the previous exercise. The main difficulty is to express the equality of counting terms, written as LHS = RHS. Note that it is clearly equivalent to LHS + ( p 2 − RHS) = p 2 . Unfold p 2 on the left hand side, i.e. replace it with the total number of shadows in the past. Use the fact that w satisfies ϕ trans σ σ , which implies the equality # < wht∧β (w, p) = # < shdw∧β (w, p). Finally, get rid of subtraction and write an LTL F,Half formula by employing Half . The presented exercises show that the expressive power of LTL F,Half is so high that, under a mild assumption of truly-shadowness, it allows us to perform cardinality comparison. We are now only a step away from showing undecidability of the logic, which is tackled next.

Undecidability of LTL extensions
This section is dedicated to the main technical contribution of the paper, namely that LTL F,Half , LTL F,PM and LTL F,MFL have undecidable satisfiability and model checking problems. We start from LTL F,Half . Then, the undecidability of LTL F,PM will follow immediately from the fact that Half is definable by PM. Finally, we will show how the undecidability proof can be adjusted to LTL F,MFL .
We start by recalling the basics on Minsky Machines.
Minsky machines A deterministic Minsky machine is, roughly speaking, a finite transition system equipped with two unbounded-size natural counters, where each counter can be incremented, decremented (only in the case it is positive), and tested for being zero. Formally, a Minsky machine A is composed of a finite set of states Q with a distinguished initial state q 0 and a transition function δ : it means that only the positive counters can be decremented) and q = q (the machine cannot enter the same state two times in a row). Intuitively, the first coordinate of δ describes the current state of the machine, the second and the third coordinates tell us whether the current value of the i-th counter is zero or positive, the next two coordinates denote the update on the counters and the last coordinate denotes the target state. We define a run of a Minsky machine A as a sequence of consecutive transi- , all the following conditions are satisfied: It is not hard to see that this definition is equivalent to the classical one [26]. We say that a Minsky machine reaches a state q ∈ Q if there is a run with a letter containing q on its last coordinate. It is well known that the problem of checking whether a given Minsky machine reaches a given state is undecidable [26].

"Half of" meets the halting problem
We start from presenting the overview of the claimed reduction. Until the end of Section 4, let us fix a Minsky machine A = (Q, q 0 , δ) and its state q ∈ Q. Our ultimate goal is to define an LTL F,Half formula ψ q A such that ψ q A has a model iff A reaches q. To do so, we define a formula ψ A such that there is a one-to-one correspondence between the models of ψ A and runs of A. Expressing the reachability of q, and thus ψ q A , based on ψ A is easy. Intuitively, the formula ψ A describes a shadowy word w encoding on its white positions the consecutive letters of a run of A. In order to express it, we introduce a set Σ A , composed of the following distinguished atomic propositions: from q and to q for all states q ∈ Q, -fVal c and sVal c for counter values c ∈ {0, +}, and -fOP op and sOP op for all operations op ∈ {−1, 0, 1}.
We formalise the one-to-one correspondence as the function run, which takes an appropriately defined shadowy model and returns a corresponding run of A. More precisely, the function run(w) returns a run whose ith configuration is (q, f, s,f ,s, q N ) if and only if the ith white configuration of w is labelled with from q , fVal f , sVal s , fOPf , sOPs and to q N .
The formula ψ A ensures that its models are truly Σ A -shadowy words representing a run satisfying properties P1-P4. To construct it, we start from ψ truly−Σ A shadowy and extending it with four conjuncts. The first two of them represent properties P1-P2 of runs. They can be written in LTL F in an obvious way.
To ensure the satisfaction of the property P3, we observe that in some sense the letters from q and to q are paired in a model, i.e. always after reaching a state in A you need to get out of it (the initial state is an exception here, but we assumed that there are no transitions to the initial state). Thus, to identify for which q we should set the from q letter on the position p, it is sufficient to see for which state we do not have a corresponding pair, i.e. for which state q the number of white from q to the left of p is not equal to the number of white to q to the left of p. We achieve this in the spirit of Exercise 3.
Finally, the satisfaction of the property P4 can be achieved by checking for each position p whether the number of white fOP +1 to the left of p is the same as the number of white fOP −1 to the left of p, and similarly for the second counter. This reduces to checking an equicardinality of certain sets, which can be done by employing shadows and Exercise 3.
The reduction Now we are ready to present the claimed reduction. We first restrict the class of models under consideration to truly Σ A -shadowy words (for the feasibility of equicardinality encoding) with a formula ψ truly−Σ A shadowy . Then, we express that the models satisfy properties P1 and P2. The first property can be expressed with ψ P 1 := from q0 ∧ fVal 0 ∧ sVal 0 . The property P2 will be a conjunction of two formulae. The first one, namely ψ 1 P 2 , is an immediate implementation of P2. The second one, i.e. ψ 2 P 2 , is not necessary, but simplifies the proof; we require that no position is labelled by more than six letters from Σ A .
We put ψ P 2 := ψ 1 P 2 ∧ ψ 2 P 2 and ψ enc-basics := ψ truly−Σ A shadowy ∧ ψ P 1 ∧ ψ P 2 . We now formalise the correspondence between intended models and runs. Let run be the function which takes a word w satisfying ψ enc-basics and returns the word w A such that |w A | = |w|/2 and for each position i we have: The definition of ψ enc-basics makes the function run correctly defined and unambiguous, and that the results of run satisfy properties P1 and P2.

Fact 5 The function run is uniquely defined and returns words satisfying P1 and P2.
What remains to be done is to ensure properties P3 and P4. Both formulas rely on the tools established in Exercise 3 and are defined as follows: (from q ∨ ψ #from q =#toq )).
Proof. The satisfaction of the properties P1 and P2 by run(w) follows from Fact 5. Ad absurdum, assume that run(w) does not satisfy P3. It implies the existence of a white position p in w such that w, p |= to q but w, p+2 |= from q for some q = q . By our definition of Minsky machines, we conclude that w, p |= from q for some q = q. Thus, w, p |= from q . From the satisfaction of ψ P 3 by w we know that w, p |= ψ #from q =#toq . Let k be the total number of positions labelled with from q before p. Since w, p |= ψ #from q =#toq holds, by Exercise 3 we infer that the number of positions satisfying to q before p is also equal to k. Since w, p+2 |= from q and from the satisfaction of ψ P 3 by w we once more conclude w, p+2 |= ψ #from q =#toq . But such a situation clearly cannot happen due to the fact that the number of to q in the past is equal to k + 1, while the number of from q in the past is k.
Finally, let us define ψ A as ψ enc-basics ∧ ψ P 3 ∧ ψ P 4 . The use of ↔ in ψ P 4 guarantees that fVal 0 labels exactly the white positions having the counter empty (and similarly for the second counter). The counters are never decreased from 0, thus the white positions not satisfying fVal 0 are exactly those having the first counter positive.
The proof of the forthcoming fact relies on the correctness of Exercise 3 and is quite similar to the proof of Lemma 2, and is presented in the appendix.

Lemma 3. If w satisfies ψ A , then run(w) is a run of A.
Lastly, to show that the encoding is correct, we need to show that each run has a corresponding model. It is again easy: it can be shown by constructing an appropriate w; the white positions are defined according to ( ), and the shadows can be constructed accordingly.

Fact 6 If w A is a run of A, then there is a word
Observe that the formula ψ q A is satisfiable if and only if A reaches q. The "if" part follows from Lemma 3 and the satisfaction of the conjunct F (to q ) from ψ A . The "only if" part follows from Fact 6. Hence, from undecidability of the reachability problem Minsky machines we infer our main theorem: Theorem 1. The satisfiability problem for LTL F,Half is undecidable.

Undecidability of model-checking
For a given alphabet Σ, we can define a Kripke structure K Σ whose set of traces is the language (2 Σ ) + : the set of states S of K Σ is composed of all subsets of Σ, all states are initial (i.e. I = S), the transition relation is the maximal relation (R = S×S) and (X)=X for any subset X ⊆ Σ. It follows that a formula ϕ over an alphabet Σ is satisfiable if and only if there is a trace of K Σ satisfying ϕ. From the undecidability of the satisfiability problem for LTL F,Half we get:

Theorem 2. Model-checking of LTL F,Half formulae over Kripke structures is undecidable.
The decidability can be regained if additional constraints on the shape of Kripke structures are imposed: model-checking of LTL F,Half formulae over flat structures is decidable [13].
As discussed earlier, the Half operator can be expressed in terms of the PM operator. Hence, we conclude: Corollary 2. Model-checking and satisfiability problems for LTL F,PM are undecidable.

Most-Frequent Letter and Undecidability
We next turn our attention to the MFL operator, which turns out to be a little bit problematic. Typically, formulae depend only on the atomic propositions that they explicitly mentioned. Here, it is not the case. Consider a formula ϕ = MFL a and words w 1 = {a}{}{a} and w 2 = {a, b}{b}{a, b}. Clearly, w 1 , 2 |= ϕ whereas w 2 , 2 |= ϕ. This can be fixed in many ways -for example, by parametrising MFL with a domain, so that it expresses that "a is the most frequent letter among b 1 , . . . , b n ". We show, however, that even this very basic version of MFL is undecidable. The proof is an adaptation of our previous proofs with a little twist inside.
First, we adjust the definition of shadowy words. A word w is strongly shadowy if w is shadowy and for each even position of w we have that wht and shdw are the most frequent letters among the other labelling w while for odd positions wht is the most frequent. Note that the words constructed in the previous sections were strongly shadowy because each letter σ appeared only at whites or at shadows.

Exercise 4. There exists an LTL F,MFL formula ψ MFL
shadowy defining strongly shadowy words.
Proof. It suffices to revisit Exercise 1 and to modify the formula ϕ ex1 odd stipulating that odd positions are exactly those labelled with shdw (since it is the only formulae employing Half ). We claim that ϕ ex1 odd can be expressed with ϕ MFL odd := G [MFL (wht) ∧ (wht ↔ MFL (shdw))] Indeed, take any word w |= ϕ ex1 init ∧ ϕ MFL odd . Of course we have w, 0 |= wht (due to ϕ ex1 init ). Moreover, w, 1 |= shdw holds: otherwise we would get contradiction with shdw not being the most frequent letter in the past of 1. Now assume p > 1 and assume that the word w 0 , . . . , w p−1 is strongly shadowy. Consider two cases. If p is odd, then both wht and shdw are the most frequent letters in the past of p−1 and p−1 is labelled by wht. Then, shdw is not the most frequent letter in the past of p and thus p is labelled by shdw and wht is the most frequent letter in the past of p. If p is even, p−2 is labelled by wht and the most frequent letters in the past of p−2 are wht and shdw, and p−1 is labelled by shdw. Thus both wht and shdw are the most frequent letters in the past of p and therefore wht is labelled by wht. Thus, w 0 , . . . , w p is strongly shadowy. By induction, w is strongly shadowy. It can be readily checked that every strongly shadowy word satisfies ψ MFL shadowy . We argue that over the strongly shadowy models, the formulae Half σ and MFL σ are equivalent. Proof. If w, 2i |= MFL σ, then w, 2i |= MFL wht due to the strongly shadowness of w. Hence # < σ (w, 2i) = # < wht (w, 2i) = 2i 2 , implying w, 2i |= Half σ. Now, assume that w, 2i |= Half σ holds, so σ appears i times in the past. Since w is strongly shadowy we know that wht is the most frequent letter. Moreover, wht appears 2i 2 = i times in the past. Hence, w, 2i |= MFL σ.
We say that a letter σ is importunate in a word w if σ labels more than half of the positions in some even prefix of w. Notice that strongly shadowy words cannot have importunate letters.
With the above lemma, it is tempting to finish the proof as follows: replace each Half (ϕ) in the formulae from Section 4.1 with MFL (p ϕ ) for some fresh atomic proposition p ϕ and require that G (ϕ ↔ p ϕ ) holds. A formula obtained from ϕ in this way will be called a dehalfication of ϕ and will be denoted with dehalf(ϕ). The next lemma shows that dehalf(·) preserves satisfaction of certain LTL F,Half formulae.

Lemma 5.
Let ϕ be an LTL F,Half formula without nested Half operators and without F modality, Λ be the set of all formulae λ such that Half λ appears in ϕ and let w be a word such that w |= ψ MFL shadowy ∧ λ∈Λ G (p λ ↔ λ). Then for all even positions 2p of w we have that w, 2p |= dehalf(ϕ) implies w, 2p |= ϕ.
Proof. The proof goes via structural induction over LTL F,Half formulae without nested Half operators and without F operators. The only interesting case is when ϕ = Half λ, which follows from Lemma 4.
Note, however, that the above lemma works only one way: it fails when the formula ϕ is satisfied in more than half of the positions of some prefix, as that would make p ϕ importunate leading to unsatisfiablity of ψ MFL shadowy .

Most-Frequent Letter: the reduction
The next step is to construct a formula defining truly Σ A -shadowy words, which are the crucial part of ψ MFL enc-basics . To do it, we first need to rewrite a formula ϕ trans σ σ , transferring the truth of a letter σ from whites into their shadows. The main ingredient of ϕ trans σ σ is the formula ϕ (♥) := G (wht → Half ([wht ∧ σ] ∨ [shdw ∧ ¬σ])), which we replace with dehalf(ϕ (♥) ). We call the obtained formula (ϕ trans σ σ ) MFL and show its correctness below.
First, by Lemma 5 we know that every model of (ϕ trans σ σ ) MFL is also a model of ϕ trans σ σ . Then, the models of ϕ trans σ σ can be made strongly shadowy, so dehalfication of ϕ trans σ σ is satisfiability-preserving.

Lemma 6.
Let p ϕ be a fresh letter for ϕ := [wht ∧ σ] ∨ [shdw ∧ ¬σ]. Take w, a strongly shadowy word satisfying w |= ϕ trans σ σ without any occurrences of p ϕ . Then w , the word obtained by labelling with p ϕ all the positions of w satisfying ϕ, is strongly shadowy.
Hence, we obtain the correctness of (ϕ trans σ σ ) MFL . By applying the same strategy to other conjuncts of ψ enc-basics and Fact 5, we obtain ψ MFL enc-basics satisfying: The function run (taking as input the words satisfying ψ MFL enc-basics ) is uniquely defined and returns words satisfying P1 and P2. Moreover the formulae ψ MFL enc-basics and ψ enc-basics are equi-satisfiable.
Towards completing the undecidability proof we need to prepare the rewritings of the formulae ψ P 3 and ψ P 4 . For ψ P 3 we proceed similarly to the previous case. We know that the models of ψ MFL enc-basics ∧ dehalf(ψ P 3 ) satisfy P3 (due to Lemma 5 they satisfy ψ P 3 and hence, by Lemma 2, also P3). To observe the existence of such models, we show again that the satisfiability of ψ P 3 is preserved by dehalfication.

Lemma 7.
Let p q be a fresh letter for ϕ q := [wht ∧from q ]∨[shdw ∧¬ to q ] indexed over q ∈ Q\{q 0 }. Take w, a strongly shadowy word satisfying w |= ψ MFL enc-basics ∧ψ P 3 without any occurrences of p q . Then w , the word obtained by labelling with p q all the positions of w satisfying ϕ q , is strongly shadowy.
The last formula to rewrite is ψ P 4 . We focus only on its first part, speaking about the first counter, i.e.
Note that this time we cannot simply dehalfise this formula: the letter responsible for the inner part of Half would necessarily be importunate -consider an initial fragment of a run of A in which A increments its first counter without decrementing it. Fortunately, we cannot say the same when the machine decrements the counter and hence, it suffices to express the equivalent (due to even length of shadowy models) statement ψ P 4 as follows: As we did before, we show that dehalfication of ψ P 4 preserves satisfiability: Take w, a strongly shadowy word satisfying w |= ψ MFL enc-basics ∧ dehalf(ψ P 3 ) ∧ ψ P 4 without any occurrences of p ϕ . Then w , the word obtained by labelling with p ϕ all the positions of w satisfying ϕ, is strongly shadowy. Thus, by Theorem 1 and the above corollary, we obtain the undecidability of LTL F,MFL . Undecidability of the model-checking problem is concluded by virtually the same argument as in Section 6.1. Hence: Theorem 3. The model-checking and the satisfiability problems for LTL F,MFL are undecidable.

Decidable variants
We have shown that LTL F with frequency operators lead to undecidability. Without the operators that can express F (e.g. F, G or U ), the decision problems become NP-complete. Below we assume the standard semantics of LTL operator X , i.e. w, i |= X ϕ iff i+1 < |w| and w, i+1 |= ϕ. Theorem 4. Model-checking and satisfiability problems for LTL X,MFL,PM are N P -complete.
The complexity of LTL X,MFL,PM is so low because the truth of the formula depends only on some initial fragment of a trace. This is a big restriction of the expressive power. Thus, we consider a different approach motivated by [7].
In the new setting, we allow to use arbitrary LTL formulae as well as percentage operators as long as the they are not mixed with G . We introduce a logic LTL % , which extends the classical LTL [29] with the percentage operators of the form P k% ϕ for any ∈ { ≤, <, =, >, ≥ }, k ∈ N and ϕ ∈ LTL. By way of example, the formula P <20% (a) is true at a position p if less then 20% of positions before p satisfy a. The past majority operator is a special case of the percentage operator: PM ≡ P ≥50% . Formally: To avoid undecidability, the percentage operators cannot appear under negation or be nested. Therefore, the syntax of LTL % is defined with the grammar ϕ, ϕ : The main tool used in the decidability proof is the Parikh Automata [21]. A Parikh automaton P = (A, E) over the alphabet Σ is composed of a finitestate automaton A accepting words from Σ * and a semi-linear set E given as a system of linear inequalities with integer coefficients, where the variables are x a for a ∈ Σ. We say that P accepts a word w if A accepts w and the mapping assigning to each variable x a from E the total number of positions of w carrying the letter a, is a solution to E. Checking non-emptiness of the language of P can be done in NP [17]. Our main decidability results is obtained by constructing an appropriate Parikh automaton recognising the models of an input LTL % formula.

Theorem 5.
Model-checking and satisfiability problems for LTL % are decidable.
Proof. Let ϕ ∈ LTL % . By turning ϕ into a DNF, we can focus on checking satisfiability of some of its conjuncts. Hence, w.l.o.g. we assume that ϕ = ϕ 0 ∧ n i=1 ϕ i , where ϕ 0 is in LTL and all ϕ i have the form F (ψ i,1 LTL ∧ P ki% ψ i,2 LTL ) for some LTL formulae ψ i,1 LTL and ψ i,2 LTL . Observe that a word w is a model of ϕ iff it satisfies ϕ 0 and for each conjunct ϕ i we can pick a witness position p i from w such that w, p i |= ψ i,1 LTL ∧ P ki% ψ i,2 LTL . Moreover, the percentage constraints inside such formulae speak only about the prefix w <pi . Thus, knowing the position p i and the number of positions before p i satisfying ψ i,2 LTL , the percentage constraint inside ϕ i can be imposed globally rather than locally. It suggests the use of Parikh automata: the LTL part of ϕ can be checked by the appropriate automaton A (due to the correspondence that for an LTL formula over finite words one can build a finite-state automaton recognising the models of such a formula [19]) and the global constraints, speaking about the satisfaction of percentage operators, can be ensured with a set of linear inequalities E.
Our plan is as follows: we decorate the intended models w with additional information on witnesses, such that the witness position p i for ϕ i will be labelled by w i (and there will be a unique such position in a model), all positions before p i will be labelled by b i and, among them, we distinguish with a letter s i some special positions, i.e. those satisfying ψ i,2 LTL . More formally, for each ϕ i we produce an LTL formula ϕ i according to the following rules: there is a unique position p i such that w, p i |= w i (selecting a witness for ϕ i ), for all j < p i we have w, j |= b i (the positions before p i are labelled with b i ), LTL ]) (distribution of the special positions among b i ) and Note that w |= ϕ implies w |= ϕ. Moreover, any model w |= ϕ can be labelled with letters b i , s i , w i such that the decorated word satisfies ϕ . Let ϕ := ϕ 0 ∧ n i=1 ϕ i and let E be the system of n inequalities with E i = 100 · x bi k i · x si . Now observe that any model of ϕ satisfies E (i.e. the value assigned to x a is the total number of positions labelled with a), due to the satisfaction of counting operators, and vice versa: every word w |= ϕ satisfying E is a model of ϕ . It gives us a sufficient characterisation of models of ϕ. Let A be a finite automaton recognising the models of ϕ , then a Parikh automaton P = (A, E), as we already discussed, is non-empty if and only if ϕ has a model. Since checking non-emptiness of P is decidable, we can conclude that LTL % is decidable.
A rough complexity analysis yields an NExpTime upper bound on the problem: the automaton P that we constructed is exponential in ϕ (translating ϕ to DNF does not increase the complexity since we only guess one conjunct, which is of polynomial size in ϕ). Moreover, checking non-emptiness can be done non-deterministically in time polynomial in the size of the automaton. The NExpTime bound is not optimal: we conjuncture that the problem is PSpace-complete. We believe that by employing techniques similar to [7], one can construct P and check its non-emptiness on the fly, which should result in the PSpace upper bound.
For the model-checking problem, we observe that determining whether some trace of a Kripke structure K = (S, I, R, l) satisfies ϕ is equivalent to checking the satisfiability of formula ϕ K ∧ ϕ, where ϕ K is a formula describing all the traces of K. Such a formula can be constructed in a standard manner. For simplicity, we treat S as a set of auxiliary letters, and consider the conjunction of (1) s∈I s, (2) G (X → (s,s )∈R (s ∧ X s )) and (3) s∈S G (s → p∈ (s) p), expressing that the trace starts with an initial state, consecutive positions describe consecutive states and that the trace is labelled by the appropriate letters. Thus, the modelchecking problem can be reduced in polynomial time to the satisfiability problem.

Two-Variable First-Order Logic with Majority
The Two-Variable First-Order Logic on words (FO 2 [<]) is a robust fragment of First-Order Logic FO interpreted on finite words. It involves quantification over variables x and y (ranging over the words' positions) and it admits a linear order predicate < (interpreted as a natural order on positions) and the equality predicate =. Henceforth we assume the usual semantics of FO 2 [<] (cf. [16]).
In this section, we investigate the logic FO 2 M [<], namely the extension of FO 2 [<] with the so-called Majority quantifier M. Such quantifier was intensively studied due to its close connection with circuit complexity and algebra, see e.g. [22,5,6]. Intuitively, the formula Mx.ϕ specifies that at least half of all the positions in a model, after substituting x with them, satisfy ϕ. Formally w |= Mx.ϕ holds, if and only if |w| 2 ≤ |{p | w, p |= ϕ[x/p]}|. We stress that the formula Mx.ϕ may contain free occurrences of the variable y.
Note that the Majority quantifier shares similarities to the PM operator, but in contrast to PM, the M quantifier counts globally. We take advantage of such similarities and by reusing the technique developed in the previous sections, we show that the satisfiability problem for FO 2 M [<] is also undecidable. We stress that our result significantly sharpens an existing undecidability result for FO with Majority from [23] (since in our case the number of variables is limited) as well as for FO 2 [<, succ] with Presburger Arithmetics from [25] (since our counting mechanism is limited and the successor relation succ is disallowed).
Proof plan There are three possible approaches to proving the undecidability of FO 2 M [<]. The first one is to reproduce all the results for LTL F,PM , which is rather uninspiring. The second one is to define a translation from LTL F,PM to FO 2 M [<] that produces an equisatisfiable formula. But because of models of odd length, this involves a lot of case study. Here we present a third approach, which, we believe, gives the best insight: we show a translation from LTL F,PM to FO 2 M [<] that works for LTL F,PM formulae whose all models are shadowy. Since we only use such models in the undecidability proof of LTL F,PM , this shows the undecidability of FO 2 M [<].
Shadowy models We first focus on defining shadowy words in FO 2 M [<]. Before we start, let us introduce a bunch of useful macros in order to simplify the forthcoming formulae. Their names coincide with their intuitive meaning and their semantics. Proof. Let ϕ lem9 base be a formula defining the language of all (non-empty) words, where the letters wht and shdw label disjoint positions in the way that the first position satisfies wht and the total number of shdw and wht coincide. It can be written, e.g. with ∀x(wht(x) ↔ ¬shdw(x))∧∃x(first(x)∧wht(x))∧Halfx.wht(x)∧ Halfx.shdw(x). To define shadowy words, it would be sufficient to specify that no neighbouring positions carry the same letter among { wht, shdw }. This can be done with, rather complicated at the first glance, formulae: the only possibility for w to not be shadowy is to have two consecutive positions p, p+1 carrying the same letter. W.l.o.g assume they are both white. Let w be the number of white positions to the left of p and let s be the number of shadows to the right of p. By applying ϕ forbid wht·wht to p we infer that w + s = 1 2 |w|. On the other hand, by applying ϕ forbid wht·wht to p+1 it follows that (w+1)+s = 1 2 |w|, which contradicts the previous equation. Hence, w is shadowy.
The following lemma shows the correctness of the presented translation.

Lemma 10.
An LTL F,PM formula ϕ has a shadowy model iff tr(ϕ) has a model.
Since the formulae used in our undecidability proof for LTL F,PM have only shadowy models, by Lemma 10 we conclude that FO 2 M [<] is also undecidable. Theorem 6. The satisfiability problem for FO 2 M [<] is undecidable.

Conclusions
We have provided a simple proof showing that adding different percentage operators to LTL F yields undecidability. We showed that our technique can be applied to an extension of first-order logic on words, and we hope that our work will turn useful in showing undecidability for other extensions of temporal logics. Decidability results for logics with percentage operators in restricted contexts were also provided.