1 Introduction

A theorem of universal algebra and model theory is proved in [8] that is particularly pertinent to the study of semigroups. The theorem clarifies why it is so often the case that classes of algebras which are closed under the taking of arbitrary direct products (P) and homomorphic images (H), but not necessarily subalgebras (S), may be defined by a set of equation-like sentences. The members of these bases are referred to in [8] as equation systems and generally involve simultaneous equations, meaning that the equations are not assumed to be independent in that a symbol may occur in more than one equation. More formally, a semigroup equation system is a quantified conjunction of equalities between semigroup words (elements of a free semigroup). Thus in comparison with the more familiar theory of varieties and equational logic, we allow both \(\forall \) and \(\exists \) as quantifiers at the front of the sentence and we allow the logical connective of conjunction \(\wedge \) (“and”).

For instance, Example 2.1(ii) of [8] shows that Completely regular semigroups (\(\mathscr{C}\mathscr{R}\)) is the class consisting of all semigroups in which the following equation pair may be solved:

$$\begin{aligned} (\forall \,a)(\exists \,x):(a=axa) \wedge (ax = xa). \end{aligned}$$
(1)

These equations are simultaneous and indeed both the parameter a and the variable x feature in each. Together they capture the property that each element has an inverse with which it commutes, which is one definition of completely regular semigroups. However, we shall show that this class may be defined by a single quantified equation (without conjunction), and in more than one way.

Classes defined by equation systems in this fashion are referred to as \(\{E,H,P\}\)-classes or simply EHP-classes, with the E symbol standing for the taking of elementary substructures, H for homomorphic images and P for direct products. An elementary class (or first order class) is one defined by a collection of first order formulae: the quantifiers refer to elements of algebras, hence “elementary”, in contrast with second order logic where we may quantify over relations. The inclusion of the class operator E is necessary to capture equivalence with definability by equation systems, but the reader will not need familiarity with the definition of elementary substructures in this article, because we always proceed by finding equation systems that capture the classes we explore. It is not currently known if closure under the operator EHP is enough to ensure closure under the set of operators \(\{E,H,P\}\).

For relevant background and notation on universal algebra see [1, 2] (Chapter V there also serves as a useful first introduction to model theoretic notions for those wanting to delve further), while [3, 7, 9] are texts covering semigroup facts and terminology.

In the following theorem, closure under taking elementary substructures is subsumed by the assumption that \({\mathscr {C}}\) is an elementary class (all elementary classes are closed under the taking of elementary substructures).

Theorem 1.1

[8, Theorem 3.1] An elementary class \({\mathscr {C}}\) equals the class of models of some family of equation systems if and only if \({\mathscr {C}}\) is closed under taking homomorphic images of direct products. If the elementary class is the model class of a single sentence, then it is a class of models of a single equation system.

The theorem was inspired by the observation that so many of the fundamental classes of algebraic semigroups are EHP-classes, but are not varieties, which is to say the class is not closed under the taking of subsemigroups, and so cannot be defined by semigroup identities. Indeed since equation systems are examples of first order sentences, a class defined by the satisfaction of a set of equations is automatically an elementary class and the initial condition of the theorem is, in the (easy) forward direction, redundant.

In Sect. 2 we highlight examples showing just how rich is EHP theory in the context of semigroups. Many extensively studied semigroup classes are captured through our approach, in many cases by a single equation. We verify this in cases where the equational bases differ from those given in [8].

An EHP-class \({\mathscr {C}}\) that is closed under the taking of all embeddings (and not only elementary embeddings) is a variety in which case, by Birkhoff’s theorem, the equations defining \({\mathscr {C}}\) may be taken to be identities, which is to say the equations do not involve the quantifier \(\exists \). It is natural then to consider a kind of dual to the Birkhoff theorem for EHP-classes \({\mathscr {C}}\) defined by equations that are free from the quantifier \(\forall \). Clearly such a class is closed under the taking of containing algebras. In Sect. 3 we prove the converse does indeed hold for the class of semigroups.

In Theorem 1.1, the number of alternations between the \(\forall \) symbol, which qualifies parameters, (denoted by lower case letters from the beginning of the alphabet \(a,b,c,\dots \)), and the \(\exists \) symbol, which qualifies variables, (denoted typically by xyz) is finite but unbounded in length. In Sect. 4 we give examples of EHP-classes which have no basis comprised of equations of the form \((\forall \cdots )(\exists \cdots )\) nor the form \((\exists \cdots )(\forall \cdots )\).

2 Classical semigroup collections as EHP-classes

2.1 Classes of regular semigroups

We begin with six examples that highlight how easily important classes of regular semigroups may be characterised by a single equation. Moreover the proofs of this are simple but elegant exercises in semigroup theory. In particular we see in Proposition 2.1 how the classes in question may be characterised by small adjustments to the equation that defines regularity.

Proposition 2.1

The classes of Regular semigroups (\({\mathscr {R}}eg\)), Left groups (\(\mathscr{L}\mathscr{G}\)), Right groups (\(\mathscr{R}\mathscr{G}\)), Groups (\({\mathscr {G}}\)), Completely regular semigroups (\(\mathscr{C}\mathscr{R}\)), and Completely simple semigroups (\(\mathscr{C}\mathscr{S}\)) are the EHP-classes of semigroups defined by the following equations.

$$\begin{aligned}{} & {} {\mathscr {R}}eg:(\forall a)(\exists x):a=axa. \end{aligned}$$
(2)
$$\begin{aligned}{} & {} \quad \mathscr{L}\mathscr{G}:(\forall a,b)(\exists x):a=axb,\,\,\mathscr{R}\mathscr{G}:(\forall a,b)(\exists x):a=bxa. \end{aligned}$$
(3)
$$\begin{aligned}{} & {} \quad {\mathscr {G}}:(\forall a,b)(\exists x):a=bxb. \end{aligned}$$
(4)
$$\begin{aligned}{} & {} \quad \mathscr{C}\mathscr{R}:(\forall \,a)(\exists x):a=a^{2}xa^{2}. \end{aligned}$$
(5)
$$\begin{aligned}{} & {} \quad \mathscr{C}\mathscr{S}:(\forall a,b)(\exists x):a=abxba. \end{aligned}$$
(6)

Proof

The Eq. (2) may be satisfied in any regular semigroup S by taking \(x\in V(a)\). Conversely, if x satisfies (2), then \(xax\in V(a)\).

For Eq. (3), let S be a left group and let \(a,b\in S\). Since S is left simple, there exists \(y\in S\) such that \(a=yb\). By regularity there exists \(z\in S\) such that \(a=aza\), whence \(a=azyb\). Putting \(x=zy\) gives \(a=axb\), as required.

Conversely suppose that \(a=axb\) is solvable in a semigroup S. Putting \(b=a\) gives \(a=axa\) is solvable so that S is regular. Moreover \(a=axb\) implies that \(a\le _{\mathscr {L}}b\), and since ab are arbitrary it follows that \(a\mathrel {\mathscr {L}}b\). Therefore S is left simple and regular and so S is a left group. The left-right dual argument shows that \(\mathscr{R}\mathscr{G}\) is defined by the equation \(a=bxa\) in like manner.

For Eq. (4), if G is a group then for given \(a,b\in G\) there is a unique solution to (4), that being \(x=b^{-1}ab^{-1}\).

Conversely, let S be a semigroup in which (4) is solvable. The equation implies that \(H_{a}\le H_{b}\). By interchanging a and b we obtain the reverse inequality, whence S consists of a single \(\mathscr {H}\)-class, and is therefore a group.

For Eq. (5): from the equation, it follows that \(a\mathrel {\mathscr {H}}a^{2}\), from which we infer that every \(\mathscr {H}\)-class is a group, whence S is a union of groups, which is to say that S is completely regular.

Conversely, let S be a completely regular semigroup. Let \(a\in S\) and put \(x=b^{3}\), where b is the inverse of a in the group \(H_{a}\). Then \(ab=ba\) is the identity element of \(H_{a}\) and so

$$\begin{aligned} a^{2}xa^{2}=a^{2}b^{3}a^{2}=a(ab)b(ba)a=aba=a, \end{aligned}$$

in accord with (5).

Finally, let S be a semigroup that satisfies (6). By putting \(b=a\) we see that (6) implies (5), so that S is completely regular. For any \(a,b\in S\), (6) implies that \(J_{a}\le _{\mathscr {J}}J_{b}\), and by role reversal of a and b, the reverse inequality follows so that that \(J_{a}=J_{b}\). Therefore S is a simple completely regular semigroup, which is to say that S is completely simple.

Conversely let S be a completely simple semigroup and take \(a,b\in S\). Then we have \(a\mathrel {\mathscr {R}}ab\mathrel {\mathscr {L}}b\mathrel {\mathscr {R}}ba\mathrel {\mathscr {L}}a\). By Green’s Lemma, the mapping \(\rho _{ba}:H_{a}\rightarrow H_{a}\) whereby \(x\rho _{ba}=xba\) is a bijection, as is the the mapping \(\lambda _{ab}:H_{a}\rightarrow H_{a}\) whereby \(x\lambda _{ab}=abx\). It follows that \(\phi =\lambda _{ab}\rho _{ba}=\rho _{ba}\lambda _{ab}:H_{a}\rightarrow H_{a}\) is also a bijection, whereupon there exists a unique \(x\in H_{a}\) such that \(x\lambda _{ab}\rho _{ba}=abxba=a\), thereby proving that S satisfies Eq. (6). \(\square \)

Other classes of regular semigroups may be defined by an equation system consisting of (2) together with one more equation. For instance, Example 2.1(iii) of [8] shows that Semilattices of groups (\(\mathscr{S}\mathscr{G}\)) is the EHP-class of regular semigroups defined by the additional equation:

$$\begin{aligned} (\forall \,a,b)(\exists \,y):ab=bya. \end{aligned}$$
(7)

Some standard properties used in the description of classes of semigroups may be expressed by equations, and that allows for abbreviation. For example, that a certain product u of some parameters and variables is idempotent we write as \(u\in E\), or if v is an inverse of u we write \(v\in V(u)\). Properties defined by Green’s relations are generally not intrinsically equational within the class of semigroups but may become so in the presence of the regularity Eq. (2). However the respective properties of being \(\mathscr {G}\)-simple for any of the five Green’s relations \(\mathscr {G}\) defines an EHP-class, except for the case of \(\mathscr {D}\). In general, bisimple semigroups are not closed under the taking of direct products. Within \({\mathscr {R}}eg\), the condition \(u\mathrel {\mathscr {H}}v\) is expressible through equations in S, with similar comments applying to \(\mathscr {L},\mathscr {R},\) \(\mathscr {J}\), and indeed \(\mathscr {D}\). In particular, the class of regular bisimple semigroups is defined by the regularity equation together with the equational relationships \((\forall a,b)(\exists x):a\mathrel {\mathscr {R}}x\mathrel {\mathscr {L}}b\). This class is however not a variety as, by a theorem of Preston, any semigroup may be embedded in a regular bisimple monoid [7, Corollary 1.2.15].

The property of u belonging to a subgroup, which we write as \(u\in G\), is also equational:

$$\begin{aligned} (\exists x):(x\in V(u))\wedge (ux=xu). \end{aligned}$$

The ascending chain of the three important classes of \({\mathscr {I}}\) (Inverse semigroups), \({\mathscr {O}}\) (Orthodox semigroups), and \(\mathscr{E}\mathscr{S}\), (Idempotent-solid semigroups), which are those regular semigroups whose idempotent generated subsemigroup is a union of groups, may be defined in a uniform fashion that is conveniently displayed if we adjoin two redundant equation types to the definition of regularity:

$$\begin{aligned} {{\textbf {reg}}:\,(\forall \,a,b)(\exists \,x,u,v):(x\in V(a))\wedge (u\in V(a^{2}))\wedge (v\in V(b^{2})).} \end{aligned}$$
(8)

We include two further classes within this sequence. For the first, we have the Right Inverse semigroups introduced by Venkatesan [14] as regular semigroups in which each \(\mathscr {L}\)-class contains a unique idempotent (for that reason, they are also known as \(\mathscr {L}\)-unipotent semigroups). The class \(\mathscr{R}\mathscr{I}\) of Right inverse semigroups is given six further characterisations in [14, Theorem 2.1], one of which is the class of all regular semigroups S for which \(efe = fe\) for any idempotents ef in S. It follows that \({\mathscr {I}} \subseteq \mathscr{R}\mathscr{I} \subseteq {\mathscr {O}}\). It also is the case that \(\mathscr{R}\mathscr{I}\) is an EHP class, and the argument for closure under homomorphisms is given in Theorem 3 of [14].

The second inclusion in the chain is the class \(\mathscr{C}\mathscr{N}\) of Conventional semigroups of Masat [13]: a regular semigroup S is conventional if \(aea'\) is idempotent for all \((a,a') \in V(S)\) and \(e\in E(S)\). Equivalently, by [13, Lemma 2.2], if \(eEe\subseteq E\), which compares as a natural weakening of the \(efe = fe\) condition of Right inverse semigroups. The Conventional semigroup definition is also a weakening of the Orthodox semigroup definition, and so \({\mathscr {O}} \subseteq \mathscr{C}\mathscr{N}\). A consequence of Masat’s Lemma 2.2 and Lallement’s Lemma is that \(\mathscr{C}\mathscr{N}\) is closed under the taking of homomorphisms [13, Lemma 3.1]. Since \(\mathscr{C}\mathscr{N}\) is clearly closed under direct products and is an elementary class, it follows that \(\mathscr{C}\mathscr{N}\) is an EHP-class. With the symbols abuv satisfying the equations of reg, we may define each these five classes by means of one additional equation (to be included within the scope of the quantification of \(\textbf{reg}\)). The following proposition is [8, Theorem 5.2], extended by the inclusion of \(\mathscr{R}\mathscr{I}\) and \(\mathscr{C}\mathscr{N}\).

Proposition 2.2

The classes of Inverse semigroups, Right inverse semigroups, Conventional semigroups, Orthodox semigroups, and Idempotent solid semigroups, are EHP-classes defined by the following equational bases.

$$\begin{aligned}{} & {} {\mathscr {I}}: {\textbf {reg}} \wedge aua\cdot bvb=bvb\cdot aua; \end{aligned}$$
(9)
$$\begin{aligned}{} & {} \quad \mathscr{R}\mathscr{I}: {\textbf {reg}} \wedge aua\cdot bvb\cdot aua = bvb\cdot aua; \end{aligned}$$
(10)
$$\begin{aligned}{} & {} \quad {\mathscr {O}}: {\textbf {reg}} \wedge aua\cdot bvb\in E; \end{aligned}$$
(11)
$$\begin{aligned}{} & {} \quad \mathscr{C}\mathscr{N}: {\textbf {reg}} \wedge \ aua\cdot bvb\cdot aua\in E; \end{aligned}$$
(12)
$$\begin{aligned}{} & {} \quad \mathscr{E}\mathscr{S}: {\textbf {reg}}\wedge aua\cdot bvb\in G. \end{aligned}$$
(13)

Proof

The cases of \({\mathscr {I}}\), \({\mathscr {O}}\) and \(\mathscr{E}\mathscr{S}\) are given in [8, Theorem 5.2]. The proof for \(\mathscr{C}\mathscr{N}\) is indicative of the approach, and we omit the very similar argument for \(\mathscr{R}\mathscr{I}\). Recall that \(\textbf{reg}\) includes the conditions \(u\in V(a^2)\) and \(b\in V(b^2)\) so that aua and bvb are idempotents. Thus any conventional semigroup satisfies the given equation system because of the \(eEe\subseteq E\) condition of [13, Lemma 2.2]. Conversely if S satisfies the equations then S is regular and for any two idempotents a and b we have:

$$\begin{aligned} aba = a^2b^2a^2 = a^2ua^2\cdot b^2vb^2\cdot a^2ua^2 = aua\cdot bvb\cdot aub, \end{aligned}$$

which is idempotent. It follows by [13, Lemma 2.2] that we have an equational basis for Conventional semigroups. \(\square \)

The equational bases in Propositions 2.1 and 2.2 are not unique. The bases given by Eqs. (9), (11), and (13) of Proposition 2.2 correspond to bases for these classes when considered as e-varieties in the sense of Hall [6]. These are classes of regular semigroups that are HP-closed, and also closed under the taking of regular subsemigroups. Indeed any e-variety (of regular semigroups) is an EHP-class of regular semigroups, and so e-varieties may be defined without the need to introduce a unary operation that selects arbitrary inverses. The only semigroup operation involved is the natural operation of semigroup multiplication. However, not all EHP-classes consisting of regular semigroups are e-varieties (see [8, Theorem 5.1]). In common with e-varieties however is the property that if \({\mathscr {C}}\) is an EHP-class of regular semigroups then the class \({\mathscr {C}}^{loc}\) of all semigroups S whose local subsemigroups eSe lie in \({\mathscr {C}}\) (\(e\in E(S))\) is also an EHP-class (see [8, Theorem 5.5]).

The abstraction of the idea of e-varieties involves taking an EHP-class of algebras \({\mathscr {N}}\), which are labelled nice, and then considering classes \({\mathscr {C}}\) of nice algebras that are HP-closed and closed under the taking of nice subalgebras. Since the nice algebras are defined by first order formulae (equation systems), it follows that \({\mathscr {C}}\) will be closed under the taking of elementary subalgebras, and so \({\mathscr {C}}\) will automatically be another EHP-class. There could however, as in the case of regular semigroups, be EHP-classes of nice algebras that were not closed under the taking of nice subalgebras. If we declare the class of all algebras to be nice, then the corresponding class of e-varieties coincides with varieties in the usual sense of algebras defined by identities (\(\exists \)-free equation systems).

2.2 Classes defined by \((\exists \dots )(\forall \dots )\)

A simple example of a fundamentally different type is the class \({\mathscr {M}}\) of monoids.

$$\begin{aligned} {\mathscr {M}}:(\exists x)(\forall a):ax=xa=a. \end{aligned}$$
(14)

We sometimes write these equations as \(x=1\), and similarly we write \(x=0\) to abbreviate the equations that ensure the existence of a zero element in a semigroup. The point to note here however is that the order of the existential quantifiers in (14) is \((\exists \,\dots )(\,\forall \,\dots )\), which is the reverse of all our previous examples. Indeed \({\mathscr {M}}\) cannot be represented by equations of the type \((\forall \dots )(\exists \dots )\) because any class that has such a basis is closed under the taking of the union of an ascending chain of algebras from the class, and the class \({\mathscr {M}}\) lacks this property. We will investigate this facet of the theory further in our final section.

A natural exercise then is to exchange the order of the quantifiers of the examples of Sect. 2.1. This will necessarily result in a more restricted class to that defined by the original equation system. Exchanging the order of the quantifiers in Eq. (2) defines the class \({\mathscr {B}}\) of all semigroups that possess a universal pre-inverse element:

$$\begin{aligned} {\mathscr {B}}:(\exists \,x)(\forall \,a):a=axa. \end{aligned}$$
(15)

We note that for any \(S \in {\mathscr {B}}\), the \(\mathscr {J}\)-class \(J_x\) is maximal. Conversely any band B with a maximal \(\mathscr {J}\)-class \(J = J_x\) belongs to \({\mathscr {B}}\). To see this observe that since B is a semilattice of rectangular bands, it follows that for any \(a \in {\mathscr {B}}\), \(ax \in D_a\). Since \(D_a\) is a rectangular band, we have \(ax = a^2x= a\cdot ax \mathscr {R} a\), whence \(ax\cdot a = a\), showing that \(B \in {\mathscr {B}}\).

The following is a simple reformulation of the condition defined by (15).

Lemma 2.3

A semigroup \(S\in {\mathscr {B}}\) if and only if \(\exists x\in S\) such that \(ax\in R_{a}\cap E(S)\) for all \(a\in S\), which in turn is equivalent to the condition that \(xa\in L_{a}\cap E(S)\) for all \(a\in S\).

Proposition 2.4

Suppose that \(S\in {\mathscr {B}}\), let E denote E(S), and let x denote a fixed choice for satisfying (15). Then

  1. (i)

    \(x\in E\), S satisfies the identity \(a^{2}=a^{3}\), and \(S=E^{2}\).

  2. (ii)

    In S, \(\mathscr {D}=\mathscr {J}\), and \(\mathscr {H}\) is the equality relation.

  3. (iii)

    Let \(J=J_{x}\). Then J is the maximum \(\mathscr {J}\)-class of S, the principal factor \(J\cup \{0\}=S/(S-J)\in {\mathscr {B}}\), and

    $$\begin{aligned} R_{x}\cup L_{x}\subseteq E. \end{aligned}$$
    (16)

Proof

(i) Taking \(a=x\) in (15) we get \(x=x^{3}\). For any a we have \(a^{2}=a^{2}xa^{2}=a(axa)a=a^{3}\). In particular, \(x^{2}=x^{3}=x\), so that \(x\in E\). Furthermore, since \((ax)^{2}=ax\) and \((xa)^{2}=xa\) we have \(a=axa=(ax)(xa)\), and so \(S=E^{2}\).

(ii) That \(\mathscr {D}=\mathscr {J}\) follows from the satisfaction of \(a^{2}=a^{3}\), as this equality of Green’s relations is true in any periodic semigroup; indeed it is true of any group-bound semigroup (see [4, Theorem 1.2.20]).

Let D be any (regular) \(\mathscr {D}\)-class of S. In any subgroup G of S, the equation \(a^{2}=a^{3}\) implies that \(a=e\), the identity element of G, and so S has trivial subgroups. It follows that every group \(\mathscr {H}\)-class, and hence every \(\mathscr {H}\)-class of S is trivial, which is to say that S is a combinatorial (i.e. \(\mathscr {H}\)-trivial) semigroup.

(iii) Since \({J}_{a}\le {J}_{x}\) it follows that \(J=J_{x}\) is the maximum \(\mathscr {J}\)-class of S (where x represents any solution to (15)). Since EHP-classes are closed under the taking of homomorphisms, the principal factor \(J\cup \{0\}\) also belongs to \({\mathscr {B}}\), and any solution to (15) in S is also a solution to (15) in \(J\cup \{0\}\). Suppose that \(x\mathrel {\mathscr {R}}a\) in S, and hence in \(S/(S-J)\) also. Then since \(x\in E\), it follows that \(a=xa\). But then \(a=axa=a^{2}\), so that \(a\in E\). Dually if \(x\mathrel {\mathscr {L}}a\) then a is also idempotent. We conclude that Condition (16) holds. \(\square \)

Remark 2.5

It is possible for a semigroup S to satisfy all three conditions of Proposition 2.4 yet for S not to belong to \({\mathscr {B}}\). For example take the six-element semigroup \(A_{2}^{1}\), which is the Rees matrix 0-simple semigroup with adjoined identity element 1, given by \({\mathscr {M}}^{0}[\{e\},2,2;P]^{1}\), where \(\{e\}\) is a one-element group and

$$\begin{aligned} P=\begin{bmatrix}e &{} e\\ e &{} 0 \end{bmatrix}. \end{aligned}$$

Taking \(x=1\), we see that each of (i), (ii), and (iii) is satisfied. However \(A_{2}^{1}\not \in {\mathscr {B}}\): taking \(a=1\) we see that we must take \(x=1\) in order to satisfy the condition of Proposition 2.4. However with \(x=1\), for the single non-idempotent element \(a=(e;2,2)\) of \(A_{2}^{1}\), we have \(ax=xa=a\not \in E(S)\), contrary to Lemma 2.3.

We next consider the more restricted equational system \({\mathscr {V}}\subseteq {\mathscr {B}}\), consisting of all semigroups that possess a universal inverse element:

$$\begin{aligned} {\mathscr {V}}:(\exists \,x)(\forall \,a):x\in V(a). \end{aligned}$$
(17)

Let \(S \in {\mathscr {V}}\), and so Proposition 2.4 applies. In particular S is a regular periodic combinatorial semigroup. Moreover, (17) implies that S is bisimple, and since any periodic bisimple semigroup is completely simple (Corollary 2.56 of [3]), we conclude that S is completely simple. However, a completely simple combinatorial semigroup is none other than a rectangular band, which certainly satisfies (17). Indeed this gives the following curious formulation of the property of the existence of a universal inverse.

Proposition 2.6

For any semigroup S either

\( \cap _{a \in S} V(a) = \varnothing \) or \(\cap _{a \in S} V(a) = S\),

the latter occurring if and only if S is a rectangular band.

2.3 Applying the EHP theorem: the Croisot theory

The previous section also serves to introduce a strategy for applying the EHP theorem. The general approach is to systematically list semigroup equations and identify the corresponding semigroup classes. The classical decomposition theory of R. Croisot [5] deals in semigroup classes defined by a collection of simple related equations, which we summarise here. The following two results, based on [5], may be found in Sect. 4.1 of [3].

In this section the existential quantifiers are suppressed with ab standing for parameters and xy variables. Some left-right dual statements are not explicitly stated.

Theorem 2.7

[3] For the EHP-class defined by \(a = a^mxa^n\) \((m + n \ge 2)\):

  1. (i)

    The class defined by (m, 0) is the class of all semigroups S in which each \(\mathscr {R}\)-class is a subsemigroup (which is then necessarily a right simple subsemigroup) of S. Equivalently \(a \mathscr {R} a^2\) for all \(a \in S\).

  2. (ii)

    The class defined by (mn) with \(m,n \ge 1\) and \(m + n \ge 3\) is the class of completely regular semigroups.

Theorem 2.8

[3] The EHP-class defined by \(a = xa^2y\) is the class of all semigroups S that are unions of simple subsemigroups; S is then necessarily a semilattice of simple semigroups, and these simple components are the \(\mathscr {J}\)-classes of S.

The proof of Theorem 2.7 involves a mix of syntactic and semantic argument, with equation manipulation and use of properties of Green’s relations. Another semantic proof of Theorem 2.7(ii) comes through observing that \(a = a^2xa\) implies that S is regular and satisfies \(a \mathscr {R} a^2\) so that each \(\mathscr {R}\)-class is a regular subsemigroup of S. (The general (mn) case of Theorem 2.7 follows easily from this basic equation.) From this we may deduce that the principal factors of S are completely simple, for if not S would contain a copy of the bicyclic semigroup, B. However, the property of \(\mathscr {R}\)-classes being subsemigroups would be inherited by any inverse subsemigroup of S, which would then be a semilattice of groups, which is contradicted by the presence of B.

An advantage of the EHP theorem is that it characterises classes in the first-order language, offering a vehicle for automated theorem proving. The software package Prover 9 [11] found a proof that \(a = a^2xa\) implies \(a \mathscr {L} a^2\), whence \(a \mathscr {H} a^2\), and so S is a union of groups. The proof below is the result of equation manipulation, suggesting that automated theorem provers may play a role in results of this kind.

Remark 2.9

We now present a demonstration of the statement that the equation \(a = a^2xa\) characterises completely regular semigroups.

Proof

As observed above, it is enough to show that \(a\mathrel {\mathscr {L}}a^{2}\) for an arbitrary \(a\in S\).

Fix two members \(a',a^{R}\in S\) such that \(a=aa'a\) and \(a=a^{2}a^{R}\) and put

$$\begin{aligned} r=a^{R}a'a. \end{aligned}$$
(18)

Then r has three properties relevant to our purpose,

$$\begin{aligned}{} & {} a^{2}r=(a^{2}a^{R})a'a=aa'a=a. \end{aligned}$$
(19)
$$\begin{aligned}{} & {} \quad (ar)^{2}=(aa^{R}a'a)(aa^{R}a'a)=(aa^{R}a')(a^{2}a^{R})a'a\nonumber \\{} & {} \qquad =aa^{R}a'(aa'a)=a(a^{R}a'a)=ar. \end{aligned}$$
(20)

Moreover \(a^{2}r=a\) implies \(a^{2}r^{2}=ar\) and since (20) shows that ar is idempotent we infer:

$$\begin{aligned} a^{2}r^{2}=ara^{2}r^{2}. \end{aligned}$$
(21)

We next deduce that \(a=ara\) as follows:

$$\begin{aligned} a&=a^{2}r (\text { by (19))}\\&=a^{2}r^{2}r^{R} \text { (by definition of }{r^{R}})\\&=ara^{2}r^{2}r^{R} \text { (by (21))}\\&=ara^{2}r \text { (by definition of }{r^{R}})\\&=ara \text { (by (19)).} \end{aligned}$$

Now \(ra\mathrel {\mathscr {L}}a^{2}\) as \(ra=a^{R}a'a^{2}\) (by (18)), and from (19), \(a^{2}ra=a^{2}\). Since \(a=ara\) we then have \(a\mathrel {\mathscr {L}}ra\mathrel {\mathscr {L}}a^{2}\), which completes the proof. \(\square \)

Proposition 2.10

The following are equivalent for a semigroup S.

  1. (i)

    S satisfies \(a=abxba\).

  2. (ii)

    S satisfies \(a=abxa\).

  3. (iii)

    S satisfies \(a=axba\).

  4. (iv)

    S is completely simple.

Proof

We have by Proposition 2.1 that (i) and (iv) are equivalent. Clearly (i) implies (ii) as given \(a=abxba\) we have \(a=abya\), where \(y=xb\). Similarly (i) implies (iii). By symmetry it is enough now to prove that (ii) implies (iv). By taking \(b=a\) in (ii) we see that S satisfies \(a=a^{2}xa\), whence by Theorem 2.7(ii), S is completely regular. It also follows from (ii) that S has only one \(\mathscr {J}\)-class, whence S is completely simple. \(\square \)

3 The EHP theorem

3.1 Outline of proof of Theorem 1.1

The proof of Theorem 1.1 in the forward direction is simple for it is clear that if a class of algebras \({\mathscr {C}}\) is defined by an equation system then this property is preserved under the taking of homomorphic images and arbitrary direct products. Moreover, such a class \({\mathscr {C}}\) is automatically an elementary class as \({\mathscr {C}}\) is defined in the first order language.

The converse direction however is a consequence of Lyndon’s positivity theorem, which states that for an elementary class closed under taking surjective homomorphic images, a sentence is equivalent to a positive sentence (one free of negations). Thus we may assume that our class of algebras \({\mathscr {C}}\) such that \(\mathscr {C}\subseteq \,HP({\mathscr {C}})\) is the class of models of a set \(\Sigma \) of positive sentences. There is no loss of generality to assume that all quantifiers are at the front of the positive sentences. The remaining task then is to show that disjunctions in each \(\rho \in \Sigma \) may be removed.

Consider a sentence \(\rho \in \Sigma \). We may express the equation systems of \(\rho \) as a finite conjunction of disjunctions, \(\bigwedge _{1\le i\le m}\gamma _{i}\), where each \(\gamma _{i}\) is a finite disjunction: \(\gamma _{i}=\alpha _{i,1}\vee \dots \vee \alpha _{i,r_{i}}\), and each \(\alpha _{i,j}\) is an atomic formula involving some subset of the full set of parameters and variables of \(\rho \). Suppose that for some i, \(r_{i}\ge 2\) \((1\le i\le m)\). We show that the conjunct \(\gamma _{i}\) may be replaced by some \(\alpha _{i,j}\) and the resulting reduced sentence is equivalent to \(\rho \) for any elementary class \({\mathscr {C}}\) that is closed under HP. Repeating this for each conjunct \(\gamma _{i}\) will see us arrive at the desired \(\bigvee \)-free sentence. The quantifiers remain unchanged throughout.

In view of this, we suppress the symbol i and use the corresponding symbols \(r=r_{i}\), \(\gamma =\gamma _{i}\), and \(\alpha _{j}=\alpha _{i,j}\). Let \(\rho _{j}\) be the result of replacing \(\gamma \) in \(\rho \) by \(\alpha _{j}\). Note that \(\rho _{j}\vdash \rho \) so that the class of models satisfied by \((\Sigma \cup \rho _{j})\setminus \rho \) is a subclass of \({\mathscr {C}}\). We wish to show that for some j the reverse containment holds. Assume by way of contradiction that this is not the case. Then for each \(j\,(1\le j\le r)\) there exists a model \(M_{j}\in {\mathscr {C}}\) such that \(\rho _{j}\) fails in \(M_{j}\). Put \(M=\Pi _{j=1}^{r}M_{j}\in {\mathscr {C}}\) and so \(M\models \gamma \). (This stage of the argument only requires that \({\mathscr {C}}\) is closed under the taking of finitary direct products. However since \({\mathscr {C}}\) is an elementary class, \({\mathscr {C}}\) is closed under the taking of ultraproducts; \({\mathscr {C}}\) being closed under finitary direct products and ultraproducts then implies that \({\mathscr {C}}\) is in fact closed under the taking of arbitrary direct products.)

The nature of the argument may be illustrated in the simplest case where \(\rho \) has only one pair of existential quantifiers, for instance, let us say that each \(\alpha _{j}\) has a single equation:

$$\begin{aligned} \alpha _{j}:(\forall {\varvec{a}}_{j})(\exists {\varvec{x}}_{j}):u_{j}({\varvec{a}}_{j},{\varvec{x}}_{j})=v_{j}({\varvec{a}}_{j},{\varvec{x}}_{j}), \end{aligned}$$

where \(\varvec{a_{j}}\) and \(\varvec{x_{j}}\) are the respective vectors of parameters and variables of the equation \(u_{j}=v_{j}\). Since \(M_{j}\not \models \alpha _{j}\) it follows that for \(M_{j}\), \(\exists {\varvec{a}}_{j}\) such that \(\forall {\varvec{x}}_{j}\) \(u_{j}({\varvec{a}}_{j},{\varvec{x}}_{j})\ne v_{j}({\varvec{a}}_{j},{\varvec{x}}_{j})\). Since \(M\models \gamma \) we may select \(\tilde{{\varvec{a}}}=({\varvec{a}}_{1},\dots ,{\varvec{a}}_{j},\dots ,{\varvec{a}}_{r})\) as our parameter choices for M and there exists a corresponding choice of variables \(\tilde{{\varvec{x}}}=({\varvec{x}}_{1},\dots ,{\varvec{x}}_{j},\dots ,{\varvec{x}}_{r})\) such that for some j\({\varvec{u}}_{j}(\tilde{{\varvec{a}}},\tilde{{\varvec{x}}})={\varvec{v}}_{j}(\tilde{{\varvec{a}}},\tilde{{\varvec{x}}})\). However, taking the projection of this last equation onto the jth component then yields the contradiction that \(u_{j}({\varvec{a}}_{j},{\varvec{x}}_{j})=v_{j}({\varvec{a}}_{j},{\varvec{x}}_{j})\).

In general however, an equation system \(\rho \in \Sigma \) may have any finite number of alternations of existential quantifiers. Since satisfaction for such a sentence is defined recursively on the string of quantifiers, the above argument needs to be taken by induction on the number of quantifiers through the stages outlined in the previous discussion. This technical argument however does not require any additional facet to the proof strategy presented in the previous paragraph. The complete argument is given in [8, Theorem 3.1].

3.2 The dual variety theorem for semigroups

An EHP-class \({\mathscr {C}}\) defined without the use of the \(\exists \) quantifier is a variety, and in particular the class is closed under the taking of subalgebras. (Birkhoff’s theorem says that a class of algebras \({\mathscr {C}}\) is defined by a countable list of identities if and only if \({\mathscr {C}}\) is closed under the operator HSP.) On the other hand, if the class is defined without the use of the \(\forall \) symbol then the class is closed under the taking of superalgebras, meaning that if \(A\in {\mathscr {C}}\) and \(A\le B\), where B is an algebra in the defining signature of the algebra class under consideration, then \(B\in {\mathscr {C}}\) also.

Here we prove the converse for the class of Semigroups: if an EHP-class \({\mathscr {C}}\) is closed under the taking of superalgebras it follows that \({\mathscr {C}}\) may be defined by equations of the type \((\exists \,x_{1},\dots ,x_{n}):(\bigwedge _{1\le i\le m}u_i(x_{1},\dots ,x_{n})=v_i(x_{1},\dots ,x_{n}))\).

Definition 3.1

An equation system is called existential if it is has no instances of the \(\forall \) quantifier. An EHP-class is called existential if it has a basis of existential equation systems.

In the model theory literature what we call here an existential equation system is known as a primitive positive sentence. First let us suppose that \({\mathscr {C}}\) is an EHP-class that is closed under the taking of containing algebras. Suppose that A is an algebra containing a trivial (one-element) subalgebra, T. Then as \(T\in {\mathscr {C}}\) we have that \(A\in {\mathscr {C}}\) by the containment property. It follows that if our algebra is of a type where every algebra contains a one-element algebra, such as Monoids or Groups, then \({\mathscr {C}}\) is the EHP class of all algebras of the type under consideration. Moreover, in this context all algebras satisfy all existential equation systems. It follows that the converse is trivially true as a class closed under the taking of containing algebras and a class defined by a basis of existential equation systems are both necessarily equal to the class of all algebras.

However, within the class of Semigroups, there are (infinite) semigroups that are idempotent-free, and so the previous observation does not apply. For example the equation \((\exists x):x=x^{2}\) defines the EHP-class of all semigroups that contain an idempotent. This is a proper class of semigroups that is contained in every existentially defined EHP-class of semigroups.

Theorem 3.2

An EHP-class of semigroups \({\mathscr {C}}\) is existential if and only if \({\mathscr {C}}\) is closed under the taking of containing semigroups. Equivalently, \({\mathscr {C}}\) is closed under the taking of codomains of homomorphisms.

Before we embark on the main proof we observe the equivalence with the second sentence. Suppose that the EHP-class \({\mathscr {C}}\) is closed under the taking of containing semigroups and let \(S\in {\mathscr {C}}\) with \(\alpha :S\rightarrow T\) a homomorphism. Since \({\mathscr {C}}\) is closed under the taking of homomorphisms, \(S\alpha \in {\mathscr {C}}\), and since \(S\alpha \le T\) it then follows that \(T\in {\mathscr {C}}\). Conversely suppose that \({\mathscr {C}}\) is closed under the taking of codomains of homomorphisms and suppose that \(S\le T\). We take \(\alpha :S\rightarrow T\) to be the identity mapping on S with codomain T, whence by the given condition \(T\in {\mathscr {C}}\).

Most of the remainder of the section is devoted to the proof of Theorem 3.2, which is completed after some preliminary lemmas and discussion. The main challenge of the proof is facilitating a kind of quantifier elimination, achieved using the free product construction. For any semigroup S, we consider the free product \(F*S\) of S with the free semigroup \(F=F_A\) on a countably infinite alphabet \(A=\{A_1,A_2,\dots \}\).

Consider an arbitrary equation system \(\varepsilon \) satisfied by \(F*S\): a quantified system \(p_1=q_1\wedge \dots \wedge p_\ell =q_\ell \). As \(\varepsilon \) is satisfied, for every evaluation of the parameters (universally quantified variables in \(\varepsilon \)) in \(F*S\), we may find witnesses to the existentially quantified variables, with the choice of each witness being made on the basis of prior quantified variables. As the parameters can be chosen without restriction, we are going to adopt the strategy that each parameter is chosen to be a free generator from the set A that has not appeared within the evaluation of any variable quantified before it: we refer to this as the free dependency condition. Thus if we have \(\forall a_1\exists x_1\forall a_2\), we choose \(a_1\) to be \(A_1\) and if we have chosen the witness \(x_1\) to evaluate as \(A_2A_1sA_5\) for some \(s\in S\), then we will choose \(a_2\) to take the value \(A_3\) (or any other free generator except for those already in use: \(A_1,A_2,A_5\) in the example). Without loss of generality however, it is clear that we may rename the free generators so that each parameter \(a_i\) is assigned the free generator \(A_i\): in the example just given we could rename \(A_2\) and \(A_3\) and choose the witness \(A_3A_1sA_5\) for \(x_1\) and then choose \(a_2\mapsto A_2\). We refer to this as an instance of a canonical evaluation of parameters, and we say that \(F*S\) satisfies \(\varepsilon \) under the canonical evaluation of parameters to mean that witnesses to the existential variables can be made to achieve equality \(p_1=q_1,\dots , p_\ell =q_\ell \) under the evaluation (and satisfying the free dependency condition). The following lemma holds in any variety within any signature of algebras, replacing “free semigroup” with a relatively free algebra in the variety. We use U to denote a semigroup instead of S to match later usage of the lemma.

Lemma 3.3

Let \(\varepsilon \) be an equation system with parameters \(a_1,\dots , a_p\) and U be a semigroup. Let \(F=F_A\) be the denumerably generated free semigroup with free generators \(A=\{A_1,A_2,\dots \}\). If \(F*U\) satisfies \(\varepsilon \) under the canonical parameter evaluation, then U satisfies \(\varepsilon \).

Proof

Every function \(\phi :A\rightarrow U\) extends to a retraction from \(F*U\) onto the subsemigroup U. We have witnesses \(X_1,\dots , X_q\in F*U\) for the canonical evaluation of parameters \(a_1,\dots ,a_p\) as \(A_1,\dots ,A_p\) satisfying the free dependency condition. Let \(\gamma \) be any evaluation of the parameters of \(\varepsilon \) in U, and let \(\phi :A\rightarrow U\) be \(\phi (A_i):=\gamma (a_i)\). We may extend \(\phi \) to a retraction onto U, and use witnesses \(\phi (X_1),\dots ,\phi (X_q)\) in U to verify satisfaction of \(\varepsilon \). \(\square \)

Note that the free dependency condition is used only at the final step of this proof: if \(x_i\) is (existentially) quantified prior to (the universally quantified) \(a_j\), we should have that the choice of \(\phi (X_i)\) can yield satisfaction of \(\varepsilon \) for all subsequent evaluations of \(a_j\). But if \(X_i\) contained an occurrence of \(A_j\), then the value of \(\phi (X_i)\) would in general depend on the evaluation \(\gamma (a_j)=\phi (A_j)\) of \(a_j\).

Every element of \(F*S\) may be written uniquely in the form \(f_{1}s_{1}f_2\dots s_{k-1}f_{k}\), where each \(f_i\) is an element of \(F_A\), each \(s_i\) is an element of S and \(f_1\) and possibly \(f_k\) could be empty (though note that if \(k=1\), then this expression is simply \(f_1\) and then \(f_1\) cannot be empty). We refer to this as the normal form for an element of \(F*S\). If \(z_1\dots z_p\) is an arbitrary semigroup word (possibly with repeats in the sequence of letters \(z_1,\dots ,z_p\)), then under any evaluation of the letters \(\{z_1,\dots ,z_p\}\) into \(F*S\), the word \(z_1\dots z_p\) gives rise to a product of normal forms

$$\begin{aligned} \prod _{1\le i\le p}(f_{i,1}s_{i,1}f_{i,2}\dots s_{i,k_i-1}f_{i,k_i}). \end{aligned}$$
(22)

Depending on the value of \(k_i\), and on whether \(f_{i,1}\) or \(f_{i,k_i}\) are empty, the product in (22) may give rise to sequences of consecutive instances of elements of S; a maximal block of consecutive elements of S that arises from such a product will be called an S-run. So for example, if letters \(z_1,z_2,z_3,z_4\) are evaluated in \(F*S\) as \(A_1s_1\), \(s_2A_2s_1A_3s_3\), \(s_2\) and \(s_1A_1\) respectively, then the word \(z_1z_2z_3z_4\) evaluates as \((A_1s_1)(s_2A_2s_1A_3s_3)(s_2)(s_1A_1)\), and we have S-runs \(s_1s_2\), \(s_1\) and \(s_3s_2s_1\). These of course collapse to individual elements of S in the reduction of (22) to normal form, but we are interested in the uncollapsed form. For each S-run we may also create an abstract run, which is a matching semigroup word in the alphabet \(\{x_s\mid s\in S\}\) of variables indexed by elements of S; so the S-run \(s_3s_2s_1\) becomes \(x_{s_3}x_{s_2}x_{s_1}\).

If \(F*S\) satisfies an equation system \(\varepsilon \) under the canonical evaluation of parameters, then the chosen witnesses in \(F*S\) will provide a collection of equalities between S-runs. More precisely, each equality in \(\varepsilon \) produces an equality in \(F*S\) of the form

$$\begin{aligned} f_{1}S_{1}f_2\dots S_{k_i-1}f_{k_i} = g_{1}T_{1}g_2\dots T_{\ell _i-1}g_{\ell _i} \end{aligned}$$
(23)

where \(S_1,\dots , S_{k_i-1}\) and \(T_1,\dots ,T_{\ell _i-1}\) are S-runs. As this is an equality holding in the free product, it follows that \(k_i=\ell _i\) and that \(S_1=T_1, \dots , S_{k_i-1}=T_{k_i-1}\) in S (and that \(f_i=g_i\) are identical as words in \(A^+\), or empty). For such a choice c of witnesses to the canonical evaluation, let \(\varepsilon _c\) denote the existential equation systems consisting of the conjunction of the equalities arising from the resulting abstract S-runs, across the witnessing evaluations of all the equalities \(p_1=q_1\wedge \dots \wedge p_\ell =q_\ell \). As an example, consider the equation system

$$\begin{aligned} \varepsilon :(\forall a_1)(\exists x_1)( \forall a_2)(\exists x_2\exists x_3):(a_1a_2x_1x_2x_3a_1=a_1x_2a_1a_2x_3x_1 \wedge a_1x_3a_1=a_1x_3x_3a_1), \end{aligned}$$

and consider a semigroup S in which the canonical evaluation of \(a_1,a_2\) into \(F*S\) has choices for \(x_1,x_2,x_3\) satisfying the free dependency condition that lead to satisfaction of the equalities: an example might be \(X_1:=s_1A_1\), \(X_2:=A_2s_1\) and \(X_3=s_2\). (Note that as \(x_1\) is quantified prior to \(a_2\) the free dependency condition requires that the word \(X_1\) should not involve \(A_2\), because the value of \(x_1\) should not in general depend on the choice of \(a_2\).) In this hypothetical scenario, we have

$$\begin{aligned} (A_1)(A_2)(s_1A_1)(A_2s_1)(s_2)(A_1)=(A_1)(A_2s_1)(A_1)(A_2)(s_2)(s_1A_1) \end{aligned}$$

and

$$\begin{aligned} (A_1)(s_2)(A_1)=(A_1)(s_2)(s_2)(A_1). \end{aligned}$$

From the first equality we find that \(s_1=s_1\) and \(s_1s_2=s_2s_1\). From the second equality we find that \(s_2=s_2s_2\). Then for this choice c we obtain \(\varepsilon _c\) as

$$\begin{aligned} (\exists x_{s_1}\exists x_{s_2}):(x_{s_1}=x_{s_1}\wedge x_{s_1}x_{s_2}=x_{s_2}x_{s_1}\wedge x_{s_2}=x_{s_2}x_{s_2}). \end{aligned}$$

Obviously the equality \(x_{s_1}=x_{s_1}\) here is redundant and could be removed. One can further see that \(s_1\) could be replaced by \(s_2\) in this sentence, so that \(\varepsilon _c\) is logically equivalent to \((\exists x)\ x=x^2\).

Lemma 3.4

Let \(\varepsilon \) be an equation system satisfied by \(F*S\), under the canonical parameter evaluation for some semigroup S. Then for any choice c of witnesses in \(F*S\) to the satisfaction of the equalities in \(\varepsilon \) we have \(S\models \varepsilon _c\) and \(\varepsilon _c\vdash \varepsilon \) within the class of semigroups.

Proof

Fix a choice c of witnesses, \(x_i\mapsto X_i\in F*S\). The construction of \(\varepsilon _c\) from this choice trivially ensures that \(S\models \varepsilon _c\), which proves the first claim. Now assume that U is any semigroup satisfying \(\varepsilon _c\); we show that \(U\models \varepsilon \). We may find witnesses to the canonical evaluation of parameters in \(F*U\), as the witnesses to satisfaction of \(\varepsilon _c\) in U provide the required U-runs to match those from S in the satisfaction of \(\varepsilon \) in \(F*S\) under the canonical parameter evaluation: note that these will not violate the free dependency condition for existential witnesses, as they only provide the required U-runs, with any free parameters chosen in the evaluation of a variable \(x_i\) following whatever was done in S for the choice c (which satisfied the free dependency condition). Then Lemma 3.3 shows that U satisfies \(\varepsilon \). \(\square \)

Referring to the example given prior to Lemma 3.4, where \(\varepsilon _c\) was logically equivalent to \((\exists x)\ x=x^2\), the statement \(\varepsilon _c\vdash \varepsilon \) in Lemma 3.4 is asserting that any semigroup containing an idempotent will satisfy

$$\begin{aligned} (\forall a_1)(\exists x_1)(\forall a_2)(\exists x_2\exists x_3):(a_1a_2x_1x_2x_3a_1=a_1x_2a_1a_2x_3x_1\wedge a_1x_3a_1=a_1x_3x_3a_1). \end{aligned}$$

For a given equation system \(\varepsilon \) and any S for which \(F*S\) satisfies \(\varepsilon \) under the canonical parameter evaluation, the maximal length of any S-run is bounded by the maximal number of consecutively adjacent existential variables within a word occurring in \(\varepsilon \). Thus there is an upper bound \(\ell \) on the number of S-runs of length more than 1 that appear in \(\varepsilon _c\) (independent of S and c). Because equalities between S-runs of length 1 are trivial (they yield equalities \(x_s=x_s\) for some \(s\in S\)), the number of nontrivial conjuncts within \(\varepsilon _c\) is at most \(\ell \) also, so that up to logical equivalence, there are only finitely many different existential equation systems of the form \(\varepsilon _c\), with the number determined by the structure of the sentence \(\varepsilon \) only. We let \(H(\varepsilon )\) denote this (finite) set of existential sentences. The case of \(H(\varepsilon )=\varnothing \) is possible, and corresponds to the situation where there are no nontrivial S-runs in any canonical evaluation, for any S. In this situation, note that \(\varepsilon \) is trivially equivalent to the equation system \(\varepsilon '\) obtained by including in \(\varepsilon \) the conjunct \(x=x\), where x is a new (existentially quantified) variable. Because any element of \(F*S\) will satisfy \(x=x\), we have the same witnesses as previously for \(\varepsilon \) (which by assumption all avoided any S-runs), along with an arbitrary witness for x. Thus \(H(\varepsilon ')=\{(\exists x):x=x\}\), so that we may let \(\varepsilon _c\) denote the equation system \((\exists x):x=x\).

Proof of Theorem 3.2

Clearly an existential class is closed under the taking of containing semigroups. Conversely let us suppose that \({\mathscr {C}}\) is an EHP-class of semigroups that is closed under the operation of taking containing semigroups.

Let \({\mathscr {B}}\) be the existential class with EHP basis \(E'\), which is the set of all existential equation systems that are satisfied by all members of \({\mathscr {C}}\). Clearly \({\mathscr {C}}\subseteq {\mathscr {B}}\), and indeed \({\mathscr {B}}\) is the smallest existential class that contains \({\mathscr {C}}\). Our task is to prove the reverse containment.

Let E be a set of equation systems characterising \({\mathscr {C}}\) and consider any \(\varepsilon \in E\). We will show that there is an equation of the form \(\varepsilon _c\) that can replace \(\varepsilon \). Repeated application, across all members of E leads to a subset of \(E'\), which will complete the proof.

As a first step we need to show that there is an equation of the form \(\varepsilon _c\) that holds on all members of \({\mathscr {C}}\). We may assume that \(H(\varepsilon )\) is not empty. Assume for contradiction that for each \(\varepsilon _c\in H(\varepsilon )\) there is \(S_c\in {\mathscr {C}}\) that fails \(\varepsilon _c\). As \({\mathscr {C}}\) is closed under taking direct products we have that \(T:=\prod _{\varepsilon _c\in H(\varepsilon )}S_c\in {\mathscr {C}}\) and then as \({\mathscr {C}}\) is closed under taking containing semigroups we have that \(F*T\models \varepsilon \). But then \(T\models \varepsilon _c\) for some choice of witnesses to the canonical parameter evaluation. But then all quotients of T satisfy \(\varepsilon _c\), contradicting the assumption that \(S_c\) fails \(\varepsilon _c\).

Thus for every \(\varepsilon \in E\) there is an existential equation system of the form \(\varepsilon _c\) such that \(\varepsilon _c\in E'\). By Lemma 3.4 we have that \(\varepsilon _c\vdash \varepsilon \), so that \(\varepsilon _c\) can replace \(\varepsilon \) in E. \(\square \)

The existential equations that are satisfied by every semigroup are explicitly identified in [8, Corollary 6.7]. To conclude this section we note that this may be extended, at least at the level of an algorithmic solution to arbitrary equation systems. Satisfying equations in \(F*S\) by canonical parameter evaluation is somewhat reminiscent of solving equations in free semigroups, a problem originally solved by Makanin [12], and of continued interest and development. The connection turns out to be genuine, and a strong form of Makanin’s algorithm can be used to show that it is decidable to determine if an equation system holds in the variety of all semigroups.

Theorem 3.5

The class of equation systems satisfied in the class of all semigroups is decidable.

Proof

As we now explain, Theorem 3.5 is a direct corollary of Makanin’s celebrated solution [12] to solvability of equations over free semigroups, as extended to allow for rational constraints (see Chapter 12 of [10]). In the context of Makanin’s algorithm, a system of equations on the free semigroup consists of a finite set of equalities \(\varepsilon \) between words in alphabet \(\{A_1,A_2,\dots \}\cup \{X_1,X_2,\dots \}\) and we are asked whether there is a satisfying evaluation of the variables \(X_i\) in the free semigroup \(A^+\) (where \(A=\{A_1,A_2,\dots \}\) as before). In the absence of any constraint on the choice of the \(X_i\), this coincides with what we called the canonical parameter evaluation of the generators \(A_1,A_2,\dots \) (as themselves) in the equation system \((\forall A_1,\dots ,A_n)(\exists X_1,\dots ,X_m)\varepsilon \) (for suitable nm determined by the variables that appear in \(\varepsilon \)). Lothaire [10, §12.1.8] details an extension of Makanin’s algorithm to allow for the variables \(X_1,X_2,\dots \) to be constrained by rational languages \(\lambda _1,\lambda _2,\dots \) over the alphabet A. This enables us to additionally enforce the free dependency condition for an equation system \(\varepsilon \): constrain each variable \(X_i\) to lie in the rational language \(\lambda _i\) excluding letters in A that are quantified in \(\varepsilon \) to the right of the existential quantification of \(X_i\). Satisfaction of this constrained instance of the equation problem coincides in definition precisely with satisfaction of \(\varepsilon \) under the canonical parameter evaluation with free dependency condition holding. But this latter property is equivalent to unconditional satisfaction of the equation system \(\varepsilon \): for the nontrivial direction, use the universal mapping property of \(A^+\) with respect to itself (or alternatively use Lemma 3.4, using the fact that \(A^+*A^+\cong A^+\)). Thus the extension of Makanin’s algorithm in [10, §12.1.8] can be used to decide satisfaction of arbitrary equation systems on \(A^+\).

Finally, the equation systems true on \(A^+\) (with A a denumerable alphabet) are precisely those in the class of all semigroups. One direction of this claim follows trivially from the fact that \(A^+\) is a semigroup. For the other direction, use the universal mapping property for \(A^+\) with respect to any other semigroup S to find that if \(\varepsilon \) is satisfied by \(A^+\) (under canonical parameter evaluation, with free dependency), then it is satisfied by S. \(\square \)

4 EHP-classes requiring both types of quantifier alternation

An open question in this new theory of EHP-classes is: For any n, does there exist an EHP-class \({\mathscr {C}}\) such that in any basis for \({\mathscr {C}}\), n alternations of the existential quantifiers \(\forall \) and \(\exists \) are necessary in at least one equation of the basis?

Example 2.3(iii) of [8] is the EHP-class \({\mathscr {C}}\) defined by:

$$\begin{aligned} (\exists y)(\forall a)(\exists x,z):a=xyz, \end{aligned}$$

which is the class of semigroups S that have a maximum \(\mathscr {J}\)-class J such that \(S/(S-J)\) is not a null semigroup. It is proved there that \({\mathscr {C}}\) cannot be defined by equations systems exclusively of the type \((\forall \cdots )(\exists \cdots )\) nor by systems with the reverse order of quantifiers \((\exists \cdots )(\forall \cdots )\).

The strategy for proving this type of result is two-fold. To show that \((\forall \dots )(\exists \dots )\) quantification is not possible we find a chain of semigroups \(S_{1}\subseteq S_{2}\subseteq \dots \), with each \(S_{n}\in {\mathscr {C}}\), such that the semigroup union \(S=\cup _{n=1}^{\infty }S_{n}\not \in {\mathscr {C}}\). It follows from this that \({\mathscr {C}}\) cannot be captured by an equation system based on a \((\forall \dots )(\exists \dots )\) quantification as that would imply that \(S\in {\mathscr {C}}\) as well. (This phenomenon we have already observed in Sect. 2 in the context of the EHP-class of Monoids.)

Next we wish to show that definition by a \((\exists \dots )(\forall \dots )\) quantification for \({\mathscr {C}}\) is also impossible. This is done by identifying a semigroup chain \(S_{1}\subseteq S_{2}\subseteq \dots \), where no member of the chain lies in \({\mathscr {C}}\), yet their union \(S=\cup _{n=1}^{\infty }S_{n}\) is a semigroup in \({\mathscr {C}}\). Given that \(S\in {\mathscr {C}}\), if \({\mathscr {C}}\) possessed a quantification of the form \((\exists \dots )(\forall \dots )\), it would follow that \(S_{n}\in {\mathscr {C}}\) for all sufficiently large n.

Our Example 4.1 is complementary to the previous one in that it is defined by quantification of the type \((\forall \dots )(\exists \dots )(\forall \dots )\).

We are working with the class of Semigroups throughout and all quantifications are assumed to take place in some arbitrary semigroup S. Letters at the front (resp. end) of the alphabet abc,  (resp. xyz) denote parameters (resp. variables) in a given equation, typically written \(e:p=q\).

Example 4.1

Let \({\mathscr {C}}\) be the EHP-class defined by

$$\begin{aligned} {\mathscr {C}}:\,(\forall a)(\exists x)(\forall b):axb=abx. \end{aligned}$$
(24)

Then \({\mathscr {C}}\) cannot be defined by equation systems of the type \((\forall \cdots )(\exists \cdots )\) nor by systems with the reverse order of quantifiers \((\exists \cdots )(\forall \cdots )\).

Proof

We prove the claim by applying the strategy we have just outlined.

Let \(F_{A}\) be the free semigroup on the infinite set of generators

$$\begin{aligned} A=\{a_{1},a_{2},\dots ,x_{1},x_{2},\dots \}. \end{aligned}$$

Denote the finite subset \(\{a_{1},\dots ,a_{n},x_{1},\dots ,x_{n}\}\) of A by \(A_{n}\). Let \(\rho \) be the congruence on \(F_{A}\) generated by

$$\begin{aligned} \rho ^{0}=\{(b_{i}x_{j},x_{j}b_{i}):\,\text { where } {\,b_{i}\in \{a_{i},x_{i}\},\,} i\le j\}, \end{aligned}$$

and put \(S=F_{A}/\rho \). Let \(S_{n}\le S\), where \(S_{n}=\langle a_{1}\rho ,\dots ,a_{n}\rho ,x_{1}\rho ,\dots ,x_{n}\rho \rangle \). In the following argument we suppress the symbol \(\rho \) and write \(a\in S_{n},b\in S\) and such like to stand for \(a\rho \in S_{n},b\rho \in S\).

Thus we have a semigroup chain \(S_{1}\le S_{2}\le \dots \le S_{n}\le \dots \le S\). Observe that \(S_{n}\in {\mathscr {C}}\) as for any \(a,b\in S_{n}\) we put \(x=x_{n}\) and note that \(axb=abx\) as \(x_{n}\) commutes with each member of \(A_{n}\). However we now show that \(S\not \in {\mathscr {C}}\). Put \(a=a_{1}\) and let \(x\in S\). Then \(x\in S_{n}\) for some n. Put \(b=a_{n+1}\) and consider \(axb=a_{1}xa_{n+1}\). Since \(a_{n+1}\) commutes only with \(x_{j}\) where \(j\ge n+1\), we infer that any word \(w\in F_{A}\) such that \(a_{1}xa_{n+1}=w\) in S has the form \(w=w_{1}a_{n+1}\) where \(w_{1}\) is the result of a permutation of the letters of \(a_{1}x\). In particular \(a_{1}xa_{n+1}\ne a_{1}a_{n+1}x\) in S. Hence S does not satisfy the equation system (24). Therefore \({\mathscr {C}}\) cannot be defined by equations using only \((\forall \dots )(\exists \dots )\) quantification.

Next we wish to show that definition by a \((\exists \dots )(\forall \dots )\) quantification for \({\mathscr {C}}\) is also impossible. This is done by identifying a semigroup chain \(S_{1}\le S_{2}\le \cdots \), where no member of the chain lies in \({\mathscr {C}}\), yet their union \(S=\cup _{n=1}^{\infty }S_{n}\) is a semigroup in \({\mathscr {C}}\). The existence of such a chain together with the chain of the previous paragraph establishes our claim.

Let \(F_{n}\) be the free semigroup on \(A_{n}=\{a_{1},\dots ,a_{n},x_{1},\dots ,x_{n}\}\) \((n\ge 1)\) and let \(F_{A}\) denote the free semigroup on \(A\) \(=\cup _{n=1}^{\infty }A_{n}\). Let \(\rho \) be the congruence on \(F_{A}\) generated by \(\rho ^{0}\) where

$$\begin{aligned} \rho ^{0}=\{(ax_{n+1}b,\,abx_{n+1}),\,n\ge 1,\,a\in F_{n},\,b\in F_{A}\} \end{aligned}$$

and put \(S=F_{A}/\rho \). Then \(S\in {\mathscr {C}}\) for let us take any \(a\rho \in S\). Then \(a\in F_{n}\) for some \(n\ge 1\). Take \(x=x_{n+1}\) and let \(b\rho \in S\). Then \((ax_{n+1}b,\,abx_{n+1})\in \rho ^{0}\), so that in S we have \(axb=abx\), and therefore \(S\in {\mathscr {C}}\).

Now define \(S_{n}=\{a\rho :a\in F_{n}\}\). Then \(S_{n}\le S\) and we have the semigroup chain \(S_{1}\le S_{2}\le \cdots \le S_{n}\le S_{n+1}\le \cdots \), with S being the union of this chain. Now take \(a=a_{n}\) and consider \(a\rho \), \(x\rho ,b\rho \in S_{n}\). Then \(a\rho x\rho b\rho =(a_{n}xb)\rho \). Since \(a_{n}\not \in F_{m}\) for any \(m<n\), the only \(\rho ^{0}\) pairs involving a word with initial letter \(a_{n}\) contain some letter \(x_{n+m}\not \in A_{n}\) \((m\ge 1)\). Hence \(a_{n}xb\ne a_{n}bx\) in \(S_{n}\), for any b that is not a power of x, from which we infer that \(S_{n}\) does not satisfy the equation system (24). We conclude that \(S_{n}\not \in {\mathscr {C}}\) for all \(n\ge 1\) but, as we have witnessed, \(S\in {\mathscr {C}}\).

We therefore conclude that \({\mathscr {C}}\) cannot be defined by a basis of equation systems of the type \((\forall \cdots )(\exists \cdots )\) nor of the type \((\exists \cdots )(\forall \cdots )\). \(\square \)