1 Introduction

This paper aims at studying the tomographical problem of reconstructing convex polyominoes from their orthogonal projections, using tools from the area of combinatorics on words.

Discrete tomography has its own mathematical theory mostly based on discrete mathematics. It shows connections with combinatorics and geometry, and the mathematical techniques developed in this area find applications in other scientific fields such as: image processing (Shliferstein and Chien 1978), statistical data security (Irving and Jerrum 1994), biplane angiography (Prause and Onnasch 1996), graph theory (Brlek and Frosini 2016; Frosini et al. 2013), to name a few. For a survey of the state of the art of discrete tomography we refer the reader to the books edited by Herman and Kuba (1999, 2007).

Interestingly, mathematicians have been concerned with abstract formulations of these problems before the emergence of practical applications. Many problems of discrete tomography were first considered as combinatorial problems during the late 1950s and early 1960s. Ryser (1963) and Gale (1957) in 1957 gave a necessary and sufficient condition for a pair of vectors to be the row and column sums, later called horizontal and vertical projections, of an \(m \times n\) binary matrix, and they also defined an O(nm) time algorithm to provide one of them. We refer the reader to an excellent survey on binary matrices with given row and column sums by Brualdi (1980). In general, the number of matrices sharing the same projections grows exponentially with their dimension, so in most practical applications some extra information are needed to achieve a solution as close as possible to a starting unknown object. So, researchers tackle the algorithmic challenges of limiting the class of possible solutions in different ways: increasing the number of projections (Gardner and Gritzmann 1997; Gardner et al. 1999), fixing the wideness of the unknown object (Dulio et al. 2015, 2016, 2017b) or adding geometrical information (mainly connectedness and convexity). Concerning this last, among connected sets a dominant role deserved polyominoes, that are commonly intended as finite 4-connected sets of points of the integer lattice, considered up to translation.

In particular, convex polyominoes are very natural objects, as they can be viewed as the discrete counterpart of Euclidean convex sets. It is remarkable that several problems from various research areas about them remain open.

For instance, concerning enumeration, no exact result has been determined about them. Bodini et al. (2013) performed an asymptotic analysis to obtain a combinatorial symbolic description of convex polyominoes, to analyze their limit properties and to define a uniform sampler.

Our research follows the mainstream of studying the reconstruction of convex polyominoes from orthogonal projections. Different approaches to this problem have been considered in the past (see for example Balazs and Gara 2008), providing interesting results on several classes of polyominoes, and leaving the main problem still unsolved (see again Herman and Kuba 1999, 2007).

In particular, Barcucci et al. (2001) defined an interesting strategy for the reconstruction of polyominoes that are convex along the horizontal and vertical directions only, say hv-convex polyominoes, from the projections along them. Their polynomial time algorithm consists of two separate parts: it first reconstructs an internal hv-convex kernel of points which is common to all the convex polyominoes having the input projections; then, it expands the kernel maintaining the hv-convexity by means of a 2-SAT logic formula, one of whose valuations, representing a solution of the problem, can be computed in polynomial time. We underline that this reconstruction strategy provides one convex polyomino among exponentially many that may satisfy a couple of given projections.

The kernel reconstruction iteratively uses four filling operations that have become quite common when dealing with convex sets. As a matter of fact several studies after (Barcucci et al. 2001) have been devoted to enhance the efficiency and to modify the target of the four filling operations by modifying them in order to speed up the reconstruction process (Gȩbala 1998) [a fifth operation was also introduced in Brunetti et al. (2001) and studied in Brunetti et al. (2006)] and to specialize the reconstruction to different convex polyominoes subclasses (Brunetti et al. 2001).

Recently, the problem of reconstructing convex polyominoes from two projections has been approached by Gérard (2017) who considered the possibility of a direct extension of the second part of the strategy in Barcucci et al. (2001) on a convex kernel whose reconstruction can be performed in polynomial time (Graham 1972). As Gérard pointed out, such a direct approach has to manage, in general, complex relationships between points and needs, at a first sight, a more complex logic formulation, not belonging to 2-SAT any more.

In our study, we show a possible way of performing the kernel expansion that uses an alternative characterization of convex polyominoes given by Brlek et al. in terms of combinatorial properties of words coding their contour (Brlek et al. 2009). This paper is an extended and enhanced version of Dulio et al. (2017a), where the authors provided a local geometrical property that allows the addition of one single point along the convex border of a kernel without losing its convexity. Here, we push this result further by first characterizing a set of suitable positions where one point at a time can be added, still preserving the convexity. Such a result provides a way of performing multiple expansions of a convex kernel, one or more points at a time, until reaching given horizontal and vertical projections, and following the reconstruction strategy of Barcucci et al. (2001). Then, we show some cases where the addition of one or more points cause a loss of convexity, so forcing new points to be added to re-establish the property. The existence of these situations may prevent, in general, the convex kernel from reaching the desired horizontal and vertical projections and may cause a failure in the reconstruction process. Furthermore, in these cases, we show how convexity can be recovered without the use of a logic formula as suggested in Gérard (2017). Finally, an example of a class of convex polyominoes whose reconstruction can be performed in polynomial time by the defined kernel expansion is also presented.

The paper is organized as follows:

In Sect. 2 we present the problem of reconstructing (finite) sets of points from projections and we focus on hv-convex polyominoes sketching the reconstruction strategy defined in Barcucci et al. (2001). Then, we introduce the notions of Christoffel and Lyndon word, that will be used in Sect. 3 to characterize convexity.

In Sect. 3 we characterize some positions in the contour of a polyomino where it is possible to add one or more points in order to maintain the convexity during the reconstruction process. Examples of single or double points additions that do not maintain the convexity are also shown. Finally, we provide a class of WN-paths that can be expanded by adding points obtained with the split operation.

The last section contains some comments on the presented results and some hints for future researches that our study might open.

2 Preliminaries and known results

A planar discrete set S is a finite subset of points of the integer lattice \({\mathbb {Z}}^2\) considered up to translation, and it is commonly represented as a set of cells on a squared surface. The dimensions of the set are those of its minimal bounding rectangle, as shown in Fig. 1a.

A polyomino is a 4-connected discrete set of cells (see Fig. 1b–d). There is a vast literature on polyominoes, so for further definitions and results, we address the interested reader to Guttmann (2009).

A column (resp. row) of a polyomino is the intersection between the polyomino and an infinite strip of cells whose centers lie on a vertical (resp. horizontal) line.

Several subclasses of interest were considered by putting on polyominoes constraints defined by the notion of convexity along different directions. In particular, considering the horizontal and vertical directions, it turns out that a polyomino is h-convex (resp. v-convex) if each of its rows (resp. columns) is connected (see Fig. 1b). A polyomino is hv-convex, if it is both h-convex and v-convex (see Fig. 1c).

Concerning the notion of convexity, there are different definitions in case of discrete sets of points that take care of pathological situations that may arise when continuous shapes are discretized and that are mainly due to the fact that the discretization process does not preserve connectedness or convexity. Since the present research considers polyominoes only, connectedness is assumed, so we consider a polyomino P to be convex, if its convex hull contains no integer points outside P, where the convex hull of P is defined as the intersection of all Euclidean convex sets containing P. Obviously, a convex polyomino is also hv-convex.

Fig. 1
figure 1

a A discrete set and its representation as a set of cells inside the minimal bounding rectangle of dimensions \(7 \times 6\); b a v-convex polyomino; c a hv-convex polyomino that is not convex. The grey cell does not belong to the polyomino, but it is included in its convex hull; d a convex polyomino. The convex hull is represented and no cells outside the polyomino belong to it

2.1 Outline of discrete tomography

To each discrete set S of dimensions \(m \times n\), we can associate two integer vectors \(H=(h_{1},\ldots ,h_{m})\) and \(V=(v_{1},\ldots ,v_{n})\) such that for each \(1 \le i \le m\), \(1\le j \le n\), \(h_{i}\) and \(v_{j}\) are the number of cells of S which lie on row i and column j, respectively. We call the vectors H and V horizontal and vertical projections of S, respectively. As an example, the projections of the \(7\times 6\) discrete set in Fig. 1a are

$$\begin{aligned} H=(1,1,2,0,2,2,1) \,\,\,\, \text{ and } \,\,\,\,V=(1,3,2,1,1,1). \end{aligned}$$

One of the main aims in the field of discrete tomography is the achievement of a faithful reconstruction of an unknown object, regarded as a discrete set of points at a certain resolution, from a set of projections along discrete directions. The existence of different sets of points sharing the same projections may dramatically change into meaningless the whole process, so the relevance of considering some a priori information that may guide the reconstruction process toward an element or a smaller set of elements of a specific subclass of discrete sets.

In particular, Barcucci et al. (2001) defined an algorithm that reconstructs an hv-convex polyomino compatible with a given couple of horizontal and vertical projections, if it exists, and that runs in \(O(m^4n^4)\), where \(m\times n\) are their dimensions. In Del Lungo et al. (1996), it has been proved that imposing the reconstructed object to belong to the class of hv-convex polyominoes does not guarantee its uniqueness.

The strategy of the algorithm concerns the possibility to let a preprocessed hv-convex kernel grow by adding points in order both to keep the desired convexity, and to satisfy the projections. This additions are performed by coding the two constraints by a 2-SAT formula whose valuation can be obtained in polynomial time w.r.t. the number of clauses.

More precisely, on the input vectors H and V, the two stages reconstruction can be described as follows:

Stage 1: According to each possible placement of the elements of the polyomino touching the minimal bounding rectangle, detect the cells that are common to all the hv-convex polyominoes having H and V as horizontal and vertical projections, say the kernel. At the same time, a common external area is also detected, called the shell;

Stage 2: Label each cell not yet assigned, i.e., that lies between the kernel and the shell, with a boolean variable whose value determines the inclusion or the exclusion of the cell in the polyomino. Finally, define a 2-SAT formula involving those variables that encodes both the constraints imposed by the projections and the hv-convexity. The valuations of the formula determine all the possible hv-convex polyominoes having H and V as projections, if any.

A possible approach to the reconstruction of convex polyominoes consists in modifying the above algorithm as follows: Stage 1 is enriched with a further operation that produces the convex hull of the detected kernel. The complexity of this new step can be performed in polynomial time (Graham 1972).

In Stage 2, it can be defined a different formula to encode the convexity constraint and whose valuations determine all the solution of the convex polyomino reconstruction problem. As underlined by Gérard (2017), this formula may involve clauses with at most three literals at a time (so belonging to 3-SAT) and whose valuation, in general, is not available in polynomial time.

Our purpose is to provide some conditions to bypass the use of the 3-SAT formula in Stage 2 and perform the kernel expansion maintaining both the convexity constraint at each step, and the polynomiality of the whole process. These conditions rely on the possibility of defining the geometry of the border of a convex kernel by means of combinatorial properties of the related boundary word, as described below.

2.2 Notions of combinatorics of words related to discrete geometry

From Lothaire (1997) we borrow all the basic standard terminology in combinatorics on words: alphabet, word, length of a word, occurrence of a letter, factor, prefix, suffix, period, conjugate, primitive, reversal, palindrome etc\(\dots \) The related notations will be recalled when used.

2.2.1 Christoffel words

In discrete geometry, the theory of Christoffel words started in Christoffel (1875), has been considered in the last decades and has acquired a prominent role in the study of digital straightness, see for example (Eckhardt 2001). For the discretization of line’s segments: let a, b be two co-prime numbers, the lower Christoffel path of slope a/b is defined as the connected path in the discrete plane joining the origin O(0, 0) to the point (ba) such that it is the nearest path strictly below the Euclidean line segment joining these two points, that is, there are no points of the discrete plane between the path and the line segment, see Fig. 2a.

Analogously, an upper Christoffel path is defined as the nearest path that lies above the line segment, as depicted in Fig. 2b. By convention, the Christoffel path is exactly the lower Christoffel path.

In this study, without loss of generality, we consider Christoffel paths whose point (ba) lies in the first quadrant. To each such Christoffel path it can be associated a word, say Christoffel word, on the binary alphabet \(A=\{0,1\}\), such that the letter 0 is associated with an horizontal step, and the letter 1 is associated with a vertical step.

Example 1

Consider the line segment joining the origin O(0, 0) to the point (8, 5). We have \(a=5,b=8\) and \(n=a+b=13\). The Christoffel word of slope 5/8 is \(C(\frac{5}{8})= 0010010100101\), as represented in Fig. 2a.

The slope a/b of a Christoffel path can be obtained from the related Christoffel word w as \(\rho (w)=\frac{|w|_1}{|w|_0}\), where the notation \(|w|_x\) stands for the number of occurrences of the letter x in w. We further define \(\rho (\epsilon )=1\) and \(\rho (\frac{k}{0})=\infty \), for \(k>0\). We recall the following, well known property from (Berstel et al. 2008):

Property 1

Any Christoffel word w of length greater than one can be written as \(w=0w'1\), where \(w'\) is a (possibly void) palindrome.

The part \(w'\) of w of the previous property is called the central part of w. Note that the lower and upper Christoffel words have the same central part.

Finally, we define the minimal point m(w) of a Christoffel word w to be the unique point of the related path that has maximum distance from the line segment (see Fig. 2a).

Fig. 2
figure 2

The Lower a and Upper b Christoffel paths of the line segment of slope 5/8, and the minimal point m(w). The related Christoffel words are 0010010100101 and 1010010100100, respectively

The uniqueness of the minimal point of a Christoffel path can be obtained from the uniqueness of another point of the Christoffel path that is the closest to the related line segment. The standard factorization of a Christoffel word introduced by Borel and Laubie split the word at this closest point:

Theorem 1

(Theorem 2 Borel and Laubie 1993). A Christoffel word w has a unique factorization \(w=uv\) (indicated as standard factorization), where u and v are both Christoffel words.

Later, Chuan (1997) defined the notion of palindromic factorization of a Christoffel word as its unique factorization into two palindromic subwords that always occurs at its minimal point.

The result can be obtained from Theorem 1 and Property 1: starting from the standard factorization of a Christoffel word \(w=(u,v)\), we can apply Property 1 to each of the Christoffel words wu and v, and we obtain the results as sketched in the Fig. 3.

Fig. 3
figure 3

The words, w, u and v are Christoffel words with palindromic central parts \(w_1\), \(u_1\), and \(v_1\) resp. We have that \(u_1\) and \(v_1\) are palindromes, \(u_1 10 v_1 = w_1\) is also palindrome, thus \(w_1 = v_1 01 u_1\). The property of palindromes gives the palindromic factorization of \(w=p_1 p_2\)

So, since the standard factorization uses the unique minimal point of the Christoffel path closest to the line segment, then by construction, the palindromic factorization uses the unique furthest point, that turns out to be unique as well. The uniqueness of the minimal point represents a crucial result in our study.

2.2.2 Lyndon words

The second relevant class of words that we consider is that of Lyndon words introduced by Lyndon in 1954. Among many different characterizations (see Lothaire 1997), we present Lyndon words as those words that are strictly smaller than their proper conjugates with respect to the lexicographical order. By definition, we note that Lyndon words are always primitive, i.e., they cannot be expressed as power of a strictly shorter word. Lyndon words became immediately very popular and, among others, they have applications in constructing bases in free Lie algebras and finding the lexicographically smallest or largest substring in a string.

The following factorization on Lyndon words is from Lothaire (1997)

Theorem 2

Every non-empty word w admits a unique factorization as a lexicographically decreasing sequence of Lyndon words \( w= w_1^{n_1}w_2^{n_2}\cdots w_k^{n_k}\), such that \(w_1>_l w_2>_l \ldots >_l w_k\), \(n_i \ge 1\) and \(w_i\) are Lyndon words for all \(1\le i\le k\).

As an example, consider the word \(w=0010101001001101001001\). Its standard factorization is \((0010101)^1(001001101)^1(001)^2\). A linear time algorithm to factorize a word can be found in Duval (1983).

Lyndon words are minimal under cyclic shift (in their conjugate class); cyclic shifts of Cristoffel words are also studied, for instance, in Hegedus and Nagy (2016).

3 Adding points to a convex polyomino

The Freeman code associates to each polyomino its boundary word, see Freeman (1961), i.e., the word on a four letter alphabet \(A'=\{0,{\overline{0}},1,{\overline{1}}\}\) obtained by coding the path that clockwise follows the boundary of the cell representation of the polyomino starting from a specific point s. The letters \({\overline{0}}\) and \({\overline{1}}\) represent the horizontal step and the vertical step when travelled in the opposite directions with respect to 0 and 1, respectively. If the polyomino is hv-convex, we can identify four points W, N, E and S as the points where the polyomino’s boundary first touches the west, north, east and south sides of its minimal bounding rectangle, respectively, when moving clockwise along it.

If we choose to set \(s=W\), then the boundary word can be uniquely decomposed into four different paths joining the four extremal points W, N, E and S defined by their positions as depicted in Fig. 4. The path leading from W to N is called WN-path, and it is WN-convex if it is the WN-path of a convex polyomino. The notions of NE, ES, and SW paths and the related notions of NE, ES, and SW convexity can be similarly defined. If convex, each path uses at most two of the four steps of the Freeman alphabet.

Fig. 4
figure 4

An hv-convex polyomino and its boundary word \(w=w_1 w_2 w_3 w_4\), where \(w_1=10100101\) is the WN-path, \(w_2= 00{\overline{1}}{\overline{1}}000{\overline{1}}0{\overline{1}}00{\overline{1}}0\) is the NE-path, \(w_3={\overline{1}}{\overline{1}}{\overline{0}}{\overline{0}}{\overline{1}}\) is the ES-path and \(w_4={\overline{0}}{\overline{0}}{\overline{0}}{\overline{0}}11 {\overline{0}}{\overline{0}}1 {\overline{0}}{\overline{0}}1{\overline{0}}{\overline{0}}{\overline{0}}\) is the SW-path

3.1 Perturbations on WN-convex paths

From now on, we will consider the WN-path only, since all the obtained results can be extended to the other three paths up to rotations. Brlek et al. (2009) characterized the boundary words of a convex polyomino using the combinatorial notions of Christoffel and Lyndon words

Theorem 3

A word w is WN-convex if and only if its unique Lyndon factorization \(w_1^{n_1}w_2^{n_2}\dots w_k^{n_k}\) is such that all \(w_i\)’s are Christoffel words.

Such a result stresses the fact that the Lyndon factorization of a WN-convex path can be decomposed in a sequence of Christoffel words arranged in decreasing slope. Figure 5 depicts a part of a WN-convex path where there are highlighted the minimal points \(m(w_i)\) of each Christoffel word \(w_i\), with \(i=1 \dots 4\). Let us indicate with \(min(w_i)\) the length of the prefix of \(w_i\) ending in \(m(w_i)\).

Our aim is now to use this decomposition to determine a set of positions of a WN-convex path where it is possible to make local modifications, i.e., adding one single point, without losing the property. As outlined in the Introduction, this result shows its relevance in a reconstruction strategy for convex polyominoes that relies on the approach in Barcucci et al. (1996). More precisely, the kernel expansion defined there, asks to add points to the kernel border in order to satisfy some given projections: our results characterize positions’ sets where such an expansion can be performed and show some cases where multiple additions are required in order to preserve convexity.

Fig. 5
figure 5

A (part of a) WN-convex path and its decomposition into four Christoffel words \(w_1=001010101\), \(w_2=001010010101\), \(w_3=0001001001\), and \(w_4=00000100001\) arranged in decreasing slope. The four minimal points of each segment are highlighted

The following proposition from Dulio et al. (2017a) stresses the role of the positions min(w) and \(min(w)+1\) of a Christoffel word w, in order to decompose it into two shortest Christoffel words:

Proposition 1

Let w be a Christoffel word of length n and \(k=min(w)\).

(i):

The words \(u=w[1,k-1]\,1\) and \(v=0\, w[k+2,n]\), are two Christoffel words, where the notation w[ij] indicates the subword of w from position i to j, with \(1\le i \le j \le n\).

(ii):

For each nonnegative integer \(k'\) different from k, the words \(u'=w[1,k'-1]\, 1\) and \(v'=0\, w[k'+2,n]\) are not both Christoffel words.

A useful consequence follows.

Corollary 1

Let w, u and v be as defined in Proposition 1. It holds \(\rho (u)>\rho (v)\).

Proof

We observe that, by definition of Christoffel path, the points \((|u|_0,|u|_1)\) and \((|v|_0,|v|_1)\) lies above and below the segment associated to the Christoffel word w, respectively. So, it holds that \(\rho (u)>\rho (w)>\rho (v)\). \(\square \)

3.2 Definition of the split operator

Relying on the previous results, let us define a split operator that acts on a Christoffel word w and decomposes it into the concatenation of two Christoffel words u and v by changing the subword 01 in positions \(w[m(w),m(w)+1]\) into 10, as defined in Proposition 1, i.e., \(split(w)=u\, v\). The split operator can be naturally extended to sequences of Christoffel words, by adding an index to indicate the word where the split operator acts. More formally, if \(w=w_1 \, w_2 \dots w_n\) is a sequence of Christoffel words, then \(split_t(w)=w_1 \, w_2 \, \dots split(w_t)\dots w_n\). Accordingly, consecutive applications of the split operator will be indexed by the corresponding sequence of indexes, in application order.

From a geometrical point of view, the split of the Christoffel word can be regarded as the addition of one point in the minimal position of the related path, preserving the Christoffel property of the obtained factors. If we consider the decomposition of a WN-convex path defined in Theorem 3, then the split operator can be used to add one point at a time on the boundary of a convex polyomino. Unfortunately, when sequences of Christoffel words with decreasing slopes are involved as in a WN-convex path, the application of the split operator may not preserve the decreasing of the slopes and cause the loss of the global convexity of the path.

As an example, Fig. 6 shows two cases of this action on \(w_2\): the added cells and the factors \(u_2\) and \(v_2\) are highlighted.

Property 2

Let w be a Christoffel word of slope \(\rho (w)>1\). If \(split(w)=u\,v\), then \(\rho (u)> \rho (w)> \rho (v)\ge 1\).

Proof

By Corollary 1, it only remains to prove that \(\rho (v) \ge 1\). By the geometrical definition of Christoffel word, \(\rho (w) > 1\) implies that \(w=w'11\). Since v is a Christoffel word, if its length is greater than two, then it ends with the factor 11 as well and its slope is greater than one. On the other hand, if it has length equal to two, then \(v=01\) and its slope is one. \(\square \)

A symmetric reasoning holds if \(\rho (w)<1\). We stress that, if \(\rho (w)=1\), i.e., \(w=01\), then the word \(w^k\) with \(k\ge 1\), can be split into two different Christoffel words u and v by changing any of the factors 01 into 10, and it holds that \(\rho (u)>1\) and \(\rho (v)<1\), by changing any of the factors 01 into 10.

The previous property allows us to consider, without loss of generality, the action of the split operator only on those words whose slopes range from one down to zero.

In the next section, we investigate different situations which may occur to a WN-path as a consequence of one or more applications of the split operator. We also show solutions and drawbacks when problems are caused. The following cases are classified according to the number of points added at each step.

3.2.1 Classification of the splitting action

The split operator produces perturbations on the path, which can be classified in three different types according to the slopes of the obtained factors:

Type 1::

the added point performed by the split operator preserves both the Lyndon factorization and the global convexity (see Fig. 6a). This means that the two new factors \(u_i\) and \(v_i\) globally preserve the decreasing slope of the line segments of the path.

Type 2::

the added point preserves the convexity of the obtained path, but the Lyndon factorization is not preserved. In practice, \(w_{i-1} u_i v_i w_{i+1}\), with \(w_{i+1}\) possibly void, is not a Lyndon factorization, i.e. the slopes of the line segments of the path are not decreasing. As an example, we can obtain a new Lyndon factorization by joining \(w_{i-1}\) and \(u_{i}\) in a new Christoffel word \((w_{i-1} u_{i}) v_{i} w_{i+1}\), as shown in Fig. 6b.

Type 3::

the added point does not preserve both the convexity of the path, and the Lyndon factorization. In practice, \(w_{i-1} u_i v_i w_{i+1}\), with \(w_{i+1}\) possibly void, is not a Lyndon factorization and the new Lyndon factorization is not composed by Christoffel words only, as we can see in Example 2. In this case it is necessary to act on the path by adding at least a second point to get back the convexity, as shown in the example below.

Fig. 6
figure 6

Two WN-paths of a convex polyomino. The application of the split operator to (the minimal point of) \(w_2\) in a preserves the Lyndon factorization and the global convexity, while in b preserves the convexity of the whole path, but not the initial Lyndon factorization, i.e., \(w_1w_2=(001010101)(00101)\). The new Lyndon factorization requires \(w_1\) to be concatenated with \(u_2\), obtaining the word \(w_4=w_1\, u_2\). The new Lyndon factorization is \(w_4v_2=(001010101 \, 01)(001)\)

3.2.2 Adding one point

As simplest case, we consider the splitting of a single word \(w_i\) in the Lyndon factorization of the WN-convex path.

Example 2

Let \(w_1\) and \(w_2\) be two Christoffel words, with \(\rho (w_1)=\frac{3}{5}>\rho (w_2)=\frac{11}{20}\) as in Fig. 7. The application of the split operator to \(w_1\) produces \(split(w_1)=u_1\, v_1\), with \(\rho (u_1)=\frac{2}{3}\) and \(\rho (v_1)=\frac{1}{2}\), so the sequence of slopes \(\rho (u_1)\), \(\rho (v_1)\) and \(\rho (w_2)\) is not decreasing. Furthermore, the Lyndon factorization after the split is \(u_1\, (v_1 \, w_2)\), with \((v_1 \, w_2)\) that is not a Christoffel word, so the corresponding path is not WN-convex any more. We can fix the problem by replacing a subword 01 with a subword 10 in \((v_1 \, w_2)\), obtaining the word \(w_3\) indicated below (changed elements are in boldface). The change of 01 into 10 corresponds to the insertion of a second point in the initial WN-path, as shown in Fig. 7:

$$\begin{aligned} (v_1 \, w_2)= & {} 001 \, 0010010010010\,\mathbf{01}\,0100100100100101\\ w_3= & {} 001 \, 0010010010010\,\mathbf{10}\,0100100100100101. \end{aligned}$$

Now, the WN-convexity is acquired again and \(v_1 w_2\) changes into the Christoffel word \(w_3\) (right path of Fig. 7).

Fig. 7
figure 7

The split of the Christoffel word \(w_1\) into \(u_1\) and \(v_1\). The concatenation of \(v_1\) and \(w_2\) forces the addition of a second point

3.2.3 Adding two points

Now, we push further our research by considering the case of the addition of two points in consecutive line segments of a WN-convex path: the results we are going to present can be generalized to the addition of a generic number of points.

A fortiori when adding two points in a path, one can expect to lose the WN-convexity, so in Dulio et al. (2017a), sufficient conditions to its preservation are provided:

Theorem 4

Let \(w_1\) and \(w_2\) be two consecutive Christoffel words of a WN-convex path, and let \(split(w_1)=u_1\, v_1\) and \(split(w_2)=u_2 \, v_2\). If \(\rho (v_1)>\rho (w_2)\) and \(\rho (w_1)>\rho (u_2)\) (i.e. the split operator causes two perturbations of Type 1 to \(w_1\) and \(w_2\), separately), then \(\rho (v_1)>\rho (u_2)\).

On the other hand, if the splitting of \(w_1\) and \(w_2\) causes \(\rho (v_1)< \rho (u_2)\) different situations occur; in particular it may happen that one or more new points need to be added to gain back the WN-convexity.

The next example shows the case when additional points have to be added to gain back the WN-convexity of a path:

Example 3

Let \(w_1\) and \(w_2\) be two Christoffel words of a WN-convex path such that \(\rho (w_1)=\frac{30}{41}>\rho (w_2)=\frac{5}{7}\). Let \(split(w_1)=u_1\, v_1\) and \(split(w_2)=u_2\, v_2\) with \(\rho (u_1)=\frac{11}{15}\), \(\rho (v_1)=\frac{19}{26}\), \(\rho (u_2)=\frac{3}{4}\) and \(\rho (v_2)=\frac{2}{3}\) as in Fig. 8. The four slopes are not in decreasing order, in particular \(\rho (v_1)< \rho (u_2)\), with

$$\begin{aligned} v_1=0010100101010010101001010\, \mathbf{01}\, 010100101010010101, \text{ and } u_2=0010101. \end{aligned}$$

As in the previous section, the concatenation of \(v_1\) and \(u_2\) does not produce a Christoffel word. Then, to get back the convexity, a third point has to be added to the path, precisely by changing the boldface occurrence of 01 in \(v_1\) into 10, and obtaining the word:

$$\begin{aligned} w_3=0010100101010010101001010 \, \mathbf{10} \, 0101001010100101010010101. \end{aligned}$$

In general, the point or the points to be added can be detected by checking where \((v_1\,u_2)\) differs from the Christoffel word of slope \(w_3=\frac{|(v_1\,u_2)|_0}{|(v_1\,u_2)|_1}\).

We underline that \(w_3\) is not primitive, being the concatenation of two Christoffel words of slope \(\frac{11}{15}\). With such an addition the slopes are in decreasing order as required by the WN-convexity.

Fig. 8
figure 8

A qualitative representation of the situation of Example 3. The splits of \(w_1\) into \(u_1\) and \(v_1\) and \(w_2\) into \(u_2\) and \(v_2\) do not preserve the convexity of the WN-path. A third point (in red) is added to obtain a new Lyndon factorization \(u_1 w_3 v_2\) and to gain back the decreasing order of the slopes

Finally, a second case occurs when splitting the two Christoffel words \(w_1\) and \(w_2\) produces factors \(u_1, v_1, u_2, v_2\) and the concatenation of \(v_1\) and \(u_2\) gives a Christoffel word \(w_3\), yet the convexity is not preserved (which means that the order of the slopes is not correct). To solve this problem, again we need to add other points, as we can see in Example 4.

Example 4

Let \(w_1,~w_2\) be two Christoffel words in the WN-path boundary of a polyomino P and such that \(\rho (w_1)=\frac{3}{5}>\rho (w_2)=\frac{57}{100}\) (the words \(w_1\) and \(w_2\) are not reported for simplicity). By applying the split operator to both words, we get \(split(w_1)=u_1v_1\) and \(split(w_2)=u_2v_2\), with \(v_1=001\) and \(u_2=00100100101\). Now, the slopes of these four paths are: \(\rho (u_1)=\frac{2}{3}\), \(\rho (v_1)=\frac{1}{2}\), \(\rho (u_2)=\frac{4}{7}\), and \(\rho (v_2)=\frac{53}{93}\). We observe that the slopes are not in a decreasing order, since \(\rho (v_1)<\rho (u_2)\). By concatenating \(v_1\) and \(u_2\) we get \(w_3=00100100100101\) that is a Christoffel word of slope \(\frac{5}{9}\). Unfortunately \(\rho (u_1)>\rho (w_3)<\rho (v_2)\), which means that the order of the slopes is still not correct and the convexity is not respected. Hence, some more points have to be added to obtain back the convexity. In this case only one more point is needed, placed exactly where the two Christoffel words \(w_3\) and \(v_2\) join.

3.3 A remarkable class of WN-paths

In Dulio et al. (2017a) a stability result for WN-convex paths has been given, under a sequence of split operations. However, no class of WN-convex paths to which such a sequence of operations can be effectively applied has been presented. Indeed, all the assumptions of Theorem 4 must be fulfilled at each step, so the problem of real application of the method is not trivial. In what follows we define a family \(\mathcal {WN}\) of WN-convex paths with such a property.

Let us consider the set PER, introduced in de Luca and Mignosi (1994), of all words w having two periods p and q which are coprimes and such that \(|w|=p+q-2\), i.e. those words having maximal length and not satisfying the Fine and Wilf theorem in Fine and Wilf (1965). Also, we denote by CP the set of all Christoffel words, and by PAL the set of all palindromes.

The following properties hold (see Berstel and de Luca 1997; de Luca 1997)

  1. (a)

    \(PER=0^{*}\cup 1^{*}\cup (PAL\cap (PAL\; 01\; PAL))\) (de Luca 1997, Proposition 7).

  2. (b)

    \(CP=0\; PER\; 1\cup \{0,1\}\) (Berstel and de Luca 1997, Theorem 4.1).

We investigate the following set \(\mathcal {WN}\) of words

$$\begin{aligned} \mathcal {WN}=\left\{ w_{i,j}\in \{0,1\}^\star ,\, w_{i,j}=0((0)^i1)^j,\,\,i,j\in {\mathbb {N}}\right\} . \end{aligned}$$

Theorem 5

For any three fixed positive integer numbers ijn, \(n\le i\), let \(w_{i,j}\in \mathcal {WN}\) and \(w_j(n)=w_{1,j}w_{2,j}\ldots w_{n,j}\). Then, the following holds

  1. 1.

    \(w_{i,j}\) is a Christoffel word;

  2. 2.

    \(w_j(n)\) represents a WN-path;

  3. 3.

    \(split (w_{i,j})=(0)^{i}1w_{i,j-1}\);

  4. 4.

    for each \(s\in \{1,\ldots ,n\}\), \(split_s(w_j(n))\) represents a WN-path;

  5. 5.

    for each \(s\in \{1,\ldots ,n\}\), \(split_{1,2,\ldots ,s}(w_j(n))\) represents a WN-path.

Proof

For \(i,j<2\) all results are trivial, so, let us assume \(i,j\ge 2\).

1.:

The word \(w_{i,j}\) can be written as

$$\begin{aligned} w_{i,j}=0\left( (0)^i1\right) ^{j-1}(0)^i1=0r_{i,j}1, \end{aligned}$$

where \(r_{i,j}=\left( (0)^i1\right) ^{j-1}(0)^i\), so that \(r_{i,j}\in PAL\). Let \(P=(0)^{i-1}\) and \(Q=\left( (0)^i1\right) ^{j-2}(0)^i\). Then \(P,Q\in PAL\), and \(r_{i,j}=P\;01\;Q\). Therefore, \(r_{i,j}\in PAL\cap (PAL\;01\;PAL)\), and consequently, by de Luca (1997, Proposition 7), \(r_{i,j}\in PER\). Therefore \(w_{i,j}\in 0\;PER\;1\), and consequently, by Berstel and de Luca (1997, Theorem 4.1), \(w_{i,j}\in CP\).

2.:

The prefix of length \(i+1\) of each conjugate word of \(w_{i,j}\) has the form \((0)^h1(0)^k\) where \(h,k\in \{0,1,\ldots ,i-1\}\), \(h+k=i\). Since the prefix of length \(i+1\) of \(w_{i,j}\) is \((0)^{i+1}\), then \(w_{i,j}\) is the smallest among all its conjugate words with respect to the lexicographic order. Therefore \(w_{i,j}\) is a Lyndon word. Consequently, the factorization \(w_j(n)=w_{1,j}w_{2,j}\ldots w_{n,j}\) is precisely the Lyndon factorization of \(w_j(n)\). By 1., \(w_{i,j}\in CP\) for all i, and consequently, by Brlek et al. (2009, Proposition 7), \(w_j(n)\) is WN-convex.

3.:

By definition of the Christoffel word \(w_{i,j}\), its slope is

$$\begin{aligned} \rho (w_{i,j})=\frac{|w_{i,j}|_1}{|w_{i,j}|_0}=\frac{j}{ij+1} \end{aligned}$$

Let \(Q_{ij}=(\alpha ,\beta )\) be the minimal point of \(w_{i,j}\). Then \(Q_{ij}\) has the maximal vertical distance from the line segment from (0, 0) to \((ij+1,j)\). By Dulio et al. (2017a, Lemma 1) it is obtained when \(\alpha j-\beta (ij+1)=-1 [mod (j+ij+1)]\), and consequently when \(\alpha =i+1\) and \(\beta =0\). Therefore independently of j we have

$$\begin{aligned} split (w_{i,j})=(0)^i10\left( (0)^i1\right) ^{j-1}=(0)^i1w_{i,j-1}. \end{aligned}$$
4.:

By 2. the word \(w_{s,j}\) represents a WN-path for each \(s\in \{1,\ldots ,n\}\). Consider two consecutive words \(w_{s,j}\) and \(w_{s+1,j}\) of the Lyndon factorization of \(w_j(n)\). By 3. we have \(split (w_{s,j})=u_{s,j}v_{s,j}\), where \(u_{s,j}=(0)^s1\) and \(v_{s,j}=0\left( (0)^s1\right) ^{j-1}\). Analogously, it results \(split (w_{s+1,j})=u_{s+1,j}v_{s+1,j}\), where \(u_{s+1,j}=(0)^{s+1}1\) and \(v_{s+1,j}=0\left( (0)^{s+1}1\right) ^{j-1}\). Therefore we have

$$\begin{aligned} \begin{array}{l} \rho (v_{s,j})=\frac{j-1}{sj-s+1}\\ \rho (w_{s+1,j})=\frac{j}{sj+j+1}\\ \\ \rho (w_{s,j})=\frac{j}{sj+1}\\ \rho (u_{s+1,j})=\frac{1}{s+1}. \end{array} \end{aligned}$$

Therefore we have \(\rho (v_{s,j})>\rho (w_{s+1,j})\) if and only if \(j^2-j-1>0\), which is always satisfied for \(j>1\). Also we have \(\rho (w_{s,j})>\rho (u_{s+1,j})\) if and only if \(j>1\). Consequently, all the assumptions of Dulio et al. (2017a, Theorem 3) are fulfilled, so that \(split_s(w_j(n))=w_{1,j}w_{2,j}\ldots split (w_{s,j})\ldots w_{n,j}\) represents a WN-path.

5.:

This immediately follows from Dulio et al. (2017a, Corollary 2). \(\square \)

Example 5

Below we list some words of \(\mathcal {WN}\), for small values of i and j:

$$\begin{aligned} \begin{array}{l} w_{1,1}=001\,\,\,\,(i=j=1)\\ w_{1,2}=00101\,\,\,\,(i=1, j=2)\\ w_{2,1}=0001\,\,\,\,(i=2, j=1)\\ w_{2,2}=0001001\,\,\,\,(i=j=2). \end{array} \end{aligned}$$

For \(i=j=1\) we have \(w_{i,j}=0r_{i,j}1\), where \(r_{i,j}\) is trivial, and the palindromic words PQ determined in the proof (part (1)) cannot be defined. The first non trivial word of \(\mathcal {WN}\) is obtained when \(i=j=2\), and, in this case we have \(r_{2,2}=00100\), so that \(r_{i,j}=P\;01\;Q\), where \(P=0\) and \(Q=00\). Assuming \(n=i=2\), we also have \(w_2(2)=w_{1,2}w_{2,2}=(00101)(000010001)=00101000010001\), and consequently

$$\begin{aligned} \begin{array}{l}split_1(w_2(2))=01001000010001\\ split_{1,2}(w_2(2))=01001000100001. \end{array} \end{aligned}$$

Example 6

The cases when \(j=2\) represent a kind of extremal WN-path. In fact \(\rho (v_{s,2})=\rho (u_{s+1,2})\) for all \(s\in \{1,\ldots ,n\}\) (see the proof of point 4. in Theorem 5), meaning that the application of the split operator to consecutive words of the Lyndon factorization of \(w_{i,2}\) provides (independently of i), pairs of collinear segments. In Fig. 9 it is shown the case when \(n=i=3\).

Fig. 9
figure 9

The three words \(w_{1,2}\), \(w_{2,2}\), \(w_{3,2}\), corresponding to the line segments AB, BC, and CD, respectively. The WN-convex path (bold solid line) determined by their concatenation \(w_2(3)=w_{1,2}w_{2,2}w_{3,2}\), and the WN-path (dashed line) obtained by the iterated application of the split operator

Example 7

If \(j>2\) the split operator provides four distinct segments from each pair of consecutive words of the original WN-convex path. For instance, if \(i\le j=3\), we have

$$\begin{aligned} \begin{array}{l} w_{1,3}=0010101\\ w_{2,3}=0001001001\\ w_{3,3}=0000100010001\\ \\ w_3(3)=0010101\cdot 0001001001\cdot 0000100010001\\ \\ split_{1,2,3}(w_3(3))=01\cdot 00101\cdot 001\cdot 0001001\cdot 0001\cdot 000010001, \end{array} \end{aligned}$$

where each slot represents a different line segment of the resulting WN-path.

Example 8

In the case of the word \(r_{i,j}\) associated with a generic \(w_{i,j}\in \mathcal {WN}\), \(i,j>1\), as described in the proof of Theorem 5, we have shown that \(P=(0)^{i-1}\) and \(Q=\left( (0)^i1\right) ^{j-2}(0)^i\). Therefore \(|P|=i-1\) and \(|Q|=ij+j-i-2\). By de Luca (1997, Section 5), \(p=|P|+2\) and \(q=|Q|+2\) are periods of \(r_{i,j}\) such that \(|r_{i,j}|=p+q-2\), which, in our case, precisely equals \(ij+j-1\).

4 Conclusions and perspectives

In this paper we have studied the possibility of reconstructing convex polyominoes, i.e. finite connected sets of points, from two orthogonal projections. Yan Gérard, in a recent communication (Gérard 2017), suggested an approach to the problem that resembles the strategy defined in Barcucci et al. (2001) for the super class of hv-convex polyominoes. Interestingly, the mutual dependencies between points in the contour of a convex polyomino are not clearly understood yet, preventing an immediate generalization of the reconstruction.

We have studied under which conditions and how the addition of one or more points to the contour of a convex polyomino may affect the neighboring areas, in the intent of handling, step by step, its reconstruction process without falling into non polynomiality. The obtained results show, on one side, that strong geometrical constraints are needed to maintain the convexity when the kernel of a convex polyomino is extended by means of the addition of new point, on the other side that there exist some classes of boundary paths where the addition of points can be performed in an unexpected simple way.

We have provided several examples illustrating local strategies that can be adopted, i.e., strategies involving one point or few close points, as well as some global results on a special class of WN-paths.

Summing up, something more remains to be investigated in order to show that the class of convex polyominoes can be reconstructed in polynomial time, as we would expect, basing on experimental results. In particular, it has prominent relevance the characterization of the positions and the number of the points that are included into a convex polyomino as consequence of a single further addition. A related research line includes the study of the combinatorial properties of those paths that have a mutual independent growth.