1 Introduction

1.1 Motivation

Fractals are fascinating mathematical objects, typically showing complicated geometrical patterns when drawn on the screen. Strikingly, such complex geometric shapes are generally obtained through very simple production rules. There are several ways to compute and display fractal objects. Popular procedures include escape-time fractals, recursive fractals, finite subdivision rules, random processes (e.g., Brownian motion, percolation clusters, diffusion-limited aggregation, Lévy flights, random walks), strange attractors of dynamical systems, L-systems, and others. The interested reader is referred to [1, 2] for a general overview on the field.

One of the most popular methods for fractal image generation is based on the so-called Iterated Function Systems (IFS). Basically, an IFS is a finite set of contractive affine maps \(\{\varphi _i\}_{i=1,\dots ,N}\), defined on a complete metric space, \({{\mathcal{M}}}\) [3, 4]. These affine maps depend on several parameters encapsulating different 2D geometrical operations (scaling, rotation, translation). The set of parameter values for all affine maps of the IFS is called the IFS code. It can be proved that every IFS has a unique compact fixed set, \(\mathcal{A}\), usually known as the attractor of the IFS. Interestingly, the graphical representation of this attractor is a fractal image.

The use of IFS goes beyond the generation of beautiful fractal images. Actually, one of the most exciting applications of the IFS appears in image reconstruction. Given a digital image, it is possible to obtain an IFS whose attractor replicates the input image accurately. Amazingly, this input image does not necessarily need to be a fractal image; it can be any image. This is a very surprising result, even shocking at first sight, which is based on the observation that very often, some parts of an image resemble other parts of the same image. This feature is also characteristic in fractal geometry, as the fractal objects exhibit the property of self-similarity: they show (at least, approximately) similar patterns at different scales [1]. In this way, the central task of fractal image reconstruction is to identify similar parts within a digital image and then compute affine transformations connecting them, so that the image can be (approximately) reconstructed through iterative application of such transformations on an initial image [5, 6].

Although labeled as a very promising technique, the IFS image reconstruction is severely affected by a limiting factor: given a 2D input image, it is very difficult to obtain the IFS code (i.e., the parametric values of the IFS) providing a close approximation of the input image. This problem, usually referred to as IFS fractal image reconstruction problem, has become more challenging than expected. Although several approaches have been published during the last decades to address this problem (see Sect. 2 for details), the field still lacks a general methodology solving this question in the general case.

An additional issue is that the current methods for IFS image reconstruction address the problem from a purely geometric standpoint. As a result, they can only be applied to binary images, while the case of colored images is not actually considered. A previous paper presented by the authors at the ISCMI 2021 conference [7] addressed the reconstruction of colored images with IFS by introducing the color as a new free variable vector. The method combined the cuckoo search algorithm and k-means clustering to perform reconstruction of both the geometry and the color of the fractal image. However, the method had several limitations. In this paper, the method introduced in our previous conference paper in [7] is extended and enhanced in several ways. The main contributions of this work can be summarized as follows:

  • In the previous conference paper, the number of contractive functions of the IFS was assumed to be known and given as an input of the method. Instead, in this paper this number is assumed to be unknown and computed in a fully automatic way.

  • In the previous paper, the geometry and the color were not specialized, but encapsulated into a single representation. In this new method, genetic algorithms are used to address the discrete color subproblem, while particle swarm optimization is applied to address the continuous geometric subproblem, according to the actual (discrete or continuous) nature of the free variables of the general problem.

  • In the previous paper, the geometry and the color were calculated through two different fitness functions, but their interplay was ignored. This limitation affected the accuracy of the results. In this new method, the interplay of the color and the geometry is embedded in the method by using a hybrid scheme, in which the genetic algorithms and the PSO do not work independently, but mutually coupled. Under this new strategy, the output of each method is used as input of the other in an iterative fashion. This new scheme takes into account the interplay of the geometry and color, as any new computation of one of them modifies automatically the results for the other.

  • A new local refinement step through local search heuristics is now considered for further improvement of the final solution.

  • The global fitness of the previous paper based on the arithmetic mean of the fitness functions for the geometry and the color is now replaced by a weighted convex combination of these individual fitness functions, and the best value for the weight is empirically determined.

As a consequence of these changes, the method provides much more accurate results than the previous method and in a fully automatic way (see Sect. 5 for details).

This paper is organized as follows: Section 2 discusses the previous work in the field. The mathematical background needed to follow the paper is presented in Sect. 3. The proposed method is explained in detail in Sect. 4. The obtained graphical and numerical results are discussed in Sect. 5. The paper closes with the main conclusions and some ideas for future work in the field.

2 Previous work

The notion of using fractals to encode digital images started with the seminal work by Michael Barnsley in the 1980s (see [5] for details). Based on earlier theoretical developments by Hutchinson in [4], he introduced the use of fractal transformations to encode images in [8] and then in [9], where the collage theorem was firstly presented. That theorem showed that the digital images can be efficiently approximated through iterated function systems, leading to the first approaches and some patents for fractal image compression.

Interestingly, the graphical representation and encoding of digital images through IFS are a highly asymmetric problem. Rendering the attractor of a given IFS (usually called the direct problem) is a quite trivial task. Simple, fast, and efficient methods can be found, for instance, in [1, 3, 10]. On the contrary, the IFS encoding problem, i.e., obtaining an IFS whose attractor approximates a given digital image accurately (also known as the inverse problem and also as the IFS image reconstruction problem), is extremely difficult and elusive. Several methods have been described in the literature to address the inverse problem. First algorithms for fractal images were introduced in [9, 11]. These early methods were later extended to an automatic approach in [12], based on the concept of the partitioned iterated function systems (PIFSs). However, these methods rely on exhaustive search procedures, which are slow and computationally expensive. Several methods were proposed to alleviate this computational load, mostly based on quadtrees, rectangular partitions, and triangular partitions, in some cases combined with clustering. But because they are mostly based on greedy algorithms, they are still slow and generally exhibit very low efficiency. The list of these methods is quite large, and they are out of scope to be discussed here. The interested reader is referred to [13] for a detailed review. Other approaches include Gröbner basis [14], wavelet transform [15], gradient search [16], and moment matching [17]. Unfortunately, they are also computational expensive and can only be applied to some (generally simple) particular cases, while failing to solve the general problem.

It has been pointed out that the inverse problem can be formulated as an optimization problem. Consequently, there have been attempts to solve it using classical mathematical optimization techniques. However, it was noticed that these techniques are not powerful enough to solve the general problem. Then, the focus shifted toward metaheuristic techniques, which are generally applied to problems unsolvable with the classical approaches; for instance, when the objective function is not differentiable and hence gradient-based techniques can no longer be applied. Other cases include multimodal problems and ill-conditioned problems.

As a feasible alternative, nature-inspired population-based methods, such as those typically found in evolutionary computing and swarm intelligence, have been considered to tackle the IFS inverse problem. One of the most popular approaches was based on the application of evolutionary techniques, such as genetic algorithms [18, 19] and genetic programming [20, 21]. The combination of fractal compression using PIFS and genetic algorithms is reported in [22,23,24]. Similarly, an evolutionary algorithm has been applied in [25] for fractal coding of binary images. Other examples of these techniques can be found in [26, 27].

Swarm intelligence methods have also been applied in several papers. Fractal image compression with different variations of the particle swarm optimization can be found in [28, 29]. These works do not compute the IFS parameters, but instead perform exhaustive search of similarities between blocks of the image. The papers in [30] and [31] perform IFS image reconstruction through the bat algorithm and the firefly algorithm, respectively. Their results might be acceptable in some settings where the accuracy is not critical, but they provide suboptimal solutions. A more recent paper improves the accuracy while being able to compute the optimal number of contractive maps in addition to their parametric values [32].

All previous methods deal with the problem of shape reconstruction from a geometric point of view exclusively and can only be applied to binary (i.e., black and white) images. In contrast, the case of full-color images has been largely ignored in the literature. In a previous paper presented at the ISCMI 2021 conference, the authors introduced a new method to address the reconstruction of colored images with IFS by considering the color as a new free variable vector [7]. The method combined a swarm intelligence method called cuckoo search algorithm and k-means clustering to perform reconstruction of both the geometry and the color of the fractal image. An alternative method based on the bat algorithm and multilevel thresholding was also recently reported [33]. To the best of our knowledge, these are the only papers addressing the issue of IFS image reconstruction for colored images. However, these methods have also some limitations. In this paper, they are substantially extended and enhanced in several ways.

3 Mathematical background

This section provides the reader with the mathematical background needed to follow the paper. The interested reader is kindly referred to [3, 6, 34] for further details.

3.1 Contractive maps and iterated function systems

Let (Xd) be a metric space, where X is a non-empty set and d a distance defined on X. A contractive map \(\varphi\) on (Xd) is a function \(\varphi :X \rightarrow X\) such that there is a real number \(0 \le k <1\) holding: \(d(\varphi (x),\varphi (y)) \le k.d(x,y)\), \(\forall x,y \in X\). The Banach fixed-point theorem states that every contractive map \(\varphi\) on a non-empty complete metric space (Xd) has a unique fixed point, denoted as p. Moreover, given any \(x \in X\), the sequence \(x, \varphi (x), \varphi (\varphi (x)),\dots\) obtained by iterative composition of \(\varphi\) with itself converges to the fixed point. In other words, if \(x_{n+1}=\varphi (x_n)\), \(\forall n \in {\mathbb {N}}\), then \(\lim \nolimits _{n\rightarrow \infty } x_n=p\).

Let \(\mathcal{M}=(X,d)\) be a complete metric space. An iterated function system (IFS) is a finite set \(\{\phi _i\}_{i=1,\dots ,N}\) of contractive maps \(\phi _i: X \rightarrow X\) defined on \(\mathcal{M}\). Generally, the IFS will be denoted as \({{\mathcal{W}}}=\{X; \phi _1,\dots ,\phi _N\}\). In this paper, we will focus on 2D bitmap images, so we consider onwards the complete metric space \(({\mathbb {R}}^2,d_2)\), where \(d_2\) denotes the Euclidean distance. Therefore, the contractive affine maps \(\phi _i\) are of the form:

$$\begin{aligned} \left[ \begin{array}{c} x^* \\ y^* \\ \end{array} \right] = \phi _i \left[ \begin{array}{c} x \\ y \\ \end{array} \right] = \left[ \begin{array}{cc} a_i &{} b_i \\ c_i &{} d_i \\ \end{array} \right] . \left[ \begin{array}{c} x \\ y \\ \end{array} \right] + \left[ \begin{array}{c} e_i \\ f_i \\ \end{array} \right] \end{aligned}$$
(1)

or equivalently, in matrix notation: \(\textbf{x}^* = \mathbf{\Phi }_i(\textbf{x}) = \textbf{A}_i.\textbf{x} + \textbf{b}_i\), where \(\textbf{b}_i\) is a 2D vector describing the translations, and \(\textbf{A}_i\) is a \(2\times 2\) matrix encoding other 2D transformations, such as scaling and rotation. Since the maps \(\phi _i\) are contractive, the matrix \(\textbf{A}_i\) has eigenvalues \(\lambda _1^i,\lambda _2^i\) such that \(|\lambda _j^i| < 1\), \(j=1,2\). Furthermore, the contractivity factor of the mapping \(\phi _i\), denoted as \(s_i\), holds that \(s_i=|det(\textbf{A}_i)|<1\), thus making the objects smaller after transformation.

From Eq. (1), we can see that any contractive affine map \(\phi _i\) is uniquely defined by the set of parameters \((a_i,b_i,c_i,d_i,e_i,f_i)\). Furthermore, any IFS, \({{\mathcal{W}}}\), is fully characterized by the collection of parameters \(\{(a_i,b_i,c_i,d_i,e_i,f_i)\}_{i=1,\dots ,N}\), called the IFS code of \({{\mathcal{W}}}\).

3.2 Hutchinson operator and IFS attractor

Let \({{\mathcal{C}}_S}(X)\) denote the set of all compact (i.e., closed and bounded) subsets of \(X \subset {\mathbb {R}}^2\). Note that the bitmap images are compact subsets of \({\mathbb {R}}^2\). The Hausdorff metric h on \({{\mathcal{C}}_S}(X)\) is defined as:

$$\begin{aligned} h(\mathcal{R},\mathcal{S})=max\left\{ d_h(\mathcal{R},\mathcal{S}),d_h(\mathcal{S},\mathcal{R})\right\} \end{aligned}$$
(2)

where \(d_h(\mathcal{R},\mathcal{S})=\underset{x \in \mathcal{R}}{\max } \; \underset{y \in \mathcal{S}}{\min } \; d_2(x,y).\)

Because \(({\mathbb {R}}^2,d_2)\) is a complete metric space, it can be proved that \(({{\mathcal{C}}_S}(X),h)\) is also a complete metric space [3]. The Hutchinson operator, \({{\mathcal{H}}}\), defining the join action of all contractive maps \(\phi _k\) on \(\mathcal{C_S}(X)\), is given by:

$$\begin{aligned} {{\mathcal{H}}}(\mathcal{B})=\bigcup \limits _{k=1}^{N} \phi _{k}(\mathcal{B}) \quad \quad \forall \mathcal{B} \in {{\mathcal{C}}_S}(X) \end{aligned}$$
(3)

Since all the \(\phi _k\) are contractions in \(({\mathbb {R}}^2,d_2)\), \({{\mathcal{H}}}\) is also a contraction on \({{\mathcal{C}}_S}(X)\) with the induced Hausdorff metric [3, 4]. Then, the Banach fixed-point theorem states that \({{\mathcal{H}}}\) has a unique fixed point, \({{\mathcal{H}}}(\mathcal{A})={{\mathcal{A}}}\), called the attractor of the IFS. Interestingly, the attractor set \({{\mathcal{A}}}\) is a fractal image, regardless of the initial set \(\mathcal{B}\) in Eq. (3).

There are several approaches to render the attractor of an IFS [35]. A simple and popular procedure is the probabilistic algorithm, where each contractive map \(\phi _k\) is associated with a probability \(\omega _k >0\), such that \(\sum\nolimits_{{k = 1}}^{N} {} \omega _{k} = 1\). The method proceeds iteratively: starting with an compact set \({{\mathcal{S}}}_0 \in {{\mathcal{C}}_S}(X)\), one of the maps of the IFS is randomly chosen with probability \(\omega _k\) at iteration j to obtain \(\mathcal{S}_j=\phi _k({{\mathcal{S}}}_{j-1})\). The process is repeated again for the resulting set, and so on. It can be proved that \(\overline{\{\mathcal{S}_j\}}_{j} ={\mathcal{A}}\), meaning that this iterative process can render the attractor [3, 10]. Theoretically the initial set \({{\mathcal{S}}}_0\) can be any compact set, but since the \(\phi _{\kappa }\) are contractive, the size of \({{\mathcal{S}}}_0\) decreases over the iterations, eventually collapsing to a point. Consequently, it is advisable to take \({{\mathcal{S}}}_0\) as a single point for computational efficiency.

There are several approaches to compute the probabilities \(\omega _{\kappa}\) [1, 36]. The most popular method, usually called Barnsley’s algorithm (also, the chaos game), considers a probability value \(\omega _{\kappa}\) related to the area filled by the contractive map \(\phi _k\), which is proportional to its contractive factor, \(s_k =|det(\textbf{A}_ k)|=|a_k.d_k-b_k.c_k|\). Then, the method computes \(\omega _{i}\) as:

$$\begin{aligned} \omega _{i}={s_{i}\over \sum \nolimits _{k=1}^N s_{k}} \,i=1,\dots ,N. \end{aligned}$$
(4)

In this paper, we follow this classical choice. Other methods are also feasible, even leading to more efficient values [10]. However, this problem is out of the scope of this work and is not addressed here.

3.3 The Collage theorem and the IFS reconstruction optimization problem

An important result in fractal theory is the collage theorem [9]. Given an IFS \({{\mathcal{W}}}\) on \(({\mathbb {R}}^2,d_2)\) with contractivity factor \(s=\max \limits _{i} s_i\), any \({{\mathcal{I}}}\in {{\mathcal{C}}_S}({\mathbb {R}}^2)\), and a threshold \(\epsilon \ge 0\), if \(H\left( {{\mathcal{I}},\bigcup\nolimits_{{i = 1}}^{N} {\phi _{i} } ({\mathcal{I}})} \right) \le \varepsilon\), then \(H\left( {{\mathcal{I}}},{{\mathcal{A}}}\right) \le {{{1} \over {1-s}}} H\left( \mathcal{I},\bigcup \nolimits _{i=1}^{N} \phi _{i}({{\mathcal{I}}})\right)\) with H(., .) being the induced Hausdorff metric on \({{\mathcal{C}}_S}({\mathbb {R}}^2)\), and \({{\mathcal{A}}}\) the attractor of \({{\mathcal{W}}}\). In practical terms, this means that given any digital image \({{\mathcal{I}}}\), there exists an IFS whose attractor has a graphical representation \({{\mathcal{A}}}\) that approximates \({{\mathcal{I}}}\) accurately, according to the Hausdorff metric.

The challenging question is how to compute this IFS \({{\mathcal{W}}}\), which is the problem addressed in this paper. To reconstruct a digital image \(\mathcal{I}\), we need to compute the collection of parameters of an IFS (i.e., its IFS code) such that the attractor of the IFS, \({{\mathcal{H}}}({{\mathcal{I}}})\), provides an accurate approximation of \({{\mathcal{I}}}\), according to a given metrics \(\mathbf {\Delta }\). Thus, this problem can be formulated as the following optimization problem:

$$\begin{aligned} \underset{\{\textbf{A}_i,\textbf{b}_i\}_{i=1,\dots ,N}}{minimize}\; \left[ \mathbf {\Delta }\left( \mathcal{I},\bigcup \limits _{i=1}^{N} \phi _{i}(\mathcal{I})\right) \right] \end{aligned}$$
(5)

such that \(s_i=|det(\textbf{A}_i)|<1\) for all \(i=1,\dots ,N\).

It is worthwhile to remark that it is actually enough to approximate \({{\mathcal{I}}}\) by \({{\mathcal{H}}}({{\mathcal{B}}})\), where \({{\mathcal{B}}}\) is any initial image (note that the attractor of the IFS \({{\mathcal{W}}}\) is independent of the initial image \({{\mathcal{B}}}\)). Therefore, the optimization problem becomes:

$$\begin{aligned} \underset{\{\textbf{A}_i,\textbf{b}_i\}_{i=1,\dots ,N}}{minimize}\; \left[ \mathbf {\Delta }\left( \mathcal{I},\bigcup \limits _{i=1}^{N} \phi _{i}(\mathcal{B})\right) \right] \end{aligned}$$
(6)

This problem is extremely challenging, as it is continuous (since all free variables in \(\{\textbf{A}_i,\textbf{b}_i\}_{i=1,\dots ,N}\) are real-valued), constrained (because the corresponding functions \(\phi _k\) have all to be contractive), and multimodal (there can be several global or local optima of the objective function \(\mathbf {\Delta }\)). In short, we have to solve a very difficult constrained multimodal continuous optimization problem. The problem is so challenging that it cannot be solved through classical mathematical optimization techniques. Several techniques have been proposed to tackle this issue, but only partial solutions have been reported in the literature so far (see Sect. 2 for details). Nowadays, the general problem still remains open and the community in the field is still searching for more powerful methods to address this problem. Furthermore, the difficulty of this problem is exacerbated by three additional major factors: (1) the optimal number of contractive functions, N is unknown in advance, so the method should typically include some procedure to determine it; (2) the problem can be high-dimensional for complex images, as they might require a large number of contractive maps for accurate reconstruction; (3) the optimization problem in Eq. (6) does not include any reference to color, so Eq. (5) must be modified to add this new attribute. This paper proposes a new method to solve all these problems simultaneously, as discussed in the next section.

4 The proposed method

The proposed method is presented in this section. Firstly, a brief overview of the method is described. Then, the different elements of the method are discussed individually in further detail.

4.1 Overview of the method

The input of our method is a fractal image, \({{\mathcal{I}}}\). It is assumed that \({{\mathcal{I}}}\) is given as a rectangular colored bitmap image of size \(P \times Q\) (measured in pixels) on the compact domain \(\mathbf {\Omega }=[a,b]\times [c,d] \subset {\mathbb {R}}^2\). Without loss of generality, we can consider the domain to be the unit square, that is, \(a=c=0\), and \(b=d=1\). To be more precise, the image is mathematically represented by a two-dimensional matrix \(\textbf{D}\) of size \(P \times Q\), where each matrix element \(\textbf{d}_{i,j}\) is a vector of three components, \(\textbf{d}_{i,j}=(R_{i,j},G_{i,j},B_{i,j})\), accounting, respectively, for the values of the red, green, and blue color channels of the corresponding pixel (ij) of the image. According to the collage theorem in Sect. 3.3, there is an IFS \({{\mathcal{W}}}\), whose attractor approximates \({{\mathcal{I}}}\) accurately. Our goal is to reconstruct the input image \({{\mathcal{I}}}\) in both geometry and color using that IFS. Note that this task requires to obtain the optimal number of contractive functions, N, which was assumed to be known in our previous conference paper in [7], but it is automatically computed in this extended version. It also requires to determine all relevant parameters of the IFS \({{\mathcal{W}}}\), that is, its IFS code. Finally, it also requires to obtain the color map of the input figure. This color subproblem can be solved by determining the color of each of the N contractive maps of the IFS.

Fig. 1
figure 1

Workflow of the method. On the left, the different steps of the method are shown (from top to bottom). On the right, the techniques used for each step are indicated

Figure 1 shows the workflow of our method. Given an input fractal image, the procedure consists of four main steps:

  1. 1.

    Firstly, color-based image clustering is carried out to identify the main regions in the image (see Sect. 4.2 for details).

  2. 2.

    Then, color histogram of each region of the cluster is computed and the dominant color for each cluster is extracted (see Sect. 4.3 for details).

  3. 3.

    Thirdly, a memetic approach hybridizing genetic algorithms and particle swarm optimization is applied to determine the optimal number of contractive functions, the IFS code and the color map of each contractive function via discrete and continuous optimization with genetic algorithms and particle swarm optimization (see Sect. 4.4 for details).

  4. 4.

    Finally, local search refinement is applied to enhance the optimal solution found in the previous step (see Sect. 4.5 for details).

The four steps of the method are individually discussed in detail in next sections. For a better description of the method, we consider an illustrative example of a colored fractal image. The image, called Barnsley’s fern (or just fern for short), was first proposed by Michael Barnsley in the 1980s. It is one of the most iconic fractal images and is often used to illustrate the ability of IFS to generate organic shapes of high complexity. Although it consists of only four contractive functions, the shape is challenging for geometric reconstruction, due to several factors: the disparity of areas covered by the contractive functions, positional rotations, and others. Therefore, it is a good benchmark to test the performance of the proposed method.

Fig. 2
figure 2

Fractal colored image fern used in this paper in colored (left) and binary (right) versions

The fern image is displayed in Fig. 2 in two versions: full color on the left and the binary version on the right. The image has been generated with one million points following the rendering procedure explained in Sect. 3.2 and processed as a bitmap image of size \(600 \times 600\) pixels. The final image consists of 360,000 pixels, encoded as a square matrix of order 600. The components of the matrix are color vectors in RGB color map for the colored image, and 0 s and 1 s for the binary image, corresponding to the pixels of the fractal and the pixels of the background, respectively. The number of active pixels (those in black in the binary version) is 80,240, giving a filling rate (i.e., percentage of active pixels over the total size of the image) of 22.28%.

4.2 Color-based image clustering of the input image

First step of our method is to perform color-based image clustering. The goal of this step is to identify the main regions (with respect to the color) of the input image and their dominant colors. In this paper, we consider two classical methods for image clustering for which several efficient implementations and computer libraries are available:

  1. 1.

    Similar to the previous conference paper in [7], we apply the well-known k-means method, which is based on minimization of within-cluster variances. In particular, k-means is based on minimizing the sum of the squared distances of each data to the cluster centroid. Thus, given a set of data vectors \(\textbf{x}_1,\textbf{x}_2,\dots ,\textbf{x}_n\), the algorithm builds k clusters \(\{\textbf{C}_1,\textbf{C}_2,\dots ,\textbf{C}_k\}\) as: \(\min \limits _{\mathbf {\Omega }} E(\bar{g}_i)=\min \limits _{\mathbf {\Omega }} \sum \nolimits _{i=1}^k \sum \nolimits _\mathbf{{x}_j \in \textbf{C}_i} ||\textbf{x}_j-\bar{g}_i ||^2\), where \(\bar{g}_i\) denotes the centroid of cluster \(\textbf{C}_i\). The centroids are updated iteratively from the condition: \({{{\partial E} \over {\partial \bar{g}_i}}=0}\), yielding: \(\bar{g}_i^{(t+1)}= {{{1} \over {|\textbf{C}_i^{(t)}|}} \sum \nolimits _\mathbf{{x}_j \in \textbf{C}_i^{(t)}} \textbf{x}_j}\), where t indicates the iteration.

  2. 2.

    In addition, we consider a modification of the classical Otsu algorithm for automatic image binary thresholding [37], also based on the minimization of the intra-cluster intensity variance. Since in this paper we are dealing with colored images, we consider an implementation of the extension of the Otsu algorithm to the case of multi-level thresholding described in [38], which applies a recursive procedure accessing to a pre-computed between-class variance through a look-up table.

Fig. 3
figure 3

k-means clustering results of the image in Fig. 2(left) for different number of clusters (left-right, top-bottom) ranging from 2 to 9 (including background)

These methods are applied to perform image clustering for different values of the number of clusters. The resulting clusters are stored to be subsequently used for comparative purposes between the initial fractal image and the reconstructed one throughout the iterative optimization during the next steps of our method.

It is interesting to remark that both algorithms, k-means and Otsu algorithm, are based on the same criterion, namely the minimization of the within-class variance. Their main difference is that Otsu method is an exhaustive global search algorithm, while k-means searches for the optimal solution locally. Figure 3 shows the image clustering results for the k-means method on the input colored fractal image fern in Fig. 2 for different number of clusters ranging from 2 to 9 (including background). The results for image clustering with Otsu multi-level thresholding are very similar visually, so they are omitted here to keep the paper at reasonable length.

4.3 Dominant color determination for each cluster

Once the image clustering step is carried out, color histogram of each cluster is computed in order to determine its dominant color. To this aim, the pixels of each cluster are identified and their intensity values, described as triplets in RGB code, are stored and used to compute the histogram of the cluster. Then, the highest peak value in the histogram is selected to determine the dominant color of each cluster. The output of this step is a collection of vectors \(\{{{\mathcal{R}}}_j,{{\mathcal{G}}}_j,{{\mathcal{B}}}_j\}_{j=1,\dots ,N}\), corresponding to the dominant colors (in RGB color code) of the N clusters. This step provides an initial ground truth for the region colors of the original image for comparative purposes with the potential solutions of our method throughout the iterations.

4.4 IFS code optimization

This is the most critical step of the method. It applies a hybrid approach combining genetic algorithms and particle swarm optimization to determine the optimal number of contractive functions and the IFS code (through PSO) and the color map of each contractive function (through GA).

4.4.1 Genetic algorithms

Genetic algorithms (GAs) are metaheuristic search procedures based on the Darwinian principles of natural evolution and natural selection. They become popular thanks to the seminal work carried out in the 1970 s by J. Holland [39]. Since then, they have been widely used in optimization problems, specially in combinatorial optimization [40].

The general structure of the genetic algorithm is shown in Table 1. Genetic algorithms are population-based methods that proceed iteratively by generations: at each generation g, a GA considers a population of individuals, where each particular individual, accounting for a potential solution, is represented using a genetic representation. In this work, we use a discrete representation where the individuals \(\textbf{D}_k\) are vectors of \(N_i\) components (as many as the number of contractive functions of the IFS) of the form: \(\textbf{D}_k=(\textbf{D}_k^1,\textbf{D}_k^2,\dots ,\textbf{D}_k^{N_i})\) where each \(\textbf{D}_k^j=(R_k^j,G_k^j,B_k^j)\), and the \(R_k^j,G_k^j,B_k^j\) components, accounting for the red, green, and blue color channels, take integer values within the range \(0-255\).

Generally the initial population is selected randomly, but some knowledge about the specific problem can also be embedded into the initial population with the aim at improving the convergence speed. Since in our case no prior knowledge about the problem is assumed, the individuals for the genetic algorithm are initialized with random values taken from an integer uniform distribution on the range \(0-255\) for each color component.

Table 1 General structure of the genetic algorithm (modified from [41])

A critical aspect in many GA applications is the size of this initial population, denoted onwards as \(N^{GA}\). If it is too small, the algorithm could converge too quickly, while if it is too large the algorithm could waste valuable computational resources. The population size is often chosen to be constant, although GA with dynamic population size are also possible [42]. In this paper, we consider a fixed population of \(N^{GA}=100\) individuals.

Each one of the potential solutions must be evaluated by using a fitness function; the result of this evaluation is typically understood as a measure of the adaptation for each individual, which is, in turn, a measure of the goodness of the solution for the given optimization problem. For the color optimization subproblem addressed in this work with genetic algorithms, we consider the normalized RGB color difference error, \({{\mathcal{C}}}\), given by:

$$\begin{aligned} {{\mathcal{C}}} ({{\mathcal{I}}}_o,{{\mathcal{I}}}_r)= {{1} \over {3\, N_l}} {\left[ \sum \limits _{i=1}^{N_l} {\bigg ( {{|{{\mathcal{R}}}_o^i-{{\mathcal{R}}}_r^i|+|{{\mathcal{G}}}_o^i-{{\mathcal{G}}}_r^i|+|\mathcal{B}_o^i-{{\mathcal{B}}}_r^i|} \over {255}} \bigg )} \right] } \end{aligned}$$
(7)

where \(({{\mathcal{R}}}_o^i,{{\mathcal{G}}}_o^i,{{\mathcal{B}}}_o^i)\) and \((\mathcal{R}_r^i,{{\mathcal{G}}}_r^i,{{\mathcal{B}}}_r^i)\) are the RGB color channels of the contractive functions of the original and the reconstructed images, respectively, and \(N_l\) indicates the number of contractive functions of the optimal solution found by the method.

After the initialization step, the evolution process in genetic algorithms works iteratively: at each iteration new populations are obtained using a selection process, based on the Darwinian concept of individual adaptation. This selection process is repeated over the generations and the selected individuals form the new population. In this work, the selection operator is implemented as a biased roulette wheel with slots weighted proportionally to the individual fitness values. Then, some genetic operators (crossover and mutation) are applied onto the new population [40, 43]. The individuals exhibiting the best adaptation measure increase their chance of reproducing and generating new individuals through crossover and mutation. These genetic operators are driven by some parameters, as listed in Table 2.

Table 2 Parameters of the genetic algorithm and values used in this paper

The crossover operator generates two new individuals (offsprings) by combination of different parts from two selected individuals of the population. In this work, we consider a two-point crossover operator that, with probability \(p_c\), selects two random chromosomes of two individuals and then swaps the values of the parent chromosomes to produce the two new offsprings. A good performance of the GA requires the choice of a high crossover probability, in order to promote the diversity of the population. This will enhance the exploratory ability of the population, covering larger regions of the search space. We noticed that our crossover procedure can introduce strong perturbations to the individuals of the population even if they are close the convergence. To alleviate this effect, we store not only the new offsprings but also their parents, so that the population is temporarily increased. Then, the best individuals (as many as the GA population size, \(N^{GA}\)) are selected based on their individual fitness, while the rest of the population is rejected and, hence, not preserved for the next generation. This elitist strategy attenuates the effect of the crossover operator, while still promoting population diversity and the ability to explore regions of the search space far from the current best solutions.

Finally, a mutation operator is applied. It generates, with a given probability, \(p_m\), a new individual by a small change in a single individual. The mutation operator guarantees that all the search space has a nonzero probability of being explored, by increasing further the diversity of the population. A good GA performance requires the choice of a low mutation probability (often taken inversely proportional to the population size). In our work, the mutation operator picks a random component of the individual, \(x_j\) and modifies it as \(x_j \leftarrow x_j \pm \, m_s\), with \(m_s\) called the mutation strength, and where the positive or negative sign is randomly chosen with the same probability.

This procedure is repeated through successive generations until a termination condition is reached. Common terminating criteria are: (1) that a solution is found satisfying a lower threshold value; (2) that a fixed number of generations are reached; or (3) that successive iterations no longer produce better results. However, depending on the given problem, other criteria can be set up. In our case, the termination criterion is that no further improvement is attained after 10 consecutive generations.

4.4.2 Particle swarm optimization

Particle Swarm Optimization (PSO) is a highly popular bio-inspired metaheuristic technique for real-valued optimization problems. The original PSO algorithm was first proposed by Kennedy and Eberhart in 1995 [44]. Many other variants from this original procedure have been proposed [45]. The interested reader is referred to [46] for further details. See also [47] for a nice computational approach of the PSO and its variants.

In the PSO algorithm, an initial population of potential solutions (called particles) are distributed (uniformly in the general case) over the search space; then, they are provided with an initial velocity and the capacity to communicate the best positions to each other and adjust their own position and velocity based on these optimal positions. In this work, we consider a population size \(N^{PSO}\) of 100 particles. The particles \({{\mathcal{W}}}_i\), representing the IFS, follow a continuous representation given by vectors of \(N_i\) components (as many as the number of contractive functions of the IFS), of the form:

$$\begin{aligned} {{\mathcal{W}}}_i=(\phi _i^1,\phi _i^2,\dots ,\phi _i^{N_i}) \end{aligned}$$
(8)

where each \(\phi _i^j=(a_i^j,b_i^j,c_i^j,d_i^j,e_i^j,f_i^j)\), i.e., the IFS parameters of the corresponding contractive function \(\phi _i^j\). Without loss of generality, we can assume that the parameters \(a_i^j,b_i^j,c_i^j,d_i^j,e_i^j,f_i^j\) take real values on the interval (0, 1]. However, since the functions \(\phi _i^j\) must be contractive, the following constraints must hold:

$$\begin{aligned} \left\{ \begin{array}{l} \left( a_i^j\right) ^2 + \left( c_i^j\right) ^2<1 \\ \left( b_i^j\right) ^2 + \left( d_i^j\right) ^2<1 \\ \left( a_i^j\right) ^2 + \left( b_i^j\right) ^2 + \left( c_i^j\right) ^2+ \left( d_i^j\right) ^2 <1+\left( a_i^j.d_i^j-b_i^j.c_i^j \right) ^2 \\ \end{array} \right. \end{aligned}$$
(9)

These conditions are to be checked at every iteration step t.

Note that the length of each individual \({{\mathcal{W}}}_i\) is \(N_i\), which can be different from one IFS to another. In other words, different to our previous conference paper, we do not consider an initial value for the number of contractive functions of the IFS. Instead, \(N_i\) is a dynamic variable that changes over the iterations according to the best fitness value at each iteration. In this way, the method is able to compute automatically the optimal number of contractive functions for any input fractal image. In this work, \(N_i\) is initialized randomly between 2 and 10, although larger values are also supported in our approach without further modification.

Similar to genetic algorithms, the dynamics of the population is considered along successive iterations. The evolution of the particle i is determined by two memory factors: the memory of their own best position, and knowledge of the global best. For the first main factor, we collect the coordinates \(P_i^b\) associated with the best solution (according to a given fitness function f) it has achieved so far along with the corresponding fitness value, \(f(P_i^b)\). For the second main factor, we also store the position, \(P_g^b\), and the best fitness value, \(f(P_g^b)\), among all the particles in the population from the initial iteration. This global information is used to modify the position and velocity of each particle i according to the following evolution equations:

$$\begin{aligned} V_i(t+1)=w\, V_i(t)+\gamma _1 R_1\, [P_g^b(t)-P_i(t)] + \gamma _2 R_2\, [P_i^b(t)-P_i(t)] \end{aligned}$$
(10)
$$\begin{aligned} P_i(t+1)=P_i(t)+V_i(t) \end{aligned}$$
(11)

where \(P_i(t)\) and \(V_i(t)\) are the position and the velocity of particle i at time t, respectively, w is the inertia weight which defines how much the old velocity will affect the new one, and the coefficients \(\gamma _1\) and \(\gamma _2\) are constant values called learning factors, defining the level of affection of \(P_g^b\) and \(P_i^b\). To be more precise, \(\gamma _1\) is a weight accounting for the “social" or “global" component of the swarm, while \(\gamma _2\) represents the “cognitive" or “personal" component, accounting for the memory of the individual particle i along the time. Two random numbers, \(R_1\) and \(R_2\), following a uniform distribution on the unit interval [0, 1], are included to diversify the search.

The PSO is used in this work to compute the optimal number of contractive functions and their IFS code. To this purpose, the fitness function computes the distance between the attractor of the reconstructed IFS and the input image. Since the input consists of 2D bitmap images given as a collection of pixels, the most natural choice would be the Hausdorff distance, given by Eq. (2). However, this metric is very computational expensive. Even worse, it may become unreliable for this problem, as it might identify as similar images that are different in terms of their geometry. For these reasons, other similarity functions have been proposed in the literature [25, 30, 32]. In this paper, we consider two of them, described in next paragraphs.

The first one is given by the Hamming distance, which has already been used in our previous conference paper in [7]. To compute this distance, we consider the binarized version of the original and the reconstructed images, denoted as \({{\mathcal{I}}}_o^B\) and \({{\mathcal{I}}}_r^B\) , respectively, with the same size \(p \times q\), and compute the Hamming similarity error, \({{\mathcal{S}}}_{{\mathcal{H}}}\), given by:

$$\begin{aligned} {{\mathcal{S}}}_{{\mathcal{H}}} ({{\mathcal{I}}}_o^B,{{\mathcal{I}}}_r^B)={e {{1} \over {p \times q}} {{\sum \limits _{x=1}^p \sum \limits _{y=1}^q \left| {{\mathcal{I}}}_o^B(x,y) - {{\mathcal{I}}}_r^B(x,y) \right| }}} \end{aligned}$$
(12)

with \({{\mathcal{I}}}_j^B(x,y)\) indicating the binary value of the pixel (xy) of \({{\mathcal{I}}}_j^B\). Note that \({{\mathcal{S}}}_{{\mathcal{H}}} (\mathcal{I}_o^B,{{\mathcal{I}}}_r^B)\) computes the number of mismatched pixels between both images. Therefore, values of \({{\mathcal{S}}}_{{\mathcal{H}}} (\mathcal{I}_o^B,{{\mathcal{I}}}_r^B)\) close to 0 mean that the images are very similar, indicating a very good geometrical reconstruction, while values close to 1 mean that both images are very different, a clear indication of poor reconstruction.

The second fitness function is the intersection similarity function, \({{\mathcal{S}}}_{\#_p}\), given by:

$$\begin{aligned} {{\mathcal{S}}}_{\#_p} ({{\mathcal{I}}}_o^B,{{\mathcal{I}}}_r^B)= {{\#_p({{\mathcal{I}}}_o^B {\,\,{\cap }}\,\,{{\mathcal{I}}}_r^B)} \over {\#_p({{\mathcal{I}}}_o^B {\,\,{\cup }}\,\,\mathcal{I}_r^B)}} \end{aligned}$$
(13)

where \(\#_p(.)\) represents the number of active pixels of the image, that is, those displayed in black in the binarized version of the image; see, for instance, Fig. 2(right). In this case, values of this function close to 1 indicate very good matching, with the limit case for value 1, which corresponds to a perfect matching between the union and intersection of both images. The opposite applies for values approaching to 0.

Table 3 Parameters of the particle swarm optimization method and values used in this paper

The PSO also comes with some parameters, as shown in Table 3. The most critical parameters might be the social and cognitive factors. However, in our trials, we did not notice significant differences in the final results, only in the CPU times. We finally set them to 2.0, values that show a good behavior toward convergence, according to some theoretical studies. About the inertia weight, although it can be taken constant, some studies suggest that it is better to modify it dynamically over the time [48]. Initially, a relatively high value is generally considered, corresponding to a system with low viscosity so that the swarm performs extensive exploration. Then, the value is gradually decreased until a small value the system corresponding to a dissipative system where the swarm is better at homing into local optima. This is also the strategy taken in this paper: starting with a value \(w=0.9\), it is gradually decreased by 0.05 every 100 iterations.

This PSO algorithm is sketched in the pseudocode shown below. The procedure is repeated iteratively, and the solutions are evaluated at each iteration according to the similarity function given by Eq. (12) until a termination condition is reached. Once the method ends, the best particle of the swarm at the last iteration is taken as the best solution of the optimization problem.

figure a

4.4.3 Hybridization of GA and PSO

Despite their remarkable features, genetic algorithms and particle swarm optimization also have some important limitations. For instance, PSO suffers from premature convergence (particles in PSO can get stuck in local minima or confined within a poor region of the search space). At its turn, genetic algorithms suffer from memory loss (whenever an individual in GA is not selected, its information is lost). Both are also affected by parameter tuning. In this context, the underlying idea of hybridization is to take advantage of their individual strengths to improve their performance while simultaneously overcoming its main limitations. In fact, hybrid evolutionary systems based on GA and PSO have been extensively used so far, with a number of papers showing that they outperform their individual components for several optimization problems (a brief discussion of several hybrid approaches can be found in [50] and references within).

In the particular case of this paper, GA and PSO are used to deal with the subproblems of dominant color optimization and IFS parameter optimization, respectively. Note that the color problem is discrete in nature, leading to a discrete optimization problem, where GA is a very suitable approach. On the contrary, the IFS parameters are real-valued, so their optimization is more suited for a continuous optimizer such as PSO. There is another important reason to consider hybrid schemes in this paper. The reconstruction of colored fractal images with IFS is affected by the fact that the computation of the IFS parameters and the color of each contractive function are strongly intertwined in a highly nonlinear way. Modifying the IFS parameters changes the contractive functions, which in turn modifies the attractor of the image and hence its colors. In order to match the new colors of the image again, new computations have to be carried out, which modify the IFS parameters, and so on.

In order to solve this problem, we consider a hybrid approach already used in the past to solve a different problem in another field but with similar characteristics of nonlinear interplay between sets of free variables [50]. The method, called IMCH GAPSO (standing for Iterative Mutually Coupled Hybrid GA-PSO), is based on the idea that GA and PSO are mutually coupled, in the sense that the output of one system is used as the input of the other and vice versa. This coupling is then repeated iteratively until a termination criterion is attained. In particular, the GA is applied to optimize the color values of the contractive functions of the IFS, as explained in Sect. 4.4.1. Once a value for them is obtained, this input is injected into the PSO to determine the optimal IFS parameters with such colors, as described in Sect. 4.4.2. Once the PSO is executed, the fitness of the new particles along with the colors obtained from the GA is evaluated according to the global fitness function:

$$\begin{aligned} F({{\mathcal{I}}}_0,{{\mathcal{I}}}_r)= \alpha \, {{\mathcal{S}}}_{{\mathcal{H}}} (\mathcal{I}_0,{{\mathcal{I}}}_r) + (1-\alpha )\, [1-{{\mathcal{C}}} ({{\mathcal{I}}}_0,{{\mathcal{I}}}_r)] \end{aligned}$$
(14)

where \(\alpha\) is a scalar factor taking values on the interval (0, 1). Equation (14) is a convex combination of the fitness functions associated with the two subproblems of this paper. In this sense, it extends the global fitness function in our previous conference paper [7], where we took \(\alpha =0.5\). This time, we carried out several simulations for different values of \(\alpha\) varying from 0.1 to 0.9 with step size 0.1. From them, we found empirically that \(\alpha =0.7\) provides the best results in terms of accuracy and CPU times, so this is the value selected in this paper. The corresponding solutions are then ranked according to their fitness values and the process is started again with a new round of executions for the GA, then the PSO, and so on. This process is repeated for a given number of iterations, for which convergence has already been achieved. This number is set to 12,000 iterations in this paper.

4.5 Solution refinement through local search

Once the best solution is obtained through the IMCH GA-PSO method described in the previous section, it is combined with a local search procedure with the aim at improving its accuracy even further. In this paper, we consider two local search heuristics to intensify the search in the neighborhood of the global optima. The first procedure is the Luus–Jaakola local search procedure, which is a popular heuristic for nonlinear programming problems [51]. In this method, an initialization step is performed, where random uniform values are sampled for each component. These random values are added to the current values of the obtained solution to generate a new candidate solution, which replaces the current one in case of improvement of the fitness value. Otherwise, the sampling space is multiplicatively decreased by a factor, which can be assumed to be constant. This process is repeated iteratively. With each iteration, the neighborhood of the point decreases, and the search is increasingly limited to a smaller area until a termination criterion is reached.

The second local search procedure is called MADS (standing for Mesh Adaptive Search algorithm). It is a direct search method originally proposed by Audet and Dennis in [52] for nonlinear optimization. A great advantage of this method is that it extends the family of search methods known as generalized pattern search by allowing local exploration in an asymptotically dense set of directions in the space of optimization variables. In other words, the local exploration of variables is no longer restricted to a finite number of directions, making MADS less restrictive for exploration of the search space than the classical pattern search methods. In this paper, we follow the implementation described in [53]. The reader is kindly referred there for further details.

The local search heuristics is run for 1,000 additional iterations during the local refinement step to further improve the quality of the solution. This scheme has shown to be very effective for the problem addressed in this paper, as it will be discussed in the next section.

5 Computational experiments and results

This section presents the main results obtained for the IFS image reconstruction problem of the fern fractal image in Fig. 2. Firstly, the graphical results are discussed. Then, the numerical results are presented and analyzed.

5.1 Graphical results

Figures 4, 5, 6, 7 and 8 show the evolution of the best solution of the population for the 13,000 iterations of our method with step-size 250, starting with a random initial image. Each picture in those images is comprised of two parts: on the left, the colored attractor of the best IFS is depicted, where the color of each contractive map is determined by the genetic algorithm and the geometry by the PSO; on the right, this colored attractor is combined with the target image (in black) superimposed on the attractor for easier and prompt visualization of the difference between both images. The images are also part of two QuickTime videos: Video1.mov (length: 110 s; size: 10.5 MB) and Video2.mov (length: 158 s; size: 13.9 MB), submitted as accompanying material of this paper.

Fig. 4
figure 4

(l-r, t-b) Evolution of the best solution for 0 to 2,250 iterations (step-size 250)

Fig. 5
figure 5

(l-r, t-b) Evolution of the best solution for 2,500 to 4,750 iterations (step-size 250)

Fig. 6
figure 6

(l-r, t-b) Evolution of the population best for 5,000 to 7,250 iterations (step-size 250)

Fig. 7
figure 7

(l-r, t-b) Evolution of the population best for 7,500 to 9,750 iterations (step-size 250)

Fig. 8
figure 8

(l-r, t-b) Evolution of the population best for 10,000 to 12,750 iterations (step-size 250)

The first image in Fig. 4(top-left) corresponds to one of the 100 random images generated for the initial population of the method. As clearly shown in this picture, the initial random image is extremely different to the input image. As shown in the sequence of images in Figs. 4, 5, 6, 7 and 8, the application of our method makes the best solution to get closer to the target image over the iterations, until reaching a very good approximation of the input image at the last iterations. This convergence process is not visually obvious at the earlier iterations, as the strong nonlinear interplay between the color and the geometry can yield solutions that that might look worse than the previous ones from a visual point of view, either in terms of geometry or in terms of the intended color. On the other hand, we recall that the number of contractive maps of the IFS changes dynamically over the iterations. And since each contractive map is associated with one color, this can make new colors appear during the iterative process. During our simulations, we noticed that the number of contractive functions of the partial best solutions typically tends to vary from 3 (the lowest value found in our simulations) to 10 (the largest one). For instance, the reader can see that a new blue color appears in the transition between the left and the right images of the second row of Fig. 4, indicating that a new contractive function has been added by the method. This new function disappears in the first row of Fig. 5, an indication that a contractive map has been removed from the IFS. New functions are automatically added or removed by the method according to the fitness values all over the iterative process. Anytime a new contractive map is added, a new color is automatically assigned to the map by the genetic algorithm. That is the reason that explains why we can see different variations of the color palette throughout Figs. 4, 5, 6, 7 and 8.

Fig. 9
figure 9

(l-r, t-b) Intersection (left) and union (right) sets of the input and approximating fractal images of the bush example for 0 to 2,250 iterations (step-size 250)

Fig. 10
figure 10

(l-r, t-b) Intersection (left) and union (right) sets of the input and approximating fractal images of the bush example for 2,500 to 4,750 iterations (step-size 250)

Fig. 11
figure 11

(l-r, t-b) Intersection (left) and union (right) sets of the input and approximating fractal images of the bush example for 5,000 to 7,250 iterations (step-size 250)

Fig. 12
figure 12

(l-r, t-b) Intersection (left) and union (right) sets of the input and the approximating fractal images of the bush example for 7,500 to 9,750 iterations (step-size 250)

Fig. 13
figure 13

(l-r, t-b) Intersection (left) and union (right) sets of the input and the approximating fractal images of the bush example for 10,000 to 12,750 iterations (step-size 250)

Fig. 14
figure 14

Summary of the graphical results: (top-left) reconstructed colored image (compare it with the original one in Fig. 2(left)); (top-right) superposition of the original (in black) and the reconstructed (in color) images for visual comparison; (bottom) intersection (left) and union (right) of the original and the reconstructed fractal images

In addition to our primary geometry-based fitness function, the Hamming similarity error, in this paper we also consider the intersection similarity function, as given by Eq. (13). This fitness function requires to compute the intersection and the union of the original and the reconstructed images. Figures 9, 10, 11, 12 and 13 (left-right, top-bottom) show the evolution of the intersection (on the left) and the union (on the right) sets of the input and the reconstructed fractal images of the fern example for the 13,000 iterations with step-size 250. Note the remarkable difference between the intersection and the union sets at early stages of the method. This difference is visually decreasing over the iterations, and the global shape of the target fractal image becomes more recognizable at later stages of the procedure. Two additional QuickTime videos: Video3.mov (length: 158 s; size: 8.7 MB) and Video4.mov (length: 158 s; size: 7.6 MB), showing, respectively, the evolution of the intersection and the union sets for our example, are also submitted as accompanying material of the paper.

Figure 14 summarizes the graphical results of the global best of our method for the fern example in this paper. The top row shows the reconstructed colored image (on the left). A simple visual comparison with the original image in Fig. 2(left) shows that the proposed method performs very well, since the final reconstructed image is very similar visually to the target one. The image in Fig. 14(top-right) shows the superposition of the original image (in black) and the reconstructed image (in full color behind) for easier visual comparison. The fact that the colors of the reconstructed image in the background can barely be seen (as they are hidden by the original image in black in the foreground) is a clear indication of the high accuracy of the geometric reconstruction. As a further evidence, the bottom row of Fig. 14 shows the intersection (left) and union (right) of the original and the reconstructed fractal images. Note that both images look quite similar, albeit not identical. The ground truth image is obviously in-between. These graphical results show that the method is able to capture successfully all major features of the input image, even although it has a complicated and irregular shape.

Fig. 15
figure 15

Histogram of the original fractal image (left) and the reconstructed fractal image (right), with the three color channels displayed individually (top) and combined with transparent colors (bottom)

Figure 15 summarizes the graphical results regarding the reconstruction of color with our method. The figure shows the histogram of the original fractal image (left) and the reconstructed image (right), with the three color channels displayed individually (top) and combined with transparent colors for better visualization (bottom). Once again, the visual results show that the method performs very well: both histograms look very similar, although the matching is not perfect. The difference is more noticeable for the blue color channel, corresponding to the bottom rightmost areas of the fractal image, where the geometric matching is less accurate. Yet, in our opinion the results are quite impressive. To the best of our knowledge, there is no method reported in the literature even attempting to address this problem, much less getting this high accuracy results.

Table 4 Numerical results of our method for the fern fractal example

5.2 Numerical results

The good graphical results described in previous section have been confirmed by the numerical results. Table 4 summarizes the main results obtained for the example in this paper. It shows the results obtained for the global best solution for the 13,000 iterations with step-size 1,000 (in rows). The following data are shown (in columns):

  • Number of iterations, iter;

  • Number of contractive maps, N, of the best solution at generation iter;

  • Number of mismatched pixels between the reconstructed and the original images;

  • Ratio of mismatches with respect to the active pixels of the image;

  • Number of pixels in the intersection of the reconstructed and the original images;

  • Number of pixels in the union of the reconstructed and the original images;

  • Ratio between the number of pixels in the intersection and the union;

  • Ratio between the number of pixels in the intersection and the number of active pixels of the image.

The most important conclusion from Table 4 is the excellent behavior of the method in terms of the geometric accuracy and its predictive capacity. Taking a look at the last row of the table, which displays the final results of the method, we can see that the method is able to recover the optimal number of contractive functions of the input image, which is \(N=4\), even although this datum is never used in our method. On the other hand, there are only 18,445 different pixels between the original and the reconstructed images, a ratio of only 22.98% of the total pixels of the image. Also, we remark that the number of pixels in the intersection and in the union is very close to the actual number of pixels of the target image, which is obviously between them. The disparity of these values is less than 10,000 pixels, a remarkable error of less than the 13%. Finally, the percentage of pixels in the intersection is larger than 87%, a clear indication of an excellent matching. We are unaware of any paper reporting numerical results with this high level of accuracy, even when considering methods focused exclusively on binary images without any color.

Table 4 provides other interesting information about the previous graphical results. The second column shows that the method is able to determine automatically the optimal number of contractive maps over the iterations. Note also that the value of N in the table changes from 4 to 7 over the generations. In addition, the values of N change even further over the population, taking values from 3 to 10, but these values do not appear in the table as they correspond to suboptimal solutions, while the table only shows the best solution for each iteration value. Also, it is interesting to remark that due to the interplay between the geometry and the color, it may happen that changing the number of contractive functions does not necessarily lead to an improvement of the geometric matching between the original and the reconstructed images. For instance, the numerical results are better for 2,000 iterations than for 3,000 or 4,000. However, this tells us only a part of the story, because the higher matching for 2,000 iterations is obtained by using contractive functions covering large areas, with a total of 122, 376 pixels for the union and only 54, 429 for the intersection. It is obvious that if we approximate the original image with other covering all its area in excess, we recover most of the pixels of the target image, but at the cost of adding many wrong pixels in excess to the image. This is exactly what happens here. These situations are allowed in our method in order to prevent the method from getting stuck in local optima and to increase the exploratory ability of the method, searching for more promising areas of the search space.

5.3 Computational issues

The computations in this work have been performed on a personal computer with 5.2 GHz Intel Core i9-11900 processor with 32 GB of RAM and graphical card Nvidia Quadro P620 2 Gb GDDR5X. The source code has been implemented by the authors in the native programming language of the popular scientific program MATLAB version 2021b and using the numerical libraries for fractals in [54, 55]. The clustering procedure has been run on Mathematica using source code adapted by the authors from public libraries for image processing and computer vision. The videos have been generated by the authors using the iLife suite by Apple. Regarding the time complexity of the method, taken into account our choices for the genetic operators of selection (roulette wheel), two-point crossover and one-point mutation, the genetic algorithm complexity is O(\(g.N^{GA}.N_i\)), with g the number of generations. For the PSO, using our constant values for the social and cognitive factors and a linear variation of the inertia weight, the complexity is O(\(T.N^{GA}.N_i\)) with T the number of iterations.

6 Conclusions and future work

This paper is an extension of a previous conference paper in [7] to address the problem of fractal image reconstruction for colored images through IFS in a fully automatic way. Given a colored fractal image, the goal is to obtain the IFS parameters and the color map of each contractive function automatically so that the input image can be accurately replicated not only in terms of its geometry but also its color, without any knowledge about the input fractal image beyond the input bitmap image. This work presents a method to address this issue based on four steps: image clustering, color determination, IFS code optimization, and local search refinement. The optimization task is solved through a hybrid method combining GA and PSO to compute the color of the contractive functions, and the optimal number of contractive functions and their IFS parameters, respectively. The GA and PSO algorithms are mutually coupled so that the input of each method is injected as input for the other, in an iterative fashion until convergence is achieved.

This new method has been applied to a popular yet challenging example of fractal image. The graphical and numerical results clearly show that the method performs very well, leading to excellent reconstruction results for both the geometry and the color. To the best of our knowledge, no other method has ever attempted to solve automatically the fractal image reconstruction problem with IFS for colored images, a clear evidence of the originality and contribution of this work. In this context, the present work opens the door for future developments in this field.

This work has also some limitations. The most noticeable is that only one example of fractal image is presented, raising a reasonable open question about the generality of this work. We have performed some trials on other examples and found that the method is also able to provide very good results, which could lead to a positive answer to this generality question. Unfortunately, we have had no time to finish all our simulations for the submission deadline of this special issue, and hence, these other examples are not included here. Of course, it would be great to include other images in the benchmark, but getting the results for this example has been an intensive work, requiring several days in simulations and data analysis. And this leads to what is, in our opinion, the most critical aspect of this method: the computational times. The method, as exposed above, provides a high degree of accuracy at the expense of very large computational times. A typical execution of the method as described for the fern example takes tens of hours for a single run on a powerful personal computer. Given the stochastic nature of the metaheuristic techniques, several executions must be carried out in order to suppress spurious behaviors and obtain reliable results, which means that, unless you have a computer farm at your disposal (not certainly the authors’ case), you must be ready for weeks of extensive computations. Clearly, the computational speed is still a challenge for this method. However, our focus in this work is not on computational speed, but on accuracy. Since no method has been reported so far in the literature to solve this problem, we wanted to confirm the feasibility of our approach to yield accurate results regardless of the computational effort, so we accepted the challenge and are extremely satisfied with the (arguably) impressive results. Given this pioneering nature, we have prioritized its precision over any other factor of the problem. But this large CPU time must be taken into account as a limiting factor of the method.

Once we show that the method performs well, our next focal point goes to the limitations of the method and how they can be improved. We are working now on ideas and new procedures to alleviate the computational load and decrease the computational times. One possibility could be to identify more efficient clustering methods for the color determination subproblem. Other line of research might be to propose more efficient fitness functions for this problem. These ideas will be part of our plans for future work in the field.