1 Introduction

In recent years, a significant number of digital images are produced from cell phones, surveillance cameras, and personal digital devices. Artificial intelligence-driven technologies have concentrated on leveraging images to automate numerous tasks [5, 47, 49]. Popular apps such as Instagram and Tiktok, in particular, can handle the flow of millions of photos at once. These apps offer features related to image sharing. As a result, image compression is a crucial issue when it comes to storage space and network bandwidth usage.

The most widely used method of compressing digital images is called JPEG (Joint Photographic Experts Group), and it is based on the discrete cosine transform (DCT) [5]. There are several variants of JPEG such as the JPEG File Interchange Format (JFIF), as one of the most prevalent variants. Unlike JPEG standard, JFIF benefits from a color space. In other words, the first step in JFIF image compression is to convert the original color space to YCbCr color space, where Y, Cb, and Cr stand for luminance, blue, and red chrominance components, respectively. Each component is treated separately throughout the compression process. For simplicity, we will just employ the luminance component, Y; otherwise, the procedure is the same for all components. The Y component of the image is divided into \(8 \times 8\) chunks, each of which is independently adjusted. The \(8 \times 8\) blocks are zero-shifted before applying the DCT by deducting 128 from the element values. Each updated block is then quantised, leading to information loss. After quantisation, each block may be properly entropy encoded [11] without any data being lost. Depending on the quality factor value, different amounts of compression can be applied to an image.

A key component of the JPEG image compression is the quantisation table (QT). The luminance quantisation table (LQT) and chrominance quantisation table (CQT) are the two quantisation tables used by the Annex K variation [14], the most significant variant of JPEG implementation. These two are primarily in charge of quantising the luminance and chrominance elements’ respective DCT coefficient blocks. Since each image requires its own table, obtaining the right values for both quantisation tables is a tough and challenging task, as a result, most implementations utilise a typical value for the tables.

Metaheuristic algorithms (MA), and in particular population-based metaheuristic algorithms (PBMH), such as genetic algorithms (GA) [57] and particle swarm optimisation (PSO) [48] show satisfactory performance in different applications such as neural network training [7, 32, 34, 44] image segmentation [33, 36, 40], and image quantisation [35, 37, 38]. Also, they are a reliable alternative for finding the optimal QTs. PBMHs are iterative, stochastic, and problem-independent algorithms to direct the search process by using several operators toward an optimal point. A global optimum solution cannot be guaranteed by PBMHs, but they can offer a solution that is close to it [39].

One of the earliest attempts to use PBMHs for finding QTs [12] employed a GA algorithm to find the quantisation table such that the chromosome is an array of size 64 and the objective function is the mean square error between the original image and the compressed image. In another study, [19] used GA to create a JPEG image QT to compress iris images in iris identification systems. A knowledge-based GA was suggested by [6] to find the optimised quantisation table. To do this, the GA algorithm incorporates information regarding image properties and image compression. In another study [21], the quantisation table is designed using differential evolution (DE) [50], and it is demonstrated that DE can outperform canonical GA. A knowledge-based DE is suggested by another study [20] to enhance DE performance. In one of the most recent works [56], the rate-distortion optimal principle is taken into account for finding the QT(s), which offers a number of optimal solutions to the applications’ need for multiple rates. Some other PBMHs which are used for finding QT(s) are the DE algorithm [55], the particle swarm optimisation (PSO) [1], the firework algorithm [53], and the firefly algorithm [54].

Rate-distortion (RD) is a fundamental concept in image compression that involves trading off the amount of compression (rate) with the resulting image quality (distortion), and it is achieved through the quantisation step. While a variety of works focus in finding the RD-based image compression by using the conventional algorithms, there is not much work in PBMH-based image compression. Qijun et al. [56] proposed a novel crossover and mutation based on the rate-distortion principle for the NSGA-II algorithm by changing the quantisation step, while [22] employed two conflicting objective functions, namely, compression rate and mean square error (MSE) for the multi-objective optimisation process based on NSGA-II algorithm. All works mentioned generate a set of solutions as the Pareto front. As a result, they are a member of a posteriori methods which the user can select the preferred solution among the generated solutions after the optimisation process. In other words, a user can not determine their preferences before conducting the optimisation process.

Energy consumption is one of the most important criteria in an electronic device [10] like smartphones. However, it might be difficult to estimate how much energy is used for any given process in a typical image. The majority of the present methods described in the literature can measure the power of a battery or, at best, for a particular application [13]. To tackle this issue, [31] employed the energy profiler of Android Studio and Plot Digitiser software to show that image quality and file size have a vital influence in the energy usage of an application. Therefore, they introduced the concept of "Energy-aware JPEG Image Compression" in the sense of methods that can save energy consumption in a device such as a smartphone. To this end, they showed that file size and image quality are two proxies for the energy consumption in Android smartphones. Developers frequently strive for both reduced file sizes and better image quality, however these two goals are incompatible since better image quality will increase file size and energy usage. Hence, a compromise between image quality and file size is required. To tackle this issue, [31] proposed two general multi-objective approaches: scalarisation and Pareto-based. They employed five scalarisation algorithms, including GA, PSO, DE, evolution strategy (ES), and pattern search, while two Pareto-based methods, including Non-Dominated Sorting Genetic Algorithm II (NSGA-II) and a reference-point-based NSGA-II (NSGA-III) are used for the embedding scheme.

The above-mentioned studies have focused on finding an optimal QT, while they suffer from a few fundamental problems that, to the best of our knowledge, no PBMH-based research has yet addressed, including 1) ignoring the user’s opinion in advance, 2) lack of comprehensive coverage, and 3) lack of sufficient knowledge of a user to determine the quality factor.

The first problem with the current studies is that they ignore the user’s opinion in advance. Different PBMH algorithms try to find the proper QT(s) based on one (or more) criterion, while the output image may not have the features required by the user. Assume that an Android developer is considering a small-size app. The optimisation algorithms for this app endeavor to produce a high-quality image that results in a larger file size. As a result, the production of such an image might be contrary to the opinion of the user, who prefers an image with a small file size. Although there are few research that consider the user’s opinion, it should be noted that these methods first perform the optimisation process (a posteriori optimisation) and then provide a set of solutions. The user must choose one of these solutions, while the generated solutions sometimes is sparse [31]. In addition, the scalarisation methods in the literature tries to decrease the file size regardless of what size a developer needs. Therefore, developing a user-specified file size for PBMH-based JPEG image compression is indispensable, so that a user can determine the file size for a specific image in advance, before starting the optimisation process (a priori optimisation). The second problem with the population-based JPEG image compression is the lack of comprehensive coverage. Our experiments (Section 3.1) imply that working on only the QT cannot provide all possible combinations of file size and quality. This issue becomes acute when the goal is to find a specific combination of file size and image quality that cannot be achieved by only using QTs. The third problem comes from a lack of enough knowledge of a user to determine the quality factor. To the best of our knowledge, these three problems have not yet been studied in population-based JPEG image compression.

The main contributions and characteristics of this paper are as follows:

  • The first contribution incorporates a user-specified file size to include the user’s opinion in our proposed approach. To this end, we propose an objective function for metaheuristic algorithms that first tries to produce an output image that is as close to the user-specified file size as possible and, secondly, has the highest image quality simultaneously.

  • To provide comprehensive coverage, we proposed a novel representation for PBMH-based image compression by adding only one component to enhance coverage. Our novel representation is simple but effective since it can give comprehensive coverage in the search space, and its implementation is straightforward.

  • The proposed algorithm can find the quality factor automatically without any reference image and prior knowledge about the colour distribution of an image.

  • Our proposed algorithm focuses on different aspects of JPEG image compression in a single shot, meaning that it includes user-specified file size, high-quality image, capability to find the right values for QTs, and the ability to find the quality factor automatically.

  • The approach is independent of the PBMHs since it is based on representation and objective function, not a search strategy. As a result, it works with all PBMHs. Therefore, we perform a comprehensive evaluation of PBMH-based search strategies for our re-formulated JPEG image compression. For this purpose, we select 22 base, advanced, and metaphor-based algorithms. Some selected algorithms, such as DE and its variants, are state-of-the-art, while others are among the more recent algorithms that have already received significant attention (based on paper citations).

  • We also provide the computational complexity of our proposed approach.

The remainder of the paper is organised as follows. Section 2 briefly describes the JPEG Image Compression. Section 3 defines our novel representation, novel objective functions, and the search strategies for re-formulated JPEG image compression, while Section 4 discusses the results. Finally, Section 5 concludes the paper.

2 The JPEG image compression

The essential elements of JPEG image compression are shown in Fig. 1. The encoder is in charge of transforming the original image into the JPEG compression variation, while the decoder is in charge of doing the opposite. A JPEG file can be encoded in a variety of ways, but JFIF (JPEG File Interchange Format) encoding is one of the most prevalent. The first step in JFIF is the color representation which is changed from RGB to YCbCr, which consists of one luma component (Y), which stands for luminance, and two chroma components (CB and CR), which stand for blue and red chrominance components. The other parts of JFIF are almost similar to the standard JPEG. We go into further depth on the key elements below.

Fig. 1
figure 1

The general structure of JPEG Image Compression

Blocks of \( 8 \times 8\) pixels are initially created from the original image. After that, the block values are moved from \([0,2p-1] \) to \([-2^{p-1},2^{p-1}-1] \), where p is the number of bits per pixel (\(p=8\) for the standard JPEG compression). A vector with a size of \(64 \times 1\) should be provided into the Discrete Cosine Transform (DCT) [4] component for each block of \(8 \times 8\) pixels. The DCT block divides the input signal into 64 DCT coefficients, or basis-signal amplitudes. The DCT can be mathematically written as

$$\begin{aligned} F(u,v)=\frac{1}{4}c_{u}c_{v} \left[ \sum _{x=0}^{7} \sum _{y=0}^{7} f(x,y) \cos \left( \frac{(2x+1)u\pi }{16} \right) \cos \left( \frac{(2y+1)v\pi }{16} \right) \right] \end{aligned}$$
(1)

where

$$\begin{aligned} c_{r}={\left\{ \begin{array}{ll} \frac{1}{\sqrt{2}} &{} r=0 \\ 1 &{} r>0 \end{array}\right. } \end{aligned}$$
(2)

The reversal of the DCT component used to rebuild the original image is called Inverse DCT (IDCT), and it is described as

$$\begin{aligned} F(x,y)=\frac{1}{4}c_{u}c_{v} \left[ \sum _{u=0}^{7} \sum _{v=0}^{7} f(u,v) \cos \left( \frac{(2x+1)u\pi }{16} \right) \cos \left( \frac{(2y+1)v\pi }{16} \right) \right] \end{aligned}$$
(3)

The original 64-point signal is exactly recovered in the absence of the quantisation step.

2.1 The quantisation and dequantisation components

The 64-element quantisation table, which is used in the quantisation stage, should be known beforehand. Each item in the table that corresponds to [1,255], the baseline, specifies the step size of the quantiser for its associated DCT coefficient. By deleting information that is not visually significant, quantisation seeks to accomplish compression while retaining picture quality.

The definition of the quantisation image is

$$\begin{aligned} L(u,v)= round\left( \frac{F(u,v)}{Q(u,v)}\right) \end{aligned}$$
(4)

where Q(uv) denotes the corresponding entry of the quantisation table, L(uv) stands for the quantised DCT coefficients, F(uv) stands for the DCT coefficients, and round(x) is the closest integer number to x. It is important to note that the information loss increases with the value of Q(uv).

A preliminary approximation of F(uv) is recreated by the de-quantisation component of the decoder by reversing the quantisation process as

$$\begin{aligned} \bar{F}(u,v)=L(u,v) \times Q(u,v) \end{aligned}$$
(5)

Since the quantisation table causes an information loss, this step is essential to the JPEG compression process. In order to balance the quality of the reconstructed image with the efficacy of the compression, the quantisation table plays a crucial role in JPEG image quantisation.

The quality factor can be added as a component of the JPEG implementation [22]. For a given quality factor F, the elements of the associated QT are obtained as

$$\begin{aligned} Q_{i,j}=\left[ \frac{50+S+Q_{i,j}}{100}\right] \end{aligned}$$
(6)

where [x] means the rounding of x, \(Q_{i,j}\) is the QT in the location of (ij), and S is defined as

$$\begin{aligned} S={\left\{ \begin{array}{ll} 200-2F &{} F\ge 50 \\ \frac{5000}{Q} &{} else \end{array}\right. } \end{aligned}$$
(7)

2.2 Symbol coding

After quantisation, the \(8 \times 8\) block’s 63 AC coefficients are treated separately from the DC coefficient. The DC coefficient is encoded using the Differential Pulse Code Modulation (DPCM) as

$$\begin{aligned} DIFF_{i}=DC_{i}-DC_{i-1} \end{aligned}$$
(8)

where \(DC_{i}\) and \(DC_{i-1}\) are the DC coefficients for the current \(8 \times 8\) block and the preceding \(8 \times 8\) block, respectively.

The quantised 63 AC coefficients may be formatted for entropy coding using a zigzag scan [45]. The AC coefficients after the zigzag scan exhibit decreasing variances and increasing spatial frequencies.

2.3 Entropy coding

There are frequently a few nonzero and a few zero-valued DCT coefficients left behind after the quantisation procedure. Entropy coding aims to compress the quantised DCT coefficients by their statistical characteristics. JPEG uses Huffman coding as its default method, which uses two DC and two AC Huffman tables for the luminance and chrominance DCT coefficients, respectively [45].

3 Re-formulated population-based JPEG image compression

This paper proposes a re-formulated population-based JPEG image compression. To this end, we propose, for the first time, a population-based strategy to find an JPEG image so that the size of the output file is as close as the user-specified file size, but with the highest image quality. Also, we propose a novel representation so that the search space could be covered better and the quality factor could be found automatically. Generally speaking, a representation, an objective function, and a search strategy are the three primary considerations when using a PBMH method to solve an optimisation problem. Representation exhibits the structure of each candidate solution, while an objective function is responsible for quantifying the quality of a candidate solution. The search strategy aims to find a promising solutions using several operators. This paper mainly proposes a novel representation and also a novel objective function with the aim of finding an image with a file size as close as to the user-specified file size, while maintaining the highest image quality. In addition, we selected 22 algorithms, which not only allows us to choose the best strategy, but also provided a benchmark among different algorithms.

In this section, first, we conduct a behaviour analysis of the quality factor, and then explain the main components of the proposed strategy.

3.1 Behaviour analysis of the quality factor

In this section, we conducted a behaviour analysis on the quality factor (QF). For all experiments, we randomly generated 10000 QTs and QFs to build a JPEG image and then calculated PSNR and the file size for the output image.

In the first experiment, we randomly generated the QTs 10000 times using permutation. In other words, the values for each QT are unique, and there are no duplicates. For a typical image, the data distribution in terms of PSNR and file size can be observed in Fig. 2(a). It is clear that, in this case, the search space is divided into several clusters, and the generated data does not cover the search space entirely.

In the next experiment, we added QF to the experiments, meaning that QF was not a fixed number and selected as a random integer. Figure 2(b) shows that by adding QF, the search space has more coverage.

In the third experiment, we selected QFs without permutation, meaning that the values for QTs can be duplicate. From Fig. 2(c), the result is impressing, indicating that the search space is comprehensively covered. Therefore, the two leading factors that are very effective in covering the search space are QF and not using permutation. In our proposed algorithm, these two are embedded in our representation strategy.

Fig. 2
figure 2

(a): Data Distribution in terms of PSNR and file size over random numbers for QTs with permutation; (b) Data Distribution in terms of PSNR and file size over random numbers for QTs without permutation; (c) Data Distribution in terms of PSNR and file size over random numbers for QTs without permutation but including quality factor

3.2 Solution representation

The conventional representation for population-based JPEG image compression is a vector as

$$\begin{aligned} x=[{QT_{1},QT_{2},...,QT_{m}}] \end{aligned}$$
(9)

whose length is the number of elements in the QT(s) (it is 64 per table) and \(Qt_{i}\) indicates the i-th element. This representation is so prevalent in the literature, and to the best of our knowledge, most research uses this representation. In this representation, only QT(s) are encoded and the goal of optimisation is to identify the best elements for QT(s) .

This paper, first, proposes a novel representation that encodes not only the QT(s) but also the quality factor. The representation is a vector of integer numbers of dimension 129 as

$$\begin{aligned} x=[LQT_{1,1},...,LQT_{8,8},...,CQT_{1,1},...CQT_{8,8}, QF] \end{aligned}$$
(10)

where the \(LQT_{i,j}\) and \(CQT_{i,j}\) denote the element that belongs at the coordinates (ij) in the LQT matrix and the CQT matrices, respectively. In other words, the LQT table’s first 64 entries are positive integer integers in the range \([0,2p-1]\) (where p is the number of bits corresponding to a pixel; in our case, \(p=8\)), while the remaining elements are set aside for the CQT table. The last entry is an integer number in [1, 99], which controls the quality factor. Our new representation is able to find the QF automatically since it is a part of the candidate solution. Also, by adding QF to the representation, the search space can be covered entirely.

It is worth noting that the conventional search space has a size of \(256^{128}\), whereas the new search space has a size of \(256^{128} \times 99\). In other words, even though we only added one variable to our new representation, the search space has grown by 99 times.

3.3 Objective function

One of the main goals of this paper is to present a user-specified population-based JPEG image compression. To this end, the desired file size is recommended by the user. In other words, there are two specific purposes that must be pursued in the objective function, as follows:

  1. 1.

    the output file size of image should be as close as possible to the file size specified by the user.

  2. 2.

    the image quality should be maximised.

To achieve the first purpose, the following function can be defined, which is the distance between the output file size and the user-specified file size

$$\begin{aligned} obj = \frac{|FS_{US}-FS_{O}|}{FS_{US}} \end{aligned}$$
(11)

where \(FS_{US}\) shows the user-specified file size, while \(FS_{O}\) is the file size of output image. Also, for normalisation, this difference is divided by \(FS_{US}\).

The second purpose can be achieved by maximising a quality factor such as PSNR. Therefore, the final objective function can be defined as

$$\begin{aligned} obj = \frac{|FS_{US}-FS_{A}|}{FS_{US}}+\frac{\lambda }{PSNR} \end{aligned}$$
(12)

where \(\lambda \) is a parameter, aiming to normalise the two goals in an almost identical range.

3.4 Search strategies

For search strategies, we can use any type of PBMHs. It is obvious that we cannot analyse every PBMH technique that has been published in the literature due to the large and diverse collection. Therefore, based on two criteria, we chose a number of algorithms for our investigation. While certain algorithms, like GA and DE, are state-of-the-art techniques that are often used in evolutionary and swarm computing, others, like the grey wolf optimiser (GWO) [29], are more recent algorithms that have still drawn substantial attention based on paper citations.

Eventually, we selected 22 algorithms, classified into three different categorises: base algorithms, advanced algorithms, and metaphor-based algorithms. In the following, we briefly explain the algorithms, while we refer to the cited publications for more details.

3.4.1 Base algorithms

  • Genetic algorithm (GA) [57]: GA is the oldest population-based algorithm and has two main operators, crossover and mutation. Crossover combines the information from the parents, while mutation makes random changes to one or more elements of a candidate solution. Solutions are carried over from one iteration to the next based on the principle of “survival of the fittest”. GA uses selection operators, both for choosing the parents for crossover and mutation and for choosing the solutions that pass to the next generation

  • Differential Evolution (DE) [50]: DE has three main operators, mutation, crossover, and selection. Mutation generates candidate solutions based on a scaled difference among candidate solutions and generates a mutant vector, DE/rand/1, as

    $$\begin{aligned} v_{i}=x_{r1} + F (x_{r2}-x_{r3}) , \end{aligned}$$
    (13)

    where F is a scaling factor, and \(x_{r1}\), \(x_{r2}\), and \(x_{r3}\) are three different randomly selected candidate solutions from the current population. Crossover integrates the mutant vector with a target vector selected from the current population. Finally, a candidate solution is selected by a selection operator.

  • Memetic Algorithm (MA) [30]: MA is a population-based search strategy that uses a population-based algorithm (here GA) in the combination with a local search. In the version we used, there is a probability for each agent, indicating whether a local search should be done or not.

  • Particle Swarm Optimisation (PSO) [48]: it is a swarm-based optimisation technique whose updating process is based on the best position of each candidate solution and a global best position. The velocity vector of a particle is updated as

    $$\begin{aligned} v_{t+1}= \omega v_{t}+c_{1} r_{1} (p_{t}-x_{t})+c_{2} r_{2} (g_{t}-x_{t}) , \end{aligned}$$
    (14)

    where t is the current iteration, \(r_{1}\) and \(r_{2}\) are random numbers from a uniform distribution in [0; 1], \(p_{t}\) is the personal best position, and \(g_{t}\) is the global best position.

  • Evolutionary strategy (ES) [58]: ES is a metaheuristic algorithm where each offspring is generated based on a Gaussian random number as

    $$\begin{aligned} x_{new}= x_{old}+N(0,\sigma ^{2}) , \end{aligned}$$
    (15)

    where \(N(0,\sigma ^{2})\) is a Gaussian random number with mean 0 and variance \(\sigma ^{2}\). Then, competition should be done for each individual and finally, the best individuals transfer to the next generation.

  • Artificial Bee Colony (ABC) [18]: it mimics the foraging behaviour of honey bees. There are three types of bees, employed bees, onlookers, and scouts. Each employed bee, i, generates a candidate solution as

    $$\begin{aligned} v_{i}=x_{i}+\varphi \times (x_{r1}-x_{r2}) , \end{aligned}$$
    (16)

where \(x_{r1}\) and \(x_{r2}\) are two random candidate solutions and \(\varphi \) is a random number from a uniform distribution in \([-1;1]\), and the better of \(v_{i}\) and \(x_{i}\) is kept. In the onlooker bee phase, a base candidate solution \(x_{i}\) is selected based on the quality of each candidate solution. If the quality of an employed bee or onlooker does not improve over a number of trials, it converts into a scout and generates a random candidate solution.

3.4.2 Advanced Algorithms

  • Levy-based Evolutionary Strategy (LevyES): Levy flight is a specific kind of random walk that uses a Levy distribution to determine step size. The next position in a Markov chain that is called a "random walk" relies simply on the present position. A sequence generated by Levy flight involves a lot of little steps and occasionally big jumps. LevyES benefits from the random numbers generated by the Levy flight distribution rather than the uniform distribution, leading to more exploration and exploitation, simultaneously.

  • Self-Adaptive DE (SADE) [43] is an improved variant of DE, based on the idea of employing two mutation operators, DE/rand/1 and DE /current-to-best/1, simultaneously. DE /current-to-best/1 is defined as

    $$\begin{aligned} v_{i}=x_{i}+ F_{i} . (x_{best}-x_{i})+ F_{i} . (x_{r1}-x_{r2}), \end{aligned}$$
    (17)

    where \(x_{best}\) is the best candidate solution from the current population, \(x_{r1}\) and \(x_{r2}\) are two randomly-selected candidate solutions, and \(F_{i}\) is the scaling factor for i-th candidate solution.

  • DE with Self-Adaptation Populations (SAP-DE) [51]: SAP-DE tries to present a self-adaptive population size in addition to self-adaptive crossover and mutation rates. To this end, SAP-DE proposes two variants, called SAP-DE-ABS and SAP-DE-REL to define a population size \(\pi \). SAP-DE-ABS defines \(\pi \) as

    $$\begin{aligned} \pi = round (NP_{ini}+N(0,1)) \end{aligned}$$
    (18)

    while SAP-DE-REL initialises the population size parameter based on a uniform distribution between [-0.5,+0.5]. In each stage, the \(\pi \) parameter should be updated. While SAP-DE-REL takes into account the current population size plus a percentage increase or decrease in accordance with the population growth rate, SAP-DE-ABS considers the population size of subsequent generations as the average of the population size attribute from all individuals in the current population.

  • Adaptive DE with Optional External Archive (JADE) [59] is a state-of-the-art variant of DE, defined based on three new modifications. First, JADE employs an archive, including historical data, to select parents. Second, JADE introduces a novel mutation, DE /current-to- pbest, as

    $$\begin{aligned} v_{i}=x_{i}+ F_{i} . (x_{best}^{p}-x_{i})+ F_{i} . (x_{r1}-x_{r2}), \end{aligned}$$
    (19)

    where \(x_{i}\) is the parent candidate solution, \(x_{best}^{p}\) is a randomly-selected candidate solution from the best 100p% candidate solutions in the current population, \(x_{r1}\) and \(x_{r2}\) are two candidate solutions randomly selected from the union of the current population and the archive. The third modification is to select F and CR, adaptively. JADE employs a normal distribution-based sampling for CR, while a Cauchy distribution is used to select F values.

  • Chaos PSO (CPSO) [24]: CPSO benefits from an adaptive inertia weight factor (AIWF) and a chaotic local search (CLS). AIWF leads to set \(\omega \), in the original PSO, adaptively based on the objective function value as

    $$\begin{aligned} \omega ={\left\{ \begin{array}{ll} \omega _{min}+\frac{(\omega _{max}-\omega _{min})(f-f_{min})}{f_{avg}-f_{min}} &{} f \le f_{avg} \\ \omega _{max} &{} f \ge f_{avg} \end{array}\right. } , \end{aligned}$$
    (20)

    where \(\omega _{max}\) and \(\omega _{min}\) signify the minimum and maximum of \(\omega \), respectively; f is the current objective function value of a candidate solution, and \(f_{avg}\) and \(f_{min}\) are the average and minimum values of all candidate solutions, respectively. To enhance the effectiveness, the CLS operator acts as a local search around the best position as

    $$\begin{aligned} cx_{i}^{k+1}=4cx_{i}^{k}(1-cx_{i}) \end{aligned}$$
    (21)

    where \(cx_{i}\) shows the i-th chaotic variable, and k is the iteration number. \(cx_{i}\) is distributed between 0 and 1, and the above equation shows a chaotic behavior when the initial \(cx_{0} \in (0,1)\) and \(x_{0} \notin {0.25,0.5,0.75}\).

  • Comprehensive Learning PSO (CLPSO) [23]: it suggests a comprehensive learning (CL) strategy for particle learning to avoid premature convergence. All particles’s pbest can be employed to adjust the velocity of each particle rather than just its own pbest. The updating scheme in CLPSO is defined as

    $$\begin{aligned} v_{t+1}^{i}= \omega v_{t}^{i}+c_{1} r (pbest_{fi(d)}^{f}-x_{t}^{i}), \end{aligned}$$
    (22)

    where fi(d) defines which particles’ pbest particle i should follow. The decision to learn from nearby particles is made using the comprehensive learning probability, PC. A random number with a uniform distribution is chosen for every dimension. The dependent dimension will learn from its own pbest if the produced random number is greater than PC(i). Otherwise, it updates based on the nearby particles.

  • Self-organising Hierarchical PSO with Jumping Time-varying Acceleration Coefficients (HPSO) [46]: the main characteristics of HPSO are as follows:

    1. 1.

      Mutation is defined for the PSO algorithm,

    2. 2.

      A novel concept, called self-organizing hierarchical particle swarm optimiser with TVAC, is introduced which solely takes into account the "social" and "cognitive" components of the particle swarm strategy when estimating each particle’s new velocity, and particles are re-initialised when they are stagnated in the search space.

    3. 3.

      A time-varying mutation step size is included in the PSO algorithm.

  • Phasor PSO(P-PSO) [16] proposes PSO control parameters based on a phase angle (\(\theta \)), inspired from phasor theory in mathematics. In each iteration, the velocity is updated as

    $$\begin{aligned} v_{i}^{iter}= |cos\theta _{i}^{iter}|^{2*sin\theta _{i}^{iter}} \times (Pbest_{i}^{iter}-x_{i}^{iter}) +|sin\theta _{i}^{iter}|^{2*cos\theta _{i}^{iter}} \times (Gbest_{i}^{iter}-x_{i}^{iter}) \end{aligned}$$
    (23)

3.4.3 Metaphor-based algorithms

  • Harmony Search (HS) [15]: it updates a new harmony (candidate solution) based on three rules, including, memory consideration, pitch adjustment and random selection. While random selection will explore the global search, and in consequence enhancing exploration, memory consideration and pitch adjustment ensure that the good local solutions are kept.

  • Grey Wolf Optimiser (GWO) [29]: it is inspired by the social structure and hunting techniques of grey wolves. Based on the top three candidate solutions from the present population, each candidate solution is updated as

    $$\begin{aligned} x_{i}(t+1)=(x_{1}+x_{2}+x_{3})/3 , \end{aligned}$$
    (24)

    with

    $$\begin{aligned} x_{1}=x_{\alpha }-r_{1} D_{\alpha } \text{, } x_{2}=x_{\beta }-r_{2} D_{\beta } \text{, } x_{3}=x_{\gamma }-r_{3} D_{\gamma } , \end{aligned}$$
    (25)

    and

    $$\begin{aligned} D_{\alpha }=|C_{1} x_{\alpha }-x_{i}(t)| \text{, } D_{\beta }=|C_{2} x_{\beta }-x_{i}(t)| \text{, } D_{\gamma }=|C_{3} x_{\gamma }-x_{i}(t)| , \end{aligned}$$
    (26)

    where \(x_{\alpha }\), \(x_{\beta }\), and \(x_{\gamma }\) are the best three candidate solutions, \(r_{1}\), \(r_{2}\), and \(r_{3}\) are random numbers as are \(D_{\alpha }\), \(D_{\beta }\), and \(D_{\gamma }\) and \(C_{1}\), \(C_{2}\), and \(C_{3}\).

  • Ant Lion Optimiser (ALO) [25]: ALO is based on the hunting habits of ant lions. The random walk of ants, an essential component of the ALO, is mathematically modelled as

    $$\begin{aligned} X(t) = X_0 + \sum _{i=1}^{t} \text {cumsum}\left( \frac{2 \times J}{|2 \times L(t-i+1) - 1|} \right) \end{aligned}$$
    (27)

    where cumsum shows the cumulative sum, n signifies the maximum number of iteration, t determines the step of random walk , and r(t) is a binary threshold function. The position of the ant lion, which represents the centre of the trap, is a critical factor in influencing the movement of ants. This behaviour is modelled as a function based on fitness value. Finally, the position of ants is updated based on their interaction with the closest ant lion, simulating the capture process in the trap:

    $$\begin{aligned} x_{i}(t+1) = \frac{x_{i}(t) + x_{\text {antlion}}(t)}{2}. \end{aligned}$$
    (28)

    where \(x_{\text {antlion}}(t)\) is the position of the ant lion.

  • Dragonfly Algorithm (DA) [26]: DA includes the following five factors: separation, alignment, cohesion, attraction, and distraction. The separation distance of the i-th dragonfly is calculated as

    $$\begin{aligned} S_{i} = -\sum _{j=1}^{n} (x_{j} - x_{i}) \end{aligned}$$
    (29)

    where \(x_{i}\) is the positions of the i-th dragonfly, \(x_{j}\) shows the position of j-th neighbouring individual, and n is the number of neighbouring dragonflies. Alignment behaviour aligns the velocity of a dragonfly with neighbouring dragonflies, defined by

    $$\begin{aligned} A_{i} = \frac{\sum _{j=1}^{n} V_{j}}{n} \end{aligned}$$
    (30)

    where \(V_{i}\) is the velocity of i-th dragonfly. Cohesion behaviour keeps the dragonflies towards the centre of mass of the neighbours as

    $$\begin{aligned} C_{i} = \frac{\sum _{j=1}^{n} x_{j}}{n}-x_{i} \end{aligned}$$
    (31)

    Another operator is called attraction, which is defined as

    $$\begin{aligned} F_{i} = x_{\text {food}} - x_{i} \end{aligned}$$
    (32)

    where \(x_{\text {food}}\) is the position of the food source. Distraction from the enemy is also defined as

    $$\begin{aligned} E_{i} = x_{i} - x_{\text {enemy}} \end{aligned}$$
    (33)

    where \(x_{\text {enemy}}\) is the position of the enemy. Finally, the velocity and position should be updated as

    $$\begin{aligned} V_{i}^{t+1} = W \cdot V_{i}^{t} + S_{i} + A_{i} + C_{i} + F_{i} + E_{i} \end{aligned}$$
    (34)
    $$\begin{aligned} x_{i}^{t+1} = x_{i}^{t} + V_{i}^{t+1} \end{aligned}$$
    (35)

    where W is the inertia weight.

  • Whale Optimisation Algorithm (WOA) [28]: the social behaviour of humpback whales is modelled by the WOA. Encircling prey is one of the main updating mechanisms in WOA defined as

    $$\begin{aligned} D = |C \cdot x_{best} - x_{t}| \end{aligned}$$
    (36)
    $$\begin{aligned} x(t+1) = x_{best} - A \cdot D \end{aligned}$$
    (37)

    where A and C are are coefficient vectors. In addition, spiral updating follows a spiral path to simulate the bubble-net behaviour, defined as

    $$\begin{aligned} \textbf{x}(t+1) = D' \cdot e^{b \cdot l} \cdot \cos (2\pi l) + x_{best} \end{aligned}$$
    (38)

    where \( D' = |x_{best} - x(t)| \), \( b \) is a constant defining the shape of the spiral, and \( l \) is a random number in \([-1, 1]\).

  • Sine Cosine Algorithm (SCA) [27]: the behavior of the sine and cosine functions serves as the foundation for the SCA algorithm. Each candidate solution is updated as

    $$\begin{aligned} x_{i}(t+1)={\left\{ \begin{array}{ll} x_{i}(t)+r_{1} \sin (r_{2})+|r_{3} p_{i}(t)-x_{i}(t)| &{} \text{ if } r_{4}<0.5 \\ x_{i}(t)+r_{1} \cos (r_{2})+|r_{3} p_{i}(t)-x_{i}(t)| &{} \text{ if } r_{4} \ge 0.5 \end{array}\right. } , \end{aligned}$$
    (39)

    where \(p_{i}\) shows the destination solution, \(r_{1}\) is a conversation parameter, \(r_{2}\) a random number between 0 and \(2\pi \), and r3 is a random number for weighing \(p_{i}(t)\).

  • Gradient-based Optimiser (GBO) [3]: GBO, as one of the most recent PBMHs, is inspired by the gradient-based Newton’s method. The GBO algorithm moves based on a gradient-specified direction for each candidate solution in the current population. The GBO algorithm benefits from two main operators, namely, gradient search rule (GSR) and local escaping operator (LEO). The GSR employs a direction movement (DM) process for updating vector locations as

    $$\begin{aligned} x_{i}(t + 1) = x_{i}(t) - r1 \times \left( \frac{1}{2\Delta x} \times x_{t} \right) + r2 \times \left( \frac{2}{x_{\text {worst}} - x_{\text {best}}} \times x_{\text {best}} - x_{t} \right) \end{aligned}$$
    (40)

    where r1 and r2 are two random numbers, \(\Delta x\) shows the distinction between the best solution and a selected neighbour position, which is chosen randomly. Also, the LEO aims to enhance the exploitation search and aid in avoiding local optima.

  • Arithmetic Optimisation Algorithm (AOA) [2]: AOA tries to find the optimal solution based on several arithmetic operators such as division and multiplication. The AOA algorithm benefits from two operators based on subtraction and addition for the exploitation phase. At the same time, the division search strategy and multiplication search strategy are responsible for the exploration phase. The updating process in the exploration phase is as

    $$\begin{aligned} x_{i}(t + 1) = {\left\{ \begin{array}{ll} \frac{x_{best}}{\text {MOP} + \varepsilon } \times (\text {UP} \times ((\text {UB}_i - \text {LB}_i) \times \mu + \text {LB}_i)) &{} \text {if } r_2 < 0.5 \\ x_{best} \times \text {MOP} \times ((\text {UB}_i - \text {LB}_i) \times \mu + \text {LB}_i) &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$
    (41)

    where, best(x) is the best-obtained solution so far, MO, MP, UP, \(UB_j\), and \(LB_j\) are specific constants used in the algorithm, \(r_2\) is a random number, and \(C_{\text {Iter}}\), \(M_{\text {Iter}}\), and \(\alpha \) are iteration-related parameters. In addition, the exploitation phase is defined as

    $$\begin{aligned} x_{i}(t + 1) = {\left\{ \begin{array}{ll} x_{best}-MOP \times (\text {UP} \times ((\text {UB}_i - \text {LB}_i) \times \mu + \text {LB}_i)) &{} \text {if } r_2 < 0.5 \\ x_{best}+MOP \times (\text {UP} \times ((\text {UB}_i - \text {LB}_i) \times \mu + \text {LB}_i)) &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$
    (42)

3.5 Computational complexity

This section provides an analysis of the computational complexity of our proposed algorithm. In general, the computational complexity of an approach based on a metaheuristic algorithm depends on various factors, such as population size (\(N_{pop}\)), objective function, the number of iterations (I), problem dimensions (d), and operators. Typically, the computational complexity of the operators in our problem is lower than the objective function. Therefore, the complexity of the operators can be disregarded. Here, we first discuss the computational complexity of JPEG image compression, as it constitutes a significant part of the objective function. Since we have applied our proposed method across a wide range of optimisation algorithms, we present specific instances to illustrate that the computational complexity in this problem primarily depends on the objective function.

JPEG image compression

The computational complexity of JPEG image compression primarily relies on the algorithm’s elements and the input image’s dimensions. Converting colour space exhibits a computational complexity of O(N), where N is the pixel count in the input image. DCT operations are conducted on \(8 \times 8\) pixel blocks, with a computational complexity of \(O(N^{2})\). The quantisation phase entails dividing DCT coefficients by quantisation values and rounding to the nearest integer, resulting in a computational complexity of O(N). Utilising Huffman coding for entropy coding entails a computational complexity of O(NlogN). Consequently, the overall computational complexity of JPEG image compression is \(O(N^{2})\).

GA-based approach

To calculate the overall computational complexity of the GA-based JPEG image compression objective function, we need to consider the computational complexities of both the genetic algorithm operations and the JPEG image compression operations and then take the maximum of the two. Our proposed method for JPEG image compression using a GA incorporates four primary operators: tournament selection, multi-point crossover, objective function evaluation, and mutation. Other components of the GA-based JPEG image compression process are relatively computationally inexpensive. The computational complexity of tournament selection is typically represented as \(O(m N_{pop})\), where m denotes the number of candidate solutions participating in each tournament. This is because, during tournament selection, m candidate solutions are randomly drawn from the population, and the best individual among these m candidates is chosen for reproduction. This selection process is iterated for each candidate solution in the new population.

The computational complexity of crossover relies on both the length of the solution representation and the number of crossover points. The process of selecting K crossover points can be accomplished constantly because the number of crossover points is predetermined and does not vary with the size of the solution representation. Hence, the complexity for selecting crossover points is O(1). Once the crossover points are determined, the crossover operation entails swapping the segments between these points. The number of operations needed for this operation depends on the number of segments, \(K+1\). Thus, the crossover operation’s complexity is \(O(K+1)\). Consequently, the overall complexity of multi-point crossover is the sum of the complexities of selecting crossover points and performing the crossover operation, amounting to \(O(K+1)\). Since the number of crossovers in the GA algorithm does not exceed \(N_{pop}\), the computational complexity of the crossover operator is \(O((K+1) N_{pop})\). On the other hand, the computational complexity of mutating a single element is constant, denoted as O(1). When we need to mutate L elements in the solution representation, the overall computational complexity of mutation becomes O(L). Therefore, if we perform mutation for each candidate solution, the complexity of mutation is \(O(L N_{pop})\).

Thus, the total complexity of a single iteration amounts to \(O(m N_{pop}+ (K+1) N_{pop}+ L N_{pop}+N_{pop}N^{2} )\). In practical scenarios, the number of pixels in a standard image typically surpasses the population size. For instance, population sizes commonly range between 5 and 200, whereas a small image measuring \(100 \times 100\) encompasses 10000 pixels. Additionally, m is generally a small value compared to the population size. Therefore, given the significant difference in the number of pixels compared to the population size, we can deduce that the computational complexity of GA-based JPEG image compression per iteration equates to \(O(N_{pop} N^{2})\), and for the entire algorithm, it becomes \(O(I N_{pop} N^{2})\).

DE-based approach

Our proposed DE-based JPEG Image Compression consists of four computational components: standard mutation, crossover, selection, and objective function. A donor vector is created for each individual in the population by adding the weighted difference between two randomly chosen individuals to a third individual. This operation typically has a complexity of O(d). In addition, the crossover operator has a complexity of O(d). The selection operator has a O(1) complexity because it only compares the trial vector with the target vector to determine which one survives to the next generation.

Moreover, DE assesses \(N_{pop}\) candidate solutions within each iteration. Given that the computational complexity of the objective function surpasses O(d) by a considerable margin, the computational complexity of DE-based JPEG Image Compression can be expressed as \(O(I N_{pop} N^{2})\).

PSO-based approach

The leading operators of PSO are updating processes and objective functions. Updating the velocity and position of each particle involves arithmetic operations and comparisons. The complexity of this operation is typically \(N_{pop} d\) since it requires updating the position and velocity for each dimension of each particle. Given that \(O(N^{2})\) greatly exceeds \(N_{pop} d)\) once more, the computational complexity of PSO-based JPEG Image Compression can be expressed as\(O(I N_{pop} N^{2})\).

4 Experimental results

In this section, an extensive set of experiments is offered to show the effectiveness of our proposed strategy. To achieve this, we used 6 of the images recommended in [42] for benchmarking image quantisation, including Snowman, Beach, Cathedrals Beach, Dessert, Headbands, and Landscape, in addition to 7 widely used benchmark images for image compression, including Airplane, Barbara, Lena, Mandrill, Peppers, Tiffany, and Sailboat. The benchmark images are displayed in Fig. 3.

Our proposed strategy is embedded in 22 PBMH-based search strategies. We categorised them into three main groups: base, advanced, and metaphor-based algorithms. Base algorithms are the most famous algorithms, while advanced algorithms are the state-of-the-art variants of the base algorithms. Also, we selected some metaphor-based algorithms according to newness and the number of citations. Some metaphor-based algorithms such as AOA have been presented in recent years, while others such as GWO have attracted a significant number of citations in recent years.

To provide for a fair comparison, each algorithm is executed 30 times, independently. We provide the results for two user-specified file sizes, 10000 and 50000 bytes. For each algorithm, the population size and the number of function evaluations are set to 20 and 1000, respectively. Other parameters are set to their default values (Table 1) in the paper’s appendix. All algorithms are implemented in Python and with the Mealpy framework [52], one of the largest Python modules for the most cutting-edge metaheuristic algorithms.

Fig. 3
figure 3

Benchmark images

Table 1 Parameter settings for all algorithms
Table 2 Mean objective functions for the base algorithms

4.1 Evaluation criteria

We used the mean objective function, closeness, and confidence factor (CF) as the main criteria for evaluation. Closeness refers to the distance between the size of the output image and the user-specified file size, defined as

$$\begin{aligned} C=|FS_{output} - FS_{US}| \end{aligned}$$
(43)

where \(FS_{output}\) is the output of file size and \(FS_{US}\) means user-specified file size. Lower closeness points out a higher ability of an algorithm to find an output image with a file size similar to the user-specified file size.

Also, we defined the CF measure as

$$\begin{aligned} CF =\frac{\sum _{ind=0}^{Nr} \xi }{Nr} \end{aligned}$$
(44)

where Nr is the number of independent runs for a specific algorithm (here is 30), CF is the confidence factor, and \(\xi \) is defined as

$$\begin{aligned} \xi ={\left\{ \begin{array}{ll} 1 &{} \text {if } C<CS \\ 0 &{} \text {otherwise} \end{array}\right. } , \end{aligned}$$
(45)

where C is the closeness measure, and CS is called confidence coefficient, which shows an acceptable file size for output image (here CS is 10000). So a good algorithm will have a large value for CF, ideally it would be 1.

In addition, for further clarification of the behavior of the search strategies for proposed approach, we provide three criteria, including diversity, exploitation, and exploration.

4.2 Results for the base algorithms

The results for the base algorithms can be seen in Tables 2, 3 and 4 in the paper’s appendix. In each table, we also provide the rank of each algorithm per image. From Table 2 in the paper’s appendix, GA obtains the lowest rank in 24 out of 26 cases, and the second rank in only 2 cases, leading to the lowest average rank, and subsequently the first overall rank. The second average rank goes to ABC, while the MA algorithm ranks third. On the other hand, DE and PSO give the worst results.

Table 3 Closeness measure for the base algorithms
Table 4 CF measure for the base algorithms

From Table 3 in the paper’s appendix, we can observe that the results are promising. Some algorithms such as GA and ES are able to accurately find the image with the user-specified file size; for instance, the difference between the output file size for Barbara image is 23.43 and 40.03 bytes for the \(FS_{US}=10000\) and 50000, respectively. Based on the closeness measure, ES is the best-performing algorithm, followed by GA and ABC.

The results of the CF measure in Table 4 in the paper’s appendix are consistent with the earlier tables, so that GA, ES, and ABC are the best algorithms. In particular, GA obtains the CF measure equal to 1 for all but four cases, indicating that in the majority of cases, it can find the desired image size. Some algorithms such as PSO, MA, and DE can not provide satisfactory results, for instance, the CF measure of the PSO algorithm is between 0.13 and 0.67 in all cases.

Table 5 The objective function results for the advanced algorithms
Table 6 Closeness measure for the advanced algorithms

4.3 Results for the advanced algorithms

Advanced algorithms include the state-of-the-art variant of the base algorithms. To this end, we selected 8 algorithms as the search strategies. The mean objective function value and its rank for all algorithms and images are provided in Table 5 in the paper’s appendix. From the table, we can observe that HPSO performs best, followed by CLPSO (at a narrow margin) and JADE. The worst algorithms are SAP-DE, SADE, and LevyES.

HPSO again obtained the lowest closeness measure (based on Table 6 in the paper’s appendix, while the second and third ranks belong to LevyES and CLPSO. It is worthwhile to mention that LevyES can not perform well in terms of objective function, but it is able to provide better results in terms of closeness measure. Again, SADE and SAP-DE yield the worst overall rank.

Table 7 CF measure results for the advanced algorithms
Table 8 Objective function results for the metaphor-based algorithms

Despite the efficiency of the HPSO algorithm in terms of objective function and closeness measures, HPSO has not been able to maintain its efficiency according to CF measure (Table 7 in the paper’s appendix) and it is located in the third place, while the first and second ranks go to CPSO and CLPSO, respectively. Again, SADE and SAP-DE have the lowest rankings.

All in all, we can say that HPSO and CLPSO perform best among other advanced algorithms since HPSO has two first-place and one third-place overall ranks, while two second-place ranks and one third-overall-rank are obtained by CLPSO.

Table 9 Closeness measure for the metaphor-based algorithms
Table 10 CF measure for the metaphor-based algorithms

4.4 Results for the metaphor-based algorithms

The experimental results for the metaphor-based algorithms are presented in Tables 8, 9 and 10 in the paper’s appendix. From Table 8 in the paper’s appendix, it is possible to see that the WOA algorithm achieves the first overall rank, followed by GWO and HS. The interesting point is that while the AOA algorithm is one of the most recently introduced PBMH algorithms, it performs the worst.

The objective function results are almost consistent with the closeness measure. From Table 9 in the paper’s appendix, The HS algorithm provides the lowest rank, while the second rank goes to the WOA algorithm. Also, SCA, DA and AOA offer the worst results. In some cases, none of metaphor-based algorithms can provide satisfactory results; for instance, for the Airplane image and \(FS_{US}=10000\), the closeness is between 184.33 and 1474.53. In some other cases, the effectiveness of the search strategy is more tangible; for instance for the Headbands image and with \(FS_{US}=10000\), the HS algorithm achieves closeness equal to 5.83 (an impressive result), while SCA obtains 2898.27, demonstrating the effect of the search strategy.

Table 11 Overall ranking of the algorithms based on objective function, closeness, and accuracy

The CF results for metaphor-based algorithms can be seen in Table 10 in the paper’s appendix. It is clear that WOA achieves the first average rank, followed by HS and GWO. Again, the worst ranks belong to SCA, AOA, and DA. By taking look at the table, some algorithms such as WOA yield satisfactory results by a CF of more than 0.97 in all cases, while some others such as SCA, can not achieve an accuracy higher than 0.5, indicating the high impact of the search strategy in the effectiveness of the proposed approach.

Finally, we can say that among metaphor-based algorithms, WOA, GA, and HS (as one of the oldest algorithms) presented the best performance, while DA, SCA, and AOA (as one of the recent algorithms) can be categorised as the worst algorithms.

4.5 Overall evaluation

This section aims to provide an overall comparison among all 22 search strategies. To this end, we provide an overall ranking in Table 11 based on three criteria, objective function, closeness, and confidence factor.

Based on the objective function, HPSO achieved the first rank, followed by CLPSO as second, and WOA as third. Only GA achieved a satisfactory rank among the base algorithms, while SAP-DE failed to achieve an acceptable rank among the advanced algorithms. The worst algorithms are DA, SAP-DE, and SCA algorithms.

HPSO obtained the second rank, based on the closeness measure, while the first rank belongs to the HS algorithm. It is worthwhile to mention that HS achieves the sixth rank based on the objective function. WOA again attains the third rank in terms of closeness measure.

CPSO is ranked first in terms of CF measure, while it can not achieve a low rank based on other criteria. The second and third ranks go to GA and CLPSO.

By moving from ES to LevyES, we can observe that LevyES can achieve a better performance in terms of objective function (14 to 11), and similar ranks based on the other two measures. A comparison between DE and their variants (SADE, SAP-DE, and JADE) reveals that SADE and JADE can improve the results of the DE algorithm, while while SAP-DE fails to do so. In particular, the JADE algorithm performs best among other DE variants. In addition, a comparison between PSO and its variants, including CPSO, CLPSO, HPSO, and PPSO shows that all variants outperform the standard PSO algorithm, and in particular, HPSO obtained excellent results.

Another point that can be found using the table is that some algorithms have shown great diversity in different criteria. For instance, closeness for CPSO shows a rank of 11, while its CF is 1. It shows that the mean file size obtained by CPSO is higher than some algorithms, but in most cases, CF is better or the file size is lower than the threshold. From the table, GA, CLPSO, HPSO, HS, and WOA can overcome others, and the results benefit from a lower diversity.

For further explanation, we also provide the results of Wilcoxon signed rank test at 5% significance level, as a pairwise statistical test, between all combinations of the algorithms to show whether the algorithms are significantly different or not.

The test shows whether there is a statistically significant difference between the performance of any two algorithms. While the alternative hypothesis, known as \(H_{1}\), suggests that there is a noticeable difference, the \(H_{0}\) hypothesis exposes the same behavior of the two algorithms under evaluation. The significance level shows the rejection of probability of \(H_{0}\). If the computed p-value is less than the significance level, \(H_{0}\) is rejected. The results of the statistical tests are shown in Table 12.

From the table, we can see that HPSO performs significantly better than the others; in 19 cases, HPSO outperforms others significantly, while the results of HPSO are equivalent to PSO and ES. The overall next best performing algorithms are CLPSO and WOA (18 wins, 2 ties, and 1 fail), and GWO (18 wins and 3 fails). On the other hand, several algorithms such as DA (21 fails), SAP-DE (1 win, 2 ties, and 18 fails), and SCA (2 wins, 1 tie, and 18 fails) can not perform well enough in comparison to others.

Table 12 Results of Wilcoxon signed rank test based on mean objective function value. \(+\), −, and \(=\) denote that the algorithm in the corresponding row is statistically superior than, inferior to, or equivalent to the algorithm in the corresponding column

4.6 Further discussion

This section provides a further discussion on the behaviour analysis of the search strategies, in particular, based on computation time, population diversity, exploration, and exploitation. In the first experiment, we evaluate the algorithms regarding computation time. It is worth mentioning that the representation and the objective function are the same for all algorithms. Therefore, the amount of computation time differences are related to the search strategy. The experiments were run using a Desktop PC with Linux Version 42.2, a i7-7700k CPU at 4.20GHz, 64GB RAM, and a 1 TB SSD hard. Figure 4 shows the average computation time for all algorithms and a representative image, Airplane. At first sight, we can observe that the computation time for most algorithms is between a little less than 100 seconds and a little more than 140 seconds. The exceptions are SAP-DE and ALO. The SAP-DE algorithm took the least time to run, but based on the earlier results, it has yet to be able to perform well enough based on other criteria. Another finding is that for different file size and a same algorithm, the computation times are almost the same, meaning that different values for the user-specified file size have a small impact on the computation time.

Fig. 4
figure 4

Computation time for different algorithms and the Airplane image

In the next experiment, we have evaluated the population diversity in the population. Population diversity serves as a gauge for solution distribution in the population. In this study, we define population diversity using the Euclidean distance metric, as

$$\begin{aligned} D=\frac{1}{NP} \sum _{i=1}^{NP}\sqrt{\sum _{k=1}^{D}(x_{ik}-\bar{x_{k}})} \end{aligned}$$
(46)

where NP is the population size and D is the problem dimension, \(x_{ik}\) shows the k-th dimension of i-th individual, and \(\bar{x_{k}}\) is the population mean, defined as

$$\begin{aligned} \bar{x_{k}} = \frac{1}{NP} \sum _{i=1}^{NP} x_{ik} \end{aligned}$$
(47)

given \(\bar{x}=[x_{1},x_{2},...,x_{k},...,x_{D}]\).

To this end, we selected two algorithms as representatives, HPSO as one of the best-performing algorithms and SAP-DE as one of the worst-performing algorithms. Figure 5 shows the population diversity during the optimisation process for two representative algorithms for all images. Since the stopping criterion in this paper is defined as the number of function evaluations, and SAP-DE in each iteration calculates more objective functions than the population size, the number of iterations for SAP-DE is lower than HPSO. It can be seen that the diversity of HPSO gradually decreases, meaning that in the early stages of the optimisation process, exploration is high, while over the iterations, exploitation is enhanced and exploration is degraded. In all cases, SAP-DE is significantly fluctuating. Even in some cases such as Snowman, the trend is upward. In some other cases such as Cathedrals beach, there is a downward trend followed by an upward trend.

Fig. 5
figure 5

Diversity measure

The performance of an optimisation algorithm is significantly influenced by its exploration and exploitation capabilities. The two competing goals should be ideally balanced by a competent optimisation technique [8],  [9]. When the exploitation process predominates, the population quickly loses its diversity, and the algorithm quickly converges to a local optimal solution. On the other hand, if the exploration phase dominates, the algorithm spends a lot of time exploring un-necessary regions of the search space. Kashif et al. [17] introduced two criteria to measure exploration and exploitation. To this end, first, the center of population is calculated based on a median as

$$\begin{aligned} Div_{j}=\frac{1}{NP} \sum _{i=1}^{NP} median(x^{i})-x_{i}^{j} \end{aligned}$$
(48)
$$\begin{aligned} Div=\frac{1}{D} \sum _{j=1}^{D} Div_{j} \end{aligned}$$
(49)

where \(median(x^{i})\) is median of dimension j in the entire population, \(x_{i}^{j}\) is the dimension j of i candidate solution, NP is the population size, and D is the dimensionality of the problem.

The percentage of exploitation and exploration in one algorithm, for each iteration, can be calculated as

$$\begin{aligned} XPL=\frac{Div}{Div_{max}} \times 100 \end{aligned}$$
(50)

and

$$\begin{aligned} XPT=\frac{Div-Div_{max}}{Div_{max}} \times 100 \end{aligned}$$
(51)

where \(Div_{max}\) devotes the maximum diversity in all iterations, XPL and XPT mean exploration and exploitation percentages for an iteration, respectively.

Figure 6 shows the exploration measure for all images and two representatives. It clearly indicates that HPSO gradually decreases the exploration over the course of iteration, while there are drastic fluctuations in the SAP-DE algorithm. Since the exploitation criterion is the opposite, we do not include them in the paper. Therefore, we can say that SAP-DE can not provide enough and regular exploration and exploitation over time, and as a result, its performance is degraded.

Fig. 6
figure 6

Exploration measure

5 Conclusion

The JPEG standard is one of the most widely employed algorithms in image processing. The quantisation table (QT) influences the image properties such as file size and image quality. Several studies suggest that population-based metaheuristic (PBMH) algorithms can be used to find the right values for QT(s). However, our study shows that these algorithms have three main problems. First, they do not take into account the user opinion, second, the current works can not give an adequate cover of the entire search space, and third, the quality factor (CF) in PBMH-based JPEG image compression algorithms should be determined in advance. To tackle these problems, we re-formulated the population-based JPEG image compression, so that both representation and objective function are changed. By changing the objective function, we incorporated the user opinion on image file size. In other words, file size can be controlled in advanced by a user. In addition, our new representation can solve the problem of lack of a comprehensive coverage. Another benefit of our new representation is that the quality factor can be selected automatically. Since both objective functions and representation are independent of the PBMH, any type of PBMH can be employed to this end. As the fourth contribution, this paper benchmarks 22 PBMHs, both state-of-the-art and newly-introduced algorithms, for the new formulation of JPEG image compression.

Despite the effectiveness of the proposed strategy, this work can be extended in the future with the following points:

  • One of the assumptions in the objective function was that both objectives have the same weights, although they can be different. The possibility of defining a weight-based objective function is another potential direction for future research.

  • This paper incorporates the file size as the user opinion, while other opinions also can be added to the proposed approach.

  • A multi-objective variant of the proposed approach is under investigation.