1 Introduction

The knapsack problem (KP) and its variants are among the most researched combinatorial optimization problems. Due to the NP-hardness of these problems, many heuristic and metaheuristic methods have been developed for solving them. In this paper, a general method is proposed that can be potentially applied to a wide range of them. The approach is illustrated on two variants of the knapsack problem with different properties. To be specific, the proposed approach is applied on the multidimensional knapsack problem (MKP) and the knapsack problem with forfeit sets (KPFS).

The MKP is a versatile problem with a wide range of practical applications such as cutting stock (Gilmore and Gomory 1966), loading (Shih 1979), and many others. However, solving it is computationally challenging due to its NP-hardness (Garey and Johnson 1979). Despite this, it has been extensively studied, and numerous solution approaches have been proposed. A comprehensive review of representative studies up to 2004 can be found in Fréville (2004), and more recent studies are discussed in Lai et al. (2018). Interesting variants of the MKP include the multiple multidimensional knapsack problem (Mancini et al. 2021), the multiple-choice multidimensional knapsack problem (Chen and Hao 2014), the robust multiple-choice multidimensional knapsack problem (Caserta and Voß 2019), and the multidemand multidimensional knapsack problem (Lai et al. 2019).

Another group of variations of the KP is based on the idea of incompatible items (Basnet 2018; Coniglio et al. 2021). One of the main representatives of this type of problem is the knapsack problem with conflict graphs (KPCG), which deals with incompatibilities between item pairs and is proven to be strongly NP-hard (Pferschy and Schauer 2009; Li et al. 2021). A closely related variant is the knapsack problem with forfeits (KPF), where item pairs come with associated penalty costs (Capobianco et al. 2022). In this paper, the analysis focuses on the newly introduced knapsack problem with forfeit sets (KPFS) which can be understood as a generalization of the KPF (D’Ambrosio et al. 2023). In it, forfeit costs are associated with subsets of items, and an allowance parameter determines how many items can be chosen from each set before incurring penalty costs.

1.1 Related work

Methods for solving the MKP can be exact or heuristic. Exact ones frequently use branch and bound, like those by Shih (1979) and Vimont et al. (2008), and hybridization with other strategies like Boussier et al. (2010) and Mansini and Speranza (2012). The best exact algorithms (e.g., Vimont et al. (2008), Boussier et al. (2010), and Mansini and Speranza (2012)) produce optimal solutions quickly for small benchmark instances but have a prohibitive computational cost for large ones. Besides exact solution methods, the literature contains several heuristic algorithms categorized as either single-solution-based local search or population-based optimization. Local search algorithms, such as tabu search (Dammeyer and Voß 1993; Glover and Kochenberger 1996; Hanafi and Freville 1998; Vasquez and Vimont 2005), simulated annealing (Drexl 1988), and kernel search (Angelelli et al. 2010), have been shown to be successful. Population-based methods include genetic and memetic algorithms (Chu and Beasley 1998; Rezoug et al. 2018), hybrid binary particle swarm optimization (Haddar et al. 2016; Mingo López et al. 2018), ant colony optimization (Al-Shihabi and Ólafsson 2010), path relinking (Arin and Rabadi 2016), and many others. The MKP also attracts research related to general artificial intelligence exposition. For instance, in García et al. (2020), the authors propose an improved binarization framework, which uses the K-means technique to enable the use of continuous metaheuristics for the MKP.

One of the best-performing metaheuristic algorithms for the MKP is the two-phase tabu-evolutionary algorithm (TPTEA) (Lai et al. 2018) which was also highly successful for the multidemand multidimensional knapsack problem (Lai et al. 2019). This approach combines two solution-based tabu search methods with an evolutionary framework that utilizes a hyperplane-constrained crossover operator to generate offspring solutions. Additionally, it employs a dynamic method to identify areas of interest for the search and a diversity-based population updating rule to ensure that a diverse and healthy population is maintained. Another effective method for the MKP, known as the diversity-preserving quantum particle swarm optimization (Lai et al. 2020), relies on quantum particle swarm optimization (Haddar et al. 2016) and exhibits a substantially lower computational cost compared to TPTEA. However, it achieves solutions of lower quality, as demonstrated by Lai et al. (2020). This technique combines a diversity-preserving strategy based on distance to manage the population, along with a solution improvement method that employs variable neighborhood descent for local optimization.

Since the KPFS is a newly introduced problem, only limited research has been done on the development of solution methods. To be specific, in the work of D’Ambrosio et al. (2023), a mixed integer programming model has been introduced. In the same paper, a fast solution method based on the carousel greedy algorithm is also presented. In addition, an advanced memetic algorithm (MA) is proposed which extends the genetic algorithm paradigm by including a local refinement mechanism.

1.2 Fixed set search

The fixed set search (FSS) is a population-based metaheuristic that adds a learning mechanism to the greedy randomized adaptive search procedure (GRASP) (Feo and Resende 1995). The GRASP involves generating solutions using a randomized greedy algorithm and applying a local search to each of them. The FSS has been successfully applied to solve several problems, including the traveling salesman problem (Jovanovic et al. 2019), the power dominating set problem (Jovanovic and Voss 2020), machine scheduling (Jovanovic and Voß 2021), the minimum weighted vertex cover problem (Jovanovic and Voß 2019), the covering location with interconnected facilities problem (Lozano-Osorio et al. 2023), the clique partitioning problem (Jovanovic et al. 2023b), as well as bi-objective optimization problems (Jovanovic et al. 2022). The FSS is inspired by the fact that high-quality solutions for a specific problem instance often have many of the same elements in common. Therefore, the FSS generates new solutions that contain these fixed elements, and the computational effort is focused on completing the partial solution. The idea of using frequently occurring elements in high-quality solutions is based on earlier notions of chunking (Voß and Gutenschwager 1998; Woodruff 1998), vocabulary building, and consistent chains (Sondergeld and Voß 1999) as they have been used in relation to tabu search. Closely related concepts have also been used in the matheuristic POPMUSIC paradigm (Taillard and Voß 2002).

Matheuristic approaches, such as kernel search (Angelelli et al. 2010; Maniezzo et al. 2021) and heuristic concentration (Rosing and ReVelle 1997), are based on the idea of generating numerous solutions to identify elements that commonly appear in high-quality solutions. These methods then fix these elements and solve the corresponding mathematical programming problem. On the other hand, the FSS utilizes a mechanism that can generate various fixed sets or kernels based on elements that frequently appear in different subsets of high-quality solutions, resulting in a more efficient global search. While kernel search and heuristic concentration share similarities with the FSS, the latter offers a more diverse range of fixed sets or kernels.

1.3 Contributions

One of the drawbacks of the best-performing methods for the MKP and the KPFS is that they have a highly complex implementation. The main reason is that they hybridize different metaheuristics and use several distinct solution neighborhoods. In this work, the goal is to provide a simple-to-implement method to solve the MKP and the KPFS based on the FSS. To be specific, the idea is to combine the FSS with the use of integer programming (IP). Related methods combining heuristics/metaheuristics with IPs are frequently called matheuristics, for which a recent review can be found in Boschetti and Maniezzo (2022). It should be noted that a method of this type has recently been successfully applied to the closely related quadratic multiple knapsack problem (Galli et al. 2023).

In the development of the matheuristic FSS (MFSS) for the MKP and the KPFS, several new concepts are explored in relation to the basic FSS. Firstly, a novel way for defining the ground set of elements for 0-1 problems is introduced that aims to maximize the amount of information that the fixed set provides. Next, an effective mechanism is proposed for incorporating the learning mechanism in a matheuristic setting. In this way, the need for defining a randomized greedy algorithm and a local search, as in the original FSS, can be avoided making the implementation of the method less complex. Another novel idea in the MFSS is using the method for generating fixed sets to diversify the generated solutions. The conducted computational experiments show that the MFSS is highly competitive to state-of-the-art methods for the MKP and outperforms the ones for the KPFS.

The paper is organized as follows. Section 2 is dedicated to the problem formulations of the MKP and the KPFS. Next, Sect. 3 provides details of applying the MFSS to the problems of interest. The following Sect. 4 is dedicated to the presentation of the conducted computational experiments. The paper is finalized in Sect. 5 with some concluding remarks.

2 Problem formulations

The knapsack problem is defined for a set of items V and a capacity c. Each item \(j \in V\) has a non-negative profit value \(p_j\) and a non-negative weight \(w_j\). The goal is to select a set of items \(S \subset V\) such that the sum of the profit values of items in S is maximized while satisfying the constraint that the sum of the weights is less or equal to the capacity c.

2.1 The multidimensional knapsack problem

The multidimensional knapsack problem (MKP) extends this concept to have multiple constraints. Now, the capacity is an m-dimensional vector having values \(c_i\), for \(i =1..m\). Each item \(j \in V\) has an m-dimensional weight vector, having as value \(w_{ij}\). The goal is to maximize the sum of profit values while ensuring that the sum of the weights in each dimension \(i= 1..m\) does not exceed the capacity \(c_i\).

Formally, the MKP can be specified using an IP model with a set of binary decision variables \(x_j\) defined for \(j \in V\) using the following objective function:

$$\begin{aligned} \text {Maximize}&\quad&\sum _{j \in V}p_jx_j \end{aligned}$$
(1)
$$\begin{aligned} \text {Subject to}{} & {} \nonumber \\ \sum _{j \in V}w_{ij}x_j\le c_i{} & \quad{} i = 1..m \end{aligned}$$
(2)
$$\begin{aligned} x_j \in \{0,1\}{}\quad j \in V \end{aligned}$$
(3)

Constraints (2) guarantee the satisfaction of the capacity constraints for each of the m dimensions, and variable definitions are provided in (3).

2.2 The knapsack problem with forfeit sets

In this subsection, the formulation of the KPFS is given as proposed by D’Ambrosio et al. (2023). The KPFS uses the same set of items \(j \in V\), item profit values \(p_j\), item weights \(w_j\), and capacity c as in the KP. Also, a solution \(S \subseteq V\) must satisfy the constraint that the total weight of the items in the solution is less or equal to the knapsack capacity c.

Additionally, there exists a collection of l forfeit sets, denoted as \(C = \{C_1, \dots , C_l \}\), where each set \(C_i\) is a subset of V. These sets satisfy the condition \(|C_i| \ge 2\) for \(i = 1,..,l\), where the notation \(|C_i|\) is used for the cardinality of the set. Each set \(C_i\) is associated with a non-negative cost \(d_i\) and an integer allowance \(h_i\), ensuring \(0 \le h_i \le |C_i|\). For a given solution S, \(n_i^S = |C_i \cap S|\) is introduced to represent the number of elements shared between \(C_i\) and S. If \(n_i^S > h_i\), a penalty equal to \((n_i^S - h_i) d_i\) must be paid. In such instances, we state that \(n_i^S - h_i\) violations are linked to \(C_i\) in solution S. Finally, an integer upper bound \(k \ge 0\) is imposed on the total number of violations allowed in a solution. Formally, a solution S is considered feasible if and only if:

$$\begin{aligned} \sum _{i \in \{1, \ldots , l\}: n_i^S > h_i} (n_i^S - h_i) \le k \end{aligned}$$
(4)

This condition ensures that the cumulative violations across all forfeit sets in S do not exceed the predefined limit k.

The integer programming formulation is as follows. For each \(j \in V\), a binary variable \(x_j\) is defined and equals 1 if j is chosen, and 0 otherwise. Next, for each \(C_i \in C\), the integer variable \(v_i\) is defined and represents the number of violations associated with the set. Using these decision variables, the IP for the KPFS can be fully specified using the following formulation.

$$\begin{aligned} \text {Maximize}&\quad&\sum _{j=1}^{n} p_j x_j - \sum _{i=1}^{l} d_i v_i \end{aligned}$$
(5)
$$\begin{aligned} \text {Subject to}{} & {} \nonumber \\ \sum _{j=1}^{n} w_j x_j \le c \end{aligned}$$
(6)
$$\begin{aligned} \sum _{i=1}^{l} v_i \le k \end{aligned}$$
(7)
$$\begin{aligned} \sum _{j\in C^i}x_j - v_i \le h_i{} \quad i = 1..l\end{aligned}$$
(8)
$$\begin{aligned} x_j \in \{0,1\}{} \quad j \in V\end{aligned}$$
(9)
$$\begin{aligned} v_i \in \{0, \dots , |C^i| - h_i\}{} \quad i = 1..l \end{aligned}$$
(10)

The objective function, given in (5), maximizes profit, which is the sum of the profits of all selected items minus the associated costs. Constraint (6) ensures that the sum of the weights of items in the solution does not exceed the capacity. Similarly, Constraint (7) enforces the limit on the number of allowed violations. Constraints (8) represent the relationship among the selected items, the allowance value, and the resulting violations for each forfeit set. Finally, Constraints (9)–(10) define the domain of the decision variables. It should be noted that, as proven by D’Ambrosio et al. (2023), in the formulation of the KPFS, it is not necessary for variables \(v_i\) to be integer.

3 A matheuristic based on the fixed set search

The FSS algorithm takes advantage of the fact that many high-quality solutions for a particular combinatorial optimization problem have common elements. The FSS incorporates some of these elements into newly generated solutions and focuses computational effort on finding optimal or near-optimal solutions in the corresponding subset of the solution space. This selected set of common elements is called the “fixed set." The goal of the FSS is to “fill in the gaps" and complete the partial solution corresponding to the fixed set. In the FSS, this is exploited through adding a learning mechanism to the GRASP. In this section, we present the MFSS which extends this concept to a matheuristic setting. The MFSS consists of several building blocks, including representing the problem solution as a subset of a ground set of elements, defining methods for generating fixed sets, implementing the learning mechanism, and defining a method for completing a solution from a fixed set.

3.1 Fixed set

In this section, we describe the approach for generating fixed sets for the MKP and the KPFS. To use the MFSS algorithm, a solution must be represented as a subset of a ground set of elements. For the two problems of interest, a natural way to represent the solution is as a subset S of the set of all items V. In the adaptation of the FSS to a matheuristic approach, our goal is to use a representation of a solution that provides as much information as possible on all items \(i \in V\). Because of this, an alternative ground set is used. Let us note that each item \(i \in V\) can either be inside the knapsack or outside of it. Using this idea, the ground set can be defined as \(G = V \times \{\top , \bot \}\), where a pair \((i,\top )\) means that item i is selected to be inside the knapsack and \((i,\bot )\) the opposite. Now, a solution S is a subset of the ground set G and satisfies \(|S| = |V|\), or in other words has the same cardinality as the set of items. A graphical illustration of the two types of ground sets is shown in Fig. 1.

Fig. 1
figure 1

Illustration of the different ground sets for the MKP and the KPFS. The items inside the gray shape represent the solution. In case of the extended ground set, circles with full lines indicate that an item is inside the knapsack and a circle with a dashed line the opposite

The next step is defining a procedure for generating multiple fixed sets F with a controllable size (cardinality) |F|. Note that it should be possible to use such fixed sets to produce feasible solutions of equal or higher quality than the solutions already generated. We begin with some definitions: The notation \({\mathcal {S}}_n = \{S_1,.., S_n\}\) represents the set of n best solutions generated in the previous steps of the algorithm. A base solution \(B \in {\mathcal {S}}_n\) is a randomly selected solution from the best n solutions. If the fixed set satisfies \(F \subset B\), it can be used to generate a feasible solution of at least the same quality as B. Moreover, F can contain an arbitrary number of elements of B. The idea is to create F such that it contains frequently occurring elements in a group of high-quality solutions. We define \({\mathcal {S}}_{kn}\) as the set of k randomly selected solutions out of the n best solutions, \(S_n\). Let us define the function C(eS), for a solution S and element e, as 1 if \(e \in S\) and 0 otherwise. Using C(eS), we can count that the number of times an element e occurs in \({\mathcal {S}}_{kn}\) with the function:

$$\begin{aligned} O(e, {\mathcal {S}}_{kn}) = \sum _{S \in {\mathcal {S}}_{kn}}C(e,S) \end{aligned}$$
(11)

Then, we define \(F \subset B\) as the set of elements e with the largest value of \(O(e, {\mathcal {S}}_{kn})\). Furthermore, we define the function \(F = Fix(B,{\mathcal {S}}_{kn}, Size)\) as the fixed set generated for a base solution B, a set of solutions \({\mathcal {S}}_{kn}\) with Size elements.

In the case of the original FSS, the diversification is done through the use of a randomized greedy algorithm with a pre-selected set of elements. In the case of the MFSS, this is partly done in the method for generating the fixed set. To be more precise, we utilize the way ties are resolved. Let us make the following observation. The last element e that should be added to the fixed set F has the value of \(O(e, {\mathcal {S}}_{kn})=f\). In the general case, there are multiple elements that have the same value of this function. Let us assume that there is a total of \(l>Size\) elements \(e \in B\) that have a value of the function \(O(e, S_{kn})\) greater or equal than f. Let us use the notation \(\hat{F}\) for the set of such elements. In the proposed approach, in such cases, the function \(Fix(B,{\mathcal {S}}_{kn}, Size)\) returns \(\hat{F}\) with \(l-Size\) random elements removed. Note that a removed element e does not necessarily have the lowest value of \(O(e, {\mathcal {S}}_{kn})\). A graphical illustration of the method for generating fixed sets is shown in Fig. 2.

Fig. 2
figure 2

Illustration of the method for generating fixed sets. It is assumed that the size of the fixed set is \(Size=3\). Next, the number of used best solutions is 6, and four test solutions are used. An element of a solution is represented using circles with full or dashed lines indicating if the item is inside or outside the knapsack. A solution is represented using the elements inside a gray rectangle

3.2 Integer program use

The idea of the proposed matheuristic approach is to exploit the fact that for many problems, the use of an IP solver is highly efficient up to a specific instance size. Moreover, standard IP solvers, in the general case, for such instance sizes manage to find a near-optimal solution at a low computational cost, and the majority of the time is often spent on finding the optimal solution and proving its optimality.

In the proposed work, the idea is to use fixed sets to create an IP that can be efficiently solved using standard solvers. A simple way to decrease the computational cost of solving an IP is by fixing the values of some decision variables. This can naturally be done using fixed sets by adding the following set of constraints to the IP model.

$$\begin{aligned} x_i = 0 \quad i \in \{j \mid (j,\bot ) \in F \} \end{aligned}$$
(12)
$$\begin{aligned} x_i = 1{} \quad i \in \{j \mid (j,\top ) \in F\} \end{aligned}$$
(13)

Equation (12) guarantees that for any element of the fixed set F of the form \((j, \bot )\), item j is not inside the knapsack. Analogously, for an element \((j, \top )\), j is in the knapsack. Note that for the MKP, this model is equivalent to solving the knapsack problem with all the items that do not appear in the fixed set with appropriately adapted capacity constraints.

Let us define \(IPS(F,\tau )\) as a function for solving the IP defined by using the IP formulation of the original problem with the added constraints (12) and (13) with a maximal computational time of \(\tau\). Note that in the case of the MKP, the formulation of \(IPS(F,\tau )\) consists of (1)–(3) and constraints given in (12) and (13), while for the KPFS, the formulation consists of (5)–(10) and constraints given in (12) and (13).

In the practical use of IP solvers, there is often a significant advantage in providing an initial, high quality, incumbent solution S for a “warm start." This makes it possible to eliminate portions of the search space in the branch-and-cut algorithm and thus may result in smaller branch-and-cut trees. In the proposed method, this is exploited to enhance the performance of the IP solver. Let us define the function \({\textrm{IPS}}(F,\tau ,S)\) that extends the function IPS by providing an initial incumbent solution S. In relation, let us define a function \({\textrm{BestFit}}({\mathcal {S}}, F)\) for a set of solutions \({\mathcal {S}}\) and a fixed set F that is equal to the highest quality solution \(S \in {\mathcal {S}}\) that can be acquired by extending the fixed set F as follows.

$$\begin{aligned} {\textrm{Fit}}({\mathcal {S}},F)= & {} \{S \mid S \in {\mathcal {S}} \wedge F \subset S \} \end{aligned}$$
(14)
$$\begin{aligned} {\textrm{BestFit}}({\mathcal {S}}, F)= & {} \text {arg}\max _{S \in {\textrm{Fit}}({\mathcal {S}},F)} z(S) \end{aligned}$$
(15)

Equation (14) defines the function \(Fit({\mathcal {S}},F)\) that returns all the solutions \(S \in {\mathcal {S}}\) for which \(F \subset S\). Equation (15) returns the solution that has the highest value of the objective function. Here, the notation z(S) is used for the objective value corresponding to solution S.

3.3 Learning mechanism

Let us give an outline of the MFSS. The idea is to start with a population \({\mathcal {P}} = \{S_i, \dots ,S_{|{\mathcal {P}}|}\}\) of randomly generated solutions. The next step consists of iteratively improving the solutions in \({\mathcal {P}}\) using the following procedure. A random base solution B is selected from the set of n best solutions, \({\mathcal {S}}_n\). In addition, the set \({\mathcal {S}}_{kn}\) is generated by selecting k solutions from \(S_n\). A fixed set F is generated using B and \(S_{kn}\), and a new solution \(S'\) is generated using the IP with the additional constraints related to the fixed set F, as described in the previous subsection. The solution \(S'\) is added to the population of solutions \({\mathcal {P}}\). Note that \({\mathcal {P}}\) is a set of solutions, so it does not contain duplicates. The procedure is repeated for the new population of solutions. In the further text details about the implementation of the proposed method are provided.

Firstly, a random solution S can be generated using the following simple procedure. Start with an empty solution \(S = \emptyset\), and iteratively add items \(t \in V\), one by one in a random order, to S if the new solution does not violate any of the capacity constraints and increases the value of the objective function.

In the iterative process, integral to the implementation of the learning mechanism, various considerations come into play. Let us first delve into the aspect of computational effort, given that the computational cost of the IPS can potentially be high. One issue is that, for a low-quality fixed set, even with a prolonged execution time granted to the IPS, it is improbable that high-quality solutions will be attained. Conversely, for a high-quality fixed set, it is reasonable to extend the IPS’s execution time, as it explores a part of the solution space containing high-quality solutions.

It is evident that the quality of solutions in the population increases as more solutions are generated. Consequently, the quality of the fixed sets also improves over time. As a result, it is reasonable to allocate shorter computational times for the IPS in the initial iterations of the algorithm and extend them in the later iterations. The proposed method starts with an initial allowed computational time, \(\tau\), for the IPS, and increases it when the algorithm shows signs of stagnation. To be more precise, the algorithm is deemed stagnant if, in the last StagMax iterations, no new solution within the best n solutions is generated.

Another thing that should be considered is that when the value of allowed computational time \(\tau\) is above some threshold, the IPS is consistently solved to proven optimality. The two main reasons for this are that the initial solution for the IPS is of very high quality or that the problem is highly constrained. In such cases, when stagnation occurs, it is reasonable to decrease the size of the fixed set |F| that will be used in further iterations since an additional increase in the allowed computational time \(\tau\) would have no effect, and in this way, a larger neighborhood of the base solution is explored.

Algorithm 1
figure a

Pseudocode for MFSS

One can obtain a better comprehension of this approach by observing the pseudocode presented in Alg. 1. The first step is setting the size of the fixed set Size that corresponds to the subproblem that can be efficiently solved using the IPS. It is equal to the total number of items |V| minus the maximal number of items \(Size_{\textrm{max}}\) for which the IPS is effective. Next, the procedure generates an initial population of \(N_{\textrm{pop}}\) solutions using a previously defined method for creating feasible solutions. Within the main loop, the first step involves randomly selecting an integer value k for the size of the set \({\mathcal {S}}_{kn}\), within the interval \([k_{\textrm{min}},k_{\textrm{max}}]\). To increase the diversity of generated fixed sets, the value of k changes in subsequent iterations. Next, the random base solution B and the set \({\mathcal {S}}_{kn}\) are chosen from the set of the n best solutions. Subsequently, a fixed set F is generated by applying the function \(Fix(B,{\mathcal {S}}_{\textrm{kn}},{\textrm{Size}})\). The initial incumbent solution \(S_{\textrm{start}}\) is selected as the solution with the best quality that fits the fixed set F from the population of solutions \({\mathcal {P}}\). The next step involves using the function \(IPS(F,\tau ,S_{\textrm{start}})\), with a time limit \(\tau\) and the initially incumbent solution \(S_{\textrm{start}}\), to obtain a new solution S. Next, it is evaluated if S is the new best solution. After that, it is checked if S will be added to the set \({\mathcal {S}}_n\). If this is not the case, the stagnation counter is increased by 1; otherwise, it is set to 0. After this step, the procedure verifies whether stagnation has occurred, and if so, and proven optimal solutions have been acquired above a certain threshold (\(OptCounter \ge \delta SolveCounter\), where \(\delta\) is a method parameter) since the last stagnation, the size of the fixed set is decreased based on the coefficient \(\gamma\). Otherwise, if stagnation has occurred, the allowed computational time \(\tau\) for IPS is doubled. Finally, the new solution S is added to the population of solutions \({\mathcal {P}}\). This process is repeated until the time limit is reached.

4 Results

In this section, the results of the conducted computational experiments are presented. The first objective is to evaluate the effectiveness of the proposed MFSS. This is done in comparison with state-of-the-art metaheuristic methods. The second goal is to assess the impact of the method parameters. The MFSS approach has been implemented in ILOG CPLEX for C#.NET through Concert Technology. The computational experiments have been performed on a personal computer running Windows 10 having Intel(R)Xeon(R) Gold 6244 CPU @3.60 GHz with 128 GB memory.

This section is organized as follows. The next two subsections focus on the evaluation of the performance of the MFSS for the MKP and the KPFS. The last subsection is dedicated to the evaluation of the effect of the parameters of the MFSS.

4.1 Evaluation of the MFSS for the MKP

In this subsection, the computational experiments used to evaluate the MFSS for the MKP are presented. In the following text, first, the properties of the used test instances are given, and later, a detailed comparison to state-of-the-art methods is provided.

4.1.1 Test instances

The computational experiments are carried out on the standard benchmark instances that are a part of the OR-Library for which a detailed description can be found in Chu and Beasley (1998). Only the results for the medium and large instances are presented since all the best-performing methods find optimal solutions for small instances. The instances can be categorized into two sets with distinct characteristics, described as follows:

  • Medium-sized instances: This set comprises 90 instances with \(|V| = 250\) items, divided into three subsets. Each subset contains 30 instances with the number of constraints m equal to 5, 10, or 30. The coefficients \(w_{ij}\) (\(i= 1..m\), \(j = 1..|V|\)) are randomly generated integers in the range [0, 1000]. The value of \(c_i\) is set to \(\beta \times \sum _{j=1..|V|}w_{ij}\) for \(i = 1..m\), where \(\beta\) is a parameter known as the tightness ratio. For each pair |V| and m, three groups of 10 instances are generated. Each group uses a single value of \(\beta\) equal to 0.25, 0.5, or 0.75. The profit coefficients \(p_j\) are correlated to the values \(w_{ij}\) as follows:

    $$\begin{aligned} p_j = \sum _{i=1}^{m}\frac{w_{ij}}{m} + 500 q_j \qquad j=1..|V| \end{aligned}$$
    (16)

    where \(q_j\) is a random real number drawn from the interval (0, 1) with a uniform distribution.

  • Large-sized instances: This set comprises 90 instances with \(|V| = 500\) items, divided into three subsets. Each subset contains 30 instances with the number of constraints m equal to 5, 10, or 30. The method for generating the coefficients \(c_i\), \(p_j\) and \(w_{ij}\) is the same as for medium-sized instances.

Optimal solutions for the instances with \(m = 5\) and \(m=10\) are provided in Vimont et al. (2008), Boussier et al. (2010), and Mansini and Speranza (2012). It is important to note that obtaining optimal solutions acquired for instances reported in these references required a significant computation time, with some instances taking up to 150 h for \(|V| = 500\) and \(m = 10\).

4.1.2 Comparison to other methods

Table 1 Comparison of the state-of-the-art methods to MFSS on instances with 250 items and five constraints

To evaluate the efficacy of the proposed algorithm, we select five state-of-the-art heuristic algorithms as the primary reference algorithms, which have been well documented in the existing literature. These reference algorithms include the genetic algorithm (GA) (Chu and Beasley 1998), which serves as a baseline reference. We also select other well-known algorithms such as the filter-and-fan heuristic (F&F) of Khemakhem et al. (2012), the hybrid quantum particle swarm optimization algorithm (QPSO) of Haddar et al. (2016), along with its variation incorporating diversity preservation (DQPSO) (Lai et al. 2020), and the two-phase tabu-evolutionary algorithm (TPTEA) (Lai et al. 2018). These algorithms are chosen based on their strong performance and status as some of the most effective heuristic algorithms for the MKP currently available in the literature.

The same set of MFSS parameters has been used for all the test instances in the comparison. These values have been found empirically. The size of the initial population size \(N_{\textrm{Pop}}\) is 100. The maximal size of a subproblem SizeMax is 100. The initial maximal computational time for the IPS is 0.1 s. The number of n best solutions that are considered for selecting the base solution and \({\mathcal {S}}_{kn}\) is 100. The value of the parameter k for generating \(S_{kn}\) is a randomly selected integer value from the interval [5, 9]. The algorithm is considered stagnant if in the last \(MaxStag=50\) iterations, there have been no changes of the set of the n best solutions \(S_n\). The maximal allowed calculation time for the MFSS is 600 s. It should be noted that in case of the MKP, the maximal size of the subproblem that has been solved, SizeMax, has never been changed. This is due to the fact that proven optimal solutions have not been frequently found; consequently, the MFSS parameters \(\gamma\) and \(\delta\) have not been used.

Table 2 Comparison of the state-of-the-art methods to MFSS on instances with 250 items and 10 constraints
Table 3 Comparison of the state-of-the-art methods to MFSS on instances with 250 items and 30 constraints
Table 4 Comparison of the state-of-the-art methods to MFSS on instances with 500 items and five constraints
Table 5 Comparison of the state-of-the-art methods to MFSS on instances with 500 items and 10 constraints
Table 6 Comparison of the state-of-the-art methods to MFSS on instances with 500 items and 30 constraints

The computational experiments are performed on the instances described in the previous subsubsection. The setting of the experiments is the same as the one used in Lai et al. (2020); for each test instance, ten independent runs are performed. The evaluation is done based on the difference between the optimal or best-known solution for an instance to the solution acquired by the methods used in the comparison. In the later text, this difference is called the error. Since ten independent runs are performed for each method on each instance, aggregated values of these runs are observed. To be specific, the minimal error over all the runs and the average error are used. The results of the comparison are shown in Tables 1, 2, 3, 4, 5, and 6 for instance groups with a different number of items and constraints.Footnote 1 Note that the values for the other methods have been taken from the corresponding papers. For GA and F&F, the average results over ten independent runs were not available in the published papers.

The first thing that can be observed is that the TPTEA, DQPSO, and MFSS have significantly better results than the other methods. In case of instances with 250 items with 5 or 10 constraints (Tables 1 and 2) and with 500 items and 5 constraints (Table 4) out of 90 instances, TPTEA, DQPSO, and MFSS missed on finding all optimal solutions only in 2, 9, and 2 cases, respectively. For the last group of instances with known optimal solutions with 500 items and 10 constraints (Table 5), TPTEA, DQPSO, and MFSS find 12, 8, and 9 optimal solutions out of 30 instances, respectively. In summary, the MFSS clearly outperforms DQPSO over all the metrics for all instance groups. Its performance is very similar to that of TPTEA. It has a more robust behavior than TPTEA regarding the average solution quality over 10 runs for each instance being better in five out of six test groups, although the difference is relatively small. On the other hand, when the quality of the best-found solutions is compared, TPTEA and MFSS have a very similar behavior. TPTEA manages to find better solutions for a few more instances than MFSS. On the other hand, the MFSS has better average values of best-found solutions for a few problem sizes. In case of the hardest problem groups, without known optimal solutions in Tables 3 and 6, the MFSS and TPTEA have been once better/worse than the other method when the average quality of the best-found solution or average solution quality over the 10 runs are compared. On the other hand, TPTEA managed to find 29 and 12 best-known solutions out of 30 test instances compared to MFSS’s 28 and 7 for the hardest instances with 250 and 500 items, respectively.

Table 7 Average computation time needed to find the best solutions for different methods for the MKP

Another aspect of the performance of the MFSS to be analyzed is the computational cost, through the evaluation of the time needed to find the best solution. In Table 7, the average computational times and corresponding standard deviations are presented for TPTEA, DQPSO, and MFSS. The aggregated values are provided for each problem size. To be exact, the averages over all the runs for all the instances are presented. Since the computational experiments are performed on different hardware, the computational times for TPTEA and DQPSO are scaled by the factor of 0.5 based on the information provided by PassMark (2022). There are several observations that can be made from the results in Table 7. Firstly, the computational cost of the MFSS is not highly dependent on the number of constraints in the instance, increasing around 50% when the number of constraints increases from 5 to 30. In the case of instances with 250 items, MFSS has a roughly 50% higher computational cost than the TPTEA and a significantly higher one than DQPSO. It is interesting that the MFSS has a much better scaling than the other two methods. This is illustrated by the MFSS having a close to five times lower computational cost than TPTEA for the largest instances.

4.2 Evaluation of the MFSS for the KPFS

In this section, the computational experiments used to evaluate the MFSS for the KPFS are presented. This subsection starts with providing details of the used test instances, and later, a detailed comparison to the results from D’Ambrosio et al. (2023) is provided.

4.2.1 Instances KPFS

In this subsection, a short description of the test instances proposed in D’Ambrosio et al. (2023), that are used in the comparison, is provided. These instances vary in the number of items |V|, where |V| takes values of 300, 500, 700, 800, or 1000. They are classified into four different scenarios, each comprising instances of three different types. The scenarios are as follows:

  • Scenario 1: The number of forfeit sets is \(l = 5|V|\). The size of each forfeit set \(C_i\) is chosen randomly in the interval \([2, \frac{|V|}{50}]\) for \(i = 1, \ldots , l\). The allowances for each forfeit set \(C^i\), for \(i = 1, \ldots , l\) satisfy \(h_i = 1\).

  • Scenario 2: The number of forfeit sets is \(l = 3 |V|\). The size of each forfeit set \(C_i\) is chosen randomly in the interval \([2, \frac{|V|}{20}]\) for \(i = 1, \ldots , l\). The allowances for each forfeit set \(C^i\), for \(i = 1, \ldots , l\) satisfy \(h_i = 1\).

  • Scenario 3: The number of forfeit sets is \(l = 5|V|\). The size of each forfeit set \(C_i\) is chosen randomly in the interval \([2, \frac{|V|}{50}]\) for \(i = 1, \ldots , l\). The allowance \(h_i\) for forfeit set \(C^i\) is selected randomly from the interval \([1, \frac{2}{3}|C^i|]\)

  • Scenario 4: The number of forfeit sets is \(l = 3|V|\). The size of each forfeit set \(C_i\) is chosen randomly in the interval \([2, \frac{|V|}{20}]\) for \(i = 1, \ldots , l\). The allowance \(h_i\) for forfeit set \(C^i\) is selected randomly from the interval \([1, \frac{2}{3}|C^i|]\)

Scenarios 1 and 3 involve a larger number of relatively smaller forfeit sets, whereas Scenarios 2 and 4 have a smaller number of larger forfeit sets. Thus, Scenarios 1 and 3 model more specific properties related to fewer items, while the opposite is true in the other two cases. Additionally, in Scenarios 1 and 2, a single item is allowed for each set without incurring costs, making it similar to classical conflicts.

In addition to different scenarios, there are types of test instances based on relationships between profits, weights, and forfeit costs and are classified as follows:

  • Not Correlated (NC): Each item’s weight \(w_j \in W\) and item profit \(p_j \in P\) are chosen uniformly at random in the interval [1, 30]. Each forfeit cost \(d_i \in D\) is chosen at random in the interval [1, 20].

  • Correlated (C): Weights and costs are chosen randomly as in the NC instances. Each profit \(p_j \in P\) is equal to the weight \(w_j+10\) of the corresponding item \(j \in X\).

  • Fully Correlated (FC): Weights are chosen randomly as in previous instance types. Profits are correlated to weights, as in the C instances. Each cost \(d_i\) is computed using the following formula:

    $$\begin{aligned} d_i = \left\lfloor \frac{\sum _{j \in \bar{C}_i^+} w_j}{|C_i|} \right\rfloor \end{aligned}$$
    (17)

    where \(\bar{C}_i^+\) represents the subset consisting of the \(h_i + 1\) items with the highest profits in \(C_i\).

The capacity c is calculated as \(w_{\text {min}} + \frac{w_{\text {max}}}{10} \times \frac{|V|}{10}\), where \(w_{\text {min}}\) and \(w_{\text {max}}\) represent the minimum and maximum bounds for \(w_j\) values (in our case, 1 and 30, respectively).

For Scenarios 1-2, the upper bound k is set as \(\frac{|V|}{15}\),\(\frac{|V|}{25}\),\(\frac{|V|}{35}\),\(\frac{|V|}{45}\), or \(\frac{|V|}{55}\), where n takes values of 300, 500, 700, 800, or 1000, rounded to the nearest integer. In Scenarios 3-4, k is uniformly \(\frac{|V|}{15}\), rounded to the nearest integer, for all |V| values.

4.2.2 Comparison to other methods

The same set of MFSS parameters has been used for all the test instances in the comparison for the KPFS. These values have been found empirically. The size of the initial population size \(N_{\textrm{Pop}}\) is 100. The number of n best solutions that are considered for selecting the base solution and \({\mathcal {S}}_{kn}\) is 100. The value of the parameter k for generating \(S_{kn}\) is a randomly selected integer value from the interval [5, 9]. The algorithm is considered stagnant if in the last \(MaxStag=20\) iterations, there have been no changes of the set of the n best solutions \(S_n\). The maximal allowed calculation time for the MFSS is 600 s. As it has been discussed in D’Ambrosio et al. (2023), the performance of the IP solver highly varied depending on the type of instances. Due to this fact with the goal of achieving the best results, different values for the initial size of the fixed set are used for the MFSS. To be specific, for instances corresponding to scenarios 3 and 4, on which the IP solver is most effective, the value \(SizeMax=200\) is used for the size of the initial subproblem being solved. In the case of scenarios 1 and 2, for not correlated and correlated instances, SizeMax had the value 100. In the case of instances of the fully correlated type for scenarios 1 and 2, that are hardest to solve for the IP solver, the value of \(SizeMax=80\) is used. Finally, the value of \(\delta = 0.9\) is used for the portion of acquired proven optimal solutions for which the size of the fixed set is decreased. The value \(\gamma = 1.25\) is used for decreasing the size of the fixed set.

Table 8 Comparison of the state-of-the-art methods to MFSS on instances with various numbers of items, instance types, and scenarios for the KPFS
Table 9 Comparison of the state-of-the-art methods to MFSS on instances with various numbers of items, instance types, and scenarios for the KPFS

The experimental setup is the same as in D’Ambrosio et al. (2023), where a single run is performed on all of the ten instances in a group having a specific number of items \(\vert V \vert\), scenario, and type. For each group of instances, the average solution value is given over the ten instances. These values for the MFSS for scenarios 1 and 2 are shown in Table 8, while Table 9 provides the same information for scenarios 3 and 4. These tables also contain the results for the memetic algorithm (MA) approach taken from D’Ambrosio et al. (2023). In addition, the comparison includes the averages solution values acquired using CPLEX with a time limit of 3 h taken from D’Ambrosio et al. (2023).

The first thing that can be observed from these results is that the MFSS manages to significantly outperform the MA in case of scenarios 3 and 4. The average solution quality and best-found solution quality is higher for 30 out of 30 instance groups. The MFSS has a better performance than MA in 11 and five instance groups, while the opposite is true for four and nine instances, for scenarios 1 and 2, respectively. The advantage of the MFSS compared to MA is the highest in the case of NC and C type instances, where it had a lower average solution value for only one and three instance groups, respectively. On the other hand, the MA managed to outperform the MFSS in case of FC instances for scenarios 1 and 2. As it is discussed by D’Ambrosio et al. (2023), CPLEX has the worst performance on these types of problems. This indicates that the performance of the MFSS is highly dependent on the effectiveness of the IPS.

The MA approach is computationally less expensive than the MFSS, it had average computational times between 1 and 120 s until stagnation occurred (no improvement to the best solutions is found in 150 iterations) for the smallest to largest instances, respectively. In case of the MFSS, the average time to finding the best solution is between 60 and 500 s for the smallest to largest instances, respectively. It is important to point out that it takes the MFSS a longer time period to get trapped in locally optimal solutions than the MA. It is interesting to mention that the MA is most effective on instances of scenario 2, for which it has significantly lower computational times than for the other scenarios, between 1 and 11 s, as has been discussed by D’Ambrosio et al. (2023).

From the results presented by D’Ambrosio et al. (2023), except for scenarios 1 and 3 of type NC, CPLEX becomes highly ineffective for instances having more than 500 items. The use of CPLEX managed to find all proven optimal solutions for 13 instance groups out of which the MFSS managed to find all optimal solutions for ten instance groups.

4.3 Method parameters

In this subsection, an analysis of the effect of the parameters is provided. As in the case of the FSS (Jovanovic et al. 2019; Jovanovic and Voß 2019; Jovanovic and Voss 2020; Jovanovic and Voß 2021; Jovanovic et al. 2023b), the MFSS is highly robust in the sense that it has a good performance for a wide range of parameter values. Keep in mind that certain parameters of the MFSS are related to the execution time of the IPS and their optimal values are hardware-dependent. Our aim is to understand the impact of the MFSS parameters without emphasizing specific values.

The initial step of the algorithm is generating the initial population of solutions (see Sarhani et al. (2023) for some general exposition on initial populations in metaheuristics). In practice, it is most important to generate at least n solutions that are used in the learning mechanism, consequently, \(N_{\textrm{pop}}\) should be equal or greater than n. Since a random method is used for generating solutions, which eventually produces low-quality feasible solutions, there is no significant advantage in using a higher value of \(N_{\textrm{pop}}\) than n. On the other hand, if a more advanced method is used for generating initial solutions, i.e., by incorporating some local search, the value of \(N_{\textrm{pop}}\) should be higher since there is a potential of acquiring more high-quality solutions. This is due to the fact that the use of the IPS is computationally expensive, and it is preferable to only apply it to high-quality solutions.

The parameters \(k_{\textrm{min}}\) and \(k_{\textrm{max}}\) are used to provide diversity in generating new solutions, through the randomization of the size of the set \(S_{kn}\). A small value of k results in fixed sets with a higher level of randomness. If a smaller number of test solutions is considered, there is a higher number of elements \(e \in B\) with the same value of the \(O(e,S_{kn})\); consequently, a larger number of them will be randomly removed from \(\hat{F}\). On the other hand, a higher value of k makes the generation of fixed sets more deterministic since there are less such elements, and the difference between Size and \(|\hat{F}|\) is lower.

The value of the initial computational time \(t_{\textrm{init}}\) of the IPS is highly hardware-dependent and is found empirically. In general, it should be the lowest value that makes it highly probable for the IPS to improve the solutions in the initial population. In relation, StagMax is used to increase the computational time for the IPS and is effective for a very wide range of values. It has been observed that after a minimal value of MaxStag for which the solution space is effectively explored, higher values of MaxStag only increase the computational cost without providing additional benefits. In our initial test, we have observed that \(MaxStag =50\) and \(MaxStag =20\) are effective values for the used hardware and the MKP and KPFS, respectively.

The parameters \(\gamma\) and \(\delta\) related to the change in the size of the fixed set have the following effect. The portion of acquired proven optimal solutions \(\delta\) that is considered to result in stagnation related to the size of the fixed set had the following effect. In the case, this value is too high, very rarely will the size of the fixed set be decreased. Consequently, when the allowed computational time is increased, there will be no change in the size of the neighborhood being explored, so it is unlikely that stagnation of the algorithm will stop. On the other hand, if the value of \(\delta\) is too low, the size of the fixed set will be decreased too frequently, and the resulting subproblems will quickly become of a size too large for the IPS to be solved effectively. We have empirically observed that for the KPFS, a value of \(\delta =0.9\) produces good results. The value of the parameter \(\gamma\) is used to decrease the size of the fixed sets being used in the MFSS. The idea is to gradually decrease the size of the fixed set avoiding prematurely generating subproblems that are too large for the IPS to effectively solve with the current quality of fixed sets and time limit. In the case of the KPFS, we have observed that the value \(\gamma =1.25\) produces good results, but we expect that the value of this method parameter is highly problem-dependent.

Fig. 3
figure 3

Illustration of convergence speed for different values of SizeMax for the instance with 500 items, 30 constraints and \(ID=0\). The graphs represent the average distance to the best-known solution over ten runs at different time periods

The size of the subproblem being solved, \(Size_{\textrm{max}}\), has a high impact on the performance of the MFSS when both convergence speed and solution quality are considered. A graphical illustration of the effect of this parameter is shown in Fig. 3 for one instance of the MKP but similar behavior is observed for the KPFS. In it, the convergence speed of the average solution quality of ten independent runs of the MFSS for different values of \(Size_{\textrm{max}}\) is presented. From this figure, it can be observed that a higher value of \(Size_{\textrm{max}}\) provides an increase in convergence speed and solution quality. In practice, this means that its value should be at most one for which the IPS can be effectively applied within a short computational time. It is important to point out that similar behavior can be observed for instances with a different number of items and constraints.

Fig. 4
figure 4

Illustration of the convergence speed for different values of n for the instance with 250 items, 10 constraints and \(ID=0\). The MFSS solves subproblems with \({\textrm{Size}}_{\textrm{max}}=25\) items. The graphs represent the average distance to an optimal solution over ten runs at different time periods

The last parameter that is analyzed is the size of the number of the best solutions n that is considered when generating the base solution B and the set of test solutions \(S_{kn}\). The main idea of the MFSS is to make it possible to solve significantly larger instances than it is possible using the IP, and the effect of the parameter n is observed in this context. Because of this, the effect of this parameter is evaluated for the case when the number of items (|V|) in the problem instance is much larger than the number of items \(Size_{\textrm{max}}\) for which the subproblem can be effectively solved. In Fig. 4, the average convergence speed over ten independent runs of MFSS with \(Size_{\textrm{max}}=25\) for a problem instance with 250 items and 10 constraints is presented. It is important to point out that similar behavior can be observed for other instances of the MKP and for the KPFS. From Fig. 4, it can be observed that for a smaller population of test solutions, parameter n, the initial convergence speed is much higher but the method can easily get trapped in a locally optimal solution. By increasing the value of n, the convergence speed is decreased, but the MFSS has a higher probability of escaping locally optimal solutions. In practice, a larger value of n results in a wider search of the solution space. Consequently, if a long execution time is available, it is possible to acquire higher quality solutions for larger values of n.

It is important to mention once more that the performance of the MFSS is highly dependent on the used IPS. That is, to be specific, it depends on the size of problems that the IPS can effectively solve. It has been empirically observed that the MFSS is highly effective in solving problem instances 7-10 times larger than the IPS can be effectively applied to. On the other hand, standard metaheuristics can frequently be applied to much larger problem instances compared to the IPS. In practice, this means that above a certain problem size, it is more suitable to use metaheuristics than the MFSS.

5 Conclusions

In this paper, two generalizations of the knapsack problem, the multidimensional knapsack problem (MKP) and the knapsack problem with forfeit sets (KPFS), have been solved using a population-based matheuristic. To be specific, the fixed set search (FSS) has been extended to a matheuristic setting. To achieve this, a new ground set of elements for the family of knapsack problems has been introduced that maximizes the amount of information that the fixed set provides. In addition, the method for generating fixed sets has been adapted to increase the diversity of the generated solutions. The main advantage of the proposed method, compared to existing ones for the MKP and the KPFS, is the simplicity of implementation in the sense of the adaptability of the method to different problems. That is, to a large extent, due to the use of the FSS learning mechanism and by avoiding the need for defining solution neighborhoods. The computational experiments have shown that the MFSS is highly competitive with state-of-the-art methods for the problems of interest. The proposed approach is robust in the sense that it is effective for a wide range of method parameters.

It is important to point out that the MFSS does not exploit any specific properties of the MKP or the KPFS but only uses the respective IP model. This indicates the potential future application of the MFSS to other 0-1 problems, like the minimum vertex cover problem, the facility location problem, etc., without the need for large changes to the method. The practicality of the MFSS’s simple adaptation to complex problems is effectively illustrated through its use for optimizing the scheduling of electric buses in public transport (Jovanovic et al. 2023a).

On the other hand, due to the simplicity of the approach, it is possible to improve its performance by hybridizing it with other heuristic/metaheuristic approaches or by exploiting some specific properties of the MKP or the KPFS.