1 Introduction

Several variants of the classical 0/1 Knapsack Problem have been studied in the literature to model real-world scenarios in which some items represent contrasting choices that cannot be taken together, or in which penalties/costs must be paid in the case in which specific items are chosen.

Among them, the Knapsack Problem with Conflicts is one of the most well-studied (see Pferschy and Schauer (2009); Hifi and Otmani (2012); Bettinelli et al. (2017); Pferschy and Schauer (2017); Ben Salem et al. (2018); Luiz et al. (2021); Coniglio et al. (2021); Li et al. (2021)). In this variant, a collection of item pairs is provided. Each item pair represents a conflict, meaning that including one of the two items in the solution excludes the other. The problem is also often knowns as Knapsack Problem with Conflict Graph (KPCG) since any instance can be represented though an auxiliary undirected graph, where each node represents an item and each edge expresses a conflict between two items. In Pferschy and Schauer (2009) and Pferschy and Schauer (2017), it is shown that the problem is strongly NP-Hard. The authors also provided approximations for specific cases. Additionally, the latter study also proposed the Knapsack Problem with Forcing Graph, in which it is mandatory to include at least one item from each pair. Several resolution approaches have been presented, including branch-and-bound (Bettinelli et al. 2017; Coniglio et al. 2021; Li et al. 2021), branch-and-cut (Ben Salem et al. 2018) and scatter search (Hifi and Otmani 2012) algorithms.

Variants of KCG have been proposed as well. In the Quadratic Knapsack Problem with Conflicts, selecting two compatible items generates additional profit. Heuristics have been presented in Shi et al. (2017); Dahmani and Hifi (2021). In the Multiple Knapsack Problem with Conflicts (Basnet 2018), conflicting items can be included if placed in different knapsacks. The author introduced and compared different heuristics.

Among problems considering penalty costs, we recall, for instance, the Fixed-Charge Knapsack Problem (Akinc 2006; Yamada and Takeoka 2009). In this problem, items are cathegorized into separate sets, and each set induces a penalty (corresponding to a setup cost) if one or more of its elements is included in the solution. Branch-and-bound and heuristic algorithms have been designed for its resolution. In the Penalized Knapsack Problem (Ceselli and Righini 2006; Della Croce et al. 2019), each item has an associated penalty in addition to its profit, and the maximum penalty of a chosen item is paid in the objective function. The authors presented exact approaches based on exhaustive search and dynamic programming.

The Knapsack Problem with Forfeits (KPF) (Cerulli et al. 2020; Capobianco et al. 2022) bridges the gap between conflicting items and penalized knapsack problems. As in this KPCG, item pairs are provided. However, in the case of this problem, they represent so-called forfeit pairs. Each forfeit pair has an associated forfeit cost. If both items belonging to a forfeit pair are chosen, the associated forfeit cost must be paid. In a sense, forfeit pairs may be interpreted as soft conflicts. KPF was first introduced in Cerulli et al. (2020), where the authors proposed two heuristic approaches, namely a constructive greedy and a Carousel Greedy (Cerrone et al. 2017) algorithm. A genetic algorithm hybridized with Carousel Greedy was instead proposed in Capobianco et al. (2022).

An extension of KPF, the Knapsack Problem with Forfeit Sets (KPFS), was recently presented in D’Ambrosio et al. (2023). In this work, forfeit pairs are extended to include forfeit sets of any cardinality. Furthermore, an allowance threshold is defined for each of these sets. This threshold represents how many items can be included from each set without paying forfeit costs. Finally, a bound on the maximum number of threshold violations is also considered. The authors proved that KPFS generalizes both KPF and KPCG. A sub-case where the problem is polynomially solvable is also shown. The authors developed a mathematical formulation and three heuristic algorithms. In particular, a memetic algorithm that embeds Carousel Greedy is shown to have overall the best performances. Such algorithm extends the genetic approach proposed in Capobianco et al. (2022) for KPF to solve the more general KPFS problem, and adds a local search subroutine. In Jovanovic and Voß (2024), the authors propose a population-based matheuristic that combines the Fixed Set Search metaheuristic with integer programming. The approach is applied to both the KPFS and the Multidimentional Knapsack Problem.

A first clear reason of interest for KPFS arises from its ability to generalize previously introduced problems, such as KPF and KPCG. Furthermore, with respect to real-world applications, both KPF and KPFS can be used to model scenarios where, as in KPCG, certain items represent contrasting goals. However, with respect to the latter problem, penalties allow more flexibility in dealing with situations where some conflicts cannot be avoided. In D’Ambrosio et al. (2023), the authors also describe a business planning scenario as an application for KPFS. In this context, items represent tasks that may or may not be completed, while forfeit sets represent either machines or other resources needed by the associated tasks. Forfeit costs may be associated with an increase in resource availability or compensation for overtime hours.

In this paper, we design a new metaheuristic for KPFS, based on the Biased Random-Key Genetic Algorithm (BRKGA) scheme (Gonçalves and Resende 2011). BRKGA has been applied with success to solve optimization problems in several contexts, such as network design (Buriol et al. 2007), scheduling (Valente and Gonçalves 2008; Gonçalves et al. 2011; Homayouni et al. 2023) and routing (Carrabs 2021; De Freitas et al. 2023; Marques et al. 2023). We test the BRKGA algorithm on all instances introduced in D’Ambrosio et al. (2023), and show that it finds better solutions than the best algorithm presented in that work.

An essential and problem-specific component of any BRKGA algorithm that we had to develop is a decoder function. This function must be able to produce a feasible solution associated with any possible chromosome (vector of random keys). In addition, our approach uses some heuristic operators to obtain local improvements of such a solution. In particular, the cleaning operator is aimed at identifying items that are detrimental or have limited usefulness. After removing these items, we take advantage of the residual budget to improve the solution with a greedy completion operator. A chromosome adjustment operator is also applied to reflect the modifications introduced by the cleaning and greedy completion ones. Furthermore, we also adopted a restart strategy to avoid getting trapped in local minima.

The paper is organized as follows. The problem is formally defined in Sect. 2, that also introduces the notation used throughout the paper. The proposed BRKGA algorithm is described in detail in Sect. 3. Computational results are reported and discussed in Sect. 4. Section 5 contains our conclusions and some discussions on future research directions.

2 Problem definition and notation

Let X denote the set of items, with \(|X|=n\), where each item is identified by a numerical index \(j \in \{1,\ldots ,n\}\). Each item j has two numerical attributes, that is a profit \(p_j > 0\) and a weight \(w_j \ge 0\). Let profits and weights define the sets P and W, respectively. Additionally, let \(b\ge 0\) be the available budget for the knapsack, which limits the total weight of the selected items. That is, as in the classical 0/1 Knapsack Problem a subset of items \(S \subseteq X\) representing a feasible solution must satisfy the following constraint:

$$\begin{aligned} \sum _{j \in S } w_j \le b \end{aligned}$$
(1)

Additionally, consider a collection \(\mathcal {C}=\{C^i\}_{i=1,...,l}\) of l forfeit sets, where each \(C^i\) is a subset of X containing at least two items. Each forfeit set \(C^i\) is associated with a positive forfeit cost \(d_i\) and an integer allowance \(h_i\), with \(0\le h_i \le |C^i|\). The inclusion in a feasible solution of a number of items belonging to \(C^i\) that exceeds \(h_i\) determines the payment of forfeit costs. In more detail, consider again the solution S, and let \(n_i(S)=| C^i \cap S | \) be the number of items in \(C^i\) that are also contained in S. We say that there are \(v_i(S)=\max \{n_i(S)-h_i,0\}\) violations associated with \(C^i\) in S; if \(v_i(S) >0\), a cost equal to \(v_i(S) \cdot d_i\) must be paid. Let forfeit costs and allowances define the sets D and H.

Furthermore, let \(k\ge 0\) be an integer bound on the number of violations that a solution can have. In other words, a solution S is considered feasible if and only if

$$\begin{aligned} \sum _{C^i \in \mathcal {C}} v_i(S) \le k \end{aligned}$$
(2)

To resume, in the KPFS problem we look for a subset of items \(S \subseteq X\) satisfying (1) and (2), and such that the objective function

$$\begin{aligned} z(S)=\sum _{j \in S} p_j - \sum _{C^i \in \mathcal {C}} v_i(S) \cdot d_i \end{aligned}$$
(3)

is maximized.

We give the data of an example instance in Fig. 1a. For each of the 6 items, we report the index \(j \in X\), profit \(p_j \in P\) and weight \(w_j \in W\), along with the budget b. Moreover, there are three forfeit sets \(C^1,C^2,C^3\). For each set \(C^i\), we report the included items, the forfeit cost \(d_i \in D\) and the allowance \(h_i \in H\), in addition to the threshold k.

In Fig. 1b, we show a graphical representation of the feasible solution \(S=\{3,4,5,6\}\). Each segment associated to an item has a length that is proportional to its weight, while the last segment represents the residual budget \(b-(w_3+w_4+w_5+w_6)=16-12=4\). We note that \(v_1(S)=0\), \(v_2(S)=1\) and \(v_3(S)=1\). Indeed, \(n_1(S)=0\), \(n_2(S)=3\) and \(n_3(S)=2\), while \(h_2=2\) and \(h_3=1\). The solution is feasible since up to \(k=2\) violations are allowed. The objective function value is \(z(S)=p_3+p_4+p_5+p_6-(1\cdot d_2 + 1\cdot d_3)=19-4=15\).

Fig. 1
figure 1

a Example instance data. b Feasible solution S

Fig. 2
figure 2

Population update in BRKGA

Finally, we introduce some notation that will be used later on in this work. Given \(S \subseteq X\), \(j \in X \setminus S\) and \(j' \in S\), let \(S+j\) denote the set \(S \cup \{j\}\), and \(S-j'\) represent \(S {\setminus } \{j'\}\). Furthermore, for any item \(j \in X\), let \(\mathcal {C}_j \subseteq \mathcal {C}\) correspond to the forfeit sets containing j, that is, \(\mathcal {C}_j = \{C^i \in \mathcal {C} |\) \(j \in C^i\}\).

3 Biased random-key genetic algorithm

In this section, we discuss and describe in detail our proposed BRKGA algorithm to solve KPFS. The BRKGA (Gonçalves and Resende 2011) is a metaheuristic technique belonging to the general class of Genetic Algorithms (GAs). As in classical GAs, a population of chromosomes is evolved over a series of iterations (also called generations), until some stopping criterion is met. Each chromosome encodes a feasible solution for the considered optimization problem. In BRKGAs, chromosomes are composed of vectors of real numbers in the interval [0, 1], known as random keys. Chromosomes contain a fixed predefined number of elements (genes); the specific random key assigned to each gene of a chromosome is also defined allele. The population size \(popsize >0\) also remains constant between generations.

In order to design a BRKGA algorithm, it is essential to define a deterministic and problem-specific function, called decoder. The decoder must be able to assign a solution for the optimization problem at hand to any possible chromosome (that is, to any vector of random keys of the considered chromosome size). Once this solution is known, a fitness function can be computed to rank the chromosome within its population. As in many GAs, we use the problem objective function as the fitness function. That is, given a chromosome \(\Gamma \) corresponding to solution S, the fitness function value of \(\Gamma \) will be z(S). The higher this value is, the more fit will be considered \(\Gamma \). As noted in Gonçalves and Resende (2011), the decoder function allows the algorithm to explore the discrete solutions space indirectly, by exploring the random key vectors space, that is, the continuous unit hypercube.

Once all chromosomes in the population at a given generation g are ranked from most fit to least fit, the population corresponding to the next generation \(g+1\) is produced as follows. A predefined percentage of the most fit individuals, controlled by a parameter \(p_e\) between 0 and 1, is defined as the elite. That is, the elite chromosomes are the \(popsize \cdot p_e\) chromosomes at generation g with the highest fitness function values. All elite chromosomes are directly copied in the population of generation \(g+1\).

Another predefined portion of the new population, controlled by a second percentage parameter \(p_m\), is composed of mutants, that is, new chromosomes generated uniformly at random. Mutants have the aim of adding diversity to the population, in order to prevent premature convergence; their role is similar to the one of the mutation operator in other evolutionary algorithms.

Finally, the remaining \(popsize \cdot (1-p_e-p_m)\) elements are generated using a crossover operator. This operator first chooses two parents within the population at generation g; the first parent is chosen at random between the elite chromosomes, while the second is chosen among the non-elite ones. The allele assigned to each gene of the resulting child chromosome will be equal to the corresponding allele of its elite parent with probability \(\rho _e > 0.5\), or to the corresponding allele of its other parent otherwise. The described procedure for the creation of a new population from the current one is summarized in Fig. 2.

Through the inheritance of the elite chromosomes and the bias towards elite parents in the crossover, BRKGAs favor the fittest individuals and the transmission of their features, in accordance with the survival of the fittest principle which is at the basis of evolutionary algorithms.

In our algorithm the first population, as is usually the case in BRKGAs, is composed of randomly generated chromosomes. That is, each allele is chosen uniformly at random in [0, 1]. Moreover, to further avoid being trapped in local minima, and as proposed in Gonçalves et al. (2014), we adopt a restart strategy, meaning that the population is substituted with a new entirely random one if a predefined number of generations passed without improving the best solution found (incumbent).

The rest of the section is organized as follows. Section 3.1 describes the chromosome representation, some notation and the different termination criteria that we took into account. An overview of the BRKGA algorithm is given in Sect. 3.2. Being the main problem-specific component, we describe in detail the implemented decoder function in Sect. 3.3.

3.1 Chromosome representation and termination criteria

In our BRKGA, each chromosome will contain \(n=|X|\) genes. Each gene is associated with an item. The values (alleles) of the genes in a given chromosome \(\Gamma \) will be used by the decoder to decide which item must be included in the solution S corresponding to \(\Gamma \). We will use the notation \(\Gamma [j]\) to refer to the allele of \(\Gamma \) corresponding to item j, with \(j \in \{1,\ldots ,n\}\).

We considered two termination criteria; the algorithm ends as soon as one is met. The first criterion is related to a maximum overall number of generations \(max\_gen >0\). The second criterion is associated with our restart strategy and two other positive integer parameters, \(max\_no\_imp\) and \(max\_restarts\). That is, if \(max\_no\_imp\) iterations passed without improvements in the incumbent solution, we perform a restart by resetting the population to random chromosomes unless \(max\_restarts\) restarts have already occurred. In the latter case, the algorithm ends.

3.2 Algorithm overview

Algorithm 1
figure a

BRKGA for KPFS

A pseudocode outline of BRKGA is given in Algorithm 1. The algorithm has the following input parameters:

  • InstanceData: all data related to an instance of the problem (set of items X, set of forfeit sets C, profits P, weights W, forfeit sets costs D and thresholds H, budget b and forfeit limit k).

  • \(popsize>0\), integer: number of chromosomes composing each population.

  • \(0<p_e<1, 0<p_m<1\): fractions of the population composed of elite elements and mutants, respectively.

  • \(0.5<\rho _e<1\): probability of inheriting alleles from elite parents during crossover.

  • \(max\_gen >0\), integer: bound on the number of generations (iterations).

  • \(max\_no\_imp>0\), integer: bound on the number of iterations without improving the incumbent solution.

  • \(max\_restarts>0\), integer: bound on the number of restarts (population resets).

  • \(\tau \ge 0\), \(0<\omega \le 1\): parameters used by the decoder function (see Sect. 3.3).

The output of the algorithm is \(S^* \subseteq X\), that is, the incumbent solution. In line 1, the algorithm initializes with \(\emptyset \) the solution \(S^*\), and with 0 both a counter of the number of generations passed since the last improvement of \(S^*\) (\(g\_no\_imp\)) and a counter of the number of restarts (\(num\_restarts\)).

In line 2, we compute the elements of the HeuScores set, containing scores that define a heuristic sorting for all elements of X, from the least promising item for insertion to the most promising one (line 3). This sorting will be used by the decoder function to break ties regarding items to be included or removed from the solution, as will be explained in detail in Sect. 3.3. The scores are given by the ratio between profits and weights; that is, for each item \(j \in X\), \(score_j\) is computed as follows:

$$\begin{aligned} score_j=\frac{p_j}{w_j} \end{aligned}$$
(4)

Given two items \(j, j' \in X\), j will be preferred to \(j'\) if \(score_j > score_{j'}\). Ties are broken arbitrarily.

We then initialize the starting population POP with popsize randomly generated chromosomes (line 4). The decoder function is then used to compute the set \(\mathcal {S}\) containing the solutions associated with each chromosome in POP so that they can be ranked from most fit to least fit (line 5). Note that in our pseudocode, the function DecodeChromosomes corresponds to the independent execution of the decoder function on every chromosome \(\Gamma \) in POP.

The main loop (lines 6–29) is executed up to \(max\_gen\) times. As previously said, in each generation a new population \(POP'\) is built from POP, by first copying into it the \(popsize\cdot p_e\) elite chromosomes, then adding \(popsize\cdot p_m\) random mutants and finally using the crossover operator to generate the offspring corresponding to the remaining \(popsize \cdot (1-p_e-p_m)\) elements (lines 7–9).

We then substitute POP with \(POP'\), run the decoder on the new population and look for the best solution S (lines 10–12). If this solution improves \(S^*\), we update the incumbent and reset the counter \(g\_no\_imp\) to 0, otherwise, we increment it by 1 (lines 13–18). As previously introduced, if \(g\_no\_imp\) reached its limit \(max\_no\_imp\), we check whether the second counter \(num\_restarts\) has not reached \(max\_restarts\) yet (lines 19–20); if this is the case, we perform a restart operation by resetting POP with random chromosomes and \(g\_no\_imp\) to 0, and increment \(num\_restarts\) by one. The decoder is then again used to rank the elements of this new population (lines 21–24). If \(num\_restarts\) is equal to \(max\_restarts\), instead, we already performed the maximum number of restarts, therefore the algorithm ends, returning \(S^*\) (lines 25–26).

Finally, if we exit from the main loop, meaning that the population evolved for \(max\_gen\) generations, we return the incumbent solution \(S^*\) (line 30).

3.3 Chromosome decoding

The decoder procedure is composed of four steps, executed in sequence. In the first step, a feasible solution S is created by including in it some of the items, based on the alleles of the input chromosome \(\Gamma \), as described in Sect. 3.3.1. A cleaning operator is then used to remove eventually detrimental items from S (Sect. 3.3.2). In the third step, the procedure checks if the solution can be improved by adding some items, using a greedy completion operator (Sect. 3.3.3). Finally, based on the obtained solution, the alleles of \(\Gamma \) are adjusted to better represent it (Sect. 3.3.4). A pseudocode outline of the decoder is given in Algorithm 2.

Algorithm 2
figure b

Decoder function

3.3.1 Solution initialization

As mentioned, a solution S corresponding to chromosome \(\Gamma \) is initialized with some items. A pseudocode of our initialization procedure is given in Algorithm 3.

The candidate items for insertion are the ones with alleles strictly lower than 0.5. To this end, alleles are first sorted in non-decreasing order. This sorting defines a permutation \(\Pi =\pi _1,\ldots ,\pi _h,\pi _{h+1},\ldots ,\pi _n\) of the items list \(1,\ldots ,n\) (line 1), where \(\pi _h\) is the last item in the permutation such that \(\Gamma [\pi _h] < 0.5\).

The solution S is initially empty (\(S = \emptyset \), line 2). Candidate items \(\pi _1,\ldots ,\pi _h\) are then considered for insertion in sequence, one by one, according to the permutation order (while loop, lines 3–9). For each item \(\pi _j\) (\(j=1,\ldots ,h\)), S is updated to be \(S+\pi _j\) if and only if the following three conditions are met (lines 4–7):

  1. (a)

    The weight of the already chosen items plus \(w_{\pi _j}\) does not exceed b (\(\sum _{y \in S+\pi _j} w_y \le b\));

  2. (b)

    The number of violations induced by the already chosen items plus \(\pi _j\) does not exceed k (\(\sum _{C^i \in \mathcal {C}} v_i(S+\pi _j) \le k\));

  3. (c)

    The profit of the new item \(p_{\pi _j}\) exceeds the sum of the forfeit costs induced by its addition in S (\(\sum _{C^i \in \mathcal {C}_{\pi _j}| v_i(S+\pi _j) > 0}d_i < p_{\pi _j}\)).

The first two conditions make sure that the solution remains feasible. The third condition ensures that each item \(\pi _j\) is actually favorable when it is chosen for insertion, that is, its addition increases the fitness function value (\(z(S+j) > z(S)\)). Note that the left-hand side in the third condition holds since the addition of \(\pi _j\) induces exactly one more violation in each forfeit set \(C^i \in \mathcal {C}_{\pi _j}\) for which \(v_i(S+\pi _j)\) is positive.

Finally, the obtained S set is returned (line 10).

Algorithm 3
figure c

Solution initialization operator

To better understand the operator, we describe an example, based on the data of the small instance discussed in Sect. 2. In Fig. 3a we show a possible chromosome \(\Gamma \). According to alleles, the considered permutation of items \(\Pi \) is 6,1,2,4,5,3, as shown in Fig. 3b. Starting from the empty solution \(S=\emptyset \), only items 6,1,2 and 4 are considered for insertion, since \(\Gamma [5]>0.5\). Item 6 is first added to S. Item 1 is also added; it induces a violation in \(C^3\), however up to \(k=2\) violations are allowed, and \(p_1=2\) exceeds the forfeit cost \(d_3=1\). Furthemore, \(w_6+w_1=7\), while \(b=16\). Item 2 is then discarded, according to condition (c). Indeed, its addition would induce a violation in \(C^1\), and \(p_2=2\) is exceeded by \(d_1=6\); therefore, \(z(\{1,2,6\})<z(\{1,6\})\). Finally, item 4 is added to S, since \(w_6+w_1+w_4=12\) and it does not induce any new violation. Therefore, the solution resulting from the application of the operator to the chromosome shown in Fig. 3a is \(S=\{1,4,6\}\). The chosen items are highlighted in bold in Fig. 3c. The fitness function value is \(z(S)=p_1+p_4+p_6-(1\cdot d_3) = 11-1 = 10\). Figure 3d resumes the values computed to decide whether each item is added to S.

We now briefly discuss the computational complexity of the initialization step. In the following, let \(l_{max} \le l\) the maximum number of forfeit sets in which the same item appears, that is, the maximum cardinality of a set \(\mathcal {C}_j\) for any \(j \in X\).

The sorting of items by allele value (line 1) is done in \(O(n\log n)\). For each item that is considered for insertion, it is necessary to iterate over its forfeit sets in \(O(l_{max})\) to check the number and cost of the eventual new induced violations, and therefore to check if conditions (a)–(c) are met. When a new item is added, the updates of variables and data structures are done in constant or \(O(l_{max})\) time (see also Sect. 3.4). Therefore, the loop in lines 3–9 requires \(O(n\cdot l_{max})\) time. Overall, the solution initialization takes \(O(n\log n + n \cdot l_{max})\).

Fig. 3
figure 3

Example execution of the Solution Initialization operator. Data refer to the example in Figure 1. a Input chromosome. b Items ordered by allele values. c Output solution; the chosen items are marked in bold. d Resume of the main steps

An item that is added in earlier iterations of the initialization operator may result to be detrimental later on, due to the addition of new items. This is checked by the cleaning operator, described in the next subsection.

3.3.2 Solution cleaning

The solution cleaning operator aims at removing from the solution S detrimental items added by the initialization step. An item \(j \in S\) is detrimental for the solution if \(z(S-j) \ge z(S)\), that is, if \(\sum _{C^i \in \mathcal {C}_j| v_i(S)> 0}d_i \ge p_{j}\). However, we relax the definition of detrimental items to include items \(j \in S\) such that \(\sum _{C^i \in \mathcal {C}_j| v_i(S)> 0}d_i \ge \omega \cdot p_{j}\). The idea is to also remove items with limited usefulness, in order to increase the residual budget and thus the range of feasible choices for the greedy completion operator executed in the third decoder step (see Subsect. 3.3.3). The greedy completion operator may eventually add back these items if it does not identify better ones.

A pseudocode of the operator is given in Algorithm 4. Items are checked for cleaning one at a time. We note that, however, the order in which we check items may influence which of them are actually removed. Suppose to have two items j and \(j'\), belonging to both solution S and forfeit set \(C^i\), with \(v_i(S)=1\). Removing either j or \(j'\) from S would lead to no forfeit cost to be paid for set \(C^i\), increasing the likeliness of not being detrimental for the remaining item. For this reason, we check items for removal in non-decreasing order of the heuristic score described in Sect. 3.2. For each item, following the SortedItems ordering, the procedure checks if it belongs to the solution and is detrimental, and eventually removes it (for loop, lines 1–5). Finally, the resulting solution S is returned (line 6).

Algorithm 4
figure d

Solution cleaning operator

The for loop is iterated O(n) times. For each item, we iterate over its forfeit sets to check if it is must be removed and to update data structures if this is the case. Therefore, the solution cleaning operator runs in \(O(n\cdot l_{max})\).

3.3.3 Greedy completion

The greedy completion operator checks whether we can add favorable items to S after the initialization and cleaning operators. A pseudocode for the operator is given in Algorithm 5. The algorithm operates in multiple iterations. In each iteration (lines 1–14), the operator first checks whether at least one item \(j \in X \setminus S\) exists, such that \(S+j\) is a feasible solution and \(z(S+j) > z(S)\). That is, we check for item j the conditions (a)-(c) described in Sect. 3.3.1. The elements satisfying these requirements compose the set \(X_{iter}\) (lines 2–5).

If \(X_{iter} = \emptyset \), both the current iteration and the greedy completion operator end (lines 8–10). Otherwise, for each item in \(j \in X_{iter}\), we evaluate the following ratio (line 11):

$$\begin{aligned} ratio_j =\frac{p'_j}{w_j + \tau fv_j} \end{aligned}$$
(5)

where:

  • \(p'_j\) is the net increase in the objective function value obtained by adding j to S, that is, \(p'_j = p_j - \sum _{C^i \in \mathcal {C}_{j}| v_i(S+j) > 0}d_i\); note that by construction \(p'_j\) will be strictly positive for each \(j \in X_{iter}\).

  • \(fv_j\) is the number of additional violations induced in S by j, that is, \(fv_j = |\{C^i \in \mathcal {C}_{j}| v_i(S+j) > 0\}|\).

We then choose the item \(j^* \in X_{iter}\) that maximizes this ratio (line 12). Note that this criterion corresponds to the one adopted by the greedy constructive heuristic proposed in D’Ambrosio et al. (2023). In the case of ties, we choose among the items with maximum ratio value the one that maximizes the heuristic score described in Sect. 3.2, with further ties broken arbitrarily.

The solution S is then updated to include \(j^*\), and the iteration ends (lines 13–14). As mentioned, the operator ends as soon as \(X_{iter} = \emptyset \) for the current iteration or if we reach the trivial stopping condition \(S=X\).

Algorithm 5
figure e

Greedy completion operator

In Fig. 4, we show the application of the greedy completion operator to the solution \(S=\{1,4,6\}\) obtained in the example discussed in Fig. 3. The chromosome and the input solution (corresponding to the items highlighted in bold) are reported in Fig. 4a. We have that \(X {\setminus } S = \{2,3,5\}\). Item 2 does not belong to \(X_{iter}\), since it does not meet two of the three criteria (\(w_1+w_2+w_4+w_6=17\), \(b=16\); \(z(S+2)<z(S)\) as discussed in Sect. 3.3.1). \(S+3\) is a feasible solution and \(z(S+3) > z(S)\), since \(w_1+w_3+w_4+w_6=16\) and the item does not induce any violation, therefore item 3 belongs to \(X_{iter}\), \(p'_3=p_3=3\) and \(fv_3=0\). Item 5 also belongs to \(X_{iter}\). It induces a second violation in \(C^3\) (that is allowed since \(k=2\)), and \(p_5=7\) exceeds \(d_3=1\). Furthermore, \(w_1+w_4+w_5+w_6=14\). We have \(p'_5=p_5-d_3=6\) and \(fv_5=1\). Therefore, either item 3 or 5 is added to S. The chosen item depends on the value of parameter \(\tau \). Assuming for instance \(\tau =5\), \(ratio_3=\frac{3}{4}\) while \(ratio_5=\frac{6}{7}\), therefore S is updated to \(\{1,4,5,6\}\).

In the second iteration of the greedy, it is easy to see that \(X_{iter} = \emptyset \), since S+2 and S+3 are unfeasible. Therefore, the greedy operator ends. The chosen items are highlighted in bold in Fig. 4b. The fitness function value is \(z(S)=p_1+p_4+p_5+p_6-(2\cdot d_3) = 18-2 = 16\). Figure 4c resumes the steps discussed above.

Fig. 4
figure 4

Example execution of the Greedy Completion operator. Data refer to the example in Fig. 1. The value of parameter \(\tau \) is 5. a Input solution; the chosen items are marked in bold. b Output solution; the chosen items are marked in bold. c Resume of the main steps

Regarding the computational complexity, each for loop (lines 3–8) iterates over the items in \(X \setminus S\), and for each item we iterate over its forfeit sets to check if it belongs to \(X_{iter}\). During this phase we actually also compute the \(ratio_j\) values, and save the incumbent best item according to them. Adding the best found item \(j^*\) to S requires \(O(l_{max})\) to update the related data structures. Overall, each iteration of the while loop (lines 1–15) runs in \(O(n\cdot l_{max})\). Since the loop iterates O(n) times, the greedy completion operator runs in \(O(n^2 \cdot l_{max})\).

3.3.4 Chromosome adjustment

Chromosome adjustment is a commonly used technique in BRKGAs that apply an improvement phase to the generated solution after its initialization (see for instance Gonçalves et al. (2011)). In our case, this improvement corresponds to the solution cleaning and greedy completion operators. The aim of this final operator is to modify the alleles to reflect the applied changes. In this way, the good features of the improved solutions are represented better by the chromosome encoding, and more likely to be passed on to future generations. Given the solution S obtained from chromosome \(\Gamma \) obtained after executing the operators described in Sects. 3.3.13.3.3, the adjustment operator works as follows:

  • For any item \(j \in X\) such that \(j \in S\) and \(\Gamma [j] > 0.5\), we set \(\Gamma [j]=1- \Gamma [j]\);

  • For any item \(j \in X\) such that \(j \notin S\) and \(\Gamma [j] < 0.5\), we set \(\Gamma [j]=1- \Gamma [j]\);

  • For any item \(j \in X\), in the unlikely case \(\Gamma [j] = 0.5\), if \(j \in S\) we set \(\Gamma [j] = 0.5 - \epsilon \) for a small value of \(\epsilon \).

In the example in Fig. 4, the chromosome adjustment operator sets \(\Gamma [2]\) to \(1-0.27=0.73\) and \(\Gamma [5]\) to \(1-0.72=0.28\).

The chromosome adjustment operator just requires to iterate over the genes once. All checks and updates are done in constant time. Therefore, it runs in O(n).

3.4 Computational complexity and implementation details

For our implementation, we used the C++ API for BRKGA algorithms discussed in Toso and Resende (2015). The API provides an implementation for the problem-independent components of the algorithm. The computation of the sorting by heuristic score of the items (see Algorithm 1, lines 2–3) is done in \(O(n\log n)\). In Toso and Resende (2015), it is reported that the computational complexity related to the evolution of a population in one generation is \(\theta (popsize \cdot n + popsize \cdot \log popsize)\), plus the decoding of \(popsize \cdot (1-p_e)\) chromosomes. As previously discussed, each decoding runs in \(O(n^2\cdot l_{max})\). The algorithm iterates for at most \(max\_gen\) generations.

In the following, we briefly describe the main variables and data structures used in our implementation. Each solution S is represented internally with a binary vector of length n. Additionally, an integer vector TakenPerSet of length l is used to store how many items belonging to each forfeit set \(C^i\) are currently in S. When an item j is considered for insertion (or deletion), by checking the values of TakenPerSet[i] for each \(C^i \in \mathcal {C}_j\) and by comparing them with the related allowances \(h_i\) it is possible to compute the number of violations that would be added (or removed), and the related costs. Furthermore, variables are used to store the residual budget (that is, \(b-\sum _{j \in S}w_j\)), the current total number of violations and the current value of z(S).

4 Computational results

In this section we compare experimentally our BRKGA algorithm with the memetic algorithm proposed in D’Ambrosio et al. (2023), on the dataset introduced in the same work. From now on, in this section the two algorithms will be called BRKGA and MA, respectively.

The parameters setting for both algorithms is done using the irace package for automatic algorithm configuration (López-Ibáñez et al. (2016)), as will be discussed in Sect. 4.1.

Furthermore, since BRKGA is a more time-consuming algorithm than MA, in order to obtain a fairer comparison, we also developed an MA variant with a different termination criterion. This variant will be referred to as extended MA (or MAext) from now on. MA ends when either a predefined number of iterations since the improvement of the incumbent solution or a global number of iterations are performed. MAext, instead, is interrupted when the computational time consumed by BRKGA for the same instance is reached. As will be observed during the discussion of the results, and according to our expectations, MAext performs generally better than MA, but in a few cases, it finds slightly worse solutions. This usually happens for smaller instances, where the computational times of MA and BRKGA are more similar, and this is due to the fact that MA is a non-deterministic algorithm.

Finally, we also compare the approaches with results obtained by solving an ILP formulation also proposed in D’Ambrosio et al. (2023), using the IBM ILOG CPLEX 12.10 solver. This approach will be called CPLEX in this section.

We briefly resume the main features of the test instances. Each instance has a number of items n that is equal to 300, 500, 700, 800 or 1000, belongs to a scenario that can be 1, 2 ,3 or 4 and has a type that can be Not Correlated (NC), Correlated (C) or Fully Correlated (FC). Ten randomly generated instances were generated for each combination of n, scenario and type, therefore the number of instances in the dataset is 600.

Scenarios are defined as follows:

  • Scenario 1: \(l= 5 \times n\), the cardinality of each forfeit set \(C_i\) is chosen at random in \([2,\ldots , \frac{n}{50}]\), each \(h_i\) is equal to 1.

  • Scenario 2: \(l= 3 \times n\), the cardinality of each forfeit set \(C_i\) is chosen at random in \([2,\ldots , \frac{n}{20}]\), each \(h_i\) is equal to 1.

  • Scenario 3: \(l= 5 \times n\), the cardinality of each forfeit set \(C_i\) is chosen at random in \([2,\ldots , \frac{n}{50}]\), each \(h_i\) is chosen at random in the interval \([1,\ldots , \frac{2}{3}|C_i|]\).

  • Scenario 4: \(l= 3 \times n\), the cardinality of each forfeit set \(C_i\) is chosen at random in \([2,\ldots , \frac{n}{20}]\), each \(h_i\) is chosen at random in the interval \([1,\ldots , \frac{2}{3}|C_i|]\).

Types are defined as follows:

  • Type NC: Each item weight \(w_j\) is a random integer in \([1,\ldots , 30]\), each item profit \(p_j\) is a random integer in the same interval, each forfeit cost \(d_i\) is a random integer in \([1,\ldots , 20]\).

  • Type C: Weights and costs are generated randomly as for instances of Type NC. Each profit \(p_j\) is set equal to \(w_j+10\), where \(w_j\) is the weight referring to the same item.

  • Type FC: Weights and profits are generated as for C instances. Each forfeit cost \(d_i\), associated with forfeit set \(C_i\), is set equal to

    $$\begin{aligned} d_i = \left\lfloor \dfrac{\sum _{j \in \bar{C^{i}}^+} w_j}{ | C^i |}\right\rfloor \end{aligned}$$

    where \(\bar{C^{i}}^+\) is the subset of \(C^i\) containing the \(h_i + 1\) items with highest profits.

Scenarios and types model different features. In particular, Scenarios 1–3 have a higher number of smaller forfeit sets with respect to Scenarios 2–4. With respect to allowance thresholds, Scenarios 1–2 are stricter, being each of them equal to 1, while larger allowance thresholds are permitted by Scenarios 3–4. Regarding types, instances correlating profits and weights (like the Type C ones) are known in the literature for being harder for KPCG (see Bettinelli et al. (2017)) with respect to non-correlated ones (such as Type NC). With Type FC, the concept of correlation is extended to forfeit costs. Further details on the instances can be found in D’Ambrosio et al. (2023).

The remaining part of the section is organized as follows. Section 4.1 contains parameter values as well as some details related to implementation and test environment. Section 4.2 compares the solutions obtained by CPLEX with those of the three heuristics. MA, MAext and BRKGA are directly compared in Sect. 4.3. Finally, we present an analysis of the impact of the parameters and features introduced in our decoder function in Sect. 4.4.

Table 1 Parameters setting through irace

4.1 Test environment and parameter values

BRKGA has been coded in C++. As previously mentioned, we used the API for BRKGA algorithms discussed in Toso and Resende (2015). Tests were run on a workstation with an Intel Xeon CPU E5-2650 v3 processor running at 2.3GHz and 128 GB of RAM. The approaches proposed in D’Ambrosio et al. (2023) used the same programming language and hardware. The Concert library of IBM ILOG CPLEX 12.10 was used for CPLEX.

With respect to parameter values for BRKGA, after a preliminary tuning phase, we chose \(popsize=\frac{n}{2}+100\), \(max\_gen=1500\), \(max\_no\_imp=50\), and \(max\_restarts=3\), which allowed us to obtain reasonable computational times for all instances. The values for the remaining parameters, namely \(p_e\), \(p_m\), \(\rho _e\), \(\omega \) and \(\tau \) were optimized using the irace package (López-Ibáñez et al. (2016), https://cran.r-project.org/web/packages/irace/vignettes/irace-package.pdf). The considered alternative values and the final chosen ones are reported in Table 1. In accordance with the irace user guide indications on heterogeneous scenarios (see https://cran.r-project.org/web/packages/irace/vignettes/irace-package.pdf, Sect. 10.5), we performed the parameter tuning separately for each of the 4 scenarios. To avoid having parameter choices that overfit too much a subset of instances and cannot be applied to new ones, we did not further tune parameters per instance type. For each scenario, all instances corresponding to \(n=300\) were used in this phase.

Parameters \(p_e\), \(p_m\) and \(\rho _e\) are fundamental parameters of BRKGA algorithms, and therefore the candidate values were chosen within the ranges suggested in Gonçalves and Resende (2011). The candidate values for \(\omega \) and \(\tau \) are reasonable values chosen within their domain.

Similarly, we used irace to tune parameters \(\alpha \), \(\beta \), \(\gamma \), \(\delta \) and \(\tau \) of MA, and re-run all tests for this algorithm. The candidate parameter values and the final ones are also reported in Table 1. The \(\tau \) weighting parameter is used in a greedy subroutine, and has the same meaning that it has in BRKGA. Parameters \(\alpha \) and \(\beta \) are related to the Carousel Greedy component of the algorithm, while \(\gamma \) and \(\delta \) are probability values used by the randomized crossover component; further details can be found in D’Ambrosio et al. (2023).

4.2 Heuristics comparison with CPLEX

In D’Ambrosio et al. (2023), CPLEX was run with a time limit of 3 h per instance on 300 of the 600 instances. Indeed, while most of the Type NC instances with fewer forfeit sets (Scenarios 2 and 4) could be solved to proven optimality within this limit, this was not the case for the other instance groups. Therefore, in the other cases, only instances with \(n\le 500\) were considered.

In this section, we compare the results of MA, MAext and BRKGA with the ones obtained by CPLEX. Whenever the latter does not find a proven optimal solution within the time limit, the best solution found is used. This comparison is summarized in Tables 2 and 3, which contain results for Scenarios 1–2 and 3–4, respectively. Each value in the table refers to the 10 instances corresponding to a given choice of n, scenario and type, and contain (unless differently specified in the following) average values for them. For CPLEX, the values under the headings Sol. and Fails represent solution values and the number of instances (between 0 and 10) for which CPLEX did not certify the optimality in 3 h. If \(Fails > 0\), Opt Gap (%) reports the average optimality gap computed for the unsolved instances. The optimality gap for a given instance a is computed as \(100 \times \frac{BB(CPLEX,a)- Sol(CPLEX,a)}{Sol(CPLEX,a)}\), where BB(Cplexa) and Sol(Cplexa) are the best bound and the best solution found, respectively. For MA, MAext and BRKGA, under CPLEX Gap (%) we report percentage gaps from CPLEX solutions; for any algorithm Alg and instance a, each gap is computed as \(100 \times \frac{Sol(Cplex,a) - Sol(Alg,a)}{Sol(Alg,a)}\), where Sol(Alga) is the solution found by the algorithm. Furthermore, in the cases where \(Fails<10\), under Opt Found we report how many of the proven optimal solutions (between 0 and \(10-Fails\)) were found by the three algorithms.

Table 2 CPLEX - Heuristics comparison (Scenarios 1–2)

Let us first consider Scenarios 1–2 (Table 2). On Scenario 1, we can see that BRKGA finds lower gaps than MA in all cases, and lower gaps than MAext in all cases except one (Type FC, \(n=300\)). BRKGA finds 7 of the 9 known optima, while MA and MAext find 4 and 5 of them, respectively. We may note that in the five cases with no known optima, the gaps are usually negative, meaning that the heuristics outperform CPLEX, with the only exceptions being Type C, \(n=300\) (gap \(0.06\%\)) and Type C, \(n=300\) (gap \(0.12\%\)) for MA, and Type C, \(n=300\) (gap \(0.04\%\)) for BRKGA.

On Scenario 2, all three algorithms perform very well. For Type NC BRKGA, MAext and MA find 47, 46 and 39 out of the 49 known optima, respectively. For Type C and FC, both MAext and BRKGA find all 16 known optima, while MA finds 14 of them. Overall, MAext and BRKGA perform very similarly in this scenario, as will be discussed in greater detail in Sect. 4.3.

Table 3 CPLEX - Heuristics comparison (Scenarios 3–4)

Looking at Table 3, we can see a higher performance gap between BRKGA and the MA variants. For Scenario 3, BRKGA, MA and MAext find 20, 8 and 9 optima, respectively. CPLEX gaps for BRKGA are always within 0.33%, while they grow up to 0.83% for MA and 0.84% for MAext. BRKGA finds lower gaps in all cases except Type FC. In this case, for \(n=300\) both MA variants have smaller CPLEX gaps and find more optima (4 for MA, 6 for MAext, 2 for BRKGA), while for \(n=500\) MAext solutions have a slightly smaller gap (0.19% for MAext, 0.24% for BRKGA).

For Scenario 4, BRKGA finds 23 optima, while MA and MAext find 2 and 7 of them, respectively. CPLEX gaps for BRKGA are always within 0.84%, while they grow up to 1.98% for MA and 1.60% for MAext. BRKGA outperforms MA in terms of gap in all cases, and MAext in all cases except one, that is Type FC, \(n=300\). In this case, MAext finds an additional optimal solution (3 for MAext, 2 for BRKGA) and has a slightly smaller gap (0.24% for MAext, 0.28% for BRKGA).

4.3 MA, MAext and BRKGA comparison

We now compare directly the performances of the three heuristic approaches on the whole set of 600 instances. Results for Scenarios 1-4 are contained in Tables 4, 5, 6 and 7, respectively. For each algorithm, the column with the heading Sol. has the same meaning described previously. The columns with the heading Time (s.) contain computational times expressed in seconds. We do not report computational times for MAext, since they are the same as BRKGA. Additionally, for BRKGA, the columns with the headings MA Imp. (%) and MAext Imp. (%) report how much BRKGA improves the MA and MAext solutions in percentage. For each instance a, such improvement is computed as \(100\times \frac{Sol(BRKGA,a)-Sol(MA,a)}{Sol(MA,a)}\) for MA and analogously for MAext. Finally, for each scenario and type, the line corresponding to \(n=ALL\) contains average values computed on the corresponding subset of 50 instances.

Looking at Scenario 1 (Table 4), we can see that BRKGA performs better than MA in all cases. Improvements are above 1% in 5 cases, above 2% in 4 cases and above 3% once. Overall, improvements are equal to 1.72% for Type NC and 1.20% for Type C. For Type FC, where for \(n=500\) both approaches outperform CPLEX significantly, the BRKGA improvement is less impressive (0.33%), however, it grows up to 0.69% for the case \(n=800\).

Regarding computational times, we can see that MA is more efficient, requiring around 30, 70 and 80 s for instances with \(n=1000\) belonging to Type NC, C and FC, respectively. The same instances require around 90, 110 and 120 s to be solved using BRKGA.

We now compare BRKGA to MAext. We can see that BRKGA performs better in 13 out of 15 cases. Improvements are above 1% in 5 cases and grow up to 1.82%, 1.73% and 0.84% for Type NC, C and FC instances, respectively. The two negative values correspond to Type FC, \(n=300\) and \(n=700\) (\(-\)0.32% and \(-\)0.55%, respectively). Overall, improvements are equal to 1.03% for Type NC, 0.98% for Type C and 0.16% for Type FC.

Moving to Scenario 2 (Table 5), we observe that BRKGA finds better solutions than MA in 14 out of 15 cases, with the only opposite case being Type FC, \(n=1000\), with a value equal to \(-0.21\%\). The improvements are generally lower than the one observed for Scenario 1 since they grow up to 1.90% (Type C, \(n=1000\)) and are below 1% in all other cases. Overall, improvements are equal to 0.30%, 0.63% and 0.09% for the three types of instances.

By the effect of the combination of a low number of forfeit sets and strict allowances, solutions in Scenario 2 have fewer items than those in the other three scenarios, and both algorithms converge faster. In particular, MA always runs within 10 s. While slower, BRKGA remains a fast algorithm, running within 65 s in the worst case.

Looking at the solutions obtained by MAext, we observe that allowing MA to run for a longer time allows it to reach the performances of BRKGA, and exceed them in some cases. Indeed, the values in the related improvement column are equal to 0 in 7 out of 15 cases, positive in 2 cases and negative in the remaining 6. All the values in the column range between 0.07% and \(-\)0.31%. We recall that both approaches find almost all the optimal solutions provided by CPLEX for this scenario.

Table 4 MA, MAext and BRKGA comparison (Scenario 1)

We now consider Scenario 3 (Table 6). We can see that BRKGA improves MA in 14 out of 15 cases, with the only negative value being \(-\)0.16% (Type FC, \(n=300\)). The improvements are above 0.5% in 12 cases, up to 1.21% (Type NC, \(n=1000\)). Overall, the average improvements are equal to 0.81%, 0.81% and 0.49% for Type NC, C and FC, respectively.

Looking at computational times, we can see that the additional choices provided by the higher allowance thresholds impact more BRKGA than MA. Indeed, MA always runs within around 200 s. While slower, BRKGA retains acceptable computational times, always running within around 500 s. When computed over the 50 instances corresponding to a type, the average computational times for BRKGA never exceed 200 s.

With respect to MAext, BRKGA still obtains better solutions in 13 out of 15 cases, with the higher negative value being \(-\)0.20% (Type FC, \(n=300\)). The average improvements for Type NC, C and FC decrease to 0.68%, 0.43% and 0.15%, respectively.

In Scenario 4 (Table 7), BRKGA finds better solutions than MA in all 15 cases. The improvements are above 0.5% in 9 cases, up to 1.13% (Type NC, \(n=1000\)). The average improvements for Type NC, C and FC are equal to 0.79%, 0.58% and 0.36%, respectively. Again, we observe that MA runs always within 200 s, while BRKGA runs within 500 s, with average computational times below 200 s when computed over the 50 instances corresponding to a type.

BRKGA finds better solutions than MAext in 14 out of 15 cases, with the only negative value being \(-\)0.04% (Type FC, \(n=300\)). The average improvements for Type NC, C and FC are 0.55%, 0.27% and 0.17%, respectively.

Table 5 MA, MAext and BRKGA comparison (Scenario 2)
Table 6 MA, MAext and BRKGA comparison (Scenario 3)
Table 7 MA, MAext and BRKGA comparison (Scenario 4)
Fig. 5
figure 5

BRKGA solutions quality comparison versus MA and MAext

To further highlight the different performances of the three algorithms and the improvements brought by MAext with respect to MA, we consider Fig. 5. In the figure we show, for each scenario, how many times BRKGA found a better, equal or worse solution than MA or MAext. In Scenario 1, with respect to MA, MAext allows to obtain more ties (29 out of 150 for MA, 40 for MAext). However, the two MA variants beat BRKGA roughly the same number of times (31 for MA, 30 for MAext). In Scenario 2, MA and BRKGA find the same solution in 120 out of 150 cases, while BRKGA finds a better solution in 26 of the remaining 30 cases. As already discussed, the performances of MAext and BRKGA are very similar on this scenario. Indeed, there are 141 ties, while MAext found a slightly better solution in 7 of the remaining 9 cases. Finally, looking at Scenarios 3 and 4, we observe that BRKGA finds better solutions than both MA and MAext in most cases. For Scenario 3, it finds better solutions than MA in 130 cases, with 9 ties, and better solutions than MAext in 114 cases, with 13 ties. For Scenario 4, it finds better solutions than MA in 132 cases, with 10 ties, and better solutions than MAext in 110 cases, with 16 ties. A two-sided sign test tells us that we can reject the null hypothesis that BRKGA and MAext perform similarly on Scenarios 1, 3, and 4 (\(p-value < 0.0001\)).

Table 8 BRKGA and variants comparison (Scenario 3)

Finally, we make some considerations about the number of items and violations contained in the solutions returned by the different algorithms. These values are reported the table Appendix A. No clear trend can be derived that correlates additional or fewer items (or violations) to better solutions, and such solutions may depend strongly on the features of each individual instance, which emphasizes the need for effective algorithms such as BRKGA to solve KPFS effectively.

4.4 BRKGA decoder parameters

In this section we propose an analysis of the effects of the parameters and features that we introduced in our proposed decoder function. In particular, we compare our BRKGA algorithm with the following three variants:

  • No\(_{adjust}\): A version of BRKGA that does not apply the chromosome adjustment operator (see Sect. 3.3.4).

  • No\(_\omega \): A version of BRKGA that does not use the \(\omega \) parameter; that is, the cleaning operator only removes from a solution S items j that lead to an improvement in the fitness value (\(\sum _{C^i \in \mathcal {C}_j| v_i(S)> 0}d_i \ge p_{j}\); see Sect. 3.3.2).

  • No\(_\tau \): A version of BRKGA that does not use the \(\tau \) parameter; that is, in each iteration the greedy completion operator adds to a solution S the item j that maximizes the ratio \(\frac{p'_j}{w_j}\); that is, the number of additional violations is not considered (see Sect. 3.3.3).

For the scope of this comparison, we focused on Scenario 3.

With No\(_{adjust}\), we aim to verify if the chromosome adjustment operator actually helps to transmit the features of good solutions to the future generations, and ultimately leads to better solutions. With respect to No\(_\omega \) and No\(_\tau \), we observe that they correspond to BRKGA with \(\omega =1\) and \(\tau =0\), respectively. These values have been also considered during the irace tuning phase. Therefore, in this analysis we aim to check which types of instances benefit most from the chosen values.

In Table 8 we report how much BRKGA improves in percentage the results of the considered variants. For any BRKGA variant Alg and instance a, the percentage improvement is computed as \(100 \times \frac{Sol(BRKGA,a) - Sol(Alg,a)}{Sol(Alg,a)}\). Each value reported is an average computed on the 10 values corresponding to a given choice of Type and n.

Looking at NO\(_{adjust}\), we observe that the chromosome adjustment operator has a significant impact on solution quality. Indeed, BRKGA finds better solutions on average for all instance types and sizes. In particular, for Type C and FC instances, the average improvement is often around 1% for bigger instances (up to \(1.19\%\) for Type C, \(n=1000\)). Overall, BRKGA finds better solutions than NO\(_{adjust}\) for 113 out of 150 instances (56 out of 60 for \(n\ge 800\)), while the opposite happens for 8 instances.

Let us now consider No\(_\omega \). It can be seen that the relaxed definition of detrimental items allows to find better solutions for correlated and fully correlated instances, for which BRKGA finds always better solutions on average, with improvements above 0.4% in 8 out of 10 cases, up to 0.99% for Type FC, \(n=700\). Overall, for Type C and FC instances, BRKGA finds better solutions for 89 out of 100 instances (always for \(n \ge 800\)). For the same instances, No\(_\omega \) finds better solutions 5 times.

For NC instances, the two approaches have similar performances. Indeed, BRKGA finds better solutions than No\(_\omega \) in 16 out of 50 cases, while the opposite happens 18 times. On average, No\(_\omega \) finds better solutions for \(n\ge 700\), while BRKGA performs better in the other two cases. However, values are smaller than the ones observed for Type C and FC, ranging from \(0.07\%\) to \(-0.27\%\).

Finally, let us consider \(No_\tau \). In contrast to \(No_\omega \), this version is significantly outperformed by BRKGA on Type NC instances. Indeed, BRKGA performs better on average in all cases with \(n\ge 500\), with improvements up to around 1.5% for \(n\ge 800\). BRKGA finds better solution than \(No_\omega \) for 31 out of 50 instances (in 19 out of 20 cases for \(n \ge 800\)), while the opposite happens 6 times.

For correlated and fully correlated instances, the approaches have similar performances. BRKGA finds better solutions for 43 out of 100 instances, while the opposite happens in 38 cases. The values range from 0.14 to \(-0.05\%\) for type C instances, and from \(0.20\%\) to \(-0.27\) for type FC instances; all negative values except one are above \(-0.10\%\).

5 Conclusions

In this work, we propose a Biased Random-Key Genetic Algorithm (BRKGA) for the Knapsack Problem with Forfeit Sets (KPFS), a recently introduced variant of the 0/1 Knapsack Problem that deals with partial incompatibilities among subsets of items (also collect forfeit sets) by introducing penalties (forfeit costs). Furthermore, KPFS generalizes previously introduced problems, such as the Knapsack Problem with Forfeits and the Knapsack Problem with Conflict Graph. BRKGA algorithms are variants of random-key genetic algorithms, introducing bias towards the fittest individuals during crossover, and have been applied successfully to many combinatorial optimization problems. This approach has been shown experimentally to outperform significantly the previously introduced MA algorithm, being able to find many more proven optimal solutions, and better solutions in all considered cases. Since BRKGA is more time-consuming than MA, we also considered an extension of MA (MAext) that uses the same computational time. BRKGA outperforms MAext in three out of four test scenarios, while the two algorithms have equivalent performances in the remaining one. Regarding future research, we aim to develop efficient exact approaches for the problem. Developing and testing BRKGA approaches to solve other optimization problems adapted to take into account the forfeit sets concept could also be an interesting line of research.