On efficiency and the Jain’s fairness index in integer assignment problems

Given two sets of objects, the integer assignment problem consists of assigning objects of one set to objects in the other set. Traditionally, the goal of this problem is to find an assignment that minimizes or maximizes a measure of efficiency, such as maximization of utility or minimization of cost. Lately, the interest in incorporating a measure of fairness in addition to efficiency has gained importance. This paper studies how to incorporate these two criteria in an integer assignment, using the Jain’s index as a measure of fairness. The original formulation of the assignment problem with this index involves a non-concave function, which renders a non-linear non-convex problem, usually hard to solve. To this aim, we develop two reformulations, where one is based on a quadratic objective function and the other one is based on integer second-order cone programming. We explore the performance of these reformulations in instances of real-world data derived from an application of assigning personnel to projects, and also in instances of randomly generated data. In terms of solution quality, all formulations prove to be effective in finding solutions capturing both efficiency and fairness criteria, with some slight differences depending on the type of instance. In terms of solving time, however, the performances of the formulations differ considerably. In particular, the integer quadratic approach proves to be much faster in finding optimal solutions.


Introduction
The assignment problem consists of allocating objects in one set to some other objects in another set.In one set, tasks such as jobs, courses, positions, projects, or resources are available to be done or used by the agents, individuals, or users of the other set.Operation managers are often faced with resource assignment problems, where quantitative decision models are useful to find a solution (Bertsimas et al. 2012).In this paper, we focus on incorporating both efficiency and fairness into the assignments.Often, the decision maker who performs the assignment is interested in maximizing an overall metric of efficiency, while the users affected by the assignment are interested in maximizing their own individual benefits.On the one hand, maximizing the assignment's performance on efficiency might lead to some users ending up in better conditions than others.On the other hand, maximizing a fairness metric among the users might potentially reduce the efficiency of the assignment.It is important to notice that these two goals are not necessarily in direct conflict.In fact, in this work, we show that the level of one objective may be maintained while improving the current level for the other.However, their interaction has to be carefully managed to achieve a balance between the two.Therefore, studying assignments that incorporate both efficiency and fairness is an important endeavor.
This paper proposes a bi-objective approach based on integer programming, incorporating the concepts of efficiency and fairness in the assignment.Here, an efficient assignment means an assignment that maximizes the total benefits.To measure fairness, we use the Jain's index (Jain et al. 1984).This index is defined as a ratio, where the numerator is the users' squared total benefits, and the denominator is the number of users times the total users' squared benefits.Hence, the Jain's index is a non-concave function, which increases the difficulty of solving the bi-objective problem.To overcome this issue, in this paper, we study two reformulations tailored to the integer unbalanced assignment problem.One of these is based on a convex quadratic objective function and the other one is based on integer second-order cone optimization (ISOCO).Using data from a real-world case and also experimental data from different scenarios, we explore the performance of these reformulations in terms of solution quality and solving time.
The bi-objective approach to incorporate efficiency and fairness in an integer unbalanced assignment problem presented in this paper was motivated by a real-world problem of assigning students to projects Rezaeinia et al. (2021).This type of assignment problem, where the number of resources and users are unequal, appears in several applications (see e.g.Rabbani et al. 2019;Majumdar and Bhunia 2012).In our application, each resource must be assigned to only one user, and the different resources are characterized by a number of attributes.We assume that each user requires a limited number of resources and the benefits of each resource are not the same for all the users.Each user is then interested in maximizing her own benefit.However, from a central decision-maker perspective, factors such as the limited number of available resources and the different valuations of the users on the different resources make it practically impossible to assign the resources so that all users achieve the maximum benefit.The remainder of this paper is organized as follows.Section 2 reviews relevant literature.Section 3 develops mathematical modeling approaches for the bi-objective problem, while Sect. 4 presents some reformulations.Section 5 discusses the numerical results.Concluding remarks are provided in Sect.6.

Literature review
The scientific literature has paid increasing attention to incorporating both efficiency and fairness in assignment problems.A wide range of articles studied the problem in various fields, especially in network assignment, wireless communication systems, digital transmission, telecommunications, and some other allocation problems with continuous resources.As a fairness measure, these works have usually adopted the Jain's index.This index is a well-established measure of fairness and has some appealing properties, as emphasized in Guo et al. (2014), Sediq et al. (2013) and Lan et al. (2010).This index was first defined by Jain et al. (1984), and since then it has been used in thousands of papers.In the following, we review papers that have used this index in assignment problems where, in addition to efficiency, the solutions try to meet a fairness criterion.Schwarz et al. (2010) studied the trade-off between efficiency and fairness in a mobile communication system with multiple users.They address the problem using a non-linear integer program, where the objective is to maximize the total throughput of the users in the system.A linear approximation is proposed to simplify the non-linear problem and a max-min model is proposed to maximize the minimum throughput and guarantee a minimum fairness level.Then, the Jain's index is used to quantify the obtained fairness.Additionally, Schwarz et al. (2011) used the Jains' index directly in the primary problem and studied the trade-off between efficiency and fairness in a wireless network.They proposed a utility maximization problem based on an -fair utility function by considering the Jain's index as a constraint.Then, a second order cone programming (SOCP) adapted to the continuous multiuser scheduling problem was used to cast the problem from a non-convex to a convex optimization problem.
In a different problem, Sediq et al. (2012Sediq et al. ( , 2013) ) studied the trade-off between the sum efficiency and Jain's fairness index in an assignment problem with continuous resources and multiple users.This work exploited some special conditions of the radio resource allocation problems to convert the non-convex problem to a convex optimization problem.Another paper that deals with trade-off between efficiency and fairness is Guo et al. (2013).They consider the increase of efficiency and fairness in the assignment of resources to users in a wireless communication system.They addressed the trade-off between efficiency and fairness by formulating an -utility function in combination with the Jain's index, and using a lexicographic approach prioritizing efficiency first and then fairness.In the first model, they considered maximizing efficiency subject to a Jain's index constraint.In the second model, they maximized the Jain's index by considering a constraint on the efficiency of the system.There is no reformulation for the Jain's index in this study, hence, an algorithm to find the optimal trade-off was proposed.The same authors 42 Page 4 of 23 studied a different problem but of similar nature in Guo et al. (2014).This problem considers the throughput maximization in a sub-channel and time slot allocation in downlink systems subject to a Jain's index constraint on both short-term and long-term fairness.First, a non-linear integer programming formulation was used to model the problem, and then the integer variables were relaxed into continuous ones and a SOCP formulation was used to cast the problem into an equivalent convex one.Song et al. (2016) used the Jain's index to measure the fairness of the solution obtained.The paper analyzed the trade-off problem among spectral efficiency, energy efficiency, and fairness in cooperative digital transmission systems.An -fairness function was used in a multi-objective optimization model using the weighted sum method to represent the fairness rate.Then, an algorithm based on the Lagrangian dual decomposition method was used to obtain the solution set of the model.A heuristic resource allocation algorithm was also proposed to obtain a solution and maintain the trade-off between many users.
In another context, Kachroo et al. (2016) studied the trade-off between efficiency and fairness in radio resource allocation of maritime channels.Here, the problem consists of assigning the blocks of the radio resource to the users.To address it, a max-min integer linear programming model was formulated, where the objective function is to maximize the minimum total throughput from the resource, aimed at increasing fairness in the allocation system.Then, the results obtained were compared by using the Jain's fairness index.In another problem with continuous nature, Zhou et al. (2017) studied the load balancing problem in a cellular network, which aims at maximizing the system throughput and balancing the load of the network.They model the problem by an integer non-linear programming formulation with a non-convex objective function.A two-layer iterative algorithm is used to find a nearly optimal solution to the problem.Then, the Jain's index was used to measure the load balancing level in the network.Another paper using the Jain's index is Zabini et al. (2017), which studies the trade-off between throughput and fairness in a resource allocation problem in wireless communication systems.They address the trade-off problem by optimizing a model, whose objective function is to maximize the average throughput subject to a specified value of fairness.Bui et al. (2019) studied the trade-off between throughput and fairness in a downlink non-orthogonal access network.They addressed the problem with an integer and non-convex optimization model.To obtain a practical implementation and overcome the problem's complexity, the integrality of the variables is relaxed to a continuous formulation.Then, an approximation method is proposed to solve the relaxed problem, attempting to arrive at a locally optimal solution.
Most of the papers above deal with efficiency and fairness in continuous resource assignment problems.Some of them consider discrete decisions, but this is managed by relaxing the integrality constraint and using rounding heuristics to find feasible solutions.The problem considered in this paper has an integer nature and rounding procedures are not a straightforward option.The work of Zabini et al. (2017) is one of the closest to ours, as they approach a similar assignment problem.However, their problem is continuous, in which each resource can be assigned to more than one user, while in our problem each resource should be assigned to only one user.Sediq et al. (2012) is also close to our article, in terms of exploiting the structure of the assignment problem and the Jain's fairness index to reformulate the problem.The continuous nature of the resources in their case allows to reduce the problem to a single non-linear problem that can be cast into a quadratic convex problem.Their main result relies on the convexity of the feasible set.This result is based on a monotonic trade-off property that leads to a monotonic decrease of fairness for any increase in efficiency.That monotonic property allows them to find what they call the "Optimal Efficiency-Jain trade-off policy" and characterize the Pareto frontier.
In contrast, the nature of the resources in our case is discrete, thus the convex argument is not valid anymore and the monotonic property disappears.The solution approach, therefore, cannot be simplified to solve a single convex quadratic problem, and a different approach is needed to handle the interplay between efficiency and fairness.To deal with this challenge, we evaluate the prioritization of either efficiency or fairness and its implication for the quality of the solution using a lexicographic optimization approach (Lai et al. 2022).The prioritization of each criterion result in modeling challenges for which we consider different reformulations.We evaluate the performance and scalability of each model and how they affect the lexicographic approach.Additionally, due to the specific requirements for the assignment of resources to users in our application, we consider some side constraints that are not typically considered in telecommunications.The addition of those constraints illustrates how these approaches can be extended to applications beyond the specifics of telecommunication problems.Moreover, an advantage of the proposed approach is that it is agnostic to the addition of side constraints, that is, the lexicographic approach and the reformulations of the non-convexities will not be compromised by adding these constraints.
Overall, our work contributes to the literature by incorporating both efficiency and fairness in assignment problems with discrete nature.On the methodological side, our reformulations aim at overcoming the difficulty of incorporating a nonlinear function to measure fairness in these problems.On the applied side, we illustrate our approaches in a real-world problem and also explore their computational performance in several experimental data instances.

Mathematical modelling
This section presents the different elements of the optimization formulations that will be used later to develop the main approaches.
To define parameters and decision variables in the mathematical formulations, we assume that we have n users and m resources.We use two sets denoted by U = {u 1 , … , u n } , which represents the set of users, and R = {r 1 .… , r m } , which rep- resents the set of resources.We also define p ru ∈ ℝ + as the benefit obtained when a resource r ∈ R is assigned to a user u ∈ U .Additionally, U u ∈ ℤ + and L u ∈ ℤ + denote the upper and lower bounds respectively for the number of resources that a user u requires.
We use two sets of decision variables.First, the model has to decide to which users assign resources.We use the binary variable y u ∈ {0, 1} , which is one if user u is selected and zero otherwise.This is required because given the condition of the 42 Page 6 of 23 unbalanced assignment problems and the number of resources needed for a user, all the users might not receive the resources they need.Second, the model has to assign resources to the users.For this purpose, we use the variable x ru ∈ {0, 1} , which is equal to one if resource r is allocated to user u, and zero otherwise.

Constraints
To model the conditions of the assignment problem under study, we define the following constraints: Constraints (1) ensure that all resources are assigned to only one user.Constraints (2) ensure that no resources will be assigned to users that were not selected.Constraints (3) and (4) impose the upper and lower bounds on the number of resources that are needed for each user.Constraints (5) enforce the binary nature of the variables.

Efficiency and fairness functions
For the optimization models, we consider two objectives: efficiency and fairness.We may define the efficiency and fairness of the assignment problem as a function of the vector x ∈ {0, 1} |R||U| of decision variables.In (6), B(x) measures the effi- ciency of an assignment x, which is the sum of the benefits obtained by the users.Note that Eq. ( 6) is a linear function on x. (1) For the fairness we use the Jain's index J(x) (Jain et al. 1984), which is formulated in (7) using the benefits of an assignment x.The Jain's index ( 7) is a non-linear continuous function with a bounded range in the closed interval [ 1 n , 1] .The Jain's index provides a fairness measure for an assignment and its two bounds represent two extreme situations.The lower bound 1 n corresponds to the least fair allocation.In that situation only one user benefits with the assignment and all the other users do not receive any resources or do not benefit from the assigned resources.The upper bound 1 corresponds to the fairest assignment in which all users receive the same benefit.Note that the fairness' bounds may be achieved without necessarily optimizing the users' benefits.Hence, when optimizing an assignment it is important to consider efficiency and fairness together.

General formulation
This paper aims to study the incorporation of efficiency and fairness in an assignment problem.The main challenge when using the Jain's index is that full fairness would not necessarily lead to efficiency.Consider for example a two users case.In this case, full fairness is obtained when both users receive the same level of utility (irrespective if it is the minimum or maximum possible, or something in between).To overcome that challenge, we propose a bi-objective approach where the pointed risk of the Jain's index fairness function is balanced by maximizing the efficiency of the assignment for any given fairness level.Note that efficiency and fairness are not necessarily conflicting.However, depending on how unbalanced the problem is, the optimization process may demand more careful handling to incorporate efficiency and fairness.
The general formulation used for building our approach was proposed in Sediq et al. (2013Sediq et al. ( , 2012) ) in the context of wireless communication assignments.Here, we define C as the set of vectors (x, y) satisfying the constraints (1)-( 5).Then we obtain the first part of the formulation (8).
In (8), X p is the set of all assignment vectors (x, y) ∈ C that maximizes the Jain's index subject to a minimum efficiency level of p.Note that (8) works with a prescribed efficiency level.Hence, to optimize efficiency, we need to maximize the total efficiency B(x) over the set X p .In (9), X * p ∈ X p is one of the benefit vectors that max- imize the total efficiency.Note that X * p lays on the Pareto efficient frontier. (8) 42 Page 8 of 23 An important remark here is with respect to (8).In Sediq et al. (2013Sediq et al. ( , 2012)), the structure of the problem allows us to derive the Optimal Efficiency-Jain trade-off using a monotonic trade-off property.Basically, this property states that any increase in efficiency leads to a decrease in fairness.This implies that the inequality B(x) ≥ p is always satisfied with equality.Then, for any level of efficiency p one just needs to solve the optimization problem in ( 8) and the solution obtained can be assigned to X * without having to solve (9).Unfortunately, the result in Sediq et al. (2012Sediq et al. ( , 2013) ) is not directly applicable to our problem, because we cannot rely on the set C being convex.In general, the lack of convexity does not necessarily invalidate the monotonicity.However, it is easy to see that our problem setup does not satisfy the monotonic trade-off property.Consider for example an efficiency level of p = 0 .One could build a solution with a fairness of 1 by simply building an assign- ment with utility zero for all the users.The monotonic trade-off property states that any increase in efficiency beyond zero will necessarily lead to a deterioration of the fairness level.The assignment of utility zero for all the users is often feasible in the assignment problems considered here.Additionally, it is possible that a solution where all users may obtain an assignment with a reward strictly greater than zero is also feasible.In such a case, one would be able to improve efficiency without any sacrifice of the fairness level, thus violating the monotonic trade-off property.In consequence, the lack of monotonicity forces us to consider an approach where solving a single convex quadratic optimization problem is no longer an option.
To cope with the challenge of lack of monotonicity, the remainder of this section will outline some variations of the general formulation presented above, which were used previously by Rezaeinia et al. (2021).The resulting models are based on a lexicographic method to prioritize one goal at a time and will set the basis for the reformulations that we consider later in this paper.

Lexicographic efficiency: fairness
The first lexicographic approach prioritizes efficiency over fairness.Efficiency is computed using the utility function defined in (6).We formulate the first step of this approach in (10), where the aim is to obtain the maximum efficiency level for an assignment subject to constraints (1)-( 5).Let B * denote the optimal efficiency level obtained by solving the problem (10).
The second step of this approach is to optimize fairness while maintaining the same level of efficiency of B * .We formulate the optimization problem of this second step in (11), where we maximize the Jain's index.
Note that ( 11) is equivalent to solving the optimization problem considered in (8), where p = B * .Hence, we are seeking an assignment that maximizes J(x) and yields an efficiency equal to B * .Two remarks are important here.First, note that setting p = B * makes it unneces- sary to solve the optimization problem in (9).In other words, by construction, we already have the best level of efficiency that can be achieved with just the assignment constraint.Henceforth, a more constrained problem will not improve that efficiency level.Second, if the problem (10) is feasible, then we immediately have a solution that satisfies the feasible set of (11).In other words, we obtain a certificate of feasibility for both problems even though the second problem is more constrained.For later reference throughout the article, we formalize this second remark in Corollary 1 below.
Corollary 1 The feasibility of Problem (11) follows from the feasibility of Problem (10).

Lexicographic fairness: efficiency
The second lexicographic approach prioritizes fairness over efficiency.We formulate the first step of this approach in ( 12), where we optimize the fairness of the assignment subject to constraints (1)-( 5).To optimize fairness, we use the Jain's index formulated in (7).Let J * be the optimal fairness level obtained by solving (12).
The second step of this approach is to optimize efficiency while maintaining at least a fairness level of J * .We formulate the optimization problem of this second step in (13), where we maximize the efficiency utility function B(x).
Note that ( 12) is equivalent to solve the optimization problem considered in (8), where p = 0 .Hence, since we are considering only non-negative utilities, we are seeking an assignment that maximizes J(x) without imposing any lower bound constraint on the efficiency.Then, problem (13) solves the problem considered in (9) imposing J * as the level of fairness required.
Note that for this approach we also have that obtaining feasibility when solving problem (12) certifies the feasibility of problem (13).This is formalized in Corollary 2 below.

Corollary 2
The feasibility of Problem (13) follows from the feasibility of Problem (12).Corollaries 1 and 2 have more general implications.Notice that the problems in (10) and in ( 12) have the same feasible set.Hence, as a result we obtain that the feasibility of the problem in ( 12) is certified by the feasibility of the problem in (10), as stated in Corollary 3 below.

Corollary 3
The feasibility of Problem (12) follows from the feasibility of Problem (10).
Even though the results in Corollaries 1, 2, and 3 follow easily from the definition formulation and construction, they have an important practical implication.First, we remark that the problem in ( 10) is an Integer Linear Problem (ILP), for which one can exploit the efficiency of the current state of the art of MILP solvers.However, the problems in ( 11), ( 12), and ( 13) are non-linear non-convex problems, for which the optimization process tend to be more challenging.To ease the challenge posed by the nature of those problems, if problem ( 10) is feasible, the solution found to it may be used to initialize the optimization of the problems ( 11), ( 12), and (13).

Reformulating the Jain's index
One of the challenges when using the Jain's index is that ( 7) is non-linear non-concave.This may lead to more computational effort and time to solve the problems in ( 11), ( 12), and ( 13).Additionally, it does not scale, i.e., when the dimension of X p is large, it might become too difficult to solve the problem.This last point is illustrated with our computational experiments in Sect. 5.The non-concavity of the Jain's index affects the nature of the optimization problem and provides the motivation for looking at possible reformulations.

Reformulation for the lexicographic efficiency: fairness approach
Here we show how J(x) can be reformulated in problem (11).First, from Corollary 1 we know that if the problem in ( 10) is feasible, its optimal solution is also feasible for (11).Moreover, since the optimal value B * found when optimizing (10) does not have any additional side constraints, we know that it is not possible to obtain a higher value for B(x) within the set defined by the constraints (1)-( 5).Hence, the constraint B(x) = B * will always be satisfied for any feasible solution of (11).There- fore, we may write Problem (11) as shown in ( 14).
Note that in this formulation we are fixing the value of B(x) = B * .Now, recall the definition of the Jain's index J(x) in ( 7).As a result, we are fixing the numerator of J(x).Hence, Problem (14) may be reformulated as follows: Since B * is obtained by ( 10), optimizing the problem in ( 15) is equivalent to opti- mizing the following problem: The advantage of Problem ( 16) is that its continuous relaxation has a convex quadratic objective function, which may help with the computational effort required to solve it.

Reformulation for the lexicographic fairness: efficiency approach
Here we focus on the reformulation opportunity for the second approach considered in this paper.First, we have that the problem in ( 12) is maximizing the Jain's index, which makes it a non-linear and non-convex problem.For that particular problem we have no reformulation to ease the challenge posed by J(x).
Hence, this will remain one of the more computationally demanding steps of our approaches.Now, using Corollary 2, we know that if we find a feasible solution (x � , y � ) to (12), then we have that (x � , y � ) is feasible for the problem in (13).Note that (x � , y � ) may be the optimal solution, but it is not required to be.This is relevant because it may happen that Problem (12) may not be solved to optimality within a certain time limit, but a good enough feasible solution is available.Let J * be the fair- ness value for that feasible solution (x � , y � ) .Using J * we can reformulate (13) as a ISOCO replacing the constraint J(x) = J * with J(x) ≥ J * .Note that since J * is the best value one can achieve for fairness, imposing J * as a lower bound leads to an equivalent formulation.This is an integer non-linear problem, whereby a linear objective function is optimized by considering some linear constraints and at least one quadratic cone constraint (Góez 2013).
To reformulate the problem in (13) we focus on the constraint involving the Jain's index.With that constraint we aim to ensure a level of fairness at least as good as J * .Note that the use of J(x) leads to a non-linear non-convex constraint.However, we can exploit the constant right-hand side J * to reformulate that con- straint as a SOC as follows: In ( 17) we obtained a second-order cone constraint in ℝ n+1 .Thus, problem (13) can be written as ( 18).
The optimization problem in ( 18) is a ISOCO.Finally, we illustrate the challenge of the lack of the monotonic trade-off property in light of the formulation considered.Note that the room for trade-off of efficiency between the users in Problem ( 18) is related to J * .Consider the case where the efficiency level of user 1 is fixed to an arbitrary level strictly greater than 0, and the efficiency level for user 2 varies.For this case, Fig. 1 illustrates the effect of different J * levels on the trade-off between the users' efficiency.Notice that the effect is symmetric if one reverses the exercise fixing user 2 and varying the efficiency for user 1, resulting in a cone.Problem (18) assigns resources to users and maximizes the total benefits in the intersection of the feasible region and the cone given by the fairness constraint.In the figure, by fixing the efficiency for user 1 at 5 units of benefit, the largest cone corresponds to J * ≥ 0.69 for any value of efficiency that is strictly greater than 0. As J * increases, it approaches the value 1, (17) (1)−( 5) Fig. 1 The effect of J * on the trade-off between the users' efficiency which is the maximum possible level of fairness and is represented in the figure by the red dashed line.

Computational results
We tested the approaches proposed in Sect. 4 in a personnel allocation problem, derived from a real-world application reported in Rezaeinia et al. (2021).This is an unbalanced integer assignment problem, where students must be assigned to projects in an educational programme.The projects are proposed by companies and presented to the students in a workshop, with the support of administrative staff and academic supervisors.After the workshop, the students are asked to fill out a preference survey and rank K projects by considering their skills, background, and projects description.The ranking is structured as follows: a project with rank K is the most beneficial project, and a project with rank 1 has the lowest benefit.Consequently, the remaining K − 2 projects that the students are allowed to rank have ben- efits in the range {2, … , K − 1} .In this problem, the projects defined by the com- panies are equivalent to what in the models is the set of users and the students are equivalent to what in the model is the set of resources.Then, the students' benefits p ru are computed based on the preference ranking obtained from the survey.Hence, p ru = k means that the benefit of student r from being assigned to project u is equal to k, where k is the given rank to the project by the student.Also, p ru = 0 means that project u is not ranked by student r.In addition, the companies may have specific requirements for the team of students they will get assigned, on attributes such as educational background, language skills, and gender.These data are gathered in collaboration with the administrative staff and are then used in an assignment model.For this purpose, we define T = {t 1 , … , t l } as a set of attributes, and a rt as a binary parameter equal to one if student r possesses attribute t, and zero otherwise.Then, the following side constraints are used in the model: In constraints ( 19) and ( 20), L ut and U ut specify upper and lower bounds on the num- ber of students with attribute t that are needed by project u.Depending on these constraints, the number of projects, the number of students and their preferences, and other aspects of the problem, it is in general not possible to assign all students to their top choices.In consequence, different solutions might render different levels of efficiency and fairness and, therefore, it is important for the administration to find a solution considering both criteria.More details about the problem can be found in Rezaeinia et al. (2021).In what follows, we use the data instances of that paper, to study the performance of the reformulations we developed in Sect. 4. All computational codes have been implemented in AMPL.To solve the mathematical programming models, we use CPLEX version 12.10, except for model ( 12), which we solve by using BARON version 18.12.26.The computational runs are set to a time limit of 2 h.

Real-world data instances
These instances of data correspond to five consecutive years, spanning from 2017 to 2021.Table 1 provides an overview on the number of students, the number of projects, and the number of requested attributes for each of these years.Each of the instances from 2017 to 2021 was run using the reformulation for the lexicographic efficiency-fairness approach (RLEF) and the reformulation for the lexicographic fairness-efficiency approach (RLFE).We compare their results with results of the lexicographic efficiency-fairness (LEF) approach described in Sect.3, which provides a basis for comparison for the solutions obtained by the two other approaches.
For the instances of 2017, 2018, and 2020, all the approaches led to the same optimal solution within a few seconds.For 2019 and 2021 the results exhibited some as shown in Table 2.In these years, the students had to rank 5 out of 9 projects, which determined the different levels of benefits detailed in the table.The RLEF and LEF approaches led to the same optimal solution, with a great majority of the students assigned to their first-choice project.This translates into a high level of efficiency, reflected in a large amount of total benefits.The RLFE approach, in contrast, tends to assign fewer students to their first-choice project and more to their second-choice project, which renders more fairness but at the expense of less efficiency.This fact is especially notable in the 2019 instance.As for the solving time, the RLFE took about 10 min to obtain an optimal solution in the two instances, while the RLEF and LEF approaches took only a few seconds.One thing to highlight from these results is that placing the emphasis on fairness in the first stage affects efficiency but does not significantly improve the fairness results.In fact, either by prioritizing efficiency or by prioritizing fairness, the results reach a fairness index slightly above 97%.

Experimental data instances
The real instances that we encountered in our application are of relatively small scale.Hence, for the purpose of testing the reformulated approaches in larger-size problems and having a sense of the scalability of the proposed methods, we generated 160 experimental data instances of different sizes.We group these instances into four data sets, which differ in the number of students, projects, requested attributes, and the number of projects ranked by the students.Table 3 shows an overview of the generated data instances, according to these characteristics.Also, two different scenarios are constructed for each data set based on the students' preferences structure.For each scenario, we generate 20 instances (therefore, each of the four data sets consists of 40 instances).The scenarios are characterized as follows.
• Random preferences scenario In a data instance based on the random scenario, students' preferences are split among the projects randomly, according to a discrete uniform distribution (that is, each project has the same probability of being chosen as the k-th most favorite by each student).This defines the benefit of assigning a student to each of the projects, where zero refers to non-beneficial preferences, and K refers to the top-beneficial preferences.There is no guarantee that all the students could be assigned to their most preferred choices in this scenario.If a feasible solution exists, some students might be assigned to their topbeneficial choices, some others to their second choices, and so on (some students might even be assigned to their non-ranked projects).• Semi-homogeneous preferences scenario In a semi-homogeneous data instance, the projects and the students are divided into groups of the same size.The first group of students rank the first group of projects, the second group of students rank the second part of the projects, and so on.For example, in an instance of data sets A, the first 30 students rank the first seven projects, the second 30 students rank from project 8 to project 14, and so on.Note in a semi-homogeneous scenario, the students are assigned to their top-beneficial projects when their preferences satisfy the requirements of the ranked project.In contrast, if the preferences of all students were fully homogeneous, that is, if the ranking of preferences of all students would coincide, then it would be unavoidable to only assign some of them to their top-ranked projects and others to their non-ranked projects (thus, we do not elaborate experiments for this rather uninteresting case).
In the rest of this section, for each data set, we report the average of the results per scenario.
Table 4 summarizes the experimental results obtained by RLEF, RLFE, and LEF approaches for data set A. Recall that the set includes 150 students, 35 projects, 20 requested attributes, and the students rank up to 5 projects.
Twenty different data instances were produced based on each scenario, and each data instance was run with the proposed approaches.Hence, each column summarizes the average results for the proposed approaches.The first 6 rows provide the average number of students that could obtain such a benefit level.For example, in the solution to the RLEF approach of the random scenario, after running twenty data instances, on average 89 students were assigned to their top-beneficial preferences.The next two rows present the averages of the total benefit and fairness level of the solutions for the 20 instances of each scenario.The average number of projects selected by each approach is reported in the next row.Note that the number of selected projects depends on the distribution of students' preferences, thus it may happen that some projects are not assigned because they were not among the main preferences of the students.In addition, we report the number of data instances that reached the optimal solution in each approach.The most notable result is that the RLEF approach is able to find a high-quality assignment in a few seconds, while the RLFE approach reaches the time limit of 2 h with a feasible solution but without proven optimality.Also, it took longer for the LEF approach than the RLEF to reach an optimal solution.In both scenarios, the RLEF and RLFE approaches reach the assignment with slight differences in the average of total benefits and fairness levels.In particular, the RLFE approach renders a solution with a higher level of fairness than the RLEF and LEF approaches, while the total benefits level obtained by RLFE is less than in the two other approaches.The approaches differ significantly when looking at the instances solved to optimality.The RLEF and LEF approaches reach an optimal solution in all 20 instances of each scenario.In contrast, the RLFE failed to reach optimality in all scenarios.In addition, there are significant differences in the students' assignments in different scenarios.In the random scenario, a great majority of the students are assigned to their three first beneficial preferences, and none of the students are assigned to their non-beneficial projects.In the semi-homogeneous scenario, the vast majority of the students are assigned to some of their preferred projects, while a few students are assigned to non-beneficial projects.
In data sets B, there are 260 students and 56 projects.Also, there are 20 requested attributes, and the students rank 10 projects as their preferred projects.The numerical results of the experiments using data sets B are summarized in Table 5.
The RLEF approach again solved the data instances very quickly, within a few seconds, and reached the optimal solution in all the instances of all scenarios.In contrast, the LEF approach did not reach the optimal solution in 5 of the 20 instances of the random scenario and failed in all instances of the semi-homogeneous scenario.Also, the RLFE approach was unable to reach optimality within the 2 h time limit in all scenarios.Although the assignments obtained by the approaches are slightly different in terms of average total benefits and fairness levels in the random and semihomogeneous scenarios, the RLFE renders assignments with fewer benefits and higher fairness than the two other approaches.In the random scenario, there are no students assigned to their three bottom choices.In the semi-homogeneous scenario, a significant number of students were assigned to their three top choices, and a few of them were assigned to their last and even to non-beneficial projects.
Table 6 shows the results obtained for the data sets C.These instances consist of 360 students, 80 projects, and 25 required attributes.Also, we assume the students rank 15 projects, where a project with a rank of 15 is the top-beneficial preference.
The table shows that the RLEF approach obtained optimal solutions in all data instances in less than a minute on average, while in the semi-homogeneous scenario it took on average 10 min to reach optimality.However, the RLFE and LEF approaches reached the time limit of 2 h with a feasible solution, but without proven optimality.The assignments obtained by the RLFE approach render a higher level of fairness and less total benefit than the two other approaches.Note that none of the students were assigned to their last eight preferences in the random scenario.
In data set D, we consider 500 students with 110 projects and 25 requested attributes.The students rank 20 projects, where a rank of 20 is the most preferred one.The results obtained are reported in Table 7.
The most remarkable outcome is that the RLEF approach found an optimal solution to all instances for each scenario, and there are considerable differences between its solution time and the other approaches.The RLFE and LEF approaches reached the time limit of 2 h with a feasible solution but not proven optimality.In general, the approaches reached different assignments in each instance of the random scenario and also of the semi-homogeneous scenarios.There are slight differences in the total benefits and fairness levels, as the RLFE approach obtained assignments with less total benefits and higher fairness levels than the other approaches.Also, the vast majority of the students are assigned to one of their three most beneficial projects.

Discussion
The scalability of these approaches is an interesting question since it may be possible to have larger problems than the real instances considered here.Notice that all the proposed approaches reach optimality quickly in the small-size instances, as illustrated in the runs with real-world data.However, we can see that the RLFE and LEF approaches do not scale well.In other words, the size of the problem affects their performance critically.For example, the RLFE reached the time limit of 2 h without an optimal solution in all sets, while the LEF approach failed to reach an optimal solution in data sets C and D. Notice that the RLEF approach also suffers when the problems grow, but the solution times remain under 10 min on average.The results indicate that the users' preferences may significantly influence the difficulties of the problem and the performance of the approaches.The semi-homogeneous scenarios are an example in this case.The number of instances that failed to reach an optimal solution in this scenario is more than in the random scenario.The students' preferences are not perfectly split among the projects.Hence, in the semihomogeneous scenario, the students are assigned to their top-beneficial projects when their preferences satisfy the requirements of the ranked project.Also, depending on the conditions of the problem, some of the students may be assigned to their non-ranked projects.The effect of increasing the number of projects that students can rank is interesting.In particular, notice that the number of non-ranked projects assigned to students tends to decrease when they are allowed to rank more projects, a trend perceived in data sets A, B, C, and D. The numerical results show that the RLEF approach is more efficient.The number of solved instances by the approaches in each scenario is shown in Fig. 2. In fact, the RLEF approach reached an optimal solution in all data instances of all the scenarios.Although the LEF failed to reach optimality in some data instances, its performance was slightly better than the RLFE approach, which leaves a more a considerable number of instances unsolved.
Also, there are significant differences in the solving time used by the approaches.Figure 3 shows the solving times by the approaches in the random scenarios over the different data sets.The figure eloquently shows that the solution time by RLEF is less than in the two other approaches, while the RLFE approach reaches the time limit of 2 h in all instances.In addition, there are considerable differences among the solution times by the RLEF and LEF approaches when the problem's size increases.This effect is somewhat more prominent in data sets C and D. Note the trend of the solving times in the semi-homogeneous is the same.
Furthermore, the results of the different data sets show that the RLFE obtained assignments with a higher level of fairness than the RLEF and LEF approaches.On the other hand, the RLEF approach allowed us to find more efficient solutions quickly, even when the size of the problem increased in the different scenarios.

Concluding remarks
This paper studied different formulations to address the simultaneous incorporation of efficiency and fairness in the unbalanced integer assignment problem, where Jain's index is the fairness function.Since this index is defined by a non-concave function, incorporating it in the optimization results in challenging problems to solve.The difficulty of the problem is highlighted more when the dimension of the problem is vast and many conditions are involved.To this aim, in addition to a basic formulation, we studied two reformulations, one based on a convex quadratic objective function and the other one based on integer second-order cone programming.We analyzed the performance of these approaches in numerical experiments carried out using both real-world data and randomly generated data.The results showed that, although all approaches may conduce to solutions that address well both efficiency and fairness measures, the quadratic formulation is much quicker than the other formulations.
Apart from the performance results, one observation from the results is that in the current setup, there are no significant compromises in either efficiency or fairness by prioritizing one or the other.This is remarkable in our view, as in principle there was no clear intuition about the effect of different prioritization.Our results signal that one could choose the quadratic formulation without any enormous sacrifice on the optimal level of fairness that is potentially achievable.This insight should be of interest to practitioners, in particular when a timely decision is needed.
To the best of our knowledge, this is the first time these types of reformulations of the Jain's index have been studied in assignment problems with an integer nature.Further work could explore the performance of the proposed reformulations in optimization problems with different profit functions.One challenge that could appear is that the RLEF approach may lead to a bigger compromise on fairness.Hence, one would need to work on more scalable approaches to solve the models involved in the RLFE approach.Additionally, the structure of the preferences and the resulting profit function may lead to the undesired assignment of resources to non-ranked projects.Our experiments signal that increasing the number of projects that students are allowed to rank may help to sort out this issue.In practice, this might come at the expense of more time and effort from survey respondents to fill in the survey, which could be discouraging if the number of projects is too high.We would expect this not to be a barrier in our application though, where the number of projects is still reasonable for the students to elicit a full ranking of preferences.In general, exploring better ways to obtain a fair and efficient assignment by avoiding non-ranked assignments may be an interesting research avenue.Another avenue for future research is to study the problem when the users can share non-continuous resources.Adapting the proposed approaches to study other integer problems requiring efficient and fair solutions also remains of interest.Finally, a future line of research is to develop a detailed analysis of the efficiency-fairness trade-off and to explore the extension of results from the continuous to the integer case.
Funding Open access funding provided by Norwegian School Of Economics.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material.If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons.org/ licenses/by/4.0/.
ru ≥ L ut y u ∀u ∈ U, ∀t ∈ T ru ≤ U ut y u ∀u ∈ U, ∀t ∈ T 42 Page 14 of 23

Fig. 2
Fig. 2 Number of instances solved to optimality by each approach in each scenario

Table 1
Overview of the real-

Table 4
Results for data sets A using all the proposed approaches

Table 5
Results for data sets B using all the proposed approaches

Table 6
Results for data sets C using all the proposed approaches

Table 7
Results for data sets D using all the proposed approaches