On the Linear Integration of Attraction Choice Models in Business Optimization Problems

Decision problems from various fields (e.g., assortment optimization, product line selection, location planning) require to endogenously incorporate probabilistic choice behavior in dependence of the availability of given choice alternatives. A widely spread demand model in marketing and econometrics to represent such choices is the attraction choice model. Of this model, the well-known multinomial logit model and—in case of multiple latent customer segments—the finite-mixture logit model are special cases. However, integrating such models in optimization problems results in non-linear formulations. Thus, in recent years, several exact linearization approaches have been proposed. These approaches are based on different ideas, and they have appeared independently from each other in different fields of research. Thus, the question arises how these approaches differ and how they relate to each other. In this short communication, we settle this question by arguing that many of the proposed approaches—even though they might seem different at first glance—can be traced back to one of two underlying linearization ideas. Establishing a generic problem, we discuss the two ideas in a unified way by presenting two corresponding general model formulations that are shown to be equivalent. Based upon this, we are able to classify the major publications which integrate some type of attraction choice model in detail. In particular, for each formulation of the analyzed literature, we explain to which extent it is a special case of (one of) the presented generic formulations. This also makes clear under which context-specific conditions certain elements of the generic linearization can be omitted, potentially serving as helpful guideline for future applications of such linearizations.


Introduction
In the business-to-consumer market, many decisions made by companies influence customers' choices and thus the resulting demand or market share. For instance, in assortment optimization, the decision about the assortment of products and their prices might directly affect the demand for a single product due to its dependence on the other products on offer and their prices. Likewise, in location planning, the decision about the position of a facility might influence the demand occurring in other facilities and vice versa (e.g., park and ride facilities). Therefore, if such decisions are supported by methods of Operations Research, the underlying mathematical formulations need to endogenously consider this choice behavior.
In the academic literature, one of the most widely spread models to incorporate choice behavior is the attraction choice model drawing on Bradley and Terry [4] and Luce [14]. In its basic form, the attraction choice model (ACM) explains demand indirectly by reference to the market share and states that the market share of a choice alternative is the ratio of the alternative's attraction to the overall attraction of all available alternatives (including the alternative to choose nothing). Thus, if an alternative is not available, the market share of this alternative is recaptured by the available alternatives (including the no-choice alternative) in proportion to their attractions. Note that the ACM can account for different observable customer segments by incorporating segment-specific characteristics, such as sociodemographic variables, into the attractions' specifications. However, if unobserved segments, i.e., latent classes of customers, shall be captured, it is common to model each of these segments by its separate ACM weighted by the segment's share of the population, so that one ends up with an overall more complex model. Please note, as a special case of the ACM, the multinomial logit model [16] is one of today's most prominent choice models to represent probabilistic demand in econometrics and marketing. In case of multiple, non-observable customer segments, the corresponding overall model is known as finite-mixture logit or latentclass model. Discussions of customer segments are provided in detail in Train [24] and Müller and Haase [19].
Over the last decade, contributions with regard to business optimization problems that integrate such choice behavior following the ACM have tremendously increased. However, due to the ACM's properties, its straightforward consideration in mathematical optimization leads to nonlinear formulations. Therefore, in order to be able to apply standard software of (mixed-integer) linear programming (MILP), quite a number of coexisting publications from different research communities and fields are dedicated to the exact linearization of the resulting nonlinear formulation and sometimes claim this linearization as one of their key contributions. Examples include publications from the field of revenue management and assortment optimization in the operations community [7,18], from product line selectionwhich originates more from the marketing community and is indeed technically very similar to assortment optimization [20,21], as well as from location planning [1,9].
In this paper, we contribute to the literature by providing a unifying analysis of linear reformulations proposed in major publications of different research fields and by clarifying their relationship to each other. Based on a generic problem formulation that covers the majority of the investigated problems of the different fields of research (Sect. 2), first, we describe two linearization ideas to which the proposed approaches can be traced back and present the appropriate mathematical formulations in the generic context. Second, we show that the resulting formulations can straightforwardly be transformed into each other, thereby also confirming that they indeed model the same problem (Sect. 3). Third, based upon the generic formulations, we are able to systematically discuss the specific linear formulations proposed in major works of the academic literature. In particular, for each formulation, we explain to which extent it is a special case of (one of) the presented generic formulations. This also makes clear under which context-specific conditions certain elements of the generic linearization can be omitted, potentially serving as helpful guideline for future applications of such linearizations (Sect. 4). Finally, some concluding remarks are given (Sect. 5).

Generic Problem Definition
Let J = {1, … , m} be a set of different alternatives that can be made available to customers. Further, let N be the set of customer segments. Then, following the ACM, the choice probability of customer segment n ∈ N for alternative j ∈ S 0 = S ∪ {0} when subset S ⊆ J is made available-with j = 0 representing the no-choice alternative (always available)-is given by with A nj ≥ 0 (A n0 > 0) being a segment-specific measure of attraction preassigned to alternative j ∈ S . In the special case that demand follows the multinomial logit model, in line with random utility theory, A nj = e v nj , with v nj being the deterministic part of the utility of customer segment n ∈ N for alternative j . Note that the no-choice alternative may also include other alternatives available to customers but not being within the decision-making scope.
Since the choice probability of each alternative j ∈ S 0 is equal to its attraction A nj relative to the attraction of all available alternatives, for each customer segment n ∈ N , the choice probabilities sum up to one: The problem is now to decide about the offer set S of available alternatives subject to a predefined problem specific objective (e.g., profit maximization) under the assumption that customer segments are not necessarily observable. For this purpose, we define the binary decision variables x j ∈ {0,1} with j ∈ J that equal 1 if alternative j should be made available and zero otherwise. The corresponding ∑ j∈S P nj (S) + P n0 (S) = 1.
offer set is S(x) ∶= j ∈ J|x j = 1 . A generic formulation of the resulting objective function-incorporating the demand of all customer segments-is given by where n is the segment's share of the population and nj is a context-specific constant associated with each alternative j ∈ J and each segment n ∈ N.
The objective in (Eq. 3) aims at maximizing the sum of weighted nj by deciding about the available alternatives. Depending on the context, this might, for instance, be the expected overall profit or market share. The resulting problem in (3) is a binary and nonlinear, fractional program containing a sum of ratios. Importantly, note that if |N| = 1 , i.e., if only one segment exists, the problem becomes much easier to handle (also see Sect. 4).

Method-Based Linearization
The first linearization idea consists of applying global formal methods developed to linearize nonlinear terms in fractional formulations (referred to as ML-"methodbased linearization"). For example, the linearizations presented by Schön [20,21] as well as by Miranda-Bront et al. [18] can be seen to be in line with this idea.
Applying such techniques [12,25], the linearization can be accomplished in two steps: Regarding the generic formulation (3), in the first step, we substitute by non-negative decision variables y n ∀n ∈ N . This substitution draws on the idea of Charnes and Cooper [5] who first proposed it in a similar way for continuous fractional functions and one segment. The variable y n is from the interval � The lower bound of y n is reached when all alternatives are available, i.e., x j = 1 ∀j ∈ J . The upper bound is reached when none of the alternatives is offered, i.e., x j = 0 ∀j ∈ J . The resulting nonlinear program is given by subject to with x j ∈ {0, 1} ∀j ∈ J and y n ≥ 0 ∀n ∈ N . Constraints (5) ensure the correct substitution by y n as described above. Note that this substitution is generally valid for the ACM since A nj ≥ 0 ∀j ∈ J, ∀n ∈ N and A n0 > 0 ∀n ∈ N , and thus, the variables y n are always positive.
In the second step, we eliminate the resulting bilinear term x j y n [25]. For this purpose, we define new decision variables z nj ∶= x j y n ∀j ∈ J, ∀n ∈ N . To guarantee z nj = x j y n in dependence of the value of the variables x j , the logical conditions (I) x j = 0 ⇒ z nj = 0 and (II) x j = 1⇒ z nj = y n must be imposed by a number of linear constraints. The resulting linear program-equivalent to problem (3)-is given by subject to with x j ∈ {0, 1} ∀j ∈ J , y n ≥ 0 ∀n ∈ N , and K nj (10) as well as K nj (11) ∀n ∈ N, ∀j ∈ J being sufficiently large numbers. Constraints (8) and (10) impose implication (I), whereas implication (II) is represented by constraints (9) and (11). For tight definitions of the parameters K nj (10) and K nj (11) , see Appendix 1.

Property-Based Linearization
The second linearization idea is motivated from specific properties of the ACM (referred to as PL-"property-based linearization"). This approach is followed, for instance, by Davis et al. [7], Haase [9], and Aros-Vera et al. [1]. More precisely, the fundamental property of demand models whose structure follows (1), as, for instance, the multinomial logit model, is the so-called independence of irrelevant alternatives (IIA) property. This property states that the ratio of two available alternatives' choice probabilities is constant and thus independent of the availability of other and hence irrelevant alternatives. From definition (1) of the choice probabilities in the ACM, it follows that this constant ratio is equal to Note, demand models not following (1), as, for instance, the nested logit model or the probit model, do not suffer from the IIA property.
In the mathematical program, it is necessary to ensure the IIA property and hence the ratios in (12). Therefore, we further exploit the fact . This means that every ratio of two alternatives can be expressed by two ratios comprising the no-choice alternative. Since the no-choice alternative is always available, we can ensure the IIA property in the mathematical program by merely imposing For the PL, we define non-negative decision variables p nj ∀n ∈ N, ∀j ∈ J ∪ {0} which represent the choice probabilities of alternatives j ∈ J ∪ {0} for customers belonging to segment n ∈ N in dependence of the offered alternatives. The model formulation building on the IIA property is given by subject to with x j ∈ {0, 1} ∀j ∈ J and M nj (18) as well as M nj (19) ∀n ∈ N, ∀j ∈ J ∪ {0} being sufficiently large numbers. First of all, constraints (15) reflect the ACM's property stated in (2). For the IIA property as stated in (13) to hold, the two logical conditions x j = 0 ⇒ p nj = 0 and x j = 1⇒ p nj = A nj A n0 p n0 must be ensured. While constraints (16) and (18) impose the first implication, the second implication (IV) is modeled by constraints (17) and (19). For tight definitions of the parameters M nj (18) andM nj (19) , see Appendix 1.
The general ML (6)-(11) and the general PL (14)- (19) presented in Sects. 3.1 and 3.2 are equivalent mixed-integer linear formulations of problem (3), as they can straightforwardly be transformed into each other by variable substitution. The proof is given in Appendix 2.

Classification of Specific Linearization Approaches
Based upon the two generic approaches presented in Sect. 3, we are now able to systematically discuss and compare major publications' linearization approaches. We argue that most of the resulting programs can be traced back to either the presented ML or PL. Further, we show that problem-specific characteristics and the considered setting lead to special and simplified cases of ML or PL regarding the linearization part, i.e., the required constraints. Table 1 presents the comparison. Column 1 states the research field to which the reference in column 2 is dedicated. Column 3 states if the work referenced in column 2 considers a market divided into different customer segments or not. The last block of columns classifies whether the referenced work's proposed linearization is based on the methodological (Sect. 3.1) or property-driven approach (Sect. 3.2), and thus, if they can be directly traced back to either ML or PL. Further, it is shown which of the constraints of the general formulations are applied as a result of the specific setting considered.
In Schön's [20], [21] optimization approaches for the product line selection problem, an ML is used with additional constraints reflecting pricing decisions. Schön [20] allows to consider each segment separately with regard to the linearization; each segment-specific objective function is quasi-convex and quasiconcave, and the model has a unimodular (price) constraint matrix. Thus, without explicitly claiming integrality, an optimal binary solution can be obtained [6]. Hence, Schön [20] can drop the integrality requirement on the decision variables x j and thus does not require any constraints like (10) and (11). Hence, the applied linearization, as stated by herself, resembles the classical Charnes-Cooper transformation for continuous variables [5]. In Schön [21], pricing is made continuous rather than based on a discrete set of prices as in Schön [20]. This would normally result in a non-concave objective function which is circumvented by defining the continuous probability as the central decision variable. Hence, constraints (10) and (11) are also not necessary. In Bechler et al. [2], the product line selection problem is extended by the empirically proven effect that customers tend to choose compromise alternatives. This results in a non-unimodular formulation such that integrality constraints cannot be dropped, and thus, their proposed formulation comprises the full set of ML's constraints as given in (7)- (11).
In the context of revenue management and assortment optimization, Talluri and van Ryzin [23] study problem (3) for the multinomial logit model and only one customer segment. They confirm an earlier result from fractional programming [22], stating that in this particular case without any further constraints, an optimal assortment can easily be obtained by greedily adding products into the offer set in order of decreasing revenues, such that a modelbased approach is not necessary at all. Miranda-Bront et al. [18] consider Table 1 Systematic comparison of context-specific linearizations of (3) proposed in the literature *Even though multiple customer segments are formally included in Schön's [20], [21] (7)- (11). In their setting, each customer segment is characterized by one consideration set (i.e., the set of products this segment considers choosing from). They assume that the different consideration sets do not need to be disjoint but can overlap to some extent. As one of their contributions, Davis et al. [7] present a PL for a setting with one customer segment and several alternative additional types of side constraints, such as price constraints. Similar to Schön [20], [21], these constraints' coefficients form a unimodular constraint matrix which allows for neglecting constraints (18) and (19). Thus, even though developed independently, from a technical point of view, the linearization proposed by Davis et al. [7] resembles the classical Charnes-Cooper transformation. Méndez-Díaz et al. [17] present an ML which is a problem related extension of Miranda-Bront et al. [18].
In the area of location planning, the objective mostly is the optimization of the market share without the consideration of cost, but under consideration of different customer segments. In this context, customer segments are denoted as demand nodes. Benati and Hansen [3] propose an ML but, in contrast to already mentioned linear formulations, completely substitute the objective function (3). In this case, constraints (9) and (10) can be omitted, since the variables substituting the resulting bilinear terms are negatively considered in the objective function and thus are minimized. Hence, only the lower bounds represented by (16) and (19) need to be ensured. Haase [9] proposes a PL. However, in constraints (19), he explicitly formulates the IIA property drawing on (12) for every possible pair of alternatives. This automatically includes constraints (17) but results in many redundant constraints (for details see Sect. 3.2). Zhang et al. [26] propose an ML by substituting the single probabilities for the different alternatives (in contrast to Benati and Hansen [3] who substitute the sum of all probabilities). For the linearization of the resulting nonlinear terms, |N| ⋅ |J| 2 instead of |N| ⋅ |J| variables and |N| ⋅ |J| 2 of each of the constraints (7)-(11) are necessary. In line with Haase [9], Aros-Vera et al. [1] propose a PL considering all possible pairs of alternatives to formulate the IIA property. In contrast to Haase [9], constraints (16) are omitted since the objective of market share maximization automatically favors the largest values for the choice probabilities. Haase and Müller's [10] reformulation of Haase [9] omits the redundant constraints and formulates the IIA property as given by (17). Due to the objective of market share maximization, constraints (19) are not necessary and constraints (15) can be formulated as inequality. Haase and Müller [10] consider M nj (18) as defined in Appendix 1, which represents the tightest upper bound for the choice probabilities in the PL in general. However, in the special case of facility location planning, a predefined and fixed number of r facilities are required to be open which is considered in the MILP formulation as additional constraint. Based on this, a stronger formulation of constraints (18) can be derived. The resulting tighter bound for M nj (18) is presented by Freire et al. [8] in the context of Haase and Müller's [10] linear formulation (see Appendix 1 for its definition).
Note that in the context of location planning, other linearizations have recently been discussed by Ljubić and Moreno [13] and Mai and Lodi [15]. Ljubić and Moreno's [13] approach relies on the outer-approximation of the continuous relaxation of the objective function and its submodularity property. Mai and Lodi's [15] approach allows to create a set of piecewise linear functions that outer-approximate separated parts of the objective function. The corresponding models arise in the specific context of branch-and-cut or cutting-plane solution procedures the authors develop and therefore are omitted in Table 1.

Discussion
In this paper, we argue that major publications' linearizations of attraction choice behavior in business optimization problems can be traced back to one of two different but equivalent MILP formulations, each relying on a specific linearization idea. By a systematic analysis, we revealed that differences of the publications' linearizations to the presented ones result from problem-specific characteristics depending on the field of application. Thus, our analysis can serve as helpful guideline for future applications of such linearizations.
Note that, basically, both linearization schemes rely on the same number of (binary and nonnegative real-valued) variables and constraints. Further, given that their equivalence can be shown by variable substitution, there are no specific indications that one is generally more suitable than the other one. Besides the equivalence, it can be seen from the substitution that the defined bounds in Appendix 1 lead to the same tightness of constraints in both formulations. Hence, no solution time differences can be expected in general. However, with regard to the future development of context-specific linearization approaches on the basis of these generic models, it is important to keep considering both variants. In particular, one could be more intuitive than the other with regard to the required model adjustments, potentially leading to differences in efficiency of the resulting specific linearizations.
Further, we want to emphasize that the two presented MILP formulations are of special interest in the case of only one customer segment, since then, the formulations can be solved very efficiently and utilized for a broad range of applications [7]. In the case of several latent segments, even though the MILP formulations are NP-hard, standard MILP solver methods have been reported to work pretty fast in many cases, or at least, the formulations can serve as helpful starting points for the derivation of promising heuristic solution procedures [18]. Additionally, as discussed in this paper, problem specific circumstances can further simplify the linearization effort needed.

Appendix 1. Tight bounds for ML and PL
For the ML's constraints (10), we know by definition of y n that it does not exceed the value 1 A n0 +A nj in case x j = 1 . Since z nj ≤ y n ∀n ∈ N, ∀j ∈ J (constraints (9)), we can define the tightest upper bound for z nj in constraints (10) by K nj(10) ∶= 1 A n0 +A nj ∀n ∈ N, ∀j ∈ J , which would be reached in case only alternative j is offered. In line with Wu [25], we define the parameters K nj (11) based on our knowledge about the upper bounds of the variables y n by K nj (11) ∶= 1 A n0 ∀n ∈ N, ∀j ∈ J . This imposes the tightest lower bound of zero for z nj in constraints (11) in case x j = 0 , which is reached, if no alternative j ∈ J is offered.
For PL, we use the properties of the underlying choice model to define the parameters M nj (18) and M nj (19) . For constraints (18), we utilize the fact that the maximum value of p nj for j ∈ J given x j = 1 is by definition (1) A nj A n0 +A nj (only alternative j ∈ J is offered) with A n0 > 0 and A nj ≥ 0 . Thus, we can define M nj (18) ∶= A nj A n0 +A nj ∀n ∈ N, ∀j ∈ J as the tightest upper bound in constraints (18), if x j = 1 . For constraints (19), the right-hand side must be smaller or equal to zero in case x j = 0 . Thus, we need to ensure that M nj (19) ≥ p n0 . Since the maximum value of p n0 is reached if x j = 0 ∀j ∈ J (i.e., p nj = 0 ∀j ∈ J ), and is by definition (1) p n0 = p nj A nj ≥ p n0 A n0 + K nj (11) x j − 1 ⟺ p nj ≥ A nj A n0 p n0 + A nj K nj (11) x j − 1 . Since K nj (11) is defined as 1

A n0
and M nj (19) as 1, (19) is equivalent to (11). ■ Funding Open Access funding enabled and organized by Projekt DEAL.

Conflict of Interest
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/ licen ses/by/4.0/.