The Big-M method with the numerical infinite M

Linear programming is a very well known and deeply applied field of optimization theory. One of its most famous and used algorithms is the so called Simplex algorithm, independently proposed by Kantorovič and Dantzig, between the end of the 30s and the end of the 40s. Even if extremely powerful, the Simplex algorithm suffers of one initialization issue: its starting point must be a feasible basic solution of the problem to solve. To overcome it, two approaches may be used: the two-phases method and the Big-M method, both presenting positive and negative aspects. In this work we aim to propose a non-Archimedean and non-parametric variant of the Big-M method, able to overcome the drawbacks of its classical counterpart (mainly, the difficulty in setting the right value for the constant M). We realized such extension by means of the novel computational methodology proposed by Sergeyev, known as Grossone Methodology. We have validated the new algorithm by testing it on three linear programming problems.


Introduction
Linear Programming (LP) is a branch of optimization theory which studies the minimization (or maximization) of a linear objective function, subject to linear equality and/or linear inequality constraints. LP has found a lot of successful applications both in theoretical and real world contexts, especially in countless engineering applications. Most of the algorithms developed so far to solve LP problems fall in one of the two following categories: basis exchange methods [19] and interior point methods [27].
Researchers argued a lot about the pros and the cons of these approaches, ending up asserting the equal dignity of both. The best choice depends on the characteristics of the problem at hand [16].
Concerning this work, we decided to focus on the class of basis exchange methods and, in particular, on its probably most famous member: the Simplex algorithm, firstly proposed in 1939 by Kantorovič [17] and independently rediscovered by Dantzig during the forties [8][9][10].
This algorithm needs a feasible basic solution to start from, an input which is not trivial to generate for complex real-world problems. The main ways to overcome this difficulty are: the two-phases approach [10] and the Big-M method [2,8,26].
The former splits the optimization in two-phases, and in each it runs the Simplex algorithm on a phase-specific problem. In the first one, the problem is a simple task for which a feasible starting point is known, and whose optimum is a valid basic solution for the original problem. The second one is the original problem itself. In spite of the guarantee of convergence to the optimum (if it exists), such approach wastes a lot of time, having to run the Simplex algorithm twice rather than once.
The other possibility is diametrically opposed to the previous one, indeed, it requires just one run of the Simplex algorithm, though on a problem different from the original one. The latter approach is known as Big-M method, since it requires a big parameter M: it consists of applying the Simplex algorithm to a modified and parametrized version of the original problem for which a feasible initial basic solution is known. In particular, the big constant M (along with some auxiliary variables) is added to the original problem; a good tuning of such constant also guarantees that the original problem and the modified one share the same optimum. Thus, the drawback of the Big-M method is that it adds a new parameter, which also needs to be properly set: a too small value does not guarantee the convergence to the same optimum of the original problem, while a too big value may generate loss of precision and numerical instabilities.
The actual goal of this work is to overcome the Big-M method issues by means of a non-Archimedean modification of the original method, that is converting the LP problem at hand into a new one where the cost function involves infinite and infinitesimal quantities too. In order to realize such non-Archimedean extension, we exploited the Grossone Methodology proposed by Y.D. Sergeyev (see [22] and references therein), a powerful framework which lets one numerically manipulate infinite and infinitesimal quantities. Even more interestingly, algorithms involving the Grossone Methodology can run on a hardware computing engine, called Infinity Computer (see patent [20]), which is a completely new kind of supercomputer. Once opportunely modified, the new problem can be solved by a recent extension of the Simplex algorithm due to Cococcioni et al. [4], namely the gross-Simplex algorithm (in brief, G-Simplex). The latter is an improved Simplex algorithm able to solve non-Archimedean LP problems too. Together, the G-Simplex algorithm and the non-Archimedean modified problem give birth to the parameter-less procedure presented in this work, called by us the Infinitely-Big-M method (abbreviated I-Big-M), since it relies on the use of an infinitely big constant, which is the same for all the problems and thus does not have to be specified by the user.
The remaining of the letter is structured as follows: Sect. 2 provides an introduction to LP and Big-M method, Sect. 3 contains an overview of the Grossone Methodology and a description of the Gross-Simplex algorithm, Sect. 4 presents our non-Archimedean extension of the Big-M method, i.e., the Infinitely-Big-M method. In Sect. 5 we present three experiments we conducted exploiting the I-Big-M method to solve LP problems, while Sect. 6 is devoted to the conclusions.

The linear programming problems and the Big-M method
A possible representation of an LP problem, which may be a little complex at first sight, is shown in Eq. (1): where z ∈ R n z + is the unknowns vector, d ∈ R n z is the cost vector, i ∈ R m i ×n and b i ∈ R m i + are the feasibility constraints, i = le, eq, ge. Here and throughout the whole work, bold symbols represent vectors and matrices, while non-bold ones refer to scalars.
As said, the Simplex algorithm requires a feasible basic solution to start from which may not be easy to find by hand. The Big-M method [2,8,26] tackles the problem transforming the original problem by adding some auxiliary variables which guarantee the existence of an initial feasible basic solution. However, not being part of the original problem, all of them must be equal to zero in the optimal solution of the modified problem. In order to guide the optimizer towards this annihilation, the Big-M method adds the auxiliary variables to the cost function and weights them with a big penalty M. The form of the modified problem is shown in Eq. (2): where a ∈ R n a is the vector of auxiliary variables to be penalized (n a = m eq + m ge ), M is the penalty weight, 1 is a vector of n a ones, s and p are m le and m ge slack variables, respectively. Such new form can be easily turned into the standard one (3), by applying the equalities shown in Eq. (4): The always feasible initial basic solution for the modified problem is {s, e, r}.
However, the Big-M method has a couple of relevant drawbacks. Actually, it turns a parameter-less problem into a parametric one, adding the scalar factor M. In addition, such parameter needs to be a-priory carefully set, in order to guarantee the correctness of the optimization process. Indeed, a too low value could lead to a non-zeroing of the auxiliary variables (and, thus, to no feasible solutions, even if they exist), a too high value could lead to a loss of precision and to numerical instabilities. Such setting issues, along with the need of re-tuning M for each different problem to solve, are crucial and struggling aspects from a practical point of view, sometimes limiting the applicability of the method in real contexts at all. In Sect. 4, we will present our parameter-free extension of the Big-M method which also preserves all its positive properties. With this goal in mind, let us move to the next section which introduces the methodological platform upon which to build such extension.

The Grossone methodology and the G-simplex algorithm
The Grossone Methodology (GM), which has been proved to stay apart from non-Standard Analysis [23], is a novel way for dealing with infinite, finite and infinitesimal numbers at once and in a numerical way [22]. Originally proposed in 2003, such framework already counts a lot of practical applications, for example it has been applied to non-linear optimization [11,18], global optimization [25], ordinary differential equations [1,15,21,24], control theory [3,12], and game theory [13,14], to cite a few. Also linear optimization positively enjoyed the advent of GM, as shown in [4][5][6][7]. In particular, in [4,6] the authors proposed an extension of the Simplex algorithm able to address Lexicographic Multi-Objective LP problems, both in absence and in presence of integer constraints, using a simplex-like approach. Such extension has been called gross-Simplex algorithm (briefly, G-Simplex), since it adopts GM to make the optimizer aware of the lexicographic priority between the objectives.
Actually, GM relates to the infinite unit Grossone, which is represented by the numeral ①. By means of it, a new numeral system with infinite base can be carried out, and each numberc within such numeral system can be represented as follows: In Eq. (5) the numberc is called gross-scalar (in short, G-scalar), m, k ∈ N, each p i is called gross-power (G-power) and can be of the same form ofc, while c p i is called gross-digit (G-digit) and is a finite (positive or negative) real number, i = m, . . . , −k.
As mentioned, the G-Simplex is an algorithm which extends the Simplex procedure, being also able to solve the new class of LP problems known as non-Archimedean LP (NA-LP) problems. The latter have the canonical form described by Eq. (6): where this time the cost vectorc is a vector of non-Archimedean quantities, that is infinite and infinitesimal values, rather than just real numbers as in Eq. (3). In this work, we modeledc by means of a gross-vector (G-vector), i.e., an n-dimensional vector whose entries are G-scalars. In [4], it has been proved that a NA-LP problem is equivalent to the optimization of multiple linear functions among which a lexicographic priority relation exists, i.e., the problem expressed in Eq. (7) (l is the number of functions to optimize at the same time).

The Infinitely-Big-M method
The main issues related to the Big-M method concern the weight M. As said, it needs an a-priori careful settings because a too low value may not force the Simplex algorithm to nil the auxiliary variables, and a too big value may bring to loss of precision and numerical instabilities. Moreover, the weight needs to be opportunely tuned for each different LP problem, which means the task to solve is no more a parameter-less but a parametric one. Notwithstanding, the role of the weight M is just to prioritize the zeroing of the auxiliary variables over the optimization of d T z, regardless its actual value. Indeed, it has just to put as much emphasis as possible on their minimization (reads to prioritize them), since a solution with at least one nonzero auxiliary variable is actually a non-feasible output for the original problem. Such priority lets us reformulate the modified problem into a non-Archimedean version.
The idea is to set the weight M equal to ①, in line with what De Cosmis and De Leone did in Nonlinear Programming [7]. Indeed, M represents a very big number which aims to increase the importance of the auxiliary variables zeroing over the minimization of d T z. The canonical form of such modified problem is shown in Eq. (8).
s.t. le z + Is = b le eq z + Ie = b eq ge z + Ir − Ip = b ge z, s, e, p, r ≥ 0 (8) Imposing the equalities in Eq. (4) (except for the vector c) and the ones in Eq. (9), the problem gets a form close to Eq. (7). Scaling down its cost function by a factor ①, it ens up fitting it perfectly. This means that such a problem can be solved by means of the G-Simplex algorithm. Indeed, as stated in Sect. 3, there always exists a bijection between a problem in the form of Eq. (7) and one in the form of Eq. (6). With regard to the Big-M method we aim to present in this letter, such bijection corresponds to Eq. (10).
Together, the rewriting of the modified problem into its non-Archimedean form and the use of the G-Simplex algorithm to solve it, implement the extended version of the Big-M method we sought to propose in this work. The latter is a procedure which leverages on an infinitely big constant to properly guide the optimization, which is the reason why we called it Infinitely-Big-M (in brief, I-Big-M). Notice that the I-Big-M method is a parameter-less approach, indeed, it does not require anymore the tuning of M, which is now always set to ①, regardless the problem to solve. Moreover, the issue of finding a feasible initial basic solution for the G-Simplex algorithm does not exist, since the modified problem and its non-Archimedean version are two different ways for representing the very same problem. What changes is only how to guide the optimization process, that is the loss function, while the polytope describing the feasible region stays the same. This means that a feasible initial basic solution for the Simplex algorithm in the modified problem is also a polytope vertex for the G-Simplex. Thus, Big-M and a I-Big-M methods share both their starting and optimal points (to be precise, the latter is true only if the weight M is correctly set in the Big-M method). Finally, it is worth to highlight how the I-Big-M method is able to tackle lexicographic multi-objective LP problems straightforwardly and without any modification to the method itself. Indeed, on the one hand the problem modification remains the same, this time with more than just one secondary vector. On the other hand, the G-Simplex algorithm is able to solve NA-LP problems regardless the number of objectives to deal with. Thus, the I-Big-M method, which is the combination of the problem modification and the G-Simplex algorithm, does not care whether the original problem to solve is lexicographic multi-objective or just single-objective. Numerical evidences of such property will be shown in Sect. 5.2.

A numerical illustration
In this section we show three practical cases of study where we can appreciate the effectiveness of the I-Big-M algorithm.

First case of study
The first analysed problem is a slight variant of Example 1 provided in [4]. Our variant takes into account only the first objective of that benchmark, which is actually a lexicographic multi-objective problem.
In order to involve an equality constraint, we have also introduced the additional variable z 3 , which plays a very marginal role in the economy of the problem. The analytical form of the current case of study is shown below, while its graphical description is reported in Fig. 1 (after being projected onto the plane z 3 = 10).
For completeness, below we also report the matrices i and the vectors b i by means of which we can represent the current LP problem in the form of Eq. (1), i = le, eq, ge : The problem's optimum is degenerate: any point within the red segment of Fig. 1 is optimal. We have indicated as α = [0, 70, 10] and β = [30, 50, 10] the two vertices of the optimal segment. In Table 1, we reported the iterations that the G-Simplex algorithm performed once run on our Infinity Computer simulator implemented in Matlab. As it can be seen from the table, the G-Simplex algorithm immediately gets rid of the auxiliary variables. Indeed, in the very first iteration the auxiliary variable r exits from the base, followed by variable e during the second iteration (in this case vectors r and e are actually scalars). This is confirmed by the cost function, which assumes a finite value in exactly two steps. Thus, the infinite weight ① does its own duty, that is prioritize the zeroing of the auxiliary variables during the solution of  the problem, which manifests in the auxiliary variables exiting from the optimal base before all the other unknowns. Finally, the optimum output by the routine was z * = α.

Second case of study
The aim of this second case of study is to numerically verify that the I-Big-M method is able to tackle the lexicographic multi-objective optimization transparently with respect to the user and the problem modification procedure. To do it, we run the I-Big-M routine on the problem presented in Eq. (12) and graphically described in Fig. 2. The new problem is the same as before with the addition of two new linear cost functions. The three overall cost vectors are reported in Eq. (13), and they can be lexicographically ordered by priority as follows: d d 2 d 3 . We avoided to report the problem splittings in matrices i and vectors b i (i = le, eq, ge), since the splitting is exactly the same already reported in Eq. (11): Some aspects of the problem are worth to be pointed out. Actually, the optimum of the new problem is now the point β, as higlighted by Fig. 2. Indeed, while for the first objective (d) the problem solution is degenerate (both α and β are acceptable), the function d 2 lets us discriminate between them. Thus, while before the choice between them was delegated to the G-Simplex algorithm actual implementation, now it is guided by the second objective function and it falls upon β. The third and least optimality direction, i.e., d 3 plays no role, since the optimum is just a singleton this time.
Finally, in Table 2 we reported the iterations performed by the G-Simplex algorithm when solving the lexicographic multi-objective LP problem in Eq. (12). Notice that the first four iterations of the algorithm are exactly the same reported in Table 1. The reason behind this phenomenon is twofold. Firstly, the G-Simplex solves both the problems (this and the previous one) from the very same initial basic solution {s, e, a}, chosen by the I-Big-M method. Secondly, the optimization is guided by the objective d until the degenerate solution is found. Together, this two aspects make reasonable to expect that the G-Simplex algorithm will follow the same steps in both the problems until it finds the degenerate solution. At that point, the first objective is not able anymore to discriminate among the solutions, and therefore the second objective turns to be effective, leading the algorithm towards the unique lexicographic solution of the problem β.
Before concluding, in Eq. (14) we report the non-Archimedean objective function of the modified NA-LP problem generated by the I-Big-M method. Such function lets us write such problem in the form of Eq. (6):

Third case of study
We now illustrate a third case of study where the traditional Big-M method struggles, because setting the right value for M is not trivial. We have considered the problem neos-1425699, which can be downloaded from: https://miplib.zib.de/instance_ details_neos-1425699.html The problem is included within the MIPLIB2017 collection of Mixed Integer LP benchmarks, and it contains 63 inequality constraints and 26 equality constraints. Of course, we have removed the integrality constraint from all the 105 variables, and we have also removed their upper bounds. As it can be seen from Table 3, the known optimal value of the cost function is 3.14867 · 10 9 (found using Matlab linprog function). Using values for M below 10 8 the standard Big-M algorithm is not able to find the right optimum, as proved by a different value of the cost function. Also the number of iterations performed by the Big-M method is interesting: it passes from 26 to 89 (see the third column of Table 3). Needless to say, on the same problem the I-Big-M method is able to find the optimum at the first attempt, i.e., without any trialand-error approach. Thus, we think that the I-Big-M method can be a powerful method in those setting which require the solution of LP problems as a sub-problem, like in the case of Mixed Integer Linear Programming when approached using a Branch-and-Bound/Branch-and-Cut strategy, both for the single objective and the lexicographic multiple-objective cases [6].

Conclusions
In this letter we have introduced a new algorithm, called I-Big-M method, which is a re-visitation of the Big-M method for the Infinity Computer. The I-Big-M method is able to solve a LP problem using only one run of the G-Simplex algorithm, in the very same manner of the classical Big-M method, which uses a single run of the classical Simplex algorithm. Even more interesting, the I-Big-M method does not require the constant M, a parameter difficult to set, typically chosen by an errorprone trial-and-error approach. Hence, the I-Big-M offers a parameter-less algorithm to solve LP problems, generating a single sequence of visited vertices. Moreover, its formulation also allows one to solve lexicographic multi-objective LP problems in a natural way, again without any assistance by the user and solving the problem using just a single run of the G-Simplex algorithm (i.e., by generating overall a single sequence of visited vertices: a pretty remarkable result). All these properties have been numerically validated by means of three experiments, one of which involving a realworld challenging benchmark. Finally, the I-Big-M method is expected to be more numerically stable than the original Big-M method, since it does not mix numbers with very different order of magnitudes. The verification of the latter conjecture deserves a deeper investigation and it is left to a future study. material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.