Application of the surrogate gradient method for a multi-item single-machine dynamic lot size scheduling problem

This study treats a multi-item single-machine dynamic lot size scheduling problem with sequence-independent setup cost and setup time. This problem has various heterogeneous decision features, such as lot sizing and lot sequencing. Traditionally, the problem has been treated by putting artificial constraints on the other feature in order to determine one of them. The proposed model is a Lagrange decomposition and coordination method that aims at simultaneous optimization of these decision features; however, smooth convergence to a feasible near-optimal solution has been a problem. So, in this paper, we propose a model that improves the constraint equation of the existing model and showed that it satisfies the Karush–Kuhn–Tucker (KKT) condition when we obtained a feasible solution. In addition, by applying the surrogate gradient method, which has never been applied to this problem before, it was shown that smoother convergence than before can be achieved through actual example of printed circuit board.


Introduction
This paper treats a multi-item single-machine dynamic lot size scheduling problem with sequence-independent setup cost and setup time. This problem has various heterogeneous decision features such as lot sequencing and lot sizing. The decision of lot sequencing has traditionally been dealt with in the category of discrete mathematics, represented by permutations, while the decision of lot size has been dealt with in the category of continuous mathematics, represented by the method of differentiating and setting it to zero.
The main reason for this is that the existing categories of mathematics are divided into discrete mathematics and continuous one. Therefore, in any case, in order to treat one explicitly, one is forced to put artificial constraints on the other, and this has become an obstacle that makes it difficult to optimize both features simultaneously.
To resolve this problem, a solution method called the Lagrangian decomposition and coordination method, which is based on the Lagrangian relaxation method, is already proposed to optimize heterogeneous decision features simultaneously from the viewpoint of total optimization [1]. Although Muramatsu et al. [2] already extend this model to multi-process multi-machine (multi-stage) model, in this paper we treat only single-machine (singleprocess) model in order to investigate the basic property of the convergence in this model.
In this solution method, the planning horizon is first divided into very small time slots, and the transition of each item's inventory and the lapse of setup time are explicitly treated to formulate the problem as a multi-dimensional dynamic optimization problem. Then, paying attention to the separability of the problem and the existence of prohibition constraints on machine interference, the problem is decomposed by item, and each sub-problem is solved by dynamic programming and coordinated by Lagrange multipliers alternately, in order to solve the machine interference and derive a near-optimal solution. The update rule for the Lagrange multipliers is the well-known subgradient method .
However, in this model, unless the operation rate of the generated schedule is 100 percent, the machine will be idle in one of the time slots in the planning horizon. Therefore, the value of the perturbation of the machine interference prohibition constraint in the idle time slot will be minus unity.
As a result, in many time slots, the perturbation goes inside the boundary of the constraint, and the Lagrangian value, which is the lower bound, deviates significantly from the objective function value. In addition, depending on the data, the solution often oscillates and requires a very large number of iterations before convergence, which has been a problem.
On the other hand, an interleaved subgradient method for convex and continuous problems is proposed to update the Lagrange multiplier [3] After, this method extended as a surrogate gradient method [4].
The ordinary subgradient method updates the Lagrange multipliers after solving all the subproblems, but surrogate gradient method updates the Lagrange multipliers after solving each subproblem, which is known to provide smooth convergence to the solution. And, the convergence of this method is proved, where the interleaved subgradient method can be considered as a special case of surrogate gradient method. Although this update method already has been applied to Integer Programming (IP) and Mixed IP problems including some type of scheduling problems, there is an unknown example of its application to the dynamic lot size scheduling problem.
In this study, we propose a model with a new inequality constraint as a means of preventing the lower bound aparting from the objective function value. In the proposed model, the value of the perturbation of the machine interference prohibition constraint is always zero because the machine idleness is treated explicitly as a decision variable. Therefore, it is expected that the perturbation will always be placed on the boundary of the constraint and the lower bound will not be lowered. Furthermore, we apply the surrogate gradient method to the dynamic lot size scheduling model and verify its effectiveness with a numerical example. The numerical illustration is an actual example of the mounting process of a printed circuit board for a laser printer for personal computers.

Literature review
In this section, we briefly summarize the results with reference to review papers in related fields. The review papers related to lot size scheduling are known to be those of Drexl and Kimms [5], and Copil et al. [6]. The former mainly covers studies before 2000, while the latter covers up to recent studies. The features of the model in this paper are the multi-item dynamic lotsize scheduling problem and its decomposition, aiming the simultaneous optimization of heterogeneous decision features using Lagrangian relaxation (decomposition). Therefore, we mention the conventional studies from these viewpoints. Consequently, we take up only the most relevant discrete lot size scheduling problem (DLSP) (and partly the proportional lot size scheduling problem (PLSP)) and give an overview of them.
There exists a lot of studies on DLSP. Fleischmann [7] first formulated DLSP as a single-process problem, proposed a method for alternating lot size and lot sequence, and played a pioneering role in research on this type of problem. It extends to the problem with the sequencedependent setup cost. Leachman et al. [8] proposed a method to solve a problem extended to consider material supply constraints for DLSP of one machine with dynamic demand by dynamic programming. van Hoesel and Kolen [9] proposed a method of formulating the integer programming method, relaxing it, reformulating it as the shortest path problem, and solving it using dynamic programming. Gicquel and Minoux [10] considered a single-machine DLSP formulation with flow conservation of the setup state. They proposed multiproduct valid inequalities. The separation problem is solved using exact and heuristic algorithms.
In addition, there is the Proportional Lotsizing and Scheduling Problem (PLSP) as a scheduling problem that formulates the planning horizons which are discretized into very small time slot like DLSP. Kimms [11,12] is known as a study on PLSP that simultaneously deals with different decision features. However, it is a method of repeatedly determining the lot size and lot sequence step by step, and it cannot be said that the viewpoint of simultaneous optimization is taken into consideration.
The main proposed solutions to these models are relaxation of linear programming [13], Lagrangian relaxation [14], dynamic programming [15], and Lagrangian Decomposition and Coordination (LDC) Method [2]. The proposed model is based on the LDC method. In many optimization problems including production scheduling, the usual subgradient method has been used to update the Lagrange multiplier in LR and LDC methods, but convergence has been a problem, especially for realistic problems. Kaskavelis and Caramanis proposed the interleaved subgradient method [3], but the convergence problem still remains. Zhao et al. developed the surrogate subgradient method and proved the convergence of the algorithm and also showed that the interleaved subgradient method is a special case of the surrogate gradient method [4].
Subsequently, the surrogate gradient method has been applied to many scheduling problems, especially the job shop scheduling problem [16]; however, there is unknown example of its application to the dynamic lot size scheduling model, which optimizes the lot size and lot sequencing simultaneously.

Problem description
We consider a production system that comprises a single machine, which can process multiple items. For any processing, setup cost and setup time incur, which depends on the item to be processed (This paper does not treat sequence dependent setup). Processing time also depends on item and is proportional to quantity. It is necessary for the inventory holding cost to store each item, and it is proportional to the inventory level and its holding interval. The shipment requirement datum along with the due date is given for each item at its corresponding time slot. The shortage of inventory is not allowed for any item over the planning horizon.
In this study, we would like to attain a feasible schedule that can minimize the sum of the setup costs and inventory holding costs over the planning horizon without allowing the shortage of item or delay. This problem belongs to dynamic lot size scheduling problem in the sense that decision features of lot sizing and lot sequencing are not time-invariant. In this paper, specifying this problem as a dynamic lot size scheduling problem of multi-item singleprocess with sequence independent setup cost and time, we treat this dynamic optimization problem.

Notations
We present the notations that are used in the formulation. p i : production quantity per unit time slot for item i ∈ I. r it : quantities of shipment requirement for i ∈ I at time slot t ∈ T.
h i : inventory holding cost per unit time slot for i ∈ I.
x it : inventory at the end of time slot t ∈ T for i ∈ I. x i min : lower limit of inventory for item i ∈ I. x i max : upper limit of inventory for item i ∈ I.

Constraint for inventory
The inventory for each item must satisfies the following equation,

Inventory transition equation
We

Prohibition constraint of machine interference
Only one item can enable setup or be processed at a time. Hence, the following inequality must be satisfied, it = 1, process item i at time slot t, 0, otherwise.
(1) Constraint for regulating the sum total of the decision variables For obtaining a feasible solution, the total sum of the decision variables for all the items, which include a dummy item, for the entire time slots should be equal to the planning horizon n. Therefore, the following inequality must be satisfied.

Total cost
By adding all the setup costs and the inventory holding costs, the total cost f ( , x, s) can be expressed as where , x and s are the vectors whose elements are i , x i and s i , respectively, and i , x i and s i are the vectors, whose elements are it , x it , s it , respectively.
Also, f ( , x, s) is the objective function of the problem.

Problem
The problem is to obtain a feasible schedule that minimizes the total cost over the planning horizon. Therefore, it can be expressed as follows,

Lagrangian
To apply the LDC method, we define the Lagrangian L( , x, s;u, ) as follows, where, u is a vector, whose elements are u t . Accordingly, the problem can be transformed into the following minimization problem. Lagrangian relaxation problem

Problem decomposition
It enables us to decompose Lagrangian into item-based subproblems. Therefore, we can express the Eq. (7) as follows, By observing the first term on the right-hand side of Eq. (10), each item can be treated independently from the other items.
The feasibility of the solution for solving the problem can be determined by dissolving the machine interference as well as by regulating the total sum of the decision variables and can be treated explicitly as shown in Eqs. (4) and (5). These can be obtained when the problem is solved.
Consequently, by denoting L i ( , x, s;u, ) as follows, we can formulate the following sub-optimization problem. Sub-optimization problem ∀i ∈ I , Solving Eq. (12) makes it possible to calculate the Lagrangian from Eq. (10). This implies that from Eq. (11), the Lagrangian can be denoted as follows, Accordingly, the problems associated with the calculation are the solution of subproblems and the sufficiency of interaction constraints (4) and (5). We can reformulate the subproblem to solve it using dynamic programming. However, we omitted this reformulation from this study because the same formulation was used in Muramatsu [1], except that we included a dummy item and added the last term, − it in Eq. (11).

Characteristics of the proposed formulation and solution
As discussed previously, a condition scarcely can be observed in which the rate of operation is 100 percent. Accordingly, in this problem, there exists an idling time at some time slots in the planning horizon. (10) subject to (1), (2), (3) In the formulation presented in [1], a binary decision variable vector is assigned to every real item. Furthermore, the idling time can be observed when the sum of the decision variables for all the items at a certain time slot is zero.
In this formulation, we assume that machine interference occurs at some time slots in the solution. Subsequently, we can update the concerned Lagrange multipliers. Therefore, if the machine interference is dissolved and the perturbation due to machine interference takes a value of −1 , the product of the concerned Lagrange multipliers and perturbation will be a minus value. Thus, the Karush-Kuhn-Tucker (KKT) condition which is necessary for an optimum solution in nonlinear programming was not established [17,18].
In our previous study, we introduced a dummy item to express the idling time and added a constraint of the sum total regulation of the decision variables. Although this resulted in the modification of the inequality constraints into equality constraints in a previous study [19], we remodified the equality constraints to inequality constraints in this study. Therefore, it is guaranteed that the KKT conditions (14), (15) hold if an optimal solution can be obtained.
We apply the surrogate gradient method for updating the Lagrange multipliers. The difference between the surrogate gradient method and the subgradient method is that all the subproblems are solved in the subgradient method.

Surrogate gradient method
Based on the quantity of constraint violation, we can update the Lagrange multipliers to dissolve the violation. Usually, the subgradient method is used to solve the Lagrangian relaxed problem numerically. In this study, we employed the surrogate gradient method that updates the Lagrange multipliers after solving each item-based subproblem. The updating rule can be expressed as follows, where u and denote the step lengths and denotes the number of iteration. The step length u and can be updated as follows, where u and denote the parameters between 0 and 2, and L UB denotes the upper bound, whereas L SD denotes the surrogate dual cost function.

Data
In this section, we present a numerical example based on the proposed model and updating method. The related data are shown below, and they are also presented in Tables 1 and 2. This example is a real case of mounting process of PCB of laser beam printer for PCs. It should be noted that all the data relating to the dummy item ( i = 0 ) are zero excepting an initial state 00 .  Table 1 Setup cost, setup time, production quantity, inventory holding cost, and the initial inventory

Results
We obtained a feasible schedule during the 449th iteration. Figure 1 shows the transition of the objective function, the surrogate dual function, and the lower bound value. Both the objective function and the Lagrangian at the point of convergence agree with a value of 856.33009.
These results denote that the surrogate gradient method converges faster than the subgradient method. Therefore, it is generally believed that the surrogate gradient method is effective in achieving fast and smooth convergence.
However, because the proposed method does not have an algorithm for obtaining a feasible solution during each iteration, we cannot obtain a feasible solution when the calculation was interrupted. Therefore, it is necessary to construct a heuristic algorithm for deriving a feasible solution.

Conclusions
This study treated a multi-item single-machine dynamic lot size scheduling problem with sequence-independent setup cost and setup time. First, we confirmed that in the model proposed by Muramatsu [1], there is no guarantee that the KKT condition holds when the optimal solution is obtained.
Furthermore, to compensate for the shortcomings of the previous work, we proposed a new model in which the constraint prohibiting machine interference was converted from an equality constraint to an inequality constraint, and a constraint regulating the summation was added. We also showed that the KKT condition is satisfied in the proposed model when the optimal solution is obtained.
Then, the surrogate gradient method, which is known to achieve smooth convergence in updating Lagrange multipliers, was applied to the dynamic lot size scheduling model for the first time, and the validity of the proposed method was verified numerically using a real-world Data availability All data generated or analyzed during this study are included in this published article.
Code availability Not available in public repository.

Conflict of interest
The authors declare no conflicts of interest associated with this manuscript. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.