Robust optimization of uncertain multistage inventory systems with inexact data in decision rules
Abstract
In productioninventory problems customer demand is often subject to uncertainty. Therefore, it is challenging to design production plans that satisfy both demand and a set of constraints on e.g. production capacity and required inventory levels. Adjustable robust optimization (ARO) is a technique to solve these dynamic (multistage) productioninventory problems. In ARO, the decision in each stage is a function of the data on the realizations of the uncertain demand gathered from the previous periods. These data, however, are often inaccurate; there is much evidence in the information management literature that data quality in inventory systems is often poor. Reliance on data “as is” may then lead to poor performance of “datadriven” methods such as ARO. In this paper, we remedy this weakness of ARO by introducing a model that treats past data itself as an uncertain model parameter. We show that computational tractability of the robust counterparts associated with this extension of ARO is still maintained. The benefits of the new model are demonstrated by a numerical test case of a wellstudied productioninventory problem. Our approach is also applicable to other ARO models outside the realm of productioninventory planning.
Keywords
Adjustable robust optimization Productioninventory problems Decision rules Inexact data Poor data quality1 Introduction
With the uprise of Big Data, most of the currently available (theoretical or practical) methods for controlling a multistage productioninventory system, are using a “datadriven” approach. At each period t data in the future is treated as uncertain, while data from the past is considered known (certain). The affinely adjustable robust counterpart (AARC) method (BenTal et al. 2004), which is the focus of this paper, needs exact past demands to derive a decision, by inserting them in a linear decision rule. In reality, however, there is a strong evidence (see below) that even past data is far from being exact. For example, in inventory/production systems what is usually reported as a surrogate for the demand are sales, which then ignores lost sales due to excess demand.
In general, even when it seems that the full data on the uncertain demand is available at some stage, one cannot rely blindly on this information. Arguably, many developments in information technology have enabled firms to collect realtime data. However, despite these enormous developments in our Big Data era, poor data quality is still a big issue. In DeHoratius and Raman (2008) results of an empirical study are reported; they found that 65 % of the inventory records were inaccurate, and “the value of the inventory reflected by these inaccurate records amounted to 28 % of the total value of the expected onhand inventory”. In Redman (1998) it is estimated that 1–5 % of data fields are erred, which led to a costs increase of 8–12 % of revenue in some carefully studied cases, and to a consumption of 40–60 % of the expenditure in service organizations. Haug et al. (2011) summarize the literature that deal with the big impact of poor data quality: “Less than 50 % of companies claim to be very confident in the quality of their data”, “75 % of organizations have identified costs stemming from dirty data”. See also Soffer (2010) for a general exploration of data inaccuracy in business processes. One paper that develops a method to handle inaccurate inventory records is by Kök and Shang (2007). Their approach assumes that the distribution of the errors (describing the inaccuracy) is known and that inspections can be made at certain costs to exactly observe these errors.
In this paper we extend the AARC method to a method named adjustable robust counterpart with decision rules based on inexact data (ARCID) that incorporate past data uncertainty while keeping the resulting (deterministic) robust counterpart tractable. This is our main contribution, and it is achieved using results and techniques from the current robust optimization arsenal.
We illustrate the benefits of the ARCID model by revisiting the inventory problem that was used in the first paper on ARO (BenTal et al. 2004). Numerical results for this productioninventory problem show that if one neglects the inexact nature of the revealed data, then the resulting solution might violate the constraints in many scenarios. For our numerical example, violations occurred for up to \(80\,\%\) of the simulated demand trajectories. The ARCID model is able to avoid this severe infeasibility and produce more reliable solutions.
Although the focus of this paper is on productioninventory problems, there are various other areas where our ARCID model could be used to solve uncertain multistage problems. For example, ARO techniques were used in facility location planning (Baron et al. 2011), flexible commitment models (BenTal et al. 2005), portfolio optimization (Calafiore 2008, 2009; Rocha and Kuhn 2012), capacity expansion planning (Ordóñez and Zhao 2007) and management of power systems (Guigues and Sagastizábal 2012; Ng and Sy 2014) among others. A more elaborate list of examples up to 2011 can be found in the aforementioned survey by Bertsimas et al. (2011a). We emphasize that our proposed ARCID framework remains applicable for multistage problems outside the realm of productioninventory planning.
The remainder of this paper is organized as follows. In Sect. 2 we describe the adjustable robust models used in the literature. Section 3 then introduces the new ARCID models with inexact revealed data in the decision rules and derive tractable representations of the resulting optimization problems. Section 4 presents our productioninventory model and the corresponding ARCID model. The numerical results are given and analyzed in Sect. 5. Conclusions are presented in Sect. 6. Throughout this paper we use bold lowercase and uppercase letters for vectors and matrices, respectively, while scalars are printed in regular font.
2 Adjustable robust models
Examples of uncertainty sets and their support functions
Uncertainty set  \(\mathcal {Z}\)  \(\delta ^*(\varvec{\nu }\mathcal {Z})\) 

Box  \(\{\varvec{\zeta }: \varvec{\zeta }_\infty \le \alpha \}\)  \(\alpha \varvec{\nu }_1\) 
Ball  \(\{\varvec{\zeta }: \varvec{\zeta }_2 \le \alpha \}\)  \(\alpha \varvec{\nu }_2\) 
Polyhedral  \(\{\varvec{\zeta }: \mathbf b \mathbf B \varvec{\zeta }\ge 0\}\)  \({\left\{ \begin{array}{ll} \mathbf b ^\top \mathbf z &{}\text { if }\, \mathbf B ^\top \mathbf z = \varvec{\nu },\ \mathbf z \ge 0\\ \infty &{}\text { otherwise}\end{array}\right. }\) 
3 The new adjustable robust model based on inexact data
This section introduces our model that extends the ARC model to the case where revealed data is inexact. We stress that the models described here are more general and not limited to productioninventory problems. They could be used for any ARO problem within operations management where the revealed data is inexact.
Theorem 1
Proof
The two assumptions on the uncertainty set (closedness and nonempty relative interior of \(\mathcal {U}\)) used in Theorem 1 are satisfied for all closed sets \(\mathcal {Z}\) and \({\widehat{\mathcal {Z}}}\) with nonempty relative interior and 0 being an element of the relative interior of \({\widehat{\mathcal {Z}}}\). A few common choices for uncertainty sets, that satisfy these conditions, have been given in Table 1. Below we give two examples of constraints with different choices for the estimation uncertainty. In the first example (BoxBox) we have both box uncertainty for the parameter \(\varvec{\zeta }\) and a box for the estimation error (independent estimation errors). In the second example (BoxBall) the estimation errors reside in a ball.
Example 1
Example 2
Theorem 1 can also be used to argue that the new ARCID model bridges the gap between models that do not use information at all in the second stage (RC) and those that rely on fully accurate revealed information in the decision rules (ARC). Namely, if the estimation uncertainty is large (i.e. \({\widehat{\mathcal {Z}}}\) is large), then there is no value in the revealed inexact data. In that case the optimal value of the nonadjustable version is equal to the optimal value of (ARCID). More formally, consider the situation where there exists a realisation \({\bar{\varvec{\zeta }}} \in \mathcal {Z}\) such that \(\mathcal {Z}\subset {\bar{ \varvec{\zeta }}} + {\widehat{\mathcal {Z}}}\). Then, if (P:ARCID) is feasible, it follows directly that there must also exist a decision rule with \(\mathbf V = 0\), i.e., a nonadjustable decision. For Example 1 and 2 we have that the ARCID model is equivalent to the nonadjustable model when \(\rho \ge \theta \) for the first example (BoxBox) and \(\rho \ge \sqrt{L}\theta \) for the second example (BoxBall). In case there is no estimation error (\({\widehat{\mathcal {Z}}} = \{0\}\)), the ARC and the ARCID are equivalent in the sense that they have the same feasible region and the same optimal objective value.
Theorem 2
In Theorem 1 we only consider constraints that are linear. This theorem can be readily extended to the case where the constraint is convex (but not necessarily linear) in the hereandnow variables \(\mathbf x \). To do so, we can use Fenchel duality as has been done for nonadjustable robust models in BenTal et al. (2015).
The construction of the standard uncertainty set and the estimation uncertainty set can be done in different ways. Our model based on inexact revealed data has additional uncertainty in the estimates described by the uncertainty sets \({\hat{\mathcal {Z}}}\) or \({\widehat{\mathcal {Z}}}_{1},\ldots ,{\widehat{\mathcal {Z}}}_{T}\) in the multiperiod case. We have to construct another uncertainty set that captures all estimation errors for which we want to be protected in our future planning periods. For constructing the estimation uncertainty set we can use the same techniques as for the static case (see e.g. Bertsimas et al. 2013). We can for instance use historical data on the errors, \(\varvec{\zeta } {\widehat{\varvec{\zeta }}}^{t}\), obtained from previous planning horizons. If there is insufficient historical data, one can still define uncertainty sets with realistic a priori reasoning. In retail stores, and especially with the growing share of online retail, customers often return a product if it does not meet their requirements. Sales figures then give an indication of the total demand, but it is known that in each period between, for example, 5 and \(10\,\%\) of all products are returned. The bandwidth of this percentage can then be used to construct the estimation uncertainty around the demand estimate obtained via sales figures. Another situation of estimation uncertainty arises when the demand estimate is obtained via accumulation of (correlated) demand from different stores. If we know that different stores need different amounts of time to come up with accurate data (e.g., sales reports), then there is still some uncertainty on the total demand if, for example, only 9 out of 10 stores have reported their sales. In both of these described situations more information will be revealed in later periods and estimates are likely to become more accurate over time. An example of this type of uncertainty set where estimates become more accurate over time is used in the productioninventory problem in the next section.
4 Productioninventory problem
In this section we apply the ARCID approach to the productioninventory problem that was introduced in BenTal et al. (2004), the seminal paper on adjustable robust optimization.
4.1 The nominal model
We consider a single product inventory system, which is comprised of a warehouse and I factories. A planning horizon of T periods is used. In the model we use the following parameters and variables, using the same notation as in BenTal et al. (2004):
4.2 Parameters
 \(d_t\)

Demand for the product in period t;
 \(P_i(t)\)

Production capacity of factory i in period t;
 \(c_i(t)\)

Costs of producing one product unit at factory i in period t;
 \(V_{\text {min}}\)

Minimal allowed level of inventory at the warehouse;
 \(V_{\text {max}}\)

Storage capacity of the warehouse;
 \(Q_i\)

Cumulative production capacity of the ith factory throughout the planning horizon.
4.3 Variables
 \(p_i(t)\)

The amount of the product to be produced in factory i in period t;
 v(t)

Inventory level at the beginning of period t (v(1) is given).
4.4 The affinely adjustable robust model based on inexact data

\(I_t = \{1,\ldots ,t\}\), the information basis where demand from the past and the present is known exactly, for the future no extra information is known;

\(I_t = \{1,\ldots ,t1\}\), the information basis where all demand from the past is known exactly, there is no information about the present;

\(I_t = \{1,\ldots ,t4\}\), the information about the past is received with a four day delay. For other periods in the past (\(t3,t2\) and \(t1\)) there is no extra information at all.
4.5 Data set from BenTal et al. (2004)
The initial inventory level v(1) was not stated in BenTal et al. (2004), but this value is equal to the lower bound of the inventory level at the warehouse, namely 500. Note that the initial inventory level could also be chosen uncertain if the initial state is unkown. For new products, where no past demand has occured, it is realistic to assume no uncertainty on the stock as the inventory level is set by the manager itself. Here we also assume that the initial inventory level is known, as in BenTal et al. (2004).
5 Numerical results
BenTal et al. (2004) conduct two series of experiments based on the data given in Sect. 4.3. In the first series of experiments they modify the parameter \(\theta \) to analyze the influence of demand uncertainty on the total production cost. In the second series of experiments they change the information basis \(I_t\), the (exact) information that is used in the decision rule. Note that BenTal et al. (2004) deal with the case where in period t all demand from the periods in the information set \(I_t\) is known exactly. For instance, if the information set is equal to \(I_t = \{1,\ldots ,t1\}\), then in period t we can base our production decision rule on the exact values of the demand realizations in periods \(1,\ldots ,t1\), and use no information on the demand in periods after \(t1\). We extend these experiments to include inexact data in some periods to show the benefits of the ARCID model over the ARC model.
Just as in BenTal et al. (2004), we test the management policies by simulating 100 demand trajectories, \(d = (d_1,\ldots ,d_T)\). For every simulation the demand trajectory is randomly generated with \(d_t\) uniformly distributed in \([(1\theta )d_t^*,(1+\theta )d_t^*]\), where \(20\,\%\) (\(\theta = 0.2\)) is the chosen uncertainty level. The uncertainty level of the demand is set to \(20\,\%\) in all experiments, as this seems to be the most restrictive level of uncertainty and is the same level that has been used by BenTal et al. (2004). For higher uncertainty levels like \(30\,\%\), even the model without uncertainty (P:Nominal) is no longer feasible for the maximal demand pattern with \(d_t = (1+\theta )d_t^*\) (without uncertainty) because of the bounds on production imposed by \(P_i(t)\) and \(Q_i\). In line with the experiments performed by BenTal et al. (2004), we compute the average costs for our solutions by assuming an uniform distrutibution for the estimated demand. In BenTal et al. (2004) they have used 100 simulated demand trajectories to approximate the mean costs. However, since the costs are linear in the estimated demand parameter, this can be found by substituting the expected (nominal) demand in the objective function. All solutions are obtained by the commercial solver (Gurobi Optimization 2015) programmed in the YALMIP language (Löfberg 2004) in MATLAB.
5.1 Experiments with decision rules using inexact data on demand
Similar to BenTal et al. (2004), we saved the demand trajectories to compute the socalled costs of the ideal setting, the utopian world where the entire demand trajectory is known beforehand. The ideal setting is used to benchmark the performance of the ARCID solution. In the ideal setting one sets the policy only for one sample demand realization, so the solution does not have to be feasible for all possible demand trajectories. Hence, the costs in the ideal setting are obviously a lower bound of the costs for the ARCID solutions. For the ideal setting the worst case is the demand trajectory with the highest demand: \(d_t = (1+\theta )d_t^*\) for all t. The worst case costs in the ideal setting can be easily solved and turns out to be 44, 199. The mean costs in the ideal case are approximated by averaging the ideal costs for the 100 simulated demand trajectories and equals 33, 729.
In our model, the demand from the past periods is not known exactly, but we assume to have inexact estimates for some past and present periods. Several cases are investigated, for instance those where the delay for receiving the exact demand information is even more than 2 periods, i.e., the exact demand is known after 3, 4 or more periods. These cases are infeasible in the ARC model, see BenTal et al. (2004).
The influence of the estimation errors on the mean costs and worst case costs (WC) in the ARCID model
Case  Demand estimation error \(\rho _{r,t}\) (in %)  Costs  

\(\rho _{1,t},\ldots ,\rho _{t9,t}\)  \(\rho _{t8,t}\)  \(\rho _{t7,t}\)  \(\rho _{t6,t}\)  \(\rho _{t5,t}\)  \(\rho _{t4,t}\)  \(\rho _{t3,t}\)  \(\rho _{t2,t}\)  \(\rho _{t1,t}\)  \(\rho _{t,t}\)  Mean  WC  
1  0  0  0  0  0  0  0  0  0  10  35,167  44,268 
2  0  0  0  0  0  0  0  0  0  20  35,077  44,273 
3  0  0  0  0  0  0  0  0  20  –  35,740  44,582 
4  0  0  0  0  0  0  0  0  –  –  35,740  44,582 
5  0  0  0  0  0  0  1  5  10  –  36,882  44,883 
6  0  0  0  5  5  5  10  10  10  –  36,867  45,326 

For Cases 1 and 2 we assume that all demand from the past is known exactly. For the present period we have a good estimate on the demand that gives extra information compared to the information known at the start of the planning period (\(t=0\)).

The Cases 3–6 assume to have no additional knowledge about the present. Furthermore, the exact demand from previous periods is received with a certain delay, but there are already estimates on the demand available before this information is received.

Case 4 is equivalent to the uncertainty set from (BenTal et al. 2004) with exact revealed information and the information sets being \(\{1,\ldots ,t2\}\).
The mean costs in Table 2 show a strange pattern among the different cases at first sight. For instance, Case 5 produces higher mean costs than Case 6, but the estimation error is much less. This phenomenon can be explained in the following way. In the two step approach, we first search for a solution with minimal worst case costs \(F^*\) and then we search among all solutions with worst case costs \(F^*\) for the solution that minimizes the nominal demand trajectory. Hence, the information in Case 2 is used to decrease the worst case costs, possibly at the costs of the average behavior.
5.2 Comparison with affinely adjustable robust model based on exact data
Worst case costs of the AARC model and the ARCID model for each case
Cases  Worst case costs  

AARC  ARCID  
1  44,273  44,268 
2  44,273  44,273 
3  44,582  44,582 
4  44,582  44,582 
5  Infeasible  44,883 
6  Infeasible  45,326 
Case 4 only deals with exact estimates. The ARCID and the AARC are equivalent in those cases because there is no estimation uncertainty. There are other situations, namely in Case 5 and 6, where the ARCID use the extra inexact data to produce feasible solutions whereas the AARC is infeasible.
Percentage of simulated demand trajectories that violate the minimum required inventory level (\(V_{\text {min}}\)) and maximum allowed inventory level (\(V_{\text {max}}\)) when neglecting estimation errors
Cases  Percentage of demand trajectories that violate the bounds  

\(V_{\text {min}}\)  \(V_{\text {max}}\)  
1  64  55 
2  80  38 
3  42  38 
4  0  0 
5  27  15 
6  26  15 
6 Conclusions
In this study we consider uncertain multistage inventory systems where the observed data on demand obtained in each period is inexact. We extend the adjustable robust counterpart (ARC) method for productioninventory problems to the (ARCID) model in which the decision rules are based on inexact revealed data. Our numerical results demonstrate that ARCID outperforms ARC, which can only rely on exact revealed demand data. Two cases that are infeasible for the ARC solution, are feasible for the ARCID model. It is evident that neglecting the inexact nature of the revealed data may have severe consequences. For example, the inventory level dropped below the allowed minimum in up to \(80\,\%\) of the simulated demand trajectories.
The use of the ARCID method is thus well justified, in particular so since the resulting optimization problem that need to be solved maintain a comparable tractability status to that of the ARC method. Furthermore, there exist several software packages, such as YALMIP (Löfberg 2012), ROME (Goh and Sim 2011) and (AIMMS 4.19 2016), that can do reformulation of adjustable robust optimization problems which can be readily extended to the ARCID model. Finally, we emphasize that the ARCID model set up in this paper can also be applied to other ARC models where revealed data in each stage is inexact in various areas of operations management, such as facility location planning, flexible commitment models, capacity expansion planning, portfolio optimization and management of power systems.
References
 AIMMS 4.19 (2016) AIMMS B.V., Haarlem, The Netherlands, software available at http://www.aimms.com/
 Baron O, Milner J, Naseraldin H (2011) Facility location: a robust optimization approach. Prod Oper Manag 20(5):772–785Google Scholar
 BenTal A, Goryashko A, Guslitzer E, Nemirovski A (2004) Adjustable robust solutions of uncertain linear programs. Math Program 99(2):351–376Google Scholar
 BenTal A, Golany B, Nemirovski A, Vial JPh (2005) Retailersupplier flexible commitments contracts: a robust optimization approach. Manuf Serv Oper Manag 7(3):248–271Google Scholar
 BenTal A, El Ghaoui L, Nemirovski A (2009) Robust optimization. Princeton series in applied mathematics. Princeton University Press, PrincetonGoogle Scholar
 BenTal A, den Hertog D, Vial JPh (2015) Deriving robust counterparts of nonlinear uncertain inequalities. Math Program 149(1–2):265–299Google Scholar
 Bertsimas D, Goyal V (2012) On the power and limitations of affine policies in twostage adaptive optimization. Math Program 134(2):491–531CrossRefGoogle Scholar
 Bertsimas D, Brown DB, Caramanis C (2011a) Theory and applications of robust optimization. SIAM Rev 53(3):464–501Google Scholar
 Bertsimas D, Iancu DA, Parrilo PA (2011b) A hierarchy of nearoptimal policies for multistage adaptive optimization. Autom Control IEEE Trans 56(12):2809–2824Google Scholar
 Bertsimas D, Gupta V, Kallus N (2013) Datadriven robust optimization. arXiv:1401.0212v1
 Calafiore GC (2008) Multiperiod portfolio optimization with linear control policies. Automatica 44(10):2463–2473CrossRefGoogle Scholar
 Calafiore GC (2009) An affine control method for optimal dynamic asset allocation with transaction costs. SIAM J Control Optim 48(4):2254–2274CrossRefGoogle Scholar
 DeHoratius N, Raman A (2008) Inventory record inaccuracy: an empirical analysis. Manage Sci 54(4):627–641Google Scholar
 Goh J, Sim M (2011) Robust optimization made easy with rome. Oper Res 59(4):973–985CrossRefGoogle Scholar
 Guigues V, Sagastizábal C (2012) The value of rollinghorizon policies for riskaverse hydrothermal planning. Eur J Oper Res 217(1):129–140CrossRefGoogle Scholar
 Gurobi Optimization Inc (2015) Gurobi optimizer reference manual. http://www.gurobi.com
 Guslitzer E (2002) Uncertaintyimmunized solutions in linear programming. M.Sc. thesis, TechnionIsrael Institute of TechnologyGoogle Scholar
 Haug A, Zachariassen F, van Liempd D (2011) The costs of poor data quality. J Ind Eng Manag 4(2):168–193Google Scholar
 Iancu DA, Trichakis N (2013) Pareto efficiency in robust optimization. Manage Sci 60(1):130–147CrossRefGoogle Scholar
 Iancu DA, Sharma M, Sviridenko M (2013) Supermodularity and affine policies in dynamic robust optimization. Oper Res 61(4):941–956Google Scholar
 Kök AG, Shang KH (2007) Inspection and replenishment policies for systems with inventory record inaccuracy. Manuf Serv Oper Manag 9(2):185–205Google Scholar
 Löfberg J (2004) Yalmipa toolbox for modeling and optimization in MATLAB. In: Proceedings of the CACSD Conference. Taipei, TaiwanGoogle Scholar
 Löfberg J (2012) Automatic robust convex programming. Optim Methods Softw 27(1):115–129CrossRefGoogle Scholar
 Ng TS, Sy C (2014) An affine adjustable robust model for generation and transmission network planning. Int J Electr Power Energy Syst 60:141–152CrossRefGoogle Scholar
 Ordóñez F, Zhao J (2007) Robust capacity expansion of network flows. Networks 50(2):136–145CrossRefGoogle Scholar
 Redman TC (1998) The impact of poor data quality on the typical enterprise. Commun ACM 41(2):79–82CrossRefGoogle Scholar
 Rocha P, Kuhn D (2012) Multistage stochastic portfolio optimisation in deregulated electricity markets using linear decision rules. Eur J Oper Res 216(2):397–408CrossRefGoogle Scholar
 Rockafellar RT (1970) Convex Analysis. Princeton University Press, PrincetonGoogle Scholar
 de Ruiter FJCT, Brekelmans RCM, den Hertog D (2016) The impact of the existence of multiple adjustable robust solutions. Mathematical Programming, pp 1–15. doi: 10.1007/s1010701609786 (advance online publication)
 Soffer P (2010) Mirror, mirror on the wall, can I count on you at all? Exploring data inaccuracy in business processes. In: Enterprise, BusinessProcess and Information Systems Modeling. Springer, New York, pp 14–25Google Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.