1 Introduction and literature review

Real-life operation and optimization of production systems almost always have to deal with several uncertainties. Scheduling problems arising in the industry are no exception. Many papers in the literature discuss various approaches to tackle these uncertainties in both scheduling and other fields of process optimization.

The goal of this paper is twofold:

  • First, a generalized way of looking at the decision making process is presented. Based on this aspect, a systematic approach can be derived to enumerate the possible problem classes regard to the sets of inputs given and decisions to be made.

  • This approach is demonstrated with an illustrative scheduling problem class that aims to maximize profit. Some of the generated problem classes feature uncertainties, and the possible techniques to address them are discussed for some classes.

The paper is structured as follows: the rest of the section provides a literature review of batch process scheduling (Sect. 1.1), followed by a brief review of addressing uncertainties in the production industry (Sect. 1.2). The end of this section, Sect. 1.3 discusses advancement on the intersection of the two fields above, scheduling with uncertainties.

Section 2 provides the definition of an illustrative scheduling problem class that will be used later for demonstration. The proposed abstraction to decision making processes is presented in Sect. 3. Section 3.2 details the problem class enumeration procedure based on this abstraction. The problem class from Sect. 2 is used as an illustrative example.

Selected problem classes (referred to as cases) are investigated in detail in Sect. 4. These cases are selected in a way to highlight deterministic, reactive, and preemptive techniques in scheduling. At the end of the paper, the conclusions are drawn.

1.1 Batch process scheduling

Scheduling is an often used decision-making process in many and different kinds of manufacturing and services industry, where in a given time period the allocation of resources to different tasks is dealt with the goal to optimize one or more objectives. Resources can take many different forms, tasks may be the operations, activities, stages or executions of different kinds of processes. Objectives can also take several shapes and forms, like minimization of completion time of several tasks. In industrial environments, the processes take place within a given framework, but due to the different environmental effects and other factors, there is always the possibility of a certain degree of uncertainty.

Multi-purpose batch production plants are characterized by flexibility and variety in production, which can be used to meet customer demands and with the variety of different production paths to reduce production costs. Due to their diversity, they can be categorized in several ways (Méndez et al. 2006). Problem definition consists of three types of information, which are the recipe per product, the assignment of tasks to equipment, and the quantities to be produced from the products. The recipe contains the minimum information needed to manufacture the product, which is a unique description of specific product manufacturing requirements. Depending on user requirements, the following four levels are used in batch processes:

  • General recipe identifies the raw materials, their quantity and the required processing tasks.

  • Site recipe is a general recipe supplemented with site information, which means site-specific information

  • Master recipe contains information about the required equipment, raw materials with the required quantities and the process of product production.

  • Control recipe contains additional information about the production units. This recipe is derived from the master recipe.

Of the four types, the master recipe is applied during the batch process scheduling, which can be represented by a graph where nodes denote the production tasks and arcs denote the relationships between them (Sanmartí et al. 2002). Another important parameter of batch scheduling problems is the intermediate storage method. The intermediate storage options and implementation depend on the situation, production policy and the product itself. If the products can be stored without restriction, the Unlimited Intermediate Storage (UIS) policy is applied, whereas if storage is not an option or not possible, it is called Non-intermediate Storage (NIS) policy. In the case where the intermediate storage capacity is finite, it means Finite Intermediate Storage (FIS) or Common Intermediate Storage (CIS) policy. The strictest storage policy is Zero Wait (ZW), when the rule does not allow intermediate storage, even in the processing units. These storage rules may be present in a mix, called Mixed Intermediate Storage (MIS) policy (Hegyháti and Friedler 2010).

1.2 Uncertainty in process optimization

As the complexity of the processes increases, so does the likelihood of uncertainty, which requires special attention. There are several solutions for dealing with uncertainty, but to use them first, it is necessary to identify the uncertainty, which provides a preliminary determination and understanding of the hypothetical cause. Each can be classified into separate categories according to certain aspects. The source of uncertainty can be a dynamic event, which has the following categories (Suresh and Chaudhuri 1993):

  • Workpiece related events: random job arrival, non-deterministic processing time

  • Machine related events: machine breakdowns, loading limits

  • Operation related events: operation delays, unstable production

  • Other events: operator absence, defective materials

Another approach is that uncertainty is not simply a lack of knowledge. On the one hand, uncertainty can arise from inadequate information, which can be of the following types: inexactness, unreliability, and border with ignorance. On the other hand, there are situations where information is abundant, yet there is uncertainty (Van Asselt and Rotmans 2002). Expanding and decreasing information can further increase or decrease uncertainty. Increasing the necessary information in complex processes can put already established processes into a new perspective, about which additional knowledge can be used to identify additional uncertainties, which may mean that our knowledge is limited and insufficient, or the process is more complex than it used to be (Sluijs 1997). Uncertainties resulting from this knowledge-based deficiency can also appear in the model in several ways. Several general locations can be identified to establish the occurrence of uncertainty in the outcome (Walker et al. 2003):

  • In the problem-making phase of the model, the context is defined, which is particularly important in the decision support practice (Dunn 2018).

  • The model uncertainty can be divided into two parts: the model structure uncertainty and the model technical uncertainty, which is the uncertainty arising from the computer solution (Sluijs 1997).

  • It is sometimes worthwhile to classify the inputs of the model into controllable and uncontrollable categories, depending on whether it is possible to influence the values of the input variables.

  • The method used to calibrate the model parameters is associated with the uncertainty of the parameters (Harremoës and Madsen 1999).

  • The output of the model may include uncertainty, referenced as model outcome uncertainty, which can appear to the interest to the decision maker as cumulative uncertainty (Bankes 1993).

Additional uncertainty factors must also be taken into account, but the main factors can be summarized based on the aforementioned (Bakon et al. 2022). The identification of uncertainty is crucial in constructing the appropriate model, as ignoring it can lead to misleading or bad results, the consequences of which are revealed after decisions are made.

1.3 Uncertainty in batch process scheduling

To deal with uncertainty, it is necessary to distinguish between static and dynamic scheduling. In the case of a static schedule, the data are known in advance and the schedule is fixed. The origin of the uncertainty can arise from several factors, as mentioned earlier, therefore a dynamic schedule is required to address it as the schedule plan must be modifiable.

  • Proactive approach To address uncertainty, it is achieved through modeling solutions or by optimizing its performance under different scenarios (Liu et al. 2007; O’Donovan et al. 1999).

  • Reactive scheduling A preliminary schedule is required to start, which is modified or re-optimized when the uncertainty parameters are realized or an unexpected event occurs (Mihoubi et al. 2021).

  • Stochastic scheduling In the case of available information about the uncertainty, stochastic variables can be used with probabilistic description when implementing a deterministic model (Ma et al. 2021).

  • Fuzzy programming In the event when historical data is not available or the probability distribution is unknown, fuzzy programming can handle the uncertainty. Instead of using stochastic variables, a fuzzy programming implementation is used (Balasubramanian and Grossmann 2003; Prade 1979).

  • Robust scheduling During robust scheduling steps, the impact of disturbances on performance should be minimized by incorporating uncertainties into the model describing the scheduling task. This makes the schedule less sensitive to various distractions. Two fundamental aspects of optimization are solution robustness and quality robustness (Herroelen and Leus 2005).

In the case of a production system, the rework process is important if several aspects support the need to achieve better results. In case of semi-finished and finished products, when the objective is to minimize inventory cost, determining the optimal batch size is of paramount importance (Sarker et al. 2008). To reduce inventory cost if inventory is depleted at certain intervals in the production stage, an approach can be made that provides a solution for equal- and unequal-sized batch deliveries (Glock 2010), which can result in lower total costs. In the case of a two-stage production system, supplementing the solution with a variable that can properly measure the production rate may be a good solution to reduce excess inventory (Glock 2011). There are several other solutions in the literature where the batch sizing problem is solved. Such a solution could be a holistic scheduling algorithm (Bicheno et al. 2001), or in the case of a single-stage system due to defective items (Jamal et al. 2004) and in the case of multi-stage manufacturing systems solving the optimal batch sizes (Sarker et al. 2008). Investigating the implementation of Just-In-Time, where batch-sizing decisions (McKenzie and Jayanthi 2007) have an impact on different demand patterns from an operational and financial point of view adequate results were achieved. Similarly, if the Just-In-Time scheduling problem is approached from a general due date perspective, solution is achieved by treating it as an integrated batch sizing problem (Hazır and Kedad-Sidhoum 2014). Distributionally robust optimization based models can provide sufficient flexibility for random demands to adjust resource actions (batch sizes) in multi-stage decision-making (Shang and You 2018). Adjusting batch sizes to dynamic events entails frequent adjustments of job sequence, which can be avoided with a buffering mechanism (Qin et al. 2018). In Chemical, Petrochemical and Pharmaceutical Indsutries, risk mitigation approaches have the ability to explicitly take into account parameter uncertainty within a mathematical framework (Verderame et al. 2010). Demand uncertainty can be addressed with stochastic models, which can allow uncertain product demand correlation with the objective of profit maximization (Petkov and Maranas 1997). In short-term scheduling of multiproduct and multipurpose batch plants, the schedule from a multiperiodic formulation has much bigger robustness compared to the one determined based on the vertexes of the uncertainty range, considered as scenarios (Vin and Ierapetritou 2001). For due date changes, adopting discrete-time simulation method for minimizing responsiveness and tardiness is a viable solution in precast concrete structures (Kim et al. 2020). With prefabricated construction productivity and on-time delivery of precast components, a real case study can be conducted to test the validity of a two-level rescheduling model and achieve significant cost savings (Wang and Hu 2018). The applicability of two-stage adaptive robust optimization approaches are demonstrated on two case studies by introducing uncertain parameters (processing time, order demand) into the single-stage deterministic model and reformulating it into a two-stage batch scheduling model (Shi and You 2016). Polynomial-time efficient algorithms can be proposed for single-machine scheduling problems with serial and parallel batches under uncertain processing times using worst-case scenario analysis (Wu et al. 2023). For a single machine with availability constraints, where the machine suffers from unexpected breakdowns, a genetic algorithm by integrating a run-based preventive maintenance into the production scheduling model can optimize system robustness and stability with experimental results (Lu et al. 2015). Demand uncertainty may be represented by scenario trees in a multi-stage stochastic programming model with lot-sizing (batch size for each product) and overall system costs (production cost, setup cost, inventory cost, backlog cost)(Hu and Hu 2018). The results of a case study with two-stage sotachastic programming model for a manufacturing company producing braking equipment in automotive industry has showed that production quantity is more sensitive to the uncertainty than production sequence (Hu and Hu 2016). A combination of a mixed-integer linear programming and an agent-based reactive scheduling method can be used to minimize total cost of inventory holding and production setup, where processing times are uncertain (Chu et al. 2015). Rescheduling is traditionally viewed as an approach if uncertainty is present, but the event-triggered rescheduling has some shortcomings (constantly incoming new information), which can be addressed by approaching the rescheduling as an online problem (Gupta et al. 2016).

2 Illustrative scheduling problem class

The problem class selected for illustration is the mid-term scheduling of a production plant with an economic objective. The goal is to find a production schedule that fits into the predefined time horizon and maximizes the expected profit that derives from the following components:

  • Income by selling the produced products

  • Penalties if market demand is not met, or exceeded

There are several possible products produced via precedential recipes by the available units. For the sake of simplicity, processing is the only step with time requirements. Cleaning, transportation, etc. are considered to be instantaneous.

The core sets and parameters of the problem can be summarized as:

  • U set of available processing units

  • P set of possible products

  • \(b_p\) maximal batch size for \(p\in P\) \(\left[ t \right]\)

  • \(d_p\) market demand for \(p\in P\) \(\left[ t \right]\)

  • \(s_p\) selling price for \(p\in P\) \(\left[ cu/t \right]\)

  • \(o_p\) overproduction penalty for \(p\in P\) \(\left[ cu/t \right]\)

  • \(u_p\) underproduction penalty for \(p\in P\) \(\left[ cu/t \right]\)

  • \(I_p\) set of tasks to produce \(p\in P\), and let \(I=\cup _{p\in P} I_p\)

  • \(I^-_i\) \(\subset I\) set of predecessor tasks for \(i \in I\)

  • \(U_i\) \(\subseteq U\) set of units capable to carry out \(i\in I\)

  • \(pt_{i,u}\) \(\ge 0\) processing time of \(i\in I\) in \(u\in U_i\) \(\left[ h \right]\)

  • h \(\ge 0\) is the time horizon \(\left[ h \right]\)

To help to understanding the notations, an instance of this problem class is presented. Figure 1 shows the recipe of this instance which contains 3 products (\(P=\{A,B,C\}\)). Product A has a sequential production recipe with 3 production steps (\(I_A=\{i1,i2,i3\}\)), product B also has a sequential production recipe with 2 production steps (\(I_B=\{i4,i5\}\)) and the recipe of product C has 4 tasks (\(I_C=\{i6,i7,i8,i9\}\)) and it is not sequential. The recipe contains 4 processing units (\(U=\{u1,u2,u3,u4\}\)), which can be used for production. Tasks i1 and i6 can be performed by two processing units (u1, u2) where the processing time of task i1 is 4 h and the processing time of i6 is 8 or 9 h using u1 or u2, respectively. Task i6 generates 2 intermediates and task i9 has 2 prerequisite tasks (i7, i8).

Figure 1 does not contain all information of the system, like cost parameters. The parameters of the products and tasks are summarized in Tables 1 and 2, respectively, and the time horizon (h) is 55 h.

Fig. 1
figure 1

Example instance of the illustrative problem class

Table 1 Parameters of products for the example problem instance shown in Fig. 1. The units of measure are \(\left[ t \right]\), \(\left[ t \right]\), \(\left[ cu/t \right]\), \(\left[ cu/t \right]\) and \(\left[ cu/t \right]\) for parameters \(b_p\), \(d_p\), \(s_p\), \(o_p\) and \(u_p\) respectively
Table 2 Parameters of tasks for the example problem instance shown in Fig. 1

3 Proposed abstraction for decision processes

3.1 General concepts

From an abstract point of view, the difference between deterministic and stochastic problems comes down to the timing of the revelation of the values of input parameters and when certain decisions have to be made.

In the deterministic case, the values of all of the input parameters are known in advance, i.e., their values are revealed before any decision is made. In contrast, stochastic problems have some input data, whose value is revealed later than others, and some decisions have to be made between these two timings. The data that reveals themselves after some decisions have been made are often called uncertain.

If all of the decisions have to be made before the uncertain data reveal themselves, the approach is non-reactive. Robust and preventive approaches belong to this category, and the differentiation often lies in the selected objective function. However, if some decisions can be made or altered after an uncertain input is revealed, the approach is reactive.

Note, that unexpected events, such as unit shutdowns are within the scope of the description above. Such events could be considered as new, however, they can also be looked at as predefined parameters that reveal their values later. For example, possible shutdowns can be considered as boolean parameters for each task, which either turn out to be 0 or 1 at the time of their execution.

More often than not, if uncertain values are revealed later, there is still some information available on their possible values. These can be based on a predictive simulation, historical data, assessment from experts, etc. The level of detail of this information can vary a lot, and can influence the approach to be used. In this paper we will focus on the following options:

  • Estimated value- Only a single value is prognosed, which may be the mean or median value.

  • Interval - An interval is given for the possible values of the parameter, but no information is available about the probability distribution within that interval.

  • Probability distribution - A continuous or discrete probability distribution function is given for uncertain parameters.

Naturally, not all of the above are applicable for all cases, e.g., an interval prediction for a binary variable is trivial. However, the detailedness of this information can highly influence the possible objectives and approaches that can be developed. For example, without an interval, it is theoretically impossible to develop a robust approach with the worst case as an objective or bound. Similarly, expected values can only be incorporated into the approach if some distributions on the contributing values are available. From the practical point of view, the difficulty to assess this data can vary a lot based on the parameter, and may be unreasonably expensive to do so.

3.2 Enumeration procedure on the illustrative problem class

From the theoretical point of view, if there are n inputs and m decisions, there are \((n+m)!\) different orders of them, that could be considered as different problems. From the planners’ perspective, some of these orders can be considered the same. A decision can only rely on data that has already been revealed, therefore if two decisions are neighbors in an ordering, swapping them will not alter the information available for either of them. This can reduce the number of cases by magnitudes.

Moreover, in practice, input parameters are often revealed in larger groups, and some decisions also have to be made simultaneously. For illustration, let us split both the decisions and inputs of the considered illustrative problem class into two such groups:

  • \(\mathcal{D}^S\): the set of decisions related to scheduling, such as choosing the number of batches to be produced, and the assignment, sequencing, and timing of each task.

  • \(\mathcal{D}^B\): the set of decisions determining the size of each batch.

  • \(\mathcal{I}^P\): all the process data including recipes, processing times, task-unit compatibilities, etc. (U, P, \(b_p\), \(I_p\), \(I^-_i\), \(U_i\), \(pt_{i,u}\), h).

  • \(\mathcal{I}^M\): market related parameters describing demand, sell price, and costs for over- and underproduction (\(s_p\), \(o_p\), \(u_p\), \(d_p\)).

The idea behind this separation is that while \(\mathcal{I}^M\) may change on a daily basis, \(\mathcal{I}^P\) is more stable. Similarly, a facility often prepares the schedule (\(\mathcal{D}^S\)) in advance for a week or so, and may only do small adjustments on a daily basis, which in this case is considered to be the sizing of the batches (\(\mathcal{D}^B\)).

Having these four groups, \(4!=24\) different orders could have been considered, as illustrated in Fig. 2. Each order represents a different problem class, which will be referred to as cases in the rest of the paper. For a more expressive visualization, decision-items have orange background color, while information-items have green. The connected cases can be transformed into each other by swapping two neighboring items. The connections by swapping non-neighboring items are omitted from the Figure for transparency, as it would mean an additional 36 edges.

Fig. 2
figure 2

Different cases for the illustrative example

As mentioned before, some of these cases can be considered identical, as the decision-items have the same information available for them. These groups of cases are indicated by gray areas in Fig. 3. For example, the following four orders all describe the deterministic case: \(\mathcal{I}^P \mathcal{I}^M \mathcal{D}^S \mathcal{D}^B\), \(\mathcal{I}^M \mathcal{I}^P \mathcal{D}^S \mathcal{D}^B\), \(\mathcal{I}^P \mathcal{I}^M \mathcal{D}^B \mathcal{D}^S\), \(\mathcal{I}^M \mathcal{I}^P \mathcal{D}^B \mathcal{D}^S\), which can be simplified just as \(\mathcal{I}\mathcal{D}\).

Fig. 3
figure 3

Equivalent cases for the illustrative example

Removing these duplications, 14 cases remain, which are summarized in Table 3. Not all of the cases have equal practical significance:

  • Scheduling before any process data are available is meaningless, which eliminates cases 5, 7–11, 13.

  • In cases 5–7, 10, and 13, market parameters are known earlier than process parameters. While not completely impossible, these cases will be omitted from discussion as market environment data are usually much more volatile than process data.

  • The remaining are cases 12 and 14, where batch sizing data have to be made a-priori. A valid scenario that could be described by these cases is when the sizes of the units have to be decided in the design phase of a plant, and scheduling decisions have to be made much later without the option of scaling the processes down. From the scheduling point of view, these are cases when the batch sizes are fixed.

Table 3 Considered cases for the illustrative problem class from Sect. 2

Note, that in some cases, the decision maker is in a better position than in others. Clearly, the deterministic case (\(\mathcal{I}\mathcal{D}\)) is the most advantageous among all of them, when all of the information is available before any decision has to be made. This relation can be defined formally and forms a partial ordering, with the deterministic case being the maximal element, and case 8 as the minimal. In the current context, it is more expressive to talk about stronger and weaker cases instead of larger and smaller ones. Informally, a case is stronger than another, if the decision maker has always at least as much or more information available for all of the decisions than in the other case. (Formally, case X is stronger than case Y, if for all \(\mathcal{D}\)-type items, the set of \(\mathcal{I}\)-type items preceding it in case Y is a subset of those in case X.) Obviously, identical cases are equally strong and form equivalence classes.

In Fig. 4, this ordering is represented by the direction of the arcs. Transitive arcs are omitted for transparency.

Fig. 4
figure 4

Strongness directed network of merged cases for the illustrative example

Another layout of the same network in Fig. 5 better indicates the strongness hierarchy, and the selected cases for Sect. 4 are also indicated.

Fig. 5
figure 5

Strongness hierarchy of merged cases for the illustrative example

4 Evaluation of selected cases

The 4 selected cases will be discussed in detail in this section. The main aspects of the investigation will focus on

  • Complexity - How many variables or constraints a proposed model may require compared to the baseline.

  • Reusability - In a practical setting, how much of the effort needs to be repeated if the problem is solved on a daily basis with changed data.

  • Generalizability - How easy, solvable it is if the profit function is generalized.

4.1 \(\mathcal{I}\mathcal{D}\) deterministic case

This is the strongest case that can serve as a baseline. All information is known in advance, thus the model is deterministic.

The problem can be formulated as a mixed-integer linear programming problem easily. Depending on the type of model used (precedence-based, discrete, continuous, etc.) the number of continuous and binary variables, as well as the number of constraints may vary a lot.

The profit function according to the base problem defined in Sect. 2 is:

$$\begin{aligned} \sum _{p \in P} \left( s_p \cdot \min (q_p,d_p) - u_p\cdot \max (0,d_p-q_p) - o_p\cdot \max (0,q_p-d_p) \right) \end{aligned}$$

where \(q_p\) indicates the produced quantity from product \(p\in P\). This is a piecewise linear function for each product with two pieces: above and below \(d_p\). For better understanding Fig. 6 shows the profit function of a product and Fig. 7 shows the profit function of 3 products.

Fig. 6
figure 6

The profit function of a product

Fig. 7
figure 7

The profit function of three products

This function can be modeled in different ways: a binary variable may be introduced for each product for the selection of the segments, or two non-negative continuous variables can be introduced for under- and overproduction, and the profit is reformulated as:

$$\begin{aligned} \sum _{p \in P} \left( s_p \cdot d_p - (u_p+ s_p) \cdot q^u_p - o_p \cdot q^o_p \right) \end{aligned}$$

where \(q^o_p\) and \(q^u_p\) are non-negative continuous variables that model \(\max (0,q_p-d_p)\) and \(\max (0,d_p-q_p)\), respectively.Footnote 1 This also means that \(q_p\) must be replaced by \(d_p+q^o_p-q^u_p\) everywhere else in the model.

It is also important to note, that the evaluation of the profit (and thus \(o_p,u_p\) and \(s_p\)) only appears in the objective function, and they have no influence on the other parts of the model. This has two beneficial consequences.

If the \(\mathcal{I}^M\) variables change for the next day, the solution acquired before will still be feasible if the \(q^o_p\) and \(q^u_p\) are shifted with the change in demand. This means that with this simple preprocessing step, a feasible solution for the MILP model is available, providing an initial upper bound. Since this solution is primal feasible, the bound can be improved by running the second phase of the simplex algorithm to achieve higher profit by allowing to change only the mass related continuous variables, while keeping discrete scheduling variables and continuous timing variables fixed.

The other benefit is, that if a more complex profit evaluation is needed to model the real life example, properly it will only appear in the objective function. Thus, for example, if this function is quadratic in nature, the problem can still be formulated as a Mixed Integer Quadratic Programming (MIQP) problem (Bonami et al. 2018; Mencarelli and D’Ambrosio 2019). Modern solvers such as CPLEX or Gurobi can solve MIQP problems with some solver specific restrictions. For example in case of CPLEX the objective function can contain quadratic terms and it must be convex or the objective function can contain only quadratic terms which are products of binary variables (in this case, the objective function is not necessarily convex). Moreover, the constraints must be inequalities and the constraints that contain a quadratic term can be represented as second order cone programs (SOCP) or the quadratic term in a constraint involves only multiplication of binary variables.Footnote 2 In the case of multiobjective problems, formulating the problem as MILP or MINLP, Pareto optimal solutions can be achieved with the CPLEX or BARON solver (Capón-García et al. 2011).

4.2 \(\mathcal{I}^P \mathcal{D}\mathcal{I}^M\) market-sensitive scheduling and batch-sizing

In this case, one has to make the scheduling and sizing decisions before the market data reveals itself. Unlike in the deterministic case, it becomes important what kind of information is available about the the uncertain values.

4.2.1 Expected demand

The simplest option is if a value is estimated, and then the goal of the optimization is to maximize profit for that scenario. This situation is equivalent to the deterministic case from the mathematical point of view.

4.2.2 Demand interval

A slightly more detailed option is, if there is an interval \([lb_p,ub_p]\) available for each uncertain variable. Then, for each production plan, a worst-case and a best-case scenario can be evaluated. If an estimated case is also available within the interval, an “expected-case” is also available. These values give rise to several optimization objectives, for example:

  • Maximizing expected- or best- case while having a lower bound on the worst-case

  • Maximizing worst-case

Maximizing the worst-case scenario is simple, as it will happen at one of the ends of the given interval, whether \(q_p\) is inside it or not. Thus, one can add the profit expressions as in the deterministic case once with \(d_p=lb_p\) and then with \(d_p=ub_p\) for each product. Then, for each product a new profit variable is introduced as the lower bound of these two cases. These continuous variables are then summed in the maximization objective.

Implementing a lower bound on the worst case can be done in a similar fashion. The only difference is that the sum mentioned previously is present in a constraint with this lower bound. Maximization of the expected case is the same as the deterministic case. Maximizing the best case is similar, with the only difference that if \(q_p\) is within the interval, then \(d_p\) can be equal to it. If \(q_p < lb_p\), then the profit should be calculated as if \(d_p=lb_p\). The situation is similar if \(q_p>ub_p\). This can be done easily if \(q^o_p\) models \(\max (0,q_p-ub_p)\) (instead of \(\max (0,q_p-d_p)\)) and \(q^u_p\) does the same for \(\max (0,lb_p-q_p)\). Note, that this modeling trick does not allow to use \(d_p+q^o_p-q^u_p\) instead of \(q_p\) in the other parts of the model anymore. One should add a new bounded \(q'_p \in [0,ub_p-lb_p]\) variable, and use \(d_p+q^o_p-q^u_p+q'_p\) instead.

4.2.3 Discrete demand distribution

In some cases, the probability distribution of the uncertain values can be estimated with a (finite) discrete distribution function. For the sake of simplicity, let us assume that there is a finite set of scenarios, S, and the expected values of all of the uncertain variables are given for all of the scenarios. In a more complex case, some sets of uncertain variables would have their scenarios independently. However, having independent scenario sets \(S_1, S_2, \dots\) would only translate to \(|S_1|\cdot |S_2|\cdot \dots\) global scenarios. For each \(s\in S\), there is a probability assigned, P(s) such that \(\sum _{s\in S}P(s)=1\).

Evaluating the profit would require separate \(q^o_{p,s}\) and \(q^u_{p,s}\) variables for each scenario, \(s\in S\), which would model \(\max (0,q_p-d_{p,s})\) and \(\max (0,d_{p,s}-q_p)\), where \(d_{p,s}\) is the expected demand in the scenario. In the objective function there is just a weighted version of the original objective for each scenario:

$$\begin{aligned} \sum _{s\in S} P(s) \cdot \sum _{p \in P} \left( s_{p,s} \cdot d_{p,s} - (u_{p,s}+ s_{p,s}) \cdot q^u_{p,s} - o_{p,s} \cdot q^o_{p,s} \right) \end{aligned}$$

where \(s_{p,s}\), \(u_{p,s}\), and \(o_{p,s}\) are the price, under- and overproduction costs for product \(p\in P\) in the scenario \(s\in S\). Figure 8 shows an aggregate profit function of a product in case of three scenarios with probabilities 0.4, 0.5 and 0.1.

Fig. 8
figure 8

The aggregate profit function of three scenarios of a product (product A)

Thus the model requires \(2\cdot (|S|-1)\) additional continuous variables, and the constraints for setting them. Note, however, that these variables are strongly bound together, as for all \(s \in S\), \(d_{p,s}+q^o_{p,s}-q^u_{p,s}\) must represent the same value, \(q_p\) that is independent from s.

Incorporating continuous distribution functions into an MILP model is impossible expect for trivial cases. Regardless of the detail of the information on the market data, if it changes, re-running the model is equally time consuming as it was in the deterministic case. Similarly, considering a more complex cost function is possible within similar limits as in the deterministic case.

Note, that the ideas above are general, not specific to the illustrative example. If there are uncertain variables and variables that depend on them, the scenario-based approach can be used in the same way even if those variables are binary decision variables.

Scenario-based approach is applied with uncertainty of the number of jobs using MILP and branch-and-price algorithm to minimize the expected costs (Wullink et al. 2004). Similarly, with uncertain time requirements of elective operations in theater scheduling decisions with the objective to maximize the expected profit is applied in (Freeman et al. 2016). Data-driven uncertainty set with probability distributon for market demand is used to maximize the net present value at each time period in Ning and You (2017).

4.3 \(\mathcal{I}^P \mathcal{D}^S \mathcal{I}^M \mathcal{D}^B\) market-reactive batch-sizing

This case represents a middle point between the previously discussed cases, as it is clearly visible in Fig. 5. Compared to the deterministic case in Sect. 4.1, it is weaker, as scheduling decisions have to be made before market information is available. On the other hand, it is a stronger case than the market-sensitive one from Sect. 4.2, as the decision maker has the ability to react on the market information with the decisions about batch sizing.

If the possible outcomes from \(\mathcal{I}^M\) are finite, a two-stage model can be easily formulated. For example, having a finite set of scenarios, S, as discussed in a previous section. In such a situation, for each variable, x that either represents a decision from \(\mathcal{D}^B\) or a calculated value based on those variables, a set of variables, \(x_s\) has to be introduced, where \(s \in S\). Obviously, \(q^u_p\) and \(q^o_p\) would be such derived variables, and any variables they derive from, possibly the independent sizing variables for the batches. Thus, instead of \(q^o_p\) and \(q^u_p\), several (\(2\cdot |S|\) to be precise) variables are introduced, denoted by \(q^o_{p,s}\) and \(q^u_{p,s}\), respectively for \(s \in S\). This is similar to expected profit maximization discussed in the previous chapter. However, a key difference is that these variables are more independent, as \(d_{p,s}+q^o_{p,s}-q^u_{p,s}\) (representing \(q_{p,s}\)) can be different for each scenario.

Any constraint that involves any of these variables have to be added multiple times for all \(s \in S\), as well. For example, if an objective variable, \(w_s\) is introduced, the following constraints may be added:

$$\begin{aligned} w_s = \sum _{p \in P} \left( s_{p,s} \cdot d_{p,s} - (u_{p,s}+ s_{p,s}) \cdot q^u_{p,s} - o_{p,s} \cdot q^o_{p,s} \right) \qquad \forall s\in S \end{aligned}$$

Having these variables, an expected profit can simply be expressed as \(\sum _{s \in S}P(s)\cdot w_s\). Imposing a lower bound on the worst case situation can be done by setting the lower bound for all \(w_s\) variables. If the worst case is to be maximized, a new variable w may be introduced with \(w \le w_s\) for all \(s\in S\), and then this variable is maximized.

Note, that this approach can significantly increase the size of the model, and thus, its computational need. Let us say that \(\texttt{v}^B\) variables represent or are infected by decisions from \(\mathcal{D}^B\) and \(\bar{\texttt{v}}^{B}\) are not. Similarly, let \(\texttt{c}^B\) denote the number of constraints that have any of those \(\texttt{v}^B\) variables, and \(\bar{\texttt{c}}^{B}\) those that rely only on the non-affected variables.

In the market-sensitive case from the previous section, the model would have \(\texttt{v}^B+\bar{\texttt{v}}^{B}\) variables and \(\texttt{c}^B+\bar{\texttt{c}}^{B}\) constraints. In the scenario-based market-reactive case, the model has \(|S|\cdot \texttt{v}^B+\bar{\texttt{v}}^{B}\) variables and \(|S|\cdot \texttt{c}^B+\bar{\texttt{c}}^{B}\) constraints. Clearly, by increasing the number of scenarios the complexity of the model increases, as well. This presents a practical trade-off between having a solvable model and having a more detailed scenario distribution that models the uncertain variables better.

Naturally, the number and type of those \(\texttt{v}^B\) variables and \(\texttt{c}^B\) constraints are significant. In this illustrative example, only a small portion of the variables is affected, and all of them are continuous. Similarly, only a few equality constraints are affected by those variables. Most of the variables and constraints deal with scheduling decisions, such as timing, allocation, precedence, etc, that are unaffected by \(\mathcal{D}^M\). Thus, this is a much more fortunate situation than \(\mathcal{I}^P \mathcal{D}^B \mathcal{I}^M \mathcal{D}^S\) would have been. The set of variables affected by scheduling decisions is a much larger subset of all of the variables, i.e., \(\texttt{v}^S\) is much larger than \(\texttt{v}^B\), and the same could be said about \(\texttt{c}^S\) in relation to \(\texttt{c}^B\).Footnote 3 Moreover, plenty of these variables are binary, thus not only \(|S|\cdot \texttt{v}^S+\bar{\texttt{v}}^{S}\) is significantly larger than \(|S|\cdot \texttt{v}^B+\bar{\texttt{v}}^{B}\), it is true for the affected number of discrete variables, as well. Obviously, solving the model in that case is a much more CPU extensive job than in this one. In addition, sequencing constraints that utilize big-M relaxation are also affected by \(\mathcal{D}^S\) in the \(\mathcal{I}^P \mathcal{D}^B \mathcal{I}^M \mathcal{D}^S\) case. Thus, the two-stage model with \(|S|\cdot \texttt{c}^S+\bar{\texttt{c}}^{S}\) constraints would not only have more constraints, but probably provide worse bounds due to the increased number of big-M constraints.

Note, that the idea above can be generalized to three-stage, four-stage, ..., n-stage cases, as well, that do not appear in our illustrative example. Let us consider a \(\mathcal{I}^0\mathcal{D}^1\mathcal{I}^1\dots \mathcal{D}^n\mathcal{I}^n\) case, where \(\texttt{v}_k\) variables are affected by \(\mathcal{D}^k\) but unaffected by \(\mathcal{D}^{k+1},\dots ,\mathcal{D}^n\), and the outcome for each \(\mathcal{I}^l\) is approximated by \(\texttt{s}_l\) scenarios. For the sake of simplicity, it is assumed, that these scenarios are independent from each other. In this case, there would be only one version of the \(\texttt{v}_1\) variables only affected by \(\mathcal{D}^1\), \(\texttt{s}_1\) different version of all of the variables affected only by \(\mathcal{D}^1\) and \(\mathcal{D}^2\), as in the above case. For the variables that are affected by \(\mathcal{D}^1\), \(\mathcal{D}^2\) and \(\mathcal{D}^3\), \(\texttt{s}_1 \cdot \texttt{s}_2\) different versions must be added, and so on. Finally, the \(\texttt{v}_n\) variables that are affected by \(\mathcal{D}^n\) have a total number of \(\texttt{s}_1 \cdot \texttt{s}_2 \cdot \dots \cdot \texttt{s}_{n-1}\). Altogether, the total number of variables is:

$$\begin{aligned} \texttt{v}_1 + \texttt{s}_1 \cdot \left( \texttt{v}_2 + \texttt{s}_2 \cdot \left( \texttt{v}_3 + \texttt{s}_3 \cdot \left( \dots \texttt{v}_n \right) \dots \right) \right) = \sum _{k=1}^n \left( \texttt{v}_k \cdot \prod _{l=1}^{k-1} \texttt{s}_l \right) \end{aligned}$$

As an illustration, Table 4 shows the number of variables for several n values while considering that \(\texttt{v}_k=1\) and the number of scenarios for each stage is also uniform, e.g., \(\texttt{s}_k\) is independent of k and simply denoted by \(\texttt{s}\). In this simplified case the above formula reduces to \(n\cdot \sum _{k=0}^{n-1} \texttt{s}^k\). Also, if the case were deterministic, the original number of variables would be n in each case, which is the same as if \(\texttt{s}=1\).

Table 4 Number of variables in the n-stage model with \(\texttt{v}_k=1\) and uniform \(\texttt{s}\) values

Re-running these models can be exhaustive, however, in this particular example, if the number of feasible solutions for the part of the model with \(\bar{\texttt{v}}^{B}\) variables and \(\bar{\texttt{c}}^{B}\) constraints are manageable, a two-stage solution approach can be implemented. First, the feasible solutions to this part of the model are generated and saved. This is a time consuming process, but it has to be done only when \(\mathcal{I}^P\) changes, which is rare. Next, each time the predictions on \(\mathcal{I}^M\) change, the only remaining part of the model needs to run for all of the saved schedules. Obviously, this is only reasonable, if

  • The deterministic part is heavy, with many discrete variables and big-M constraints

  • This part of the model has a relatively small number of feasible solutions

  • The frequently changing part of the model is small and quick to solve

The advantage of this approach besides the reduced CPU times is that the models for the two parts are handled separately. It makes a possibility to apply different solution approaches. For example, the first part can be done by an MILP or SAT approach, while the profit optimization part can rely on something else. This allows the model to address more complex profit functions or continuous probability distributions for \(\mathcal{I}^M\). For example, if only the demand changes, the news-vendor model (Edgeworth 1888) can be applied to address a continuous distribution for \(d_p\) (Hegyhati et al. 2010). An additional benefit is that the saved schedules can be evaluated independently which makes it easy to deploy the calculations on a cluster, and carry out the computations in parallel. By doing this, more complex profit functions or a higher number of considered scenarios becomes possible to consider.

4.4 \(\mathcal{I}^P \mathcal{D}^B \mathcal{I}^M \mathcal{D}^S\) market-reactive scheduling

At first glance, this case could seem unrealistic, however, it actually represents an integrated planning and scheduling problem if the equipment units must always be operated in 100% capacity, and their sizes have to be decided in the facility-design phase. In this case, the sizing of the equipment units means the sizing of the batches as well, which is done once before building the factory, and thus before the daily market data is revealed. Thus, batch sizing decisions can not reflect on the daily demand, however, scheduling decisions can.

Similarly to the market-sensitive batch-sizing case, a two-stage programming model can easily be derived if the distribution of market data can be described in the form of discrete scenarios. However, as mentioned previously, the number of variables and constraints affected by \(\mathcal{D}^S\) is much larger than that of \(\mathcal{D}^B\). As a result, this model would become highly inefficient even with a small number of scenarios.

A more reasonable approach could be to generate all of the feasible schedules before making the decisions on batch sizing. Although this process would definitely take a considerable amount of time, most probably, that is not a significant issue in the planning phase of a facility. Generating all of the feasible schedules via MILP models can be cumbersome, but other approaches, such as an S-graph algorithm or potential constraint programming model can do this in a straight-forward way.

The end result of this procedure is a list of batch number configurations (P-tuple of integers), that are feasible within the time horizon. Let C denote this set, and \(n_{p,c}\) refer to the number of batches produced from \(p\in P\) in the configuration \(c\in C\). \(\mathcal{D}^S\) is reduced to a single selection from C, which would mean |C| binary variables, whose sum is 1.

In the two-stage model, that would result in \(|C|\cdot |S|\) binary variables, which may be a huge number, however, the overall model is very simple:

$$\begin{aligned} \begin{array}{rcll} \sum\limits_{c\in C} x_{c,s} &{}=&{} 1 &{} \qquad \forall s \in S \\ d_{p,s} + q^o_{p,s} - q^u_{p,s} &{} = &{} \sum\limits_{c\in C} n_{p,c} \cdot b_p \cdot x_{c,s} &{} \qquad \forall p\in P, s\in S \end{array} \end{aligned}$$

Where \(b_p\) is now a first stage continuous variable, and binary variable \(x_{c,s}\) denotes whether batch configuration c would be selected in scenario s. As a result, the latter equality contains a bilinear term, that could be linearized. This preventive model can provide the solution with the highest expected profit.

There are two more practical considerations to discuss for this case. When the planning phase is over in practice, and thus the \(b_p\) values are fixed, the problem reduces to a deterministic profit maximization scheduling problem, as all the remaining decisions can be made after all the input data is revealed. Thus, if the market data change for the same factory, the \(b_p\) values can be considered as input parameters at that point. However, if the C set is still kept, finding the optimal profit reduces to finding the minimal element between the values of the |C| profit options. The calculation of these profit options is a very simple formula, which can be done in parallel if necessary.

Another practically interesting situation is, when a similar plant (with the same set of units and products) is to be designed with different market estimates (scenarios and related parameters). If that were to happen, the |C| configurations would still be valid, and could be reused with the model above.

If other meta-information is also saved during the generation of the configurations, such as the minimal makespan for each configuration, the generation of the C set can be avoided and replaced by simple filtering if the new plant differs by having a shorter time horizon or having a subset of products. Even if the time horizon gets a bit longer, most of the computation can be skipped using the old C set as a cache.

5 Concluding remarks

Uncertainties are almost guaranteed for any application in the field of optimization. The contributions of this paper to this topic are twofold. In the first part, a generic characterization approach is presented for stochastic optimization problems. This method identifies a problem class by an alternating sequence of sets of information reveals and decisions to be made. The enumeration of such problem classes was illustrated by a stochastic scheduling example, where both the problem data, and the set of decisions are split into two parts. Two relations among those classes were identified: an equivalence relation for sequences that define the same problem, and a partial ordering where a "stronger" problem class is unequivocally more beneficial for the decision maker, with the deterministic class being the maximal element.

For the illustrative example, 14 such distinctive classes were derived from the 24 possible sequences, and referred to as cases. The second part of the paper focused on the possible modeling and optimization techniques for 4 selected cases, where some or all of the decisions have to be made before market data is revealed. These options were discussed in detail, and investigated from the aspect of complexity, generalizability and reusability. While some of the statements hold for other optimization problems, regardless of the nature of their information and decision sets, others exploit problem specific features to propose more efficient approaches. This part of the discussion focuses not only on theoretical complexity, but possible implementation and practical application of the mentioned approaches, such as building a cache of feasible schedules a-priori. This second part leaves room for follow-up papers to conduct empirical tests on case studies, and compare the results based on quality, efficiency, and applicability.