1 Introduction

Hardly any product is as socially relevant as electrical energy. Many everyday objects only work with electricity while electrification is steadily increasing. In the course of Industry 4.0, industrial companies rely on automated processes using robots, driver-less transport systems or Auto ID technologies. In 2017, German industrial companies consumed 248.6 TWh of electrical energy and overall, the industrial sector is responsible for almost half of the total national electricity consumption (Ziesing 2018). The resulting CO2 emissions amount to about one-fifth of total emissions (Dai et al. 2013).

The great importance of energy not only leads to a great social interest in efficient and sustainable use, companies are also increasingly pressured to reduce their energy costs in the face of global competition. Furthermore, they can benefit from an environmentally oriented image. Consequently, energy costs are now being taken into account in many approaches of production planning and control and thus also in operative planning in the form of energy efficient scheduling (EES). Together with approaches to reduce emissions or waste and preserve resources, a completely new branch of green scheduling research has thus developed. A general overview about different approaches to consider energy consumption in scheduling is given by Biel and Glock (2016) or Gahm et al. (2016).

In this article we look at an extended flow shop problem, the hybrid flow shop (HFS) problem. In a flow shop problem, all jobs are processed in a multi-stage production in the same machine sequence, whereby only one machine is available at each stage. In contrast, in an HFS problem several machines are available on at least one stage. This allows, for example, to overcome step-related bottlenecks. The HFS problem can thus be seen as a generalization of a parallel machine problem and a flow shop problem, which can be found in many industrial processes such as electronics, paper, textile, pharmaceutical, and sheet metal industry (Yu et al. 2018).

Among the EES approaches, variable production speeds are probably one of the most promising methods in order to significantly reduce energy consumption (Mecrow and Jack 2008). This article deals intensively with this topic. High savings potential exists, for example, in pumps which have high energy consumption in injection moulding plants or for water supply in paper mills. If a pump or fan works at 50% of the maximum volume flow, only 25% of the maximum pressure must be generated and the required power drops to 12.5%. This is based on the affinity law, which states that the power consumption of a pump is proportional to the cube of the speed. Thus, even small changes in the flow rate can result in major energy savings (Lönnberg 2007). Logically, the savings depend on the respective technical device. Even with industrial motors, enormous energy savings can be achieved through speed reductions. Besides lower energy consumption a reduction in energy costs can also be achieved through, inter alia, the exploitation of time-dependent energy prices, which are also considered in this work.

In addition to energy consumption and costs, other objectives are usually pursued. In recent years, several multi-criteria problems were published in EES. Commonly besides energy, utilization-oriented objectives as makespan are considered. In cases of strongly networked supply chains with just-in-time requirements, however, time criteria such as punctual delivery are playing an increasingly important role. Delayed production can lead to high contractual penalties and loss of confidence. Surprisingly, tardiness is rarely taken into account in EES. For that reason, this paper examines total tardiness and time depending energy costs as two separate objective functions using the ideas of multi-criteria optimization. To the best of our knowledge, this setting has not been addressed in HFS scheduling so far.

The article is structured as follows. Section 2 describes the current state of research. Subsequently, the problem is defined in Sect. 3 and possible mathematical formulations are discussed. Approaches of multi-criteria decision-making are also explained here. A computational study follows in Sect. 4. Finally, a short conclusion is given.

2 Related literature

In the following, the current state of research in EES including the fact that tardiness has hardly been taken into account so far is described. Due to the limited scope of the article, we will limit ourselves to machine and production scheduling in the following. However, it should be mentioned that the problem is basically similar to the multi-mode resource-constrained project scheduling problem (MMRCPSP). Mathematical problem formulations from this area can be found, for example, in Naber and Kolisch (2014), Besikci et al. (2015) or Wauters et al. (2016). To the best of our knowledge, no MMRCPSP model can be directly applied to the problem considered here. The concept of multiple modes refers to the possibility to execute activities in different execution modes which allows to vary required time and resource consumption (e.g. energy). In addition to execution modes and variable speeds, the term different production rates is sometimes used. Since the term “variable speeds” is often applied in the field of machine scheduling and particularly for EES, we use it primarily in the article.

2.1 Energy efficient scheduling

There are various options to reduce energy costs in scheduling, whereby the possibilities also depend on the respective production processes. Basically, either energy consumption is reduced directly through intelligent planning (Sect. 2.1.2) or pricing mechanisms are exploited to reduce expenses while energy consumption stays at the same level (Sect. 2.1.1). An overview is shown in Fig. 1. The concepts (a)–(f) are explained in detail in the following. The framed approaches in Fig. 1 will be taken into account in the considered problem. In addition to the approaches listed here, there are a few very specific contributions, which are not taken into account. For example, Nolde and Morari (2010) as well as Hait and Artigues (2011b) examine load tracking scheduling in a steel plant. Energy provider and consumer may agree upon a target load curve. For deviations the company has to pay (called tracking errors). Also Modos et al. (2017) examine this problem.

Furthermore, this paper concentrates on electrical energy. The EES literature also contains contributions that deal with heat, cold, water or emissions. A good overview about all research trends in EES is given by Gahm et al. (2016). They also propose a classification framework. This contribution can be classified in this respect by \({ eCS},\; { PS} | { TOU} | { FLX}\) indicating:

  • Energy coverage: external conversion system (eCS), production system (PS),

  • Energy supply: price driven demand response by time-of-use prices (TOU),

  • Energy demand: flexible (FLX) processing energy demand.

Fig. 1
figure 1

Overview about different EES approaches

2.1.1 Utilisation of market mechanisms with constant consumption

While private consumers are bound to fixed tariffs, companies with an annual consumption of 10 MWh or more can negotiate special contracts with the respective energy supplier. Most of these bilateral contracts are not publicly available. Nevertheless, it is known that companies usually have to pay an electricity charge per kWh consumed and a so-called demand charge for the respective maximum peak power within the billing period (Bego et al. 2014).


(a) Peak power reduction

To reduce costs due to peak power consumption, peaks must be reduced or entirely avoided. While the peak power fee normally covers at least a time period of several weeks, scheduling is primarily dedicated to shorter production periods. Therefore, the direct integration of the demand charge in a scheduling problem is not straightforward possible. However, since the demand charge per kW is 200–400 times higher than the electricity price per kWh (Nghiem et al. 2011), reducing the energy peak can be fairly lucrative. The peak power is often considered as a constraint (e.g. Bruzzone et al. 2012; Xu et al. 2014 or Schulz 2018) and set to a historical value which must not be exceeded. A parametric optimization is also possible by varying the respective upper bound. A direct minimization within the objective function is done for example by Nagasawa et al. (2015) who present a simulation approach for a flow shop problem with random processing times. Strong fluctuations in energy consumption does not only lead to high peak power costs. Constant energy consumption is also important for internal power generators or converters. Rager et al. (2014) publish an approach which aims to minimize the sum of the squared deviations of each load from the expected average energy consumption to improve the performance of an energy conversion system.


(b) Time-of-use prices

Electricity charges may depend on the time the energy is used, which requires consumption to be postponed in times of lower prices. These shifts are in contrast to the levelling due to the demand charge. One of the first contributions dealing with time-of-use tariffs (TOU) in production scheduling was published by Nilsson and Söderström (1993). Castro et al. (2009) present a continuous-time scheduling formulation for an EES model considering two different TOU energy prices. As a result, the energy costs can be reduced by around 20% by moving consumption from on-peak to off-peak times. Che et al. (2017b) consider an unrelated parallel machine problem and suggest a constructive heuristic to minimize energy costs and makespan simultaneously.


(c) Real-time prices

While TOU tariffs usually have two (on-/off-peak) or three (on-/mid-/off-peak) different time-depending price levels, Ding et al. (2016) analyse the influence of frequently fluctuating TOU prices in an unrelated parallel machine scheduling problem. Such problems with hourly fluctuating energy prices are often referred to as real-time prices (RTP). Energy-intensive companies with annual consumption of more than 100 MWh may purchase electricity directly from the stock exchange. This means, they pay the RTP for quantities that are not hedged by long-term derivatives. These prices fluctuate at the European Energy Exchange (EEX) every 15 min, which increases the computational effort for optimization enormously. Mitra et al. (2012) describes mixed-integer problem (MIP) formulations for the production planning of a week with hourly fluctuating energy prices. In Küster et al. (2013), a complex production process with RTP is visualised as a bipartite graph. The authors then present a simulation and optimization approach. They explain that in times of lower electricity prices more renewable energies are fed into the grid and thus not only costs are saved but the environment can also be protected.

2.1.2 Reduce energy consumption through intelligent planning

Reducing energy costs by making use of market mechanism does not influence the energy consumption and can even have a negative impact on CO2 emissions and other environmental factors. Zhang et al. (2014) discuss the correlation between utilization of TOU tariffs and CO2 emissions. They show that in times of low electricity prices, emissions are on average higher than those in on-peak periods. Interestingly, this contradicts the statement of Küster et al. (2013). Overall, both points of view can be understood and it depends largely on the energy market under consideration whether low prices are accompanied by lower emissions. For example, electricity prices are usually low at night. Since only few renewable energy sources can be used at night, this electricity is often generated by conventional power plants such as coal. The extent to which renewable energies are freely traded on the market or are subsidised must also be taken into account. In any case, from an environmental point of view, it may be beneficial if scheduling also reduces energy consumption and considers environmental impacts as well. For this purpose, three different approaches are particularly considered in literature as can be seen in Fig. 1.


(d) Increase utilization of more efficient machines

In heterogeneous production environments with parallel machines which have different energy consumptions and processing times for the same task, consumption can be reduced through higher utilization of more efficient machines. Ji et al. (2013) consider a uniform parallel machine problem in which machines with higher resource consumption also work faster. The authors present an MIP as well as a particle swarm heuristic to optimize resource consumption for a given maximum makespan, whereby the term resource is not limited to energy. A particle swarm optimization is also used by Nilakantan et al. (2015) to minimize makespan and energy consumption in a robotic assembly line system with differently efficient robots. Schulz et al. (2019) propose an iterated local search algorithm to optimize three different objectives (makespan, energy costs, peak power) in a heterogeneous hybrid flow shop problem. In this work, however, we focus on identical parallel machines.


(e) Making use of different machine states

Various EES contributions take different machine states into account. The basic idea is to optimize idle and standby times as well as making intelligent shut down decisions. Liu et al. (2014) analyse a job shop problem with three different processing levels (idle, runtime, cutting), whereby the energy consumption is a deterministic value and can only be reduced by minimizing the non-processing time. Also in the HFS problem considered by Dai et al. (2013) the basic idea is to minimize the idle energy consumption. If it is possible to shut down machines, the amount of possible energy savings increases. This concept is examined for example by Mashaei and Lennartson (2013). They minimize energy consumption in a flow shop problem for a given cycle time by weighing between switching off and idling in times of no production. Thereby, switching on and off leads to higher energy consumption than short idle times, but is advantageous during longer periods of standstill. Li et al. (2018) assume that setup energy is required after idle times and the objective is to minimize makespan and total energy consumption in an HFS. In Wu and Sun (2018) on/off decisions are not only used to reduce energy consumption in a flexible job shop problem, but the total number of turning-on/off machines is minimized as a third objective besides makespan and total energy consumption.


(f) Variable production speed

Most EES articles consider discrete constant energy consumption depending on the machine state, which is only an approximation of actual energy requirements. Therefore, often data from energy audits are used, which provide average values (e.g. Abdelaziz et al. 2011). Real production processes are subjected to various factors and uncertainties (e.g. Le and Pang 2013). Only a few authors consider realistic models for energy consumption. For example, Yan et al. (2016) look at different cutting machines within an HFS and propose a multi-level optimization approach to minimize makespan and energy consumption when taking into account different cutting speeds. Thus, in addition to random influences such as machine ageing, environmental effects or material parameters, energy consumption can also be directly influenced by variable production speed. That idea is used in different EES approaches to reduce energy costs, energy consumption or emissions and is also considered in this work.

For example, Fang and Lin (2013) propose an MIP formulation for flow shop problems. A further approach with processing time depending energy consumption is published by Zanoni et al. (2014). They consider a two-machine problem with three storages to minimize the total costs of production, storage and energy. Liu et al. (2012) examine the interdependencies between process efficiency and energy consumption of an electroplating unit in a hoist scheduling problem. A stochastic problem with non-linear energy cost function depending on variable production quantity is formulated by Tang et al. (2012). In Hait and Artigues (2011a) the task duration depends on the given power to a furnace. A model and constructive heuristic for a bi-objective two stage flow shop with three different speed levels is described in Mansouri et al. (2016). Lei et al. (2017) examine a flexible job shop scheduling problem with variable discrete production speeds. To minimize total energy consumption and workload balance they propose a shuffled frog-leaping algorithm. In a later work Lei et al. (2018) present a novel teaching-learning algorithm to minimize total energy consumption and total tardiness in a hybrid flow shop scheduling problem. The problem is basically similar to the one we are investigating here. However, no TOU prices are considered, so the decision problem is limited to semi-active schedules. In the following a few more EES articles are described, which consider tardiness as objective.

2.2 Multi-criteria EES with total tardiness

EES approaches often consider further objectives besides energy demand or costs. However, this is usually limited to makespan. In our opinion, it is desirable to include energy costs and tardiness in an approach, especially since it is hardly been done so far. To the best of our knowledge this is the first time that both objectives are considered in an HFS. Besides HFS there are a few publications that investigate total tardiness in multi-criteria EES.

Artigues et al. (2013) minimize energy and power overrun costs in a parallel machine problem, and use maximum tardiness as decision criterion for the scheduling in the first step of a two-phase solution approach. Also Liu et al. (2016) consider a parallel machine problem. They propose a branch and bound algorithm to minimize the resource consumption with maximum tardiness as a constraint. In Liu et al. (2017) a fuzzy flow shop problem is described with tardiness costs and energy costs summarized in one objective function. A similar cost function is used by Le and Pang (2013), whereby uncertainties in energy consumption are taken into account. Tardiness is considered in the four publications mentioned, but does not appear as an independent aim in the objective function.

A genetic algorithm for the multi-objective job shop problem with minimization of energy consumption and total weighted tardiness is introduced by Zhang and Chiong (2016). The same objectives are considered by Wang et al. (2016) in a batch scheduling problem under energy consumption uncertainties. In Che et al. (2017a) energy consumption and maximum tardiness are minimized in a single machine problem with power down options. The last three approaches mentioned describe multi-criteria approaches with energy consumption and tardiness as objective functions. However, variable energy prices or discrete speeds are not taken into account.

3 Problem description

In this section we will define the considered problem in detail. To analyse the interdependencies between total tardiness and energy costs a multi-objective MIP formulation is developed. An overview on notation with indices, parameters and variables is shown in Table 1.

Table 1 Used notation for the MIP formulation in Sect. 3

3.1 Assumptions

We consider an HFS problem where n jobs go through m stages (with \(m\ge 2\)) following the same processing sequence (flow shop). Each stage k consists of identical parallel machines i \((i\in \{1,\ldots , \mu _k\})\). Thereby, each machine at a stage has the same technical requirements but can differ in terms of speed. A typical machine layout can be seen in Fig. 2.

Fig. 2
figure 2

General machine layout in a HFS problem

For practical implementation of variable motor speeds, electronic voltage converters are connected upstream of an electric motor. As a result, speed, torque, as well as the resulting power can be varied arbitrarily (Abdelaziz et al. 2011). Such technologies are already used in various areas such as building management (see e.g. Saidur 2009). Thus, an infinite number of speeds is theoretically possible. However, this cannot be formulated in an MIP, as the solution space would also become infinitely large. Consequently, a finite and discrete set of speed levels \({\mathcal V}= \lbrace 0,\ldots ,o \rbrace\) is available for each machine in order to control processing speed. The speed is determined individually for each job and remains constant during the processing of a job.

Since we assume discrete integer production times, reducing the speed by one step leads to one additional processing time unit. Thus, the variable l can simultaneously represent additional processing time and a reduction in speed. To avoid ambiguity, l is referred to as the level of speed reduction in the following. A higher l increases the processing time but decreases energy consumption. The operation of task k of job j needs at least a baseline processing time \(p_{jk}\) which corresponds to maximum speed as well as maximum energy demand. To reduce the maximum speed by one step, l is set to 1. This increases the processing time by one unit. The resulting actual processing time is called \(P_{jk}\) and would equal \(p_{jk} + l\) with \(l=1\).

Through speed variations the energy consumption can be influenced. The energy cost \(EC_{jk}\) of job j at stage k is determined by the time-dependent energy prices \(ec^t\) and the speed-dependent consumption \(EP_{jk}\). Thereby, energy consumption \(EP_{jk}\) does not depend linearly on the speed reduction, but the interdependencies are much more complex. Often cubic relationships are assumed on the basis of the affinity laws which are already mentioned in the introduction. In our model the maximum energy consumption at highest speed is \(ed_{jk}\). Based on that value and the actual processing time \(P_{jk}\), the following relationship is assumed:

$$\begin{aligned} EP_{jk} = \left[ 1+0.6\cdot \left( \frac{P_{jk}}{p_{jk}} - 1 \right) ^2 - 1.4 \cdot \left( \frac{P_{jk}}{p_{jk}} - 1 \right) \right] \cdot ed_{jk} \cdot \frac{p_{jk}}{P_{jk}}\quad \forall \; j, k. \end{aligned}$$
(1)

This equation is based on the work of De Almeida et al. (2005) and represents an electric motor for a pump with variable speed drive. A possible practical application could be an injection moulding machine. The minimum energy consumption is achieved at a ratio of \(P_{jk}\) over \(p_{jk}\) of 2.09, which means that processing speed reduction up to 52.15% may reduce total energy consumption. This value meets practical experiences. Because of excessive wear and the inefficient working area, electric engines lose efficiency if driven below 50% of rated load (see e.g. Saidur 2009). For that reason, in our test instances speed reductions are only permitted up to 50%. Since Eq. (1) is quadratic, it is helpful to linearise the expression in the course of MIP modelling.

Energy costs can be reduced on the one hand by reducing the production speed and on the other hand by shifting the work to off-peak time periods with lower energy prices. If only energy costs were minimized, jobs would be processed at very low speeds and mainly times of low electricity prices would be used for production. This could increase cycle times enormously. We assume, that each job j has a due date \(D_j\). If the completion time exceeds the due date the difference is called tardiness \(T_j\) (see Eq. (11)). In order to obtain a time-efficient schedule despite the energy cost optimization, we minimize the total tardiness of all jobs as a second objective. On the basis of these assumptions, we consider the following two objectives (2) and (3).

$$\begin{aligned} {\textit{Minimize:}}\; &{\text {I. Total electricity costs:}}\; {\text {TEC}}= \sum _{k=1}^{m} \sum _{j=1}^{n} EC_{jk} \end{aligned}$$
(2)
$$\begin{aligned}&{\text {II. Total tardiness:}} \; TT= \sum _{j=1}^{n} T_j. \end{aligned}$$
(3)

Besides the mentioned properties, the following general assumptions are made for the HFS problem:

  • All jobs and machines are available at time zero (no release dates).

  • Each machine can process at most one job at a time.

  • Each job can be processed by at most one machine at a time.

  • Once a task has been started, no interruption is allowed.

  • There are infinite buffers between stages.

  • Set-up effort and transportation times are neglected.

3.2 Time-indexed model formulation

The described problem can be formulated as an MIP. Since we consider time depending energy cost, the model is set up using time-indexed variables. The planning horizon is subdivided into \(\tau\) time periods \(t (t\in \{1,\ldots , \tau \})\). We introduce two binary decision variables. The binary \(x_{jk}^{t}\) is equal to one, if job j is processed in t at stage k. Similarly, the binary \(z_{jk}^{t}\) is one, if job j starts at stage k in period t. Based on that two decision variables, the basic constraints can be formulated as follows.

$$\begin{aligned} \sum _{j=1}^{n} x_{jk}^{t} \le \mu _k\quad \forall \ k, t \end{aligned}$$
(4)
$$\begin{aligned} \sum _{t=1}^{\tau } x_{jk}^{t}= P_{jk}\quad \forall \ j, k \end{aligned}$$
(5)
$$\begin{aligned} \sum _{t=1}^{\tau } z_{jk}^t= 1\quad \forall \ j, k \end{aligned}$$
(6)
$$\begin{aligned} x_{jk}^{1}= z_{jk}^{1}\quad \forall \ j, k \end{aligned}$$
(7)
$$\begin{aligned} x_{jk}^{t} - x_{jk}^{t-1}\le z_{jk}^{t}\quad \forall \ j, k, t>1 \end{aligned}$$
(8)
$$\begin{aligned} s_{jk}= \sum _{t=1}^{\tau } \left( z_{jk}^t \cdot t\right)\quad \forall \ j, k \end{aligned}$$
(9)
$$\begin{aligned} c_{jk}= s_{jk} + P_{jk} - 1\quad \forall \ j, k \end{aligned}$$
(10)
$$\begin{aligned} T_{j}= \max \left\{ 0, c_{jm} - D_j\right\}\quad \forall \ j \end{aligned}$$
(11)
$$\begin{aligned} s_{jk} - c_{jk-1}\ge 1\quad \forall \ j, k>1. \end{aligned}$$
(12)

Constraint (4) ensures that the number of parallel processes in each stage complies with the number of parallel machines. Thus, no machine can be assigned multiple jobs at the same time. By introducing (5) each job is scheduled for the entire processing time needed. Furthermore, with (6) a job can start only once. Expressions (7) and (8) connect both binary variables. These three equations operate together to ensure that tasks cannot be interrupted.

In addition, we introduce two further dependent decision variables by Eqs. (9) and (10). The integer \(s_{jk}\) is the start time period of task k of job j and the integer \(c_{jk}\) represents the completion time of a job j at stage k. Both variables depend on \(z_{jk}^{t}\) as well as the chosen speed and do not necessarily have to be introduced. However, it simplifies the formulation of the model and accelerate the optimization by means of solver. On the basis of \(c_{jk}\) it is then possible to calculate the tardiness \(T_{j}\) of job j with (11). Equation (12) ensures that a job can be processed at a stage only if the previous step is completed.

Due to the quadratic function (1) and the dependence of the energy consumption on the processing speed, neither direct proportionality nor additivity is given. For linearisation, we make two modifications. Firstly, function (1) is calculated for all possible levels of speed reduction \(l \in \lbrace 0,\ldots ,o \rbrace\) and the result is saved as parameter \(es_{jkl}\). Secondly, the interdependencies of decision variables are dissolved by introducing a further binary auxiliary variable \(g_{jkl}\). Thereby, the binary \(g_{jkl}\) is equal to 1, if job j is processed at stage k at level of speed reduction l. The following constraints have to be added.

$$\begin{aligned} \sum _{l=0}^{o} g_{jkl}= 1\quad \forall \ j, k \end{aligned}$$
(13)
$$\begin{aligned} P_{jk}= p_{jk} + \sum _{l=0}^{o} \left( g_{jkl} \cdot l\right)\quad \forall \ j, k \end{aligned}$$
(14)
$$\begin{aligned} e_{jk}^{t}\ge \left[ \sum _{l=0}^{o} \left( g_{jkl} \cdot es_{jkl} \right) \right] - ed_{jk} \cdot \left( 1-x_{jk}^{t}\right)\quad \forall \ j, k, t \end{aligned}$$
(15)
$$\begin{aligned} e_{jk}^{t}\ge 0\quad \forall \ j, k, t \end{aligned}$$
(16)
$$\begin{aligned} EC_{jk}= \sum _{t=1}^{\tau } \left( e_{jk}^{t} \cdot ec^{t} \right)\quad \forall \ j, k. \end{aligned}$$
(17)

Constraint (13) assigns exactly one level of speed reduction l to each job j at each stage k. Then \(P_{jk}\) can be calculated by (14). The respective consumption \(e_{jk}^{t}\) is calculated in Eq. (15) depending on the additional processing time. Since \(ed_{jk}\) is always greater than or equal to the actual energy consumption, it works similar to a Big M formulation and there is only a positive value if actual production takes place. Furthermore, (16) ensures non-negative energy consumption if not manufactured (\(x_{jk}^{t}=0\)). Finally, (17) sums up the energy costs for each job at each stage.

3.3 Model improvements

The formulation described in Sect. 3.2 is complete and can be solved by solver. However, the formulation can still be improved. The number of variables \(g_{jkl}\) is very high. The speed is selected by one single binary without any connection between individual speed levels. If special ordered sets are used instead, the branch and bound procedure can be significantly accelerated (for more information see e.g. Beale and Forrest 1976). In doing so, Eqs. (13)–(15) are replaced by (18) to (20).

$$\begin{aligned} g^*_{jkl-1}\ge g^*_{jkl}\quad \forall \ j, k, l: l>1 \end{aligned}$$
(18)
$$\begin{aligned} P_{jk}= p_{jk} + \sum _{l=1}^{o} g^*_{jkl}\quad \forall \ j, k \end{aligned}$$
(19)
$$\begin{aligned} e_{jk}^{t}\ge ed_{jk} \cdot \left[ x_{jk}^{t} - \sum _{l=1}^{o} \left( g^*_{jkl} \cdot es^*_{jkl} \right) \right]\quad \forall \ j, k, t. \end{aligned}$$
(20)

The basic idea is that the processing time is stepwise increased by \(g^*_{jkl}\). If time is increased by a certain level \(\Phi\) all previous levels must also be activated. In the model this means that not only \(g^*_{jk\Phi } = 1\) but all \(g^*_{jkl} = 1\) with \(l\le \Phi\). This relationship is established by constraint (18). Then, the additional processing time can be calculated by the sum of all \(g^*_{jkl}\) in (19). Finally, the time depending energy consumption must be calculated by Eq. (20). Here, \(es^*_{jkl}\) is the relative percentage energy saving for job j at stage k if the level of speed reduction is changed from \(l-1\) to l.

With the reformulation described above, the computation time can be reduced by up to 50% for certain test instances. Moreover, some other approaches have been tested to speed up the solution finding. Various additional constraints were integrated, which unfortunately did not lead to any significant improvement. In addition, optimization could be probably accelerated if starting solutions are generated. But finding initial solutions does not seem to be a problem here. For example, an initial solution was created with the earliest due date (EDD) rule. Thereby the jobs are scheduled based on their due dates (job with the smallest \(D_j\) is scheduled first and so on). Interestingly, computation time was even significantly extended by using initial solutions.

Since the described model cannot be further improved directly, other formulations are investigated. The calculation of energy costs is quite complex due to the fluctuating energy prices and the time dependency. Furthermore, it has to be repeated frequently which requires a lot of computing time. One alternative is to calculate all possible energy costs depending on job, stage, speed and starting time in advance. Combining this idea with a sequence-dependent formulation leads to another MIP, which is described in the following.

3.4 Sequence-dependent model formulation

The new model is based on the parameter \(tec_{jkl}^{t}\) which defines the energy costs for a job j at stage k, if processing starts in time period t at level of speed reduction l. This parameter is calculated in a pre-process. Since the determination takes much less than a second for the considered problem sizes this calculation step is not analysed separately in the following. Rather, we focus on the resulting novel model formulation. Just like \(tec_{jkl}^{t}\), the processing time \(P_{jkl}\) depends on l. Thus, speed respectively extension of the processing time is considered as an index in the following. The modified decision variables are shown in Table 2. Further notation is similar to Table 1.

Table 2 New decision variables for the modified model

While the objective functions remain unchanged, constraints are mainly modified as can be seen in Eqs. (21)–(30).

$$\begin{aligned} x_{jj'k} + x_{j'jk}\ge y_{jki} + y_{j'ki} - 1\quad \forall \ i, j, k, j' \ne j \end{aligned}$$
(21)
$$\begin{aligned} \sum _{i=1}^{\mu _k} y_{jki}= 1\quad \forall \ j, k \end{aligned}$$
(22)
$$\begin{aligned} \sum _{l=0}^{o}\sum _{t=1}^{\tau } z_{jkl}^{t}= 1\quad \forall \ j, k \end{aligned}$$
(23)
$$\begin{aligned} s_{jk}=\sum _{l=0}^{o}\sum _{t=1}^{\tau } z_{jkl}^t \cdot t\quad \forall \ j, k, \end{aligned}$$
(24)
$$\begin{aligned} c_{jk}= s_{jk} + \sum _{l=0}^{o}\sum _{t=1}^{\tau } z_{jkl}^t \cdot P_{jkl} - 1\quad \forall \ j, k \end{aligned}$$
(25)
$$\begin{aligned} s_{jk} - c_{jk-1}\ge 1\quad \forall \ j, k>1 \end{aligned}$$
(26)
$$\begin{aligned} c_{j'k}\le s_{jk} + (1-x_{jj'k}) \cdot \tau - 1\quad \forall \ j, k, j' \ne j \end{aligned}$$
(27)
$$\begin{aligned} c_{jm}\le \tau\quad \forall \ j \end{aligned}$$
(28)
$$ T_{j}= \max \left\{ 0, c_{jm} - D_j\right\}\quad \forall \ j $$
(29)
$$\begin{aligned} EC_{jk}= \sum _{l=0}^{o}\sum _{t=1}^{\tau } z_{jkl}^{t} \cdot tec_{jkl}^{t}\quad \forall \ j, k. \end{aligned}$$
(30)

Equation (21) specifies that each job j must be either predecessor or successor of job \(j'\) if both jobs are processed on the same machine. With (22) each job is allocated to a machine. Constraint (23) guarantees that each job starts at each stage exactly once with a certain speed. In (24) and (25) start and completion time period are defined. On that basis with (26) a job can only starts if the previous stage is finished. Similarly, a job can start at a machine only if all jobs which are scheduled earlier are finished. For this purpose, constraint (27) prevents overlapping of jobs allocated to the same machine. Of course, the completion time of each job must not exceed \(\tau\) which leads to (28). In the time-indexed model equation (5) ensures that the complete processing time is within the observation period. Similar to the first model the tardiness of each job is calculated in (29). Finally, depending on the data of \(tec_{jkl}^{t}\), (30) calculates the energy costs for each job at each stage.

The performance of this approach in comparison with the model presented first will be evaluated in a computational study in Sect. 4. Beforehand, the problem of multi-criteria decision making will be discussed briefly.

3.5 From lexicographic to eps-constraint method

Due to the different objective functions (2) and (3), the described problem can not be solved directly by a solver. Basically, Branch and Cut algorithms are used to optimize a single objective function. However, if there are several objectives, one possibility is to combine them into a single function. For example, one could try to monetize the values and thus obtain a global cost function. In the considered problem, unfortunately, a weighted-sum approach is not possible to combine both criteria in one function. Delays in particular are difficult to monetise. Contractual penalties often occur, but in addition, trust and goodwill losses must also be taken into account.

Another possible approach is lexicographical optimization. Therefore, the objectives are put into a certain order and then the criteria are optimized and fixed one after the other according to their importance. In the problem under consideration, for example, the minimum total tardiness could firstly be determined and then the corresponding optimum energy costs are calculated. Conversely, it could also be possible to minimize TT for the lowest TEC calculated at the beginning. Both possible lexicographic solutions are exemplarily shown in Fig. 3 with black dots.

Fig. 3
figure 3

Visualization of lexicographic solution and pareto front with two objective functions

In general, all pareto optimal solutions come into question for the decision-maker. These solutions are also referred to as non-dominated solutions (NDS). A solution is dominated if another solution is just as good in all objectives and at least in one better. Vice versa, a solution is an NDS if it is better in at least one criterion compared to any other solution. The eps-constraint method is a suitable approach for calculating all pareto optimal solutions (pareto front) or at least some NDSs by means of mathematical modelling. Thereby, all lexicographical solutions are first identified. The range between these solutions is then analysed depending on the type of eps-constraint method. What all approaches have in common is that only one objective is optimized and all other objectives are limited by constraints.

In this article the equidistant eps-constraint method is pursued. Therefore, we calculate both lexicographic solutions. Then, TT is defined as a constraint at fixed intervals and the model is optimized for TEC as a single objective. This approach has two main weaknesses.

On the one hand, not every solution found is pareto optimal. The problem could be circumvented by the augmented eps-constraint method (see e.g. Mavrotas 2009 or Wang et al. 2018), in which the second objective is also included in the single objective function with a very small part. However, since almost every increase in TT leads to a reduction in energy costs and thus to an NDS, there is hardly any added value from the additional effort.

On the other hand, probably unevenly distributed pareto front is achieved. Here a dynamic approach like the bi-section eps-constraint method (Chircop and Zammit-Mangion 2013) could help, which always searches in the area where the Euclidean distance between two NDSs is greatest. Since we limit the computing time of the models in the following, an optimal solution is not always found and the approaches can lead to different results. This, in turn, would lead to searches in different areas, which would make the evaluation of computing performance extremely difficult. Therefore, we apply the equidistant eps-constraint methods in the following.

4 Numerical case study

In the following, a numerical example shall illustrate the scheduling problem under consideration. Subsequently, a computational study is examined to evaluate the performance of the two proposed models. All problems are solved by CPLEX 12.6 using an Intel Xeon 3.3 GHz CPU with 768 GB memory. Even simple forms of HFS are considered to be NP-hard (Dai et al. 2013). Consequently, only small instances can be solved for the described problem. The biggest problem instances that will be considered here consist of ten jobs and 4 manufacturing stages. Real production processes often have much larger sizes. Nevertheless, the following example is constructed as realistic as possible.

4.1 Test data

Initially, we consider six jobs which have to be processed at two stages whereby each stage consists of two parallel machines. The values for energy consumption and processing times are generated randomly from a uniform distribution U as follows:

  • Processing time (h): \(U\lbrace 1;10\rbrace\),

  • Energy demand (\(10^5 W\)): \(U\lbrace 1;10\rbrace\).

The exact values for the following example can be seen in Table 3.

Table 3 Numerical example

The processing speed of each job can be reduced up to 5 times which results in \(l \in \lbrace 0,1,2,3,4,5\rbrace\). Furthermore, all jobs have a due date \(D_j\), which is also shown in Table 3. The definition of the due dates should be made in such a way that not all orders are automatically delayed, but it should also not be simply possible to solve the problem without a tardiness, as otherwise the scope for decision-making is restricted. Thus, their calculation is an essential factor influencing the complexity of the problem. On the basis of the work of Choi et al. (2005) we use the following formula:

$$\begin{aligned} D_j = \max \left( 0, U\left[ \left\lfloor P\left( 1-T-\frac{R}{2}\right) \right\rceil , \left\lfloor P\left( 1-T+\frac{R}{2}\right) \right\rceil \right] \right) , \end{aligned}$$
(31)

where P denotes makespan lower bound, T is a tardiness factor which influences the general amount of delays and R as due date range defines the scatter of \(D_j\). Symbol \(\lfloor \rceil\) indicates the nearest integer. For the first example we set T to 0.4 which leads to a fairly high average delay. R is set to 0.7 but will be varied in the following.

The estimation of the makespan is not only important to choose appropriate due dates, but also to reasonably limit the observation period. Too high values for the selected period under consideration of \(\tau\) would lead to a situation where production would only take place in times of low energy costs at very low speeds, which leads to an enormous number of delays and does not represent a real alternative in practice. In detail \(\tau\) is determined in Eq. (32). We calculate the average processing time at one stage k and add the maximum times of the previous and subsequent stages for one job. The optimal makespan definitely cannot exceed this value. Since the solutions for optimal TT does not necessarily lead to the optimal makespan and in addition, to allow enough margin for speed adjustments, the value is increased by a factor \(\alpha\). For the considered instances \(\alpha\) is set to 0.1.

$$\begin{aligned} \tau = (1+\alpha ) \cdot \underset{\forall k}{\min } \left[ \underset{\forall j}{\max } \sum _{k^*=1}^{k-1} P_{jk^*} + \sum _{j=1}^{n} \frac{P_{jk}}{\mu _k} + \underset{\forall j}{\max } \sum _{k^*=k+1}^{m} P_{jk^*} \right] . \end{aligned}$$
(32)

Based on formula (32) a lower bound can be determined for the makespan. This corresponds to the average production time at a stage, extended by the minimum processing times at the other stages. Thus, P for Eq. (31) is calculated by:

$$\begin{aligned} P=\underset{\forall k}{\max } \left[ \underset{\forall j}{\min } \sum _{k^*=1}^{k-1} P_{jk^*} + \sum _{j=1}^{n} \frac{P_{jk}}{\mu _k} + \underset{\forall j}{\min } \sum _{k^*=k+1}^{m} P_{jk^*} \right] . \end{aligned}$$
(33)

Finally, TOU energy prices need to be defined. The average price per MWh for medium-sized industrial companies without major privileges in Germany is about 160 € (Fraunhofer 2015). In the following, this price will be set as mid-peak. For on- and off-peak deviations of 50% are assumed. Similar prices in USD are used for example by Ding et al. (2016). The exact prices depending on the corresponding time are shown in Table 4.

Table 4 TOU Prices for the numerical example

4.2 Evaluation of the example

To determine the optimal pareto front, we first have to define the lexicographic solutions. The corresponding schedules are shown in Fig. 4. For the considered problem minimum TT is 36 h with energy costs of 4360 €. On the other side the minimum TEC can be found by 1351.73 € with TT of 103 h. Of course, both solutions differ significantly with regard to the two objectives but also in terms of computational effort. While the minimum TT solution can be found in less than 6 s by the time-indexed model, it takes 27.2 min to minimize TEC and to define the corresponding minimum TT. The more TT increases the higher is the number of possible speed reductions and time shifts for TOU prices. For minimum TT all processes are first performed at maximum speed which also reduces the CPU time.

Fig. 4
figure 4

Optimal lexicographic solution for minimum TT (left) and minimum TEC (right)

Logically, also the energy consumptions distinguish significantly. Total Energy demand (TED) at minimum TT without any speed reductions lays at 28.4 MWh, while at minimum TEC only 10.1 MWh are needed. Similarly the time periods the energy is used differ which can be seen in Fig. 5. Not only the load curves but also the energy prices can be seen here. While for minimum TT TOU prices are barely used, for minimum TEC phases of higher energy consumption are mainly scheduled in times of lower energy prices (off-peak).

Fig. 5
figure 5

Energy Price in Euro/MWh and load curves for lexicographic solutions

The two lexicographic solutions are only a small part of all pareto optimal solutions. By calculating the minimum TEC for all TT values between 36 and 103 we can determine the optimal pareto front and the corresponding energy demand shown in Fig. 6. Altogether 63 NDSs exist for the considered example with \(\tau =35\). The total energy costs can especially be reduced in the beginning when TT is very low. First additional delays allow to exploit the highest energy cost savings through speed reductions or making use of TOU prices. In the further process the curve flats gradually down.

Fig. 6
figure 6

Optimal pareto front for the numerical example and resulting energy demand

Additionally, it can be seen that the course of TEC and TED are pretty similar. Nevertheless, a decreasing energy demand not necessarily leads to lower energy costs. Sometimes, energy demand even increases and the total costs can be reduced by shifting the consumption to times of lower TOU prices.

4.3 Performance analysis of different formulations

The computation of all pareto optimal solutions shown in Fig. 6 takes 8.48 h with the time-indexed formulation while the sequence-dependent formulation needs only 0.49 h. Thus, the calculation of all energy cost scenarios within a pre-process seems advantageous. In the following, differences in performance of both procedures shall be examined in detail and the relationship between the different objectives will be further discussed. Therefore, we generate test instances using the functions presented in Sect. 4.1. The problem sizes are varied in the following according to Table 5. Altogether, 36 different problems are considered.

Table 5 Overview of test instances

The performance of different models can be analysed in terms of model size complexity and computational complexity (cf. e.g. Meng et al. 2019). Regarding the model size Table 6 shows the number of binary variables (NBV), number of continuous variables (NBC) and number of constraints (NC). The resulting model size for the test instances considered can be found in Table 7. The specified values for \(\tau\) depend on the respective processing times and are calculated with Eq. (32).

Table 6 Model size in terms of number of variables and constraints

Due to the variable \(e_{jk}^{t}\) in the time-indexed model, significantly more continuous variables are introduced. On the other hand, in the sequence-dependent formulation, the adjustment of the processing intensity is directly taken into account by \(z_{jkl}^{t}\), which increases the number of binary variables enormously. With regard to the required constraints, considerably fewer equations are required for the second formulation.

Table 7 Model size of the considered instances

How well the models are solvable cannot be inferred directly from the problem size but may be linked by analysing the computational complexity. Unfortunately, it takes a lot of computational effort to calculate all pareto optimal solutions. Therefore, computation time is limited to 10 min for each run. For that reason computation times will not differ as significantly as for the example above. However, the quality of the results should deviate. Furthermore, we calculate not all solutions but only a certain amount. Usually in eps-constraint method, the number of calculated solutions is set in the beginning. Thus, depending on the problem size the quality of the estimated pareto front can vary greatly. As already mentioned in Sect. 3.5, we want to bypass that problem by using a predefined distance between two solutions. In detail we always increase TT by \(\Delta =5\,\)h. Thereby, the previous solution is always used as the initial solution for the next iteration. Thus, the pareto front can be appropriately estimated for each instance.

Since each problem instance has not only one optimal solution but different pareto optimal solutions, we first have to define a performance criterion to compare both approaches. In multi-objective optimization various concepts have been established. Probably the most frequently used criterion is the number of NDSs (see Table 8). Theoretically, the number of NDSs can be derived from the difference between the two TT values of the lexicographic solutions divided by 5 (since TT is increased by 5 h in each iteration). However, the solver does not always succeed in reducing energy costs within 10 min, which is why no new NDS is created in some cases. This results in different numbers of NDSs between the two approaches. These deviations are rather small and thus, this criterion is not very meaningful.

Table 8 Detailed results of the computational study

A much better criterion is the hypervolume (HV) (see e.g. Beume et al. 2007), which is visualized in Fig. 3. The idea is, to calculate the relative amount of space between theoretical optimum and anti-optimal point, which is covered by the found NDSs. Thereby, the theoretical optimum is the combination of the best objective values in the lexicographic solutions and vice versa the highest values for TT and TEC in the lexicographic solutions form the anti-optimal point. Therefore, HV has a value between 0 and 1, whereby higher values stand for a better solution quality since a larger area of the potential solution space is covered. In detail, a value above 0.5 means that the pareto front is convex between the lexicographic solutions. The higher the HV value, the more convexly curved is the pareto front and thus one gets closer to the theoretically optimal solution.

All HV values listed in Table 8 are additionally shown in Fig. 7. There is no instance where the time-indexed model finds a better pareto front than the sequence-dependent formulation. With an average of 71.6% the second approach covers on average around 5% more of the solution space. Since 7 of the first 18 instances can be solved to optimality with the made adjustments, both approaches lead to the same results in theses cases. It is noticeable that this occurs mainly when two production stages with three parallel machines are considered. As is usual within HFS problems, the computational effort increases with the number of stages and decreases with the number of parallel machines. This can also be identified to a limited extent by the computing time in the results.

Fig. 7
figure 7

Hypervolume results for the two models

A closer look at the computing times in Table 8 shows that the sequence-dependent model also has advantages here. However, the values must be viewed with caution. For larger problems, each optimization run is aborted after 10 min, which is why the times here differ less. It can also be seen that the CPU times are proportional to the number of NDSs. The time-indexed model leads especially to high computing times for the proof of optimality. As a result, the sequence-dependent method has significant advantages in computing time here. For example, for instances with 6 jobs, the average time of 0.39 h is reduced by 80% compared to the first model with 1.95 h. Interestingly the due date range R has no general influence on the used performance indicators and computation time. Concerning the fluctuation of the results, the sequence-dependent procedure seems to work more robustly. For example, the standard deviation of the HV values of the time-indexed model is 7.66% while the sequence-dependent procedure varies only by 4.88%. Overall, the sequence-dependent approach appears much more suitable for the problem under consideration.

4.4 Evaluation of savings potentials

Finally, the possibilities of reducing energy costs will be explicitly analyzed. In particular, the influence of speed changes and TOU tariffs on the reduction of energy costs will be examined. Therefore, the model is solved once with constant (maximum) speed and once with fixed energy costs (160 €/MWh). The sequence-dependent model formulation is used exclusively for this purpose. Unfortunately, not all 36 pareto fronts can be shown here and only the first 18 instances can be solved to optimality in reasonable computing time. For illustration, we will concentrate in the following on instance 2 from Sect. 4.2 and three further randomly selected instances (4, 12, 17). The results can be seen in Fig. 8.

Fig. 8
figure 8

Pareto front for constant speed and constant energy prices

All points shown in Fig. 8 are pareto optimal solutions. The filled circles are the results when the original model is used. For the case of producing at maximum speed (unfilled circles), the cost savings are significantly lower. In addition, the number of NDSs is reduced. If fixed prices are used instead of the TOU tariffs (triangles), the TEC for minimal TT are slightly increased. In comparison, the adjustment in production speed leads to significant savings. The relative reduction of costs is even higher for fixed electricity prices than for TOU tariffs. Overall, greater savings can be achieved by varying the speed. This suggests that it is preferred to reduce electricity consumption over purchase at more favourable conditions, which is also more sustainable.

Since speed changes seem to have a greater influence, it is interesting to examine the extend of energy savings when the load curve of the electric motor has a different course. For this purpose, we consider two less advantageous consumers. Fig. 9a) also shows two curves with 25% and 50% less savings potential, respectively, in addition to the original function. These differences can be identified almost to the same extent in the resulting pareto fronts in Fig. 9. For the four instances considered, similar tendencies through speed control can be observed—an average of 22.9% less energy reduction for the 25%-case and 42.7% on average for the 50%-case. Considering the consistent developments, it can thus be concluded that speed adjustments have a significant influence on the energy costs.

Fig. 9
figure 9

Influence of less energy savings due to speed reductions

Finally, it shall be discussed which of the various NDSs is a good solution from a decision-maker’s point of view. Classical approaches of multi-criteria decision making such as minimizing the distance to the utopia point may not justify the importance of TT. As mentioned in Sect. 4.2, greater energy cost savings can already be achieved by small increases in TT. The average relative reductions of TEC and the associated ranges are listed in Table 9. For this purpose, the delay for all 36 instances is increased by up to 10 h, whereby each individual increase (by 1 h) is taken into account. The computing time is limited to 10 min.

If the decision maker is willing to increase TT by one hour, energy costs can already be reduced on average by 3.8%. For the second hour, further savings of 3% are possible (see row “stepwise”). This potential falls significantly with further delays. At a certain point, an additional delay is no longer economically justifiable. It can be observed that the possibilities of cost reduction fluctuate strongly. The average coefficient of variation is 41.03%. Nevertheless, it can also be seen from minimum and maximum that the potential for cost reduction decreases with growing TT.

Table 9 Average savings potential for small TT increases for the 36 instances

5 Conclusion

The present article combines the ideas of energy efficiency and delivery reliability in production scheduling. Two multi-objective MIP formulations are given for the HFS scheduling problem considering variable production speeds to reduce energy consumption at the expense of longer processing times. Energy costs can be reduced not only by variable speeds but also by taking advantage of fluctuating TOU energy prices. To solve the problem eps-constraint method is used. As far as we know, this is the first time that energy costs and tardiness are considered as objectives in a hybrid flow shop problem with the described properties.

A numerical case study shows that energy costs can be enormously reduced by just a few delays in delivery. In this case, energy cost savings reached 3.8% on average after postponing by one hour. It can be stated, that energy costs can be reduced by shifting loads to times of lower energy prices without reducing consumption. However, speed control seems to have a much stronger impact on energy costs than load shifts due to TOU prices. This suggests that it is preferred to reduce electricity consumption over purchase at more favourable conditions. At the same time, speed changes are more ecological as they also reduce energy consumption.

Based on the second (sequence-dependent) model formulation, it can be shown that by calculating all energy cost scenarios in a pre-process, computation time can be enormously reduced. Regardless of objectives or which model is used, very good or even optimal solutions are found quickly by solver for the considered problems. However, the proof of optimality takes a lot of time. Due to the high complexity of the problem, the specification of good lower bounds seems to be very difficult. Nevertheless, it might be possible to improve the solution finding by determining better lower bounds in a future work.

Since the problem is NP-hard, the models are only able to solve small problem instances. It is appealing for future work to develop heuristic solutions in order to solve larger problems in reasonable computing time. Nevertheless, the models can provide practical insights. On the one hand it can deliver reference solutions for the development of heuristics. Also, existing heuristic solutions may be upgraded with possible improvements by solvers. Furthermore, larger problems may be divided into subproblems. For example, we may prioritize bottleneck machines, where sequences are to be optimized with the model. Moreover, decomposition methods can be used. For example, batches can be formed which consist of similar jobs. Once the problem size is reduced, exact solution method can be applied. Next, each batch of jobs is considered individually and broken up for detailed planning. Thus, the models described are not only relevant from a research point of view, but can also be useful in practice to combine low energy costs and punctual delivery.