1 Introduction

In modern market conditions, manufacturers need to quickly deliver high-quality products. This requires rapid reaction to various events in the production process. This led to development of several methodologies for quality control and improvement. In majority they have been concentrated on single-stage processes such as statistical process control (SPC), design of experiments (DoE), acceptance sampling procedures, six-sigma and lean tools [2]. In contrast, the Zero-Defect Manufacturing (ZDM) strategy concerns the operations in the entire organization. Its goal is to improve the process efficiency and product quality while eliminating defects and minimizing process errors.

While ZDM as strategy is attractive, theoretical and practical works do not consider its relationship with work scheduling problem. Most works treat the problem of job assigning to machines as a simple queuing system, e.g. FIFO [20]. It however ignores inherent uncertainty of real systems. Industry needs a coherent framework merging solutions of automatic process data acquisition, data mining and scheduling under uncertainty. There are conceptual propositions for solutions [8, 32], but there are no concrete results.

In the field of scheduling, most of the approaches for real manufacturing systems and unexpected events are “machine-oriented”. We mean by it that the algorithms mostly consider possibilities of machine failures [1, 29, 35]. “Product-oriented” and ZDM aligned approaches, which combine disparate disturbances are uncommon. An example of a “product-oriented” disturbance is a quality defect in a manufactured item.

Typically scheduling under uncertainty works considers predictive schedules using probabilistic uncertainty models. Their goal is to determine a schedule with good average performance [7]. This is not a methodology that can be used in ZDM oriented Decision Support Systems (DSS). Performance analysis of predictive-reactive rescheduling in dynamic environment is still open problem [5].

We address these issues, by proposing an approach that treats defects and their removal as an inherent part of the process. Our main contribution is a formal approach to solving scheduling manufacturing problems with defects removal, which is an extension of the ZDM frameworks. In particular, we propose the use of an Algebraic-Logical Meta Model (ALMM). We consider a class of non-deterministic discrete manufacturing problems, in which unexpected events occur during the process execution and impact job results. This general approach is presented with an examples of scheduling problems with product fault detection during quality control and their repair. The proposed method is universal and can be applied to manufacturing scheduling problems, especially for problems with defects. However, it is dedicated to NP-hard problems because it requires building a model. It takes into account dynamic changes in parameters and problem data. It may take into account random events, disturbances in the process. In this case, we consider a quality control in which the moment of the negative control is unknown. Therefore, a moment of changes in the manufacturing process is also unknown. The approach is particularly applicable in problems with unexpected events, where the decisions that can be taken depend on the state of the system or external events. In the switching method with quality deficiencies the considered problems are formally described by the ALMM methodology. The ALMM formalism, through the mechanism of constructing the solution sequentially during the execution of a multi-step decision process (MDP), allows to detect the occurrence of an event. In the proposed approach, event detection is modeled by quality control station in the system state. The switching method of the models allows to describe changes related to the consequences of non-deterministic events, i.e. the need to appropriately repair the quality deficiency detected. In defect problems, it is not known when a defect will occur, what kind of defect will occur, and what type of defect will occur, and the proposed solution makes it possible to model such situations.

The paper is organized as follows. In Section 2, we describe ZDM and its frameworks. Section 3 presents algebraic-logical meta model (ALMM) and how it can be used in predictive-reactive scheduling. In Section 4, we propose using ALMM switching as ZDM scheduling strategy. In Section 5, we discuss applications of this method for flow-shop system with defects along with examples. Paper ends with concluding remarks.

Remark 1 (Notation)

In this paper by (a,b) we will consider a tuple (pair), and not an interval. We also use τidle symbol to determine that there is an infinite time to completion, or in other words that machine is not working.

2 Zero-defect manufacturing

ZDM is a comprehensive strategy and encompasses both short- and long-term perspectives. The short-term perspective ranges a real-time process control system developing and implementing that eliminates the production of a faulty component due to variances in materials, components, and process properties. The long-term perspective is concerned with minimizing all failures by continuously optimizing the production process and the manufacturing system. The basic idea of Zero-Defect Manufacturing was introduced in the early 1960s in connection with the US Army Perishing Missile System. This concept has been implemented only partially due to numerous technological limitations that were prohibiting its implementation. According, to [33] “currently, with the evolution of Industry 4.0, ZDM concept is easier to be implemented due to the availability of the required amount of data for techniques such as machine learning to work properly but still a lot of effort is needed for better integration and coordination of the capabilities of each technology”. Thus, ZDM is now a common manufacturing practice to reduce and minimize the number of defects and errors in a process and to do things right the first time [39].

Nowadays, it is possible to implement the theoretical assumptions of the ZDM strategy because there are many solutions based on advanced technology of automatic data acquisition from machines and devices, and data mining technology focus to obtain automatics defect detection. Therefore, automatic fault prediction and prevention are commonly possible [4, 6, 17, 27, 36, 38, 39]. Other relevant works include [18, 19, 37].

The ZDM strategy is attractive for companies for the following reasons:

  • reducing the costs of the company’s resources related to the treatment of defective products,

  • reducing defective machines and tools, inefficient employees, reduction of scrap production,

  • increasing safety and customer satisfaction, making production eco-friendly.

The Zero-Defect Manufacturing can be implemented in two different approaches: the product-oriented ZDM and the process-oriented ZDM. The difference is that a product-oriented ZDM studies the defects on the actual parts and tries to find a solution while on the other hand the process-oriented ZDM studies the defects of the manufacturing equipment, and based on that can evaluate whether the manufactured products are good or not. The latter one lays within the predictive maintenance concept [33].

The concept of zero-defect can be practically utilized in any manufacturing environment to improve quality and reduce cost. Therefore, the whole system should contain the following main phases:

  • Data acquisition—automatic capture, cleaning and formatting of relevant data using intelligent sensors system;

  • Signal processing—automatic signal processing, filtering and feature extraction;

  • Diagnosis—data mining and knowledge discovery for diagnosis;

  • Prognostic assessment—data mining and knowledge discovery for prognostic;

  • Maintenance scheduling—optimization tool for dynamic scheduling under uncertainty.

There are only a few papers with a proposition of a holistic approach that contains all phrases. The last phase is usually omitted. Table 1 presents the approaches to ZDM strategies and their contribution for ZDM and scheduling maintenance.

Table 1 Survey of the ZDM frameworks illustrates that scheduling is analyzed in an only a general form

There are assumptions in all the papers that all scheduling problems can be modeled and solving using well-known meta-heuristic methods. Any specific model and any proposition of adaptation of these meta-heuristic methods to problems under uncertainty haven’t proposed. Furthermore, it hasn’t mentioned that the scheduling task should take into account various aspects of interference to both machines and products. This is still a rare problem in scheduling due to the difficulty of uncertainty problems. As part of the rescheduling approach, algorithms are proposed for solving problems of scheduling with machine failures, new incoming jobs during simulation, and stochastic job processing times [3, 29, 30, 35]. However, less often, it can find approaches combining several of these disturbances and other disturbances such as canceling the order, changing the priority of the order of execution of jobs, inability to perform the jobs due to unavailability of materials, or an unsuccessful technological process and the need to repair an incorrectly performed job.

3 Reactive-predictive scheduling with Algebraic-Logical Meta-Model (ALMM)

If we want to consider scheduling problems that satisfy the ZDM strategy, we need to consider scheduling problems under uncertainty. This kind of schedule should be predictive-reactive. Predictive-reactive scheduling consists of two stages: the planning stage and the execution stage. Predictive scheduling consists of the planning stage and it is off-line scheduling which taking into account uncertainty and flexibility of the executed process. Reactive scheduling refers to the execution stage and it is an on-line phase in which the schedule is created or modified in production when an event occurs. In this section, we present ALMM methodology and resulting from it a formal mechanism for generating predictive-reactive scheduling. This allows early detection of product defects, the adaptation to operating condition changes and the optimisation of manufacturing processes. In this section, we will show how our methodology allows creation of predictive-reactive schedule.

3.1 Predictive scheduling

Considered approach is based on a multistage decision-making process, where the decision is made jointly and considers the current situation (current process state). This approach is the process trajectory-based simulation, where the solution is constructed in stages. Thus, there is a possibility to consider the existing constraints or dynamic input data at a stage (given state of the problem). Moreover, possible decisions are also state-dependent. Using ALMM allows us to reconstruct the process of decision making as well as to monitor and track decisions during the manufacturing process.

ALMM of the discrete deterministic process was defined by [9, 10]. According to this methodology, an optimization problem is modelled as a multistage decision process together with optimization criterion Q. It is defined as a sex-tuple which elements are: a set of decisions U, a set of generalized states S = X × T (X is a set of proper states and T is a subset of non-negative real numbers representing the time instants), an initial generalized state s0 = (x0,t0), a transition function f(s,u), a set of not admissible generalized states SNS, a set of goal generalized states SGS.

Transition function f is defined by means of two functions: fx : U × X × TX determines the next state and ft : U × X × TT determines the next time instant. As a result of the decision u that is taken at some proper state x and a moment t, the state of the process changes to \(x^{\prime }=f_{x}(u, x, t)\) that is observed at the moment \(t^{\prime }=f_{t}(u, x, t)=t+ {{\varDelta }} t\). All limitations concerning the control decisions in a given state s can be defined in a convenient way by means of so-called sets of possible decisions Up(s).

The optimization task is to find an admissible decision sequence \(\widetilde {u}\) that optimizes criterion Q. The consecutive process states are generated as follows. A process starts with an initial state s0. Each next state depends on the previous state and the decision made at this state. The decision is chosen from different decisions, which can be made at the given state. Generation of the state sequence is terminated if the new state is a goal state, a non-admissible state, or a state with an empty set of possible decisions. The sequence of consecutive process states from the given initial state to a final state (goal or non-admissible) form a process trajectory.

The advantages of ALMM are that values of particular co-ordinates of a state or/and decision may be names of elements (symbols) as well as some objects, e.g. a finite set, sequence, etc. (thus they do not have to be numerical) and the set of possible decisions Up, set of not admissible generalized states SN and a set of goal generalized states SG are formally defined with the use of logical formula. It allows representing all kinds of information regarding the mentioned problem (including various temporal relationships and restriction of the process) in a convenient way [10]. It should be noted that the set of possible states is dependent on previously made decisions and in any stage decision is made based on current data and system limitations. During a simulation any decision can’t be made that would lead the system to not admissible state. The heuristic optimization algorithm based on the ALMM is designed to maximize the criterion function and determine the next state out of the set of not admissible states. Moreover, considered model gives a possibility for a formal analysis of admissibility.

Considered approach allows creation of a predictive schedule. A set of possible decisions should be designed in such a way as to take into account additional information (not yet possible in a given state) but predicted and possible in the near future. It is not mandatory to assign a new job to a machine or resource as soon as possible (as soon as the machine finishes an operation and is ready to start the next one). It is possible to design a decision not assigning any new job to the machine if that is not preferred. Preferences can be predicted. This is a different approach than the typical queuing scheduling system. Moreover, as shown by [22] instead of only using information about currently available jobs or machines, ALMM can include additional information about the expected availability in the future. The learning-based method can be used to determine the schedule [21]. A characteristic element of the method is to gather information during the process. All states of process are stored together with the characteristic parameters. Each solution is analyzed, and this analysis is used to obtain knowledge about the process and its control. Moreover, the gathered information also applies to events related to the occurrence of quality defects and machine failures, or other elements of the production process that caused the defect. In the switching method, we gather information in the event of a disturbance (product defect, machine failure, etc.) and the switching states with the corresponding parameters are saved. Then, we analyze this data using selected data analysis methods. In such a case, time can be calculated based on the given rules and restrictions of the problem or predicted based on historical data.

3.2 Reactive scheduling

Reactive scheduling can be built using ALMM switching method. This method provides a switch model when some event occurs and creates a partial re-schedule. A switching method is a formal approach to the model discrete multi-stage decision process in which some unexpected events have occurred. This method was considered in [15, 16, 26].

Here, we focus on processes in which unexpected events include detection of a defect in the product. A practical examples of such event are: irregular application of paint on a painted element, the bad balance of springs in the mattress, or back filling of the hollow headings. This type of defect should be detected and repaired to achieve the postulates of ZDM.

The ALMM switching method consists of the following elements:

  • analysis of types of disturbances,

  • determination of the set of switching states \(\mathcal {S}_{switch}\),

  • division of problem to subproblems,

  • algebraic-logical models (ALM) creation for all subproblems,

  • switching type specification,

  • determination of the switching rules \(\mathcal {R}_{switch}\),

  • definition of the switching function.

Analysis of the problem

This step deals with determining some characteristic problem features. In particular, limitations or requirements reflected in the problem are detected. The preferences of certain types of states or decisions are determined. Moreover, both kinds of disturbances and the moment of occurrence are identified.

Determination of the set of switching states specification \(\mathcal {S}_{switch}.\)

The set of switching states includes three types of states:

  • all states when a new disturbance occurs,

  • the states when previously detected disturbance is removed for all types of disturbance

  • the states when previously detected disturbance is removed and a new disturbance occurs in the same process state.

In the discussed problem with product quality control, the set of switching states includes the states when 1) at least one machine with quality control finishes its job and quality control detects some fault and 2) any additional machine has just finished repairing a job with the defect.

Division of problem to subproblems

Adequate division of the initial non-deterministic problem into subproblems is crucial and related with the types of defects. For manufacturing problems with quality defects, the initial problem includes all types of disturbances and all repair machines dedicated to these disturbances. Division of problem to subproblems reflects the possible consequences of the event. In this problem, it is necessary to introduce the quality repair process. Subproblems are constructed as follows:

  • The first problem is the classic deterministic scheduling problem in which the occurrence of quality defect is not considered, but only the possibility of detecting them, i.e. a quality control station is considered (within the Mq machine or several machines with quality control \( {\mathscr{M}}_{\mathcal {Q}}\)). The problem defines how a given set of jobs without disturbances is processed for given processing times at a given set of machines.

  • Further scheduling problems in a given system consider the following issues: specific types of quality defects repairment and a set or sequence of dedicated repair machines for these types of deficiencies.

The division into subproblems causes that in a given state of the system the proper subproblem is considered, which includes only required repair machines.

Algebraic-logical models (ALM) creation for all subproblems

All subproblems are modeled as algebraic-logical models ALM. The structure of ALMs depends on the type of data changes and may be the same or different. The advantage is to specify simpler models of simpler problems than the original initial problem. In the simplest case, we can use the same model with different parameters. Furthermore, for some problems, with various types of data changes we can specify several models of simpler problems (subproblems of initial problem).

Switching type specification

It depends on the number of disturbance types. There are four types of switching for one type of disturbance:

  • the system switches from the process without the disturbance to process with disturbance,

  • the system switches from the process with disturbance to the same process with modified parameters and data, in the case of new disturbance detection when the earlier disturbance have not been removed yet or only the part of disturbance has just been removed,

  • the system switches from the process with disturbance to process without disturbance when all disturbance have just been removed.

In the problems with more then one type of disturbances, the system can switch from the process with one type of disturbance to process with the other type of defect. In such a case, it should be analysed all the possibilities of switching when the system switches from a process with one kind of disturbance to the process with another type of disturbance or to the process without any disturbances.

Determination of the switching rules \(\mathcal {R}_{switch}\)

Switching rules determine how, as a result of an event, the current state of the production process (represented by sk of the current model ALMnow) goes to the next state of the process (represented as the initial state s0 usually in another model ALMnext). The new state determined in this way includes changes that are a consequence of the event. The set of these rules will determine the switching function (Fig. 1).

Fig. 1
figure 1

The idea of generating a single process trajectory with switching between models. Switching occurs as a result of an event in the current state of the production process (represented by sk of the current model ALMnow) and goes to the next state of the process (represented as the initial state s0 usually in another model ALMnext)

Definition of the switching function f switch

Determining the rules depends on types of switching. It consists of the determination of algebraic formulas to recalculate a value of parameters in case of switching between the same ALM or calculate a new state and new value of parameters in case of switching between different ALM s. The switching rules usually don’t have to be set for all combination of switching between models. It should be excluded the switching between those models, which are not related.

Let us consider the following notation. Let \( \mathcal {S}_{switch} \) denote the set of switching states from all the models under consideration. Let \(\mathcal {D}\) denote the space of data sets. Let \(\mathcal {A}{\mathscr{L}}{\mathscr{M}}\) denote the set of all considered models for the initial problem. Let fswitch denote the switching function, which is defined as follows:

Definition 1

Switching function is a function which to the model \(ALM \in \mathcal {A}{\mathscr{L}}{\mathscr{M}}\) in the current switching state sSswitch and the data of subproblem determines the appropriate model and the corresponding data of subproblem:

$$ f_{switch}: \mathcal{A}\mathcal{L}\mathcal{M} \times \mathcal{S}_{switch} \times \mathcal{D} \rightarrow \mathcal{A}\mathcal{L}\mathcal{M} \times \mathcal{D}. $$
(1)

The switching function is defined by means of two function:

$$ f_{switch}=({f_{switch}}_{|\mathcal{A}\mathcal{L}\mathcal{M}},{f_{switch}}_{|\mathcal{D}}), $$
(2)

where: \({f_{switch}}_{|\mathcal {A}{\mathscr{L}}{\mathscr{M}}}: \mathcal {A}{\mathscr{L}}{\mathscr{M}} \times \mathcal {S}_{switch} \times \mathcal {D} \to \mathcal {A}{\mathscr{L}}{\mathscr{M}}\) determines the next model,

\({f_{switch}}_{|\mathcal {D}}: \mathcal {A}{\mathscr{L}}{\mathscr{M}} \times \mathcal {S}_{switch} \times \mathcal {D} \to \mathcal {D}\) determines the next set of data for subproblem.

The switching function determines the next system state based on the previous state and detected disturbances, without taking decision in this state whereas the transition function in the model determines the next system state based on the previous state and the decision.

Switching method specifies the 5-tuple of

$$ ({\mathcal{D}}, \mathcal{A}\mathcal{L}\mathcal{M}, \mathcal{R}_{switch}, \mathcal{S}_{switch}, f_{switch}), $$

where:

  • \( \mathcal {D} \) - set of data and parameters of problem, which is the job set \( \mathcal {J}\), the machine set \( {\mathscr{M}} \), the resources set \( \mathcal {R} \) and their parameters

  • \( \mathcal {A}{\mathscr{L}}{\mathscr{M}} \) - set of models of considered subproblems,

  • \( \mathcal {R}_{switch} \) - a set of switching rules between models

  • \( \mathcal {S}_{switch} = (X_{switch}, T_{switch}) \) is a set of states in which switching occurs,

  • fswitch is a switching function.

Therefore, the switching algorithm for the production problems with quality deficiencies can be represented as follows:

  1. 1.

    Determining all components of the switching method, i.e.:

    • input data \( \mathcal {D}_{A} \) for the first subproblem: defining the job set \( \mathcal {J}_{A} \), the machine set \( {\mathscr{M}}_{A} \), the resources set \( \mathcal {R}_{A} \) and their parameters,

    • algebraic-logical models of subproblems and determination of ALMA - the model of the first subproblem,

    • the switching rules Rswitch,

    • the switching states set Sswitch,

    • the switching function fswitch.

  2. 2.

    Determining the current data \( \mathcal {D}_{now} \) and the current model ALMnow.

  3. 3.

    Determining the initial state of the system state of the current model (snow)0.

  4. 4.

    Determining the next state of the system \( s^{\prime }\) using the transition function of the current model \( s^{\prime } = f (u, s) \) and the taken decision.

  5. 5.

    Checking if the state belongs to the set of switching states \( s \in \mathcal {S}_{switch} \). If not, then determine the next states according to 4. If yes, go to the next Step 6.

  6. 6.

    Selecting the switching rule Rswitch.

  7. 7.

    Based on the transition function, determining new data \( \mathcal {D}_{next} \), model ALMnext and the initial state of the new model (snext)0. Then go to 4.

The algorithm flowchart is presented in Fig. 2.

Fig. 2
figure 2

The general algorithm of the switching method for scheduling problems of production with quality deficiencies. In contrast to the classic algorithm for generating trajectories using ALMM, this algorithm does not only check whether or not the calculated state belongs to the goal or to the not admissible states but also check belonging to the switching states. Also, it specifies the conditions for the ALMM model changes

3.3 Optimization algorithms

The trajectory generation is a process controlled by various general methods and/or specific optimization algorithms or algorithms for searching admissible solutions. Such general methods include the graph search algorithms and general methods (e.g. Branch & Bound). A special type of method is meta-heuristics and algorithms based on the ALMM methodology. The following methods using ALMM of multistage decision process have been proposed in earlier works:

  • a method that uses a specially designed local optimization task and the idea of semi-metric—the special local optimization criterion q(u,x,t) consists the following three parts: components related to the value of the global index of quality for the generated trajectory, components related to additional limitations or requirements and components responsible for the preference of certain types of decisions resulting from problem analysis and the local optimization task may use the semi-metrics term in the state space [11];

  • learning-based methods—these use the above specially designed local optimization task and the idea of semi-metric and additionally gather the information about the quality of solution during a searching process. Learning is realized in such a way that gathered information is used to change the coefficients of the local optimization task components [13, 23];

  • a method based on the learning process connected with pruning non-perspective solutions—an extended learning-based method in which it is possible to use gathered knowledge to also change local optimization task form. Some components can be removed or new components related to new limitations can be added because new additional limitations may arise in the current state. Additionally, collected information is used to prune non-perspective trajectories. It should be noted that every pruning, partial trajectory may be analyzed because pruning trajectories are this kind of source of information that can be used to modify the coefficients of a local criterion form [21];

  • substitution tasks method—a solution is generated using a sequence of dynamically created local optimization tasks, so-called substitution tasks. This task is a certain substitution multistage process with the substitution criterion. Substitution tasks are created to facilitate the decision making at a given state by substituting global optimization task with a simpler local task. A new or modified substitution task is defined based on information gained during an automatic analysis of the new process state. To construct the substitution task the concept of so-called intermediate goals is used, which determine a certain set of final states of the substitution process [12].

ALMM-based approaches allow solving non-deterministic discrete optimization problems. To optimize the scheduling problem with defect, it should be used the combination of thus meta-heuristics and switching method. The searching algorithm determines the deterministic problem solution on the basis of discrete process simulation. ALMM switching method is used when events occur during the process. It allows to present a problem using simple models and switching function, which specifies the rules of using these models. Paper [25] presents the combination of a hybrid algorithm with special local optimizing task and algebraic-logical models switching. In such case, a local criterion definition consists in determining particular components based on identifying limitations, requirements and preferences. The definition of component related to the estimation of the quality index for the final trajectory section is also important (it may have different forms and precision). Moreover, the modification of the local optimization task could be defined. It should be noted that the local criterion form can be designed in different ways. Firstly, the components may be determined based on the properties of the problem without disturbance. Then, the criterion in such a form can be used for a process with the disturbance. The second way is to design an additional component or components associated with the considered disturbance. When one has identified more different types of disturbances, suitable ingredients may be used appropriately. Therefore, in each state a set of decisions that can be taken Up(s) is generated and the best decision related to the local criterion is selected.

4 Proposed ZDM scheduling strategy

A scheduling tools for achieving Zero-Defect Manufacturing (ZDM) has been proposed by [32]. This is a new approach. The authors have presented the conceptual framework of the proposed tool with the informational flow among the scheduling tool, the Decision Support System (DSS), and the shop-floor (Fig. 3). The DSS utilizes real-time data from the shop-floor in order to predict or detect a defect. Then, the outcome of the DSS is fed into the scheduling tool. The scheduling tool updates the schedule according to the suggestions produced by DSS. However, this tool is still a concept without the details of the proposed solution. Neither methods nor algorithms for specific solutions are given. Therefore, we propose ZDM scheduling strategy based on ALMM approach. It provides not only the possibility of formally defining the problem, but also we have already developed specific solutions. In particular, the product-oriented algorithms were tested [25].

Fig. 3
figure 3

Scheduling tool for achieving ZDM proposed in the [32]. The framework consists the integrated modules of Decision Support System and Scheduling Tool

The proposed approach includes both decision support system and scheduling tool.

According to specification of the Decision Support System (DSS), given by [32], tool will consist of three individual components:

  • the defect detection module,

  • the defect prediction module,

  • the decision making algorithms for suggesting a solution based on user-defined rules.

In our proposition, the DSS tool uses switching method and consist of:

  • ALMM modeller which allows one to define the models of the problem in ALMM methodology. It includes creating an algebraic-logical model of the process, setting the optimization criterion and defining the additional knowledge about the problem (e.g. as auxiliary procedures or inference rules). The model can be built in different ways, e.g. by using pre-defined components or by defining it from the scratch. The modeller was presented in [14, 34].

  • ALMM switching modeller which allows one to define all elements of switching method. It includes: analysis of types of disturbances, determining the set of switching states, number of subproblems and their model, specifying types of switching, determination of the switching rules.

  • Database that collects previously developed ALMM models, ALMM switching models with the obtained solutions, using the algorithms stored in the algorithm module.

  • Predictive results interpreter module which identifies the solution of the given problem (the best one or an admissible one) on the basis of the data stored in the database and predictive analyses of defects. Predictive analyses can be made based on collected parameters from switching method and implemented predictive models such as time series analyses, regression, Bayesian approach, clustering.

The scheduling tool works in real time. The output of the DSS is fed into the dynamic scheduling tool along with the shop-floor status and specification. The new schedule is produced according a specific set of user defined criteria.

In our proposition, the scheduling tool consists special algorithm module which provides a collection of already implemented methods and algorithms described in Section 3.3. It also allows one to design new algorithms, especially meta-heuristics based on the ALMM methodology and ALMM switching method (Fig. 4). The ALMM Solver which consist ALMM modeler and algorithm module was partially implemented and presented in [24, 34].

Fig. 4
figure 4

ALMM scheduling tool for achieving ZDM: with additional modules ALMM models modeller, ALLM switching method modeler, Database with previously used models and solutions, Predictive results interpreter module and Algorithm module

Thus implementation based on ALMM methods in the ZDM framework can give some benefits as:

  • the ALMM models define multi-stage decision problems, so it can be used to defining decision making rules dynamically,

  • the switching method can be used to the defect detection and initiate rescheduling in scheduling tool,

  • the defect prediction function can be designed based on data from the switching method and some time-series prediction algorithm.

5 Application of proposed approach in flow-shop system with different types of product defect

Let us consider flow-shop manufacturing problem with quality control for the defects detection, removal of the manufacturing quality defects on additional repairing machines, and job re-treatment in part or all technological route. The problem is NP-hard. [15] We consider this problem to illustrate the effect of ALMM switching methods for solving scheduling problems with supporting the ZDM strategy.

There is more than one machine with a quality control in the technological route (\( 1 <| {\mathscr{M}}_{\mathcal {Q}} | \leq | {\mathscr{M}} | \)) and more than one additional machine for repairing defective components (\( | {\mathscr{M}}_{\mathcal {D}} |> 1 \) and \( {\mathscr{M}}_{\mathcal {D}} = \{M_{m + 1}, \ldots , M_{md} \} \)). The choice of machine for repair depends on the degree or type of the defective components and it is the result of quality control. As a result of each quality control we receive information to which specific machine the task should be transferred.

We use the following notation to describe problem under consideration. There is a set of dedicated machines \({\mathscr{M}}=\{M_{1},M_{2},\ldots ,M_{m}\}\) in the production route. The set of jobs \({\mathcal {J}=\{J_{1},J_{2},\ldots ,J_{n}\}}\) must be processed by these machines. One job is equivalent to a batch of a few elements. A number of elements in the batch may be different for each task. Therefore, we assume that the jobs can be divided into smaller jobs. Each job j consists on m operations. The processing time of each operation i of job j is denote pi,j. The processing time on the machine with quality control includes quality control time. All jobs follow the same route from the first machine to the last one (M1,M2,…,Mm). Also, there is a store in front of each machine, where the job must wait when the machine is busy. There is also a store at the end of production line for finished jobs. The store before the first machine is the initial store. The final store contains only correctly (of adequate quality) completed jobs. Let \( \mathcal {W} = \{W_{0}, W_{1}, \ldots , W_{m} \} \) mean a set of stores, where the store W0 is the final store and Wi is the store before machine Mi. We assume unlimited storage capacity. Let us denote by \( {\mathscr{M}}_{\mathcal {D}} = \{M_{m + 1}, \ldots , M_{md} \} \textrm { such that } {\mathscr{M}}_{\mathcal {D}} \cap {\mathscr{M}} = \emptyset \) set of additional machines used to repair faulty elements. There is also a store in front of each repair machine. Let’s denote the set of these stores by \( \mathcal {W}_{\mathcal {D}} = \{W_{m + 1}, \ldots , W_{md} \} \). After job repairing on the repair machine, the job returns to reprocessing to the specifically selected machine in the process line.

Figure 5 shows an example production line for such defined problem.

Fig. 5
figure 5

Example of flow-shop technology route with stores, quality control and repairing machines. The production line includes two machines on which quality control is carried out and three additional machines for repairing defects. The first quality control machine detects one type of defects. This type of defects is repair on one additional repairing machine. The second quality control machine detects two different types of defect. This defects are fixed on two different additional repairing machines. The system provides for the detection of three different types of product defect and three different ways to eliminate them

5.1 Models for subproblems

The first subproblem is the flow-shop problem with more than one quality control machine but without any repair machine. The number of machines with quality control determines the number of types of quality defects, on one machine with quality control one type of quality defects is detected.

Let UA, SA, (s0)A, fA,(SN)A, (SG)A denote an adequate set of decisions, the set of proper states, the initial state, the transition function, the set of non-admissible states and the state of goal states in the algebraic-logical model of the flow-shop system with time limits ALMA. A set of jobs denotes as \(\mathcal {J}_{A}\) and a set of machines denotes as \({\mathscr{M}}\).

State of system

The generalized state of a process sA = (xA,tA) at any instant tA can be described by the state of machines in the technology route, state of the initial store and state of the work-in-progress store. Intentionally in the system state, it is not taken into account the information of the final store including completed jobs because it is redundant information (a set of completed jobs is calculated slightly lower).

Significant elements of the ALMA for the switching method are the machines with quality control \( M_{q_{\gamma }} \), where γ > 1, and additionally distinguishing many return machine.

Therefore, the generalized state of the process sA = (xA,t) at a given moment t is described as the state of machines from the route, the state of machines with quality control, the state of the initial warehouse and the state of inter-operational warehouses.

Thus, the proper state of the process is defined as a n-tuples:

$$ \begin{aligned} {x_{A}}={}& (\underbrace{{{x_{A}^{1}}},{\ldots} ,{{x_{A}^{m}}}}_{\text{warehouses state}},\\ {}&\underbrace{{x_{A}^{m+1}}, \ldots, {x_{A}^{m+q_{1}}}, \ldots, {x_{A}^{m+i}},\ldots,{x_{A}^{m+q_{d}}},{\ldots} ,{x_{A}^{2m}}}_{\underset{\text{with states of quality control machines}}{\text{machines states}}}) \end{aligned} $$
(3)

where: \({{x_{A}^{1}}}\) - a state of the initial store, \({{x_{A}^{i}}}\) - a state of the i th work-in-progress store (between i-1 th and i th machine) for \({i=2,\ldots ,|{\mathscr{M}}|}\), \({x_{A}^{|{\mathscr{M}}|+i}}\) - a state of the i th machine for \({i=1,\ldots ,|{\mathscr{M}}|}\), \({x_{B}^{m+q_{\gamma }}}\) - a state of the i th machine with quality control \(M_{q_{i}}\) where i ∈{1,…,d}.

A coordinate \({{x_{A}^{1}}}\) is a set of jobs to process by first machine, i.e. jobs which performing has not started yet. Particular coordinates \({{x_{A}^{i}}}\), where \({i=2,\ldots ,{|{\mathscr{M}}|}}\), is a set of jobs which has been finished on the i-1 th machine and in the current state are available to process by i th machine and are not assigned to this machine.

The structure of the machine state is as follows: \({{x_{A}^{i}}=(\beta _{A},\tau _{A})}\) where: \({\beta _{A} \in \{0,1,\ldots ,|\mathcal {J}_{A}|\}}\) - job number, which is currently performing on the machine (0 denotes that no job is assigned to process by the machine), τA ∈ [0,τidle) - time remaining to complete current job (τidle denotes, that the machine doesn’t work).

The structure of the machine with quality control state is as follows:

$$ x^{m+q}=(\beta,\tau,QC_{1} \ldots, QC_{md}, \widehat{\beta}) $$

where particular coordinates take the following: βA ∈{0,1,…,n} – job number, which is currently performing on the machine (0 denotes that no job is assigned to process by the machine), τ > 0 -time remaining to complete currant job (τidle denotes, that the machine doesn’t work), QCγ ∈ [0,100] - is the percentage of defect in the lot, where γ = 1,…,md, and md is the number of additional repairing machines, \(|{\mathscr{M}}_{\mathcal {D}}|=md\), (0 denotes positive result of quality control), \(\quad \widehat {\beta } \in \{0,1,\ldots ,n\}\) – job number, in which defect were detected.

We can say that each i th machine is not working (is free) in state s when machine state is \(x_{A}^{|{\mathscr{M}}|+i}(s)=(0,\tau _{idle})\). Each i th machine is working (is busy) in state when \({x}_{A}^{|{\mathscr{M}}|+i}(s)=(\beta _{A} \neq 0, \tau _{A}>0 )\). In fact, there are no states when the machine is working and job is not allocated to machine (β = 0 and τ > 0) as well as machine is not working and has assigned the job the same time (β > 0 and τ = 0). Thus, other combinations of the coordinate values of the machine state are not reflected in the real manufacturing process.

Let \({\beta _{A}^{i}}\) and \({\tau _{A}^{i}}\) denote the appropriate value for a particular i th machine and \({x_{A}^{i}}(s)\) denote the value of i th coordinate in a particular state sA.

In an initial state ((t0)A = 0), the jobs are not performed and no job has been assigned to perform by any machine. Thus, the initial system state is as follows:

$$ {(x_{0})_{A}=((x_{0})_{A}^{1}, {\ldots} ,(x_{0})_{A}^{|\mathcal{M}|},(x_{0})_{A}^{|\mathcal{M}|+1},{\ldots} ,(x_{0})_{A}^{2|\mathcal{M}|})} $$

where:

$$ \begin{aligned} (x_{0})_{A}^{1} ={}& \mathcal{J}_{A}\\ (x_{0})_{A}^{i} ={}& \emptyset\ \text{for}\ {i=2,\ldots, |\mathcal{M}|}\\ (x_{0})_{A}^{i} ={}& (0,\tau_{idle})\ \text{for}\ {i={|\mathcal{M}|+1}, \ldots, 2|\mathcal{M}|}. \end{aligned} $$

We define a set of completed jobs \((\mathcal {J}_{F})_{A}\) in state s, which is equal to a difference of set of all jobs to perform and sets of jobs in stores with jobs processing by machines: \((\mathcal {J}_{F})_{A}(s)=\mathcal {J}_{A}\setminus \bigcup _{i=1}^{|v|} \left ({x_{A}^{i}}(s) \cup \{\beta _{A}^{i+|{\mathscr{M}}|}(s)\} \right )\).

The set of not admissible generalized states SN includes states in which job has not been finished and its due date has passed:

$$ {(S_{N})_{A}=\{s_{A}: \exists_{j\in J} \ j\notin ({\mathcal{J}}_{F})_{A}(s_{A}) \ \wedge \ d(j)<t_{A}\}}. $$

A state sA is a goal state where all the jobs have been performed and all job’s deadlines are not overdue. Set of goal generalized states (SG)A is as follows:

$$ {(S_{G})_{A}=\{s_{A}: \ ({\mathcal{J}}_{F})_{A}(s_{A})= \mathcal{J}_{A} \ \wedge \ \forall_{j\in J_{A}} \ d(j)\geq t_{A}\}}. $$

Decisions In the problem, we assume that the decision consists of assigning jobs to individual machines at the same time. Particular co-ordinate \({{u_{A}^{i}} \in {U_{A}^{i}}}\) represents separate decisions and refers to the i th machine (\({i=1,\ldots ,|{\mathscr{M}}|}\)).

The following assumptions have been proposed regarding the decision:

  • the decision to allocate the next job for the machine can be taken if and only if the machine finished the previously assigned job and the job was performed by the previous machine in the technological route,

  • the taken decision can not be changed.

The busy machine (realizing decision taken earlier) can only continue performing the previously assigned job. To the free machine in the current state, it can be assigned the job which satisfies the condition that it has been done by the previous machine in the technological route. When there will not be assigned any job to the free machine, the machine is still free.

Thus, the decision is a vector, \({u_{A}=({u_{A}^{1}},{\ldots } , u_{A}^{|{\mathscr{M}}|})}\) and value of particular co-ordinate \({u_{A}^{i}} \in \mathcal {J}_{A}\cup \{0\}\) is as follows:

  • if j th job is assigned to the i th machine, then the only possible decision is processing continuation \({u^{i}_{A}}=0\),

  • if the i th machine is free and there is no job that could be assigned, i.e. \({x^{i}_{A}}(s)=\emptyset \), then the only possible decision is not assigning any task \({u^{i}_{A}}=0\),

  • if the i th machine is free and the i th store is not empty (i.e. any j th job from the store can be assigned to the machine), then the possible decision is assigned job \({u^{i}_{A}}=j\) or no any job (still inactivity) \({u^{i}_{A}}=0\).

The decision (up)A must belong to the set of possible decision (Up)A in a given state sA. Possible decision depends on the state of the particular machine. Possible decisions \(({u_{p}^{i}})_{A}(s)\) for the i th machine in state s = (x,t) is defined as follows \(\forall {_{i=1,2,\ldots ,|{\mathscr{M}}|}}\):

$$ {({u}_{p}^{i})_{A}(s){=}} \left\{ \begin{array}{cl} {0}& \text{for} {{x}_{A}^{i+|\mathcal{M}|}(s) {\neq} (0, \tau_{idle})}\\ {0}& \text{for} {{x}_{A}^{i+|\mathcal{M}|}(s) {=} (0, \tau_{idle})} \text{if} {x_{A}^{i}}(s)=\emptyset\\ {0}\vee {j}& \text{for} {{x}_{A}^{i+|\mathcal{M}|}(s){=} (0, \tau_{idle})} \text{if} j \in {x_{A}^{i}}(s) \end{array} \right . $$
(4)

We should note that it is not possible to assign one job to two different machines. Each set \(({U_{p}^{i}})_{A}\) is equal to the set of jobs in work-in-progress store plus the element 0, i.e. \({({U_{p}^{i}} )_{A}= {x^{i}_{A}}(s) \cup \{0\}}\). When i th machine is free and the i th store is not empty it is possible to decide \({u^{i}_{A}}=0\). That means the machine hasn’t assigned any job to perform and waits. It gives us possibility for the machine to wait for the appearance of a new more perspective job. The profitability of that decision can be calculated by dedicated heuristic algorithm or predicted.

The complete definition of the set of the possible decision is as follows: \( (U_{p})_{A}(s)= ({U_{p}^{1}})_{A}(s)\times {\ldots } \times (U_{p}^{|{\mathscr{M}}|})_{A}(s)\setminus H_{A}(s)\), where HA(s) is a set of decision not assigning to any job uA = (0,…,0) when all the machines in a given state are free and not all jobs are completed. \(H_{A}(s)= \{u_{A}:(\forall _{i\in {\mathscr{M}}}: {u_{A}^{i}}(s)=0 \wedge x_{A}^{i+|{\mathscr{M}}|}(s)=(0, \tau _{idle})) \wedge {(\mathcal {J}_{F})_{A}} \neq \mathcal {J}_{A} \}\) Consecutively taken decisions form a decision sequence

$$ \tilde{u_{A}}=((u_{1})_{A},(u_{2})_{A},\ldots,(u_{c})_{A}), $$

where c is the number of taken decisions. This sequence uniquely determines the trajectory of process.

Transition function f A

The transition function is defined by means of two functions fA = ((fx)A,(ft)A), where (fx)A determines the next proper state and (ft)A determines the next time instant. Firstly, it is necessary to determine the moment when the subsequent state occurs \({t_{A}^{\prime }=t_{A}+{{\varDelta }} t_{A}}\). ΔtA equals the lowest value of time needed to complete job processing by i th machine.

Once the moment \({t_{A}^{\prime }}\) is known, it is possible to determine the proper state of the process at that time (determining (fx)A). The coordinate \({{x^{1}_{A}}}\) (state of initial store) is reduced by job, which is assigned to the first machine: \({({x_{A}^{1}}(s))^{\prime }= {x_{A}^{1}}(s) \setminus \{j: {u_{A}^{1}}(s)=j\}}\). Coordinate \({{x_{A}^{i}} \text {for} {i = 2,{\ldots } ,|{\mathscr{M}}|}}\) (state of store between i-1-th and i-th machine), is increased by the job performed on (i − 1)th machine and is reduced by the job, which is assigned to i th machine in the technological route:

$$ \begin{aligned} ({x_{A}^{i}}(s))^{\prime}=x^{i}(s) \cup &\\ \cup (\{{j:x_{A}^{i-1+|\mathcal{M}|}(s) =(j,{{\varDelta}} t_{A})}\} & \setminus \{j: {u_{A}^{i}}(s)=j\}). \end{aligned} $$
(5)

Value of coordinates related to machine state

$$ {(x_{A}^{i+|\mathcal{M}|})^{\prime}=(\beta^{\prime}_{A}, \tau^{\prime}_{A})} $$

for \({i = 1,{\ldots } ,|{\mathscr{M}}|}\), depends on the taken decision and is as follows:

  1. 1.

    if i th machine is free in given state machine and decision is \({u_{A}^{i}}(s)=j\) (assigning j th job to perform by this machine):

    $$ \begin{array}{l} \beta_{A}^{\prime} = \left \{\begin{array}{cl} 0 &\ \text{for }\tau_{A}^{i+|\mathcal{M}|}={{\varDelta}} t_{A} \\ j &\ \text{for }\tau_{A}^{i+|\mathcal{M}|}>{{\varDelta}} t_{A} \end{array}\right.\\ \tau_{A}^{\prime} = \left\{\begin{array}{cl} \tau_{idle} & \text{for } \tau_{A}^{i+|\mathcal{M}|}={{\varDelta}} t_{A} \\ p^{ij}-{{\varDelta}} t_{A} &\ \text{for } \tau_{A}^{i+|\mathcal{M}|}>{{\varDelta}} t_{A} \end{array}\right. \end{array} $$
    (6)
  2. 2.

    if i th machine is free in given state and decision is \({u}_{A}^{i}(s)=0\) (not assigning job to perform by this machine): \(\beta _{A}^{\prime } = 0\), \(\tau _{A}^{\prime } = \tau _{idle}\),

  3. 3.

    if i th machine is busy in given state and decision is \({{u_{A}^{i}}(s)=0}\) (continuation previously assigned job j):

    $$ \begin{array}{l} {\beta}_{A}^{\prime} = \left \{\begin{array}{cl} 0 &\ \text{for }{\tau}_{A}^{i+|\mathcal{M}|}={{\varDelta}} t_{A} \\ j &\ \text{for }{\tau}_{A}^{i+|\mathcal{M}|}>{{\varDelta}} t_{A} \end{array} \right. \\ {\tau}_{A}^{\prime} = \left \{\begin{array}{cl} \tau_{idle} &\ \text{for }{\tau}_{A}^{i+|\mathcal{M}|}={{\varDelta}} t_{A} \\ \tau_{A}^{i+|M|}-{{\varDelta}} t_{A}& \ \text{for }{\tau}_{A}^{i+|\mathcal{M}|}>{{\varDelta}} t_{A} \end{array} \right. \end{array} $$
    (7)

The \( x^{q + m} = (\beta , \tau , QC, \widehat {\beta }) \) coordinate representing the state of the machine with the quality control is also carried out to the new state \( (x^{q + m})^{\prime }= (\beta ^{\prime }, \tau ^{\prime }, QC^{\prime }, \widehat {\beta }^{\prime }) \). The new state depends on the decision made and the quality control result. The quality control value is provided at the time when the job is completed. Let QC(j) and \( \widehat {\beta }(j)\) represent the quality control result for the job Jj, respectively. Therefore, the transition function for the machine state with quality control is as follows:

  1. 1.

    if the machine is free and the quality control is positive (\(x^{q+m}=(0, \tau _{idle}, 0, \widehat {\beta })\)) in a given state and the decision uq(s) = j (assigning j th job to perform by this machine), then:

    1. i

      for τq+m = Δt, in the next state, the machine with QC will be free and the new quality control result will be read:

      $$ (x^{q+m})^{\prime}=(0, \tau_{idle}, QC(j), \widehat{\beta}(j)), $$
      (8)
    2. ii

      otherwise when τq+m > Δt, in the next state the machine still processes the j job and the time to complete the job is pqjΔt and the quality control result is the same as in the previous state:

      $$ (x^{q+m})^{\prime}=(j, p^{qj}-{{\varDelta}} t, 0 , \widehat{\beta}). $$
      (9)
  2. 2.

    if q th machine is free and quality control result is positive \( x^{q + m} = (0, \tau _{idle}, 0, \widehat {\beta }) \) in given state and decision is uq(s) = 0 (not assigning job to perform by this machine), then: \( (x^{q+m})^{\prime }= x^{q+m}\)

  3. 3.

    if q th machine is busy \( x^{q + m} = (j, \tau , QC, \widehat {\beta }) \) in a given state and the only possible decision is uq(s) = 0 to continue processing the previously assigned job j, then:

    1. i

      for τq+m = Δt, similarly to the case of 1 in the next state, the machine with quality control is free and a new quality control result will be read, i.e. the next state is given by the formula Eq. 8,

    2. ii

      otherwise, when τq+m > Δt, in the next state the machine continues job j processing, the time to complete the job is τΔt and the control result is the same as the previous:

      $$ (x^{q+m})^{\prime}=(j, \tau -{{\varDelta}} t, QC , \widehat{\beta}). $$
      (10)

That should be noted that if machine with quality control is free and the result of the control is negative, the next state is not calculated using the transition function but using a switching function. Then we switch models from ALMA to ALMB.

The second subproblem is represented by ALMB and it consists not only states of machines in the technological route, states of machines with quality control but also states of additional repairing machines. Let us consider a special case in which there are quality deficiencies of all types. Then all the additional repairing machines Mm+ 1,…Mmd from outside the technological route are in the second subproblem. The model of this problem is a modification of the model ALMA by adding elements related to repair of defective jobs, i.e. repair machines. The proper state xB in such system is as follows:

$$ \begin{aligned} {x_{B}}=&(\underbrace{{{x_{B}^{1}}},{\ldots} ,{{x_{B}^{m}}}}_{\text{store states}},\\ &\underbrace{{x_{B}^{m+1}}, \ldots, {x_{B}^{m+q_{1}}}, \ldots,{x_{B}^{m+q_{d}}},{\ldots} ,{x_{B}^{2m}}}_{\underset{\text{with quality control machines}}{\text{machine states}}},\\ &\underbrace{{x_{B}^{2m+1}}, {x_{B}^{2m+2}}}_{\underset{\text{and repairing machine type 1}}{\text{store state}}}, \ldots,\\ &\underbrace{{x_{B}^{2m+2md-1}}, {x_{B}^{2m+2md}}}_{\underset{\textrm{and repairing machine type md}}{\textrm{store state}}}) \end{aligned} $$
(11)

where:

\({{x}_{B}^{m+q_{\gamma }}}\) - state of machine with quality control \(M_{q_{\gamma }}\) for γ type of defects, where γ ∈{1,…,d} \({x_{B}^{2m+2\gamma -1}}\) - state of additional store Wm+γ for jobs with γ type of defects, where γ = 1,…,md, \({x_{B}^{2m+2\gamma }}\) - state of additional repairing machine Mm+γ for jobs with γ type of defects, where γ = 1,…,md.

5.2 The switching function

The switching function between the models includes three types of switching. It is defined as follows:

  • Switching from the ALMA to the ALMB in the case of detecting at least one quality defect of any type (negative result of quality control of at least one machine with quality control), when there is no job with previously detected quality deficiencies,

  • Switching from the ALMB model to the model with the same structure only with changed parameters, i.e. the \( ALM_{B}^{\prime }\) structure in cases where:

    • at least one machine with quality control has detected another job with quality defect,

    • some previously detected job has been repaired and others previously detected jobs are still waiting to repair.

  • Switching the ALMB to the ALMA structure with changed parameters, that is \( ALM_{A}^{\prime }\), when the last of the defected job is repaired by any repair machine and has just been fixed.

5.3 The switching states

The set of switching states is defined as the set of states in which (a) any machine with quality control \(M_{q_{k}}\) has just finished job and the value of quality control is negative (defect has been just detected) or (b) any repairing machine has just finished job Mγ, where γ = m + 1,…,m + md and it is necessarily to put the job which has been just repaired to the machine in the technology rough:

$$ \begin{array}{@{}rcl@{}} \mathcal{S}_{switch}&=& \{ (x,t): \\ &&\exists_{k\in\{1,{\ldots} d\}} \ x^{m+q_{k}}= (0,\tau_{idle}, QC \neq 0, \widehat{\beta} \neq 0 )\\ &&\quad \vee\\ && \exists_{\gamma\in\{1,\ldots,md\}} \ x^{2m+2\gamma-1}=(0,\tau_{idle}) \} \end{array} $$
(12)

5.4 The switching rules

Switching rules \( \mathcal {R}_{switch} \) determine how to calculate the model parameters during switching. Additionally, if any machine with quality control has just finished any job processing with negative quality control then we should indicate the returning place of this job in the technological route, taking into account the appropriate repairing machine and type of quality defect. Therefore, when dedicated machine \( M_{q_{\gamma }} \) detects a new γ defect, the rule is to determine the changes of parameters for the repair machine dedicated to a given type of defect. The appearance of a quality deficiency of the γ type does not change the state of other repairing machines dedicated to other types of quality deficiencies. Completion of a quality defect repairing of γ type by the repair machine Mm+γ requires recalculation of the store state Wrγ before the machine Mrγ. The states of return store dedicated to other defect types do not change.

5.5 Gathering information about disturbances

To support the ZDM strategy with the switching approach we gather information about issues in processes. We can gather information about switching, the state (time) of switching, the switching frequency, the type of quality defects, the quantity of defects. Those parameters are useful to predict the time and the quantity of defects. We also can predict the profitability of repairing the defect, or it is more profitable to use new raw material. We can use time-series analyses, the Bayesian approach, the Markov chain model, and regression models. Moreover, switching rules are useful to gather information about types of quality defects. Thus, we can analyze causes of defects such as defective raw materials, machine failures, wrong machine settings. Based on this data, we have information on how to improve the process so that there are no quality defects in the next orders.

6 Discussion

In the paper, we consider the approach combining several process disturbances which are “product-oriented” and aligned with the ZDM concept. In particular, we consider a detection of a quality defects in manufactured parts as “product-oriented” disturbances. However, it is debatable whether or when damaged product should be repair. When to produce additional parts instead of repair? Many factors may impact on this decision, such as the cost of repair or the availability of additional raw materials. Undoubtedly, one of such factors is the repairing time, especially when contracts include the required completion dates.

Let consider the special case of a flow-shop system with two machines in a technological route M1 and M2, including one machine with quality control (the second machine in the technological route), and one additional repairing machine Md1. When a fault is detected, the damaged parts are placed in the store Wd1, and then repaired on an additional machine Md1. After repairing on additional repairing machine, the job is placed in store W1 to reprocess on the second machine again. Figure 6 presents an example of this production system. In the switching method, two models should be developed. The first is ALMA for two machines in the technological route and second is ALMB with additional repair machine. Switching occurs when the quality control machine detects a quality deficiency, or the repair machine has just finished repairing the deficient job and there are no more deficient jobs in the system.

Fig. 6
figure 6

Example of flow-shop technology route with stores, quality control, one type of product defect and one repairing machine

6.1 Example 1

Let consider three jobs J1, J2 and J3 in the above flow-shop system. The processing time on the machines is given in the Table 2:

Table 2 First problem example, with two machines M1, M2, one additional repairing machine Md1 and three jobs J1, J2 and J3 and respectable processing times

Figure 7 presents optimal schedule without deficient jobs for this problem. Total processing time is equal 29. According to the approach described in Section 5, the problem is modeled using the ALMA with two machines in a technological route M1 and M2 and three stores.

Fig. 7
figure 7

Example of optimal schedule in flow-shop system without disturbances for two machine and three jobs

Let us consider the case in which the quality control machine gives a negative quality result for the first job J1. That is, quality deficiencies have been detected only in the first job. In this case, the quality control takes place after the completion of the job by the second machine M2, and the repair job requires the use of additional treatment on the repairing machine Md1 and re-treatment on the second machine M2. At this point, a switch between models occurs and the problem is modeled using the ALMB, which includes an additional repairing machine Md1. Due to the need to repair the first job J1 the schedule is changing. When the time is 7, the second machine M2 has finished the first job M1 with negative quality control result. For this reason, an additional repairing machine Md1 has to be considered in the system and the first job M1 must be assigned to the additional repairing machine Md1 to be processed. At this time the first machine M1 is in the process of the second job J2 and there are 5 units of time left to complete this job (which is presented in a lighter color in Fig. 8). When the time is 12, the repairing machine Md1 has finished processing of the first job J1 and no new quality defects have been detected. Hence, the system does not need to consider the repairing machine Md1 any longer. The repaired first job J1 has to be assigned to re-processing on the second machine M2. In 12 unit of time in the store, W2 are both J1 and J2 jobs. The optimization algorithm chooses the order of their processing on second machine. Figure 8 shows the situation, when the job J1 is executed first on the second machine, and then job J2. In this example, we do not consider any more quality defects. At this point, there is a switch between the models and the problem is modeled using the ALMA without an additional repairing machine Md1. In 29 unit of time, all jobs are finished with adequate quality. It should be note that despite the need of repairing first job J1, the total processing time Cmax does not change in relation to the total processing time in the case of no quality deficiencies.

Fig. 8
figure 8

Example of the schedule of the flow-shop system with defect detection after processing on the second machine, defect repairing on one additional machine, and then re-processing on the second machine. The schedule presents defect detection for the first job, then the first job is repairing to eliminating defects and then re-processing in the second machine. Switching function enables processing jobs with defects differently than it would result from the technological route

Fig. 9
figure 9

Example of optimal schedule in flow-shop system without disturbances for two machine and three jobs with parameters from Table 2

6.2 Example 2

Let us consider the same production system with another example of processing time (Table 3).

Table 3 Second problem example, with two machines M1, M2 and three jobs J1, J2 and J3 and respectable processing times. The processing time of jobs J1, and J2 on second machine is significantly longer

Let us consider again the situation when quality control is on M2 machine and gives negative quality control for J1. Then this job has to be repaired on an additional machine and reprocessed on machine M2. An example of this schedule is presented in Fig. 8.

When the time is 10, the second machine M2 has finished the first job J1 with negative quality control result. For this reason, an additional repairing machine Md1 has to be considered in the system and the first job J1 has to be assigned to the additional repairing machine Md1 to be processed. At this point, a switch between models occurs and the problem is modeled using ALMB, which includes an additional repairing machine Md1. At this time, the first machine M1 processes the second job J2 and there are 2 units of time left to complete this job (which is presented in a lighter color in Fig. 8). When the time is 20, the repairing machine Md1 has just finished processing of the first job J1 and no new quality defects have been detected. Hence, the system does not need to consider repairing machine Md1 any longer and the repaired first job J1 has to be assigned to re-processing on the second machine M2. At this point, there is a switch between the models and the problem is modeled using ALMA without an additional repairing machine Md1. In this example, we do not consider any more quality defects. In 31 unit of time, all jobs are finished with adequate quality (Fig. 9).

In this case, the total processing time Cmax is longer than 29. The schedule would be shorter if we dropped the first task and pick up new materials to start producing the job once again. The algorithm should calculate a part of the solution related to the repair of defects or reproduction and provide a more admissible solution depending on the optimization criterion.

It is also possible that then after reprocessing on the second machine, M2 quality control gives negative quality control for J1 one again and again. In such cases, in the proposed framework in the database, all previous calculated solutions and models are stored so the predictive results interpreter module can give the information to the Decision Support System. It is not profitable to repair the defects in this case and a better solution is to take new material (if available) from the warehouse and perform the job again from the beginning (Fig. 10).

Fig. 10
figure 10

Example of a schedule of the flow-shop system with defect detection after processing on the second machine, defect repairing on one additional machine, and re-processing on the second machine. The schedule presents defect detection for the first job, then the first job is repairing to eliminating defects and re-process on the second machine. Switching function enables processing jobs with defects differently than it would result from technological route

6.3 An example of a real manufacturing process with defects

The powder coating manufacturing process is an example of a manufacturing problem in which one quality control station and two different operations outside the process route for repairing defective parts are considered.

Powder coating involves the application of powder paint to painted metal parts. In powder coating, electrostatic or electrokinetic spraying technique is used. After the paint has been applied (sprayed), the elements are baked in a special oven. As a result of burning, the powder paint is melted and polymerized. The resulting paint coating is resistant to corrosion, chemicals, high temperatures and mechanical damage and has very high aesthetic and decorative qualities.

The elements are painted in batches. All batch elements are placed in a special basket (with a specified capacity) and in this basket they go through all stages of the technological process. The powder coating technological process is divided into the following stages:

  • surface preparation for painting,

  • spraying of powder paint,

  • heat treatment,

  • packaging.

At the stage of surface preparation, metal elements are subjected to chemical processes of degreasing, etching or sandblasting. This stage is very important because of the obtained quality of the finished product (quality of the obtained surface). We consider chemical surface preparation which is carried out by immersion method. It is carried out on two workstations. The first is a bath with special chemical baths. At the second station the elements are dried.

The sputtering stage is carried out in a special booth equipped with, among other things, devices for transporting the parts and electrifying the paint powder, as well as paint spraying guns. The electrified paint particles, lifted in the air stream, settle on the surface of the grounded object after being directed to it. The sputtering efficiency depends on the type of booth used and the number of guns used. Moreover, depending on the type of booth, the powder that settles on the booth walls can be recovered for re-use or not, in which case it is disposed of.

Next, the painted elements are subjected to thermal treatment in a chamber oven at a temperature of about 200C. The time of one cycle depends on the type of painted elements and is 15–45 min.

After the heat treatment, a quality control takes place, during which the painted elements are checked. Depending on the result of the inspection, different steps are carried out. If the quality is satisfactory, the result of the inspection is positive. Then the elements are directed to the packing station and prepared for release to the customer. If the quality is not satisfactory, the type of deficiency is determined. If the coating is too thick or too thin or if there are impurities in the coating, but these factors occur to a negligible degree, then the defective elements are subject to the matting operation and return to the sputtering station, after which the technological process is continued. If the degree of error is significant, the defective component must be completely repainted and the entire technological process must be repeated, starting with the chemical bath.

6.4 More complex problems

The ALMM switching method can be considered for more complex manufacturing systems with quality control and repair. Quality control can take place at various stages of the production process. In addition to quality control at the end of the process, i.e. inspection of the finished product, post-operation inspections are also possible at:

  • one specific machine (like Fig. 6),

  • on many specific machines (eg. Fig. 5),

  • on all machines.

Quality control can have different spectrum and measure the level of performance precision of a job (manufactured part). These parameters are determined for a specific production process. Further activities (operations) of the technological route depend on the execution of the production process and the quality control result. As a result of positive quality control (fulfilled quality requirements), the controlled job (element) is passed for processing by the next machine in the technological route. As a result of a negative quality control (the quality requirements have not been met) the following situations may occur:

  • the job is completely defunct—it should be thrown away (scrapped), and complementary parts should be taken from the raw materials store,

  • the job is partially scrapped—the job is undivided:

    • improvement on the designated machine or machines in the process line, returning to the control station and continuing the process route,

    • reprocessing (re-entering the production line) from the first machine or from the designated machine,

    • repairing of an element on an additional machine or machines from outside the technological route and re-processing from the first machine or from the indicated machine in the technological route.

  • the job is partially deficient - the job is divided (e.g. it is a batch). In this case, we divide the job into isolated parts and perform the appropriate operations on them:

    • a correctly executed part of job waits whereas the defective part is sent for re-processing (with variants as described above), after which processing of the whole batch continues,

    • the correctly processed part of job continues processing on another machine (it does not wait) whereas the defective part is sent for re-processing (with variants as described above).

The repair of defective components can have different degrees of complexity. A single quality control station may detect one type of quality deficiency or several types of quality deficiencies:

  • in the case of detecting one type of deficiency, the result of a negative quality control clearly indicates how to repair the defective components,

  • in case of detecting at one station various types of defects or the degree of damage, the method of repair is ambiguous, it depends on the type of detected defect.

In the most complex production process on the production line there are several quality control stations, in which some of them indicate different types of quality defects on which depend different ways of repair.

Therefore, the proposed ZDM approach can be widely applied from simple single-machine systems to parallel, flow or hybrid machine systems. In such situations, it becomes a computational problem to compute transition functions or switching rules. By defining the subproblems models described in Section 3.2, model decomposition and computational parallelization are naturally possible. Obviously, in the case of such complex problems, it is also extremely important to select appropriate solution methods taking into account the characteristic features of the problem. The proposed heuristic algorithms based on ALMM provide such a possibility.

7 Conclusions

In this paper, we have considered the problem of ensuring Zero Defect Manufacturing requirements through formal methods. We have proposed an adaptation of Algebraic-Logical Meta Model for an uncertain production environment through appropriate model switching. The advantage of our approach comes from filling the gap in typical ZDM strategies by handling “product-oriented” disturbances. Additional benefit is a possibility of integrating the meta-heuristic algorithms through predictive-reactive scheduling.

The proposed method includes strategies such as detection, repair, prediction and prevention for defect-free production. The use of the proposed approach will prevent the production line from leaving the missing elements, which would be detected only at the stage of completing the order or, even worse, returned by the customer. Each defective products or component will be repaired and the total production time including repair time will be optimized. This is especially important for production companies where defective products or components cannot be thrown away, e.g. for ecological reasons or the raw materials used for production are too expensive or difficult to obtain. Acquiring new elements in such a situation requires the involvement of the procurement team, which increases time and costs. In addition, it will result in failure to perform orders with a specified completion date and companies may pay a contractual penalty. As a consequence, the company may lose customers.

The main drawback of our approach is the fact that current form is not supported by any computational framework. While methodology is formally complete and allows representing large variety of problems, there is no tool allowing for efficient model creation or optimization. This will be a topic of further research. Moreover, the investigation of large scale extensions are still far from complete. In our future works, we want to investigate the aspect of scaling of switching ALMMs in terms of computational complexity and also to extend the predictive and prevent part of the method.