1 Introduction

A flow shop system (FSS) is a manufacturing system that usually includes multiple production lines to share the order demand for a single product (Brammer et al., 2022; Dios et al., 2018; Dolgui et al., 2021). Several workstations implement different processes in a production line, and each workstation comprises several identical machines (Yu et al., 2018). Because each machine may operate to provide a capacity or fail with a probability, the production capacity of each workstation has multiple states that follow a binomial probability distribution. Hence, the FSS can be modeled as a multistate flow network (Lin & Chang, 2012; Lin & Chen, 2022; Nguyen, 2022), namely, the multistate flow shop network (MFSN). In some industries, such as the printed wire-board and automobile manufacturing industries, defective work‐in‐process (WIP)/productions may be produced in specific workstations because of yield rates; they are then mostly repaired from the indicated workstations as much as possible to reduce the waste of being scraped (Hadjinicola, 2010; Sarker et al., 2008). Accordingly, Lin and Chang (2015) proposed the production reliability, which is defined as the probability that d units of order demand can be manufactured by an MFSN with repair actions.

The design of an FSS involves the manufacturing system configuration. The concept of manufacturing system configuration is multifaceted and encompasses various aspects, such as the physical arrangement of machines, machine selection, and task assignment (Saxena & Jain, 2012). In particular, machine configuration is a significant consideration in manufacturing system design, and traditionally, it involves decisions regarding appropriate machine types/suppliers and configuration of the number of machines (Chan et al., 2005; Chehade et al., 2012). Bukchin and Tzur (2000) developed an exact branch-and-bound algorithm and heuristic procedure to address the problem of machine selection and task assignment in a flexible assembly line to minimize the total machine cost. Bukchin and Rubinovitz (2003) extended their study to address the machine configurations in the FSSs, including the parallel workstation configurations. They proposed an integer linear programming model and branch-and-bound algorithm to minimize the number of workstations and total cost. Li et al. (2011) introduced the hierarchical compositional properties of components assembled in repetitive patterns during the automotive Li-ion battery pack manufacturing. They developed a recursive algorithm to generate processing sequence planning and machine configurations, minimize machine investment costs, and potentially increase the system throughput. Hossain and Sarker (2016) considered a multistage manufacturing system with an inspection station at the end of the production line to make decisions regarding defective products. They formulated a fractional mixed-integer nonlinear programming problem to minimize the unit cost of production by determining the optimal number and locations of offline rework stations. Oesterle et al. (2019) addressed the machine configuration problem by considering product design alternatives and assembly-line balancing. They proposed a detailed mathematical cost model to quantify the complex and interconnected consequences of product design, manufacturing technology, and process choices in a single-cost metric. Several metaheuristic algorithms, including the evolutionary algorithms, ant colony optimization, and artificial bee colonies, have been compared to solve this problem. Niroomand (2021) discussed the machine configuration problem while considering assembly line balancing to minimize the costs of station setup and machine purchase. This study aims to determine the optimal machine configuration decision and employs a combination of an artificial electric field algorithm and simulated annealing to solve this problem. Overall, studies related to the machine configurations assume that the machine states are deterministic. In addition, no study has simultaneously considered the decisions regarding machine types/suppliers and number of configured machines.

Because machine states are stochastic (Lin & Chang, 2012; Lin & Chen, 2022; Nguyen, 2022), designing a reliable FSS considering the machine configuration is a critical challenge. Thus, it is necessary to maximize the production reliability of the FSS. Yeh et al. (2023) utilized a genetic algorithm to solve a reliability-oriented machine configuration problem for a manufacturing system. However, assigning machines to workstations incurs high purchasing costs (Bajestani et al., 2009). Thus, the tradeoff between reliability maximization and cost minimization must be considered. Although Lin et al. (2019) discussed an issue related to the reliability and cost of a bi-objective machine configuration problem, they assumed that the manufacturing system had no repair actions. Accordingly, our study addresses the machine configuration problem of production reliability and total purchase cost optimization for an FSS with repair actions.

This problem involves the evaluation of the production reliability of an MFSN and determination of the optimal machine configuration. When searching for a machine configuration, it is necessary to evaluate production reliability. Lin et al. (1995) stated that the multistate flow network reliability evaluation problem is NP-hard. The production reliability evaluation problem is a typical multistate flow network reliability evaluation problem; therefore, it is NP-hard (Lin & Chang, 2015). Zhang and Bard (2006) and Jahromi and Tavakkoli-Moghaddam (2012) indicated that the machine configuration optimization problem for manufacturing systems is NP-hard. Therefore, the problem to be addressed must be NP-hard. Lin et al. (2017) and Yeh et al. (2017) evaluated production reliability through the analysis of minimal-consumed capacity patterns for demand d (d-MCCPs), which are derived from the raw material/WIP/product flow into each workstation by integrating a path decomposition method of Lin and Chang (2015). Subsequently, the recursive sum of disjoint products (RSDP) calculates the union probability of the d-MCCPs to obtain the production reliability. The path decomposition method separates a production line into normal and repair paths to determine the flow processed by each workstation. If z repair actions exist, the path decomposition method generates 2z paths, including one normal path and 2z − 1 repair paths for a production line (Lin & Chang, 2015). The computation of the flow traveling through each workstation is time-consuming, with more workstations and repair actions in each production line. Instead, by referring to Bowling et al. (2004) and Pillai and Chandrasekharan (2008), we adopt the Absorptive Markov Chain (AMC) to determine the flow transition status between a pair of workstations to obtain the input flow of each workstation in the MFSN without enumerating all the regular and repair paths.

However, because the addressed machine configuration problem is NP-hard, it must be solved in a reasonable time using metaheuristic algorithms, such as the genetic algorithms, particle swarm optimization, and simulated annealing (Juan et al., 2015). Our problem is also bi-objective. For different types of bi-objective optimization problems, the Nondominated Sorting Genetic Algorithm II (NSGA-II) has usually presented better efficiency than that of the several well-known multi-objective metaheuristic algorithms, such as the multi-objective evolutionary algorithm based on the decomposition and improved strength Pareto evolutionary algorithm (SPEA2) (Lin et al., 2019; Liu et al., 2020; Silva et al., 2022; Yeh, 2020). Ma et al. (2023) highlighted the advantages of the NSGA-II: (1) it maintains the spread of the solutions, and (2) converges to the exact nondominated solutions well. Although Deb and Jain (2014) extended the NSGA-II to propose NSGA-III, it is more suitable for handling multi-objective (having four or more objectives) problems. Thus, the NSGA-II is used to search for the Pareto solutions to the addressed bi-objective problem. However, the output from the NSGA-II is a nondominated/Pareto set, and system administrators always need to select an alternative from the set (Lin & Yeh, 2012). Our study uses the technique for order preference by similarity to an ideal solution (TOPSIS) (Hwang et al., 1993) to determine a compromise alternative. TOPSIS is widely used in multi-criteria decision-making because of its well-founded logical structure, simultaneous consideration of ideal and non-ideal solutions, and ease of calculation. Unlike other methods, TOPSIS provides a comprehensive view of alternatives by assigning individual values to each, thereby enabling a better understanding of the differences between the alternatives and varying criteria (Durak et al., 2022).

To address the problem of FSS with repair actions, this study proposes a novel method that combines the AMC, RSDP, NSGA-II, and TOPSIS. The subsequent sections present the network model and optimization formulation (Sect. 2), production reliability evaluation based on AMC and RSDP (Sect. 3), NSGA-II procedure (Sect. 4), and TOPSIS approach for selecting a compromise alternative (Sect. 5). To demonstrate the effectiveness of our hybrid method, we apply it to a practical case of solar cell manufacturing (Sect. 6) and compare the efficiencies of NSGA-II and SPEA2. Finally, we summarize our conclusions in Sect. 7.

2 Bi-objective machine configuration optimization modeling for FSS

The FSS can be described as a network (N, A) with w production lines, L1, L2, …, Lm to produce the same product, and n workstation in each production line, in which N = {nj,i| j = 1, 2, …, w, i = 1, 2, …, m} represents a set of m workstations/nodes and A represents the set of arcs/transport devices. The workstation nj,i signifies the process i in production line Lj. Let Ij and Oj denote the raw material input and product output of Lj for j = 1, 2, …, m, respectively. In each production line Lj, there are z repair workstations,\(n_{{j,\beta_{1} }}\), \(n_{{j,\beta_{2} }}\), …, \(n_{{j,\beta_{z} }}\), with βe \(\in\) {1, 2, …, m}. If defect product flow is detected at the repair workstation \(n_{{j,\beta_{e} }}\), it will be repaired from a pointed workstation \(n_{{j,\alpha_{e} }}\) to the workstation \(n_{{j,\beta_{e} }}\) for e = 1, 2, …, z, where \(n_{{j,\alpha_{e} }}\) is the initial node of the repair node \(n_{{j,\beta_{e} }}\) with αe \(\in\) {1, 2, …, βe}. Figure 1 illustrates a general FSS network with two production lines to process raw material inputs of I1 and I2 and then generate the product outputs of O1 and O2. Each production line has two black nodes denoting the repair workstations.

Fig. 1
figure 1

A general FSS network with two repair workstations

For each process i, qi device suppliers supply the machines. Each machine has four features: purchase cost cu,i, machine reliability ru,i, production capacity hu,i, and yield rate pu,i for u = 1, 2, …, qi and i = 1, 2, …, m. Let X = (x1,1, x2,1, …, x1,m, x2,1, x2,2, …, x2,m, …, xw,1, xw,2, …, xw,m) be a supplier selection with xj,i \(\in\) {1, 2, …, qi} signifying the index of the selected supplier for the workstation nj,i. Then, let Y = (y1,1, y2,1, …, y1,m, y2,1, y2,2, …, y2,m, …, yw,1, yw,2, …, yw,m) denote a machine amount pattern with yj,i \(\in\) {1, 2, …, \(M_{i}^{{x_{j,i} }}\)} representing the number of machines provided by the supplier xj,i, where \(M_{i}^{{x_{j,i} }}\) expresses the available quantity of machines supplied by the supplier xj,i. The pair (X, Y) is represented as a machine configuration. We denote C(X, Y) as the total purchase cost of machine configuration (X, Y) and formulate it as:

$$ C\left( {{\varvec{X}},{\varvec{Y}}} \right) = \sum\nolimits_{j = 1}^{w} {\sum\nolimits_{i = 1}^{m} {c_{{x_{j,i} }} y_{j,i} } } , $$
(1)

where \(c_{{x_{j,i} }} y_{j,i}\) indicates the purchase cost of nj,i.

Associated with the machine configuration (X, Y), each workstation with several identical machines has multiple capacity states. Therefore, FSS network (N, A) is viewed as an MFSN. Let S = (s1,1, x2,1, …, s1,m, s2,1, s2,2, …, s2,m, …, sw,1, sw,2, …, sw,m) be the current capacity vector of (N, A), where sj,i is the production capacity of workstation nj,i and may be 0, \(h_{{x_{j,i} ,i}}\), 2 \(h_{{x_{j,i} ,i}}\), …, or \(y_{j,i} h_{{x_{j,i} ,i}}\). The following equation defines the probability of production capacity sj,i.

$$ \Pr \left( {s_{j,i} = \tau h_{{x_{j,i} ,i}} } \right) = \left( {\begin{array}{*{20}c} {y_{j,i} } \\ \tau \\ \end{array} } \right)\left( {r_{{x_{j,i} ,i}} } \right)^{\tau } \left( {1 - r_{{x_{j,i} ,i}} } \right)^{{y_{j,i} - \tau }} , $$
(2)

where τ \(\in\) {0, 1, 2, …, yj,i} indicates the number of regular machines in nj,i. Let S denote the set of all the capacity states of (N, A) successfully manufacturing order demands d associated with machine configuration (X, Y). The production reliability of configuration (X, Y) is defined as:

$$ {\varvec{R}}(d,({\varvec{X}},{\varvec{Y}})) = \sum\nolimits_{{\varvec{S} \in {\varvec{S}}_{{(\varvec{X},\varvec{Y}),d}} }} {\Pr (\varvec{S})} , $$
(3)

where Pr(S) = Pr(s1,1) × Pr(s1,2) × … × Pr(sw,m), and Pr(sj,i) is calculated using Eq. (2).

According to the network model described earlier, the following mathematical model is built to describe the bi-objective problem:

$$ Maximize\;{\varvec{R}}(d,({\varvec{X}},{\varvec{Y}})) = \sum\nolimits_{{\varvec{S} \in {\varvec{S}}_{{(\varvec{X},\varvec{Y}),d}} }} {\Pr (\varvec{S})} , $$
(4)
$$ Minimize\;C({\varvec{X}},{\varvec{Y}}) = \sum\nolimits_{j = 1}^{w} {\sum\nolimits_{i = 1}^{m} {c_{{x_{j,i} ,i}} y_{j,i} } } $$
(5)
$$ \begin{gathered} Subject\;to \hfill \\ \sum\nolimits_{{j:x_{j,i} = u}} {y_{j,i} } \le M_{i}^{{x_{j,i} }} \quad {\text{for}}\;u = {1},{2}, \ldots ,q_{i} \;{\text{and}}\;i = {1},{2}, \ldots ,m, \hfill \\ \end{gathered} $$
(6)
$$ x_{j,i} \in \left\{ {{1},{2}, \ldots ,q_{i} } \right\}\;{\text{for}}\;j = {1},{2}, \ldots ,w\;{\text{and}}\;i = {1},{2}, \ldots ,m,\;{\text{and}} $$
(7)
$$ y_{j,i} \in \left\{ {{1},{2}, \ldots ,M_{i}^{{x_{j,i} }} } \right\}\;{\text{for}}\;j = {1},{2}, \ldots ,w\;{\text{and}}\;i = {1},{2}, \ldots ,m. $$
(8)

Equations (4) and (5) are the objective functions of production reliability and purchase cost. Constraint (6) expresses that the number of machines supplied by each supplier cannot exceed the available quantity. Constraints (7) and (8) limit the domains of decision variables xj,i and yj,i, respectively.

Assumptions

  1. (1)

    Each transport device is perfectly reliable.

  2. (2)

    The defective product flow from the repair workstation \(n_{{j,\beta_{e} }}\) must be repaired from the pointed workstation \(n_{{j,\alpha_{e} }}\) and can only be repaired once. This implies that such defective flow is repaired until a usable state is reached. If the defective flow after reworking is still defective, it is non-reparable and is scrapped (Lin & Chang, 2013).

  3. (3)

    The machines configured to workstation nj,i must be from the same supplier to ensure consistency in product quality.

  4. (4)

    The states of the different workstations under machine configuration (X, Y) are statistically independent.

  5. (5)

    Owing to the distributed nature of production lines, the movement of products within one production line cannot be transferred to other production lines.

Figure 2 illustrates the framework of the proposed hybrid method. The NSGA-II searches for the optimal machine configuration, as described in Sect. 4. The AMC and RSDP calculate the production reliability for each machine configuration generated by the NSGA-II, which is discussed in Sect. 3.

Fig. 2
figure 2

The framework of the hybrid method

3 Production reliability evaluation using AMC and RSDP

Calculating the production reliability R(d, (X, Y)) = \(\sum\nolimits_{{\varvec{S} \in {\varvec{S}}_{{(\varvec{X},\varvec{Y}),d}} }} {\Pr (\varvec{S})}\) may be time-consuming or result in an out-of-memory error because it requires enumerating all S \(\in\) S. Lin and Chang (2015) and Yeh et al. (2017) hence recommended searching for all d-MCCPs and then expressing the production reliability as a union probability of all d-MCCPs, where each d-MCCPs is the minimum vector in S. This section proposes an AMC-based approach integrated with the RSDP to compute the production reliability for a machine configuration (X, Y). An AMC-based approach is used to enumerate all d-MCCPs. The union probability of all d-MCCPs is driven by the RSDP.

3.1 AMC-based approach to determine d-MCCPs

Let \(\Gamma_{j,e}\) = {\(n_{{j,\alpha_{e} }}\), \(n_{{j,\alpha_{e} + 1}}\), …, \(n_{{j,\beta_{e} - 1}}\), \(n_{{j,\beta_{e} }}\)} be a set of sequential workstations for the repair process from \(n_{{j,\alpha_{e} }}\) to \(n_{{j,\beta_{e} }}\). An AMC transition matrix of production line Lj, denoted by Bj, comprises four elements: the transient-to-transient status matrix Uj, transient-to-absorbing status matrix Vj, zero matrix 0, and identity matrix I, which is represented as:

$$ {\varvec{B}}_{j} = \left[ {\begin{array}{*{20}c} {{\varvec{U}}_{j} } & {{\varvec{V}}_{j} } \\ {\mathbf{0}} & {\mathbf{I}} \\ \end{array} } \right],\quad {\text{for}}\;j = {1},{2}, \ldots ,w. $$
(9)

The AMC transition matrix owns m + \(\sum\nolimits_{e}^{z} {\left| {\Gamma_{j,e} } \right|}\) + 2 statuses. The transient-to-transient status matrix Uj is \(m + \sum\nolimits_{e}^{z} {\left| {\Gamma_{j,e} } \right|}\)-by-\(m + \sum\nolimits_{e}^{z} {\left| {\Gamma_{j,e} } \right|}\). Each element in Uj represents the transition probability of an ordered pair of transient statuses (workstations). There are m transient statuses in the normal process of the production line Lj. The transient-to-absorbing status matrix Vj is \(m + \sum\nolimits_{e}^{z} {\left| {\Gamma_{j,e} } \right|}\)-by-2, where “2” means both absorbing statuses: the flow is scrapped, or the product is successfully produced. All transition probabilities in Uj and Vj are assigned according to the yield rates of machine configuration (X, Y). Each element in Vj represents the transition probability from a transient status to an absorbing status. The zero matrix 0 is 2-by-\(m + \sum\nolimits_{e}^{z} {\left| {\Gamma_{j,e} } \right|}\), and I is a 2-by-2 identity matrix. Based on the definition of the AMC transition matrix, each production line consists of \(m + \sum\nolimits_{e}^{z} {\left| {\Gamma_{j,e} } \right|}\) + two statuses for j = 1, 2, …, w.

The expected value matrix of production line Lj is denoted by Tj and is expressed as:

$$ {\mathbf{T}}_{j} = \left[ {{\mathbf{I}} - {\mathbf{U}}_{j} } \right]^{{ - }{1}} ,\quad {\text{for}}\;j = {1},{2}, \ldots ,w. $$
(10)

Each element in the first row and ηth column of Tj, denoted by \(t_{1,\eta }^{j}\), is the expected value of one unit of product flow arriving at the ηth transient status from the first transient status. Subsequently, the absorption probability matrix denoted by Ej is expressed as:

$$ {\mathbf{E}}_{j} = {\mathbf{T}}_{j} {\mathbf{V}}_{j} ,\quad {\text{for}}\;j = {1},{2}, \ldots ,w. $$
(11)

Elements \(E_{1,1}^{j}\) and \(E_{1,2}^{j}\) in the first row of Ej signify the probability of scrap flow and successful production of a finished product, respectively.

Let Î = (I1, I2, …, Iw) be an input raw material vector and F = (f1,1, f1,2, …, f1,m, f2,1, f2,2, …, fw,m) be a flow vector, where fj,i is the product flow processed by the workstation nj,i. Because there are w production lines to share the demand d, the following constraint must be satisfied:

$$ d = \sum\nolimits_{j = 1}^{w} {d_{j} } , $$
(12)

where dj is the amount shared by the production line Lj. Let d = (d1, d2, …, dm) be the demand vector satisfying Constraint (12). It is necessary to determine the required input vector Î and flow vector F to obtain all d-MCCPs based on the yield rates. Considering probability \(E_{1,2}^{j}\), the necessary raw material input Ij of production line Lj to manufacture dj is computed as:

$$ I_{j} = d_{j} /E_{1,2}^{j} \quad {\text{for}}\;j = {1},{2}, \ldots ,w. $$
(13)

Assuming the statuses of production line Lj as 1, 2, …, and \(m + \sum\nolimits_{e}^{z} {\left| {\Gamma_{j,e} } \right|}\) + 2. Let Φj,i be a set of the statuses related to workstation nj,i. Each product flow fj,i can be determined as:

$$ f_{j,i} = I_{j} \sum\nolimits_{{\eta \in{\varvec{\varPhi}}_{j,i} }} {t_{1,\eta }^{j} } ,\quad {\text{for}}\;j = {1},{2}, \ldots ,w\;{\text{and}}\;i = {1},{2}, \ldots ,m. $$
(14)

Under the machine configuration (X, Y), flow vector F generated by Eq. (14) should fulfill the following maximal production capacity constraint:

$$ f_{j,i} \le y_{j,i} h_{{x_{j,i} ,i}} ,\quad {\text{for}}\;j = {1},{2}, \ldots ,w\;{\text{and}}\;i = {1},{2}, \ldots ,m. $$
(15)

Any F that does not satisfy Constraint (15) implies that the FSS network (N, A) cannot fulfill the corresponding demand vector d = (d1, d2, …, dm). For convenience, let F = {F|F fulfilling Constraint (15)}. Any production capacity vector S transformed from a flow vector F \(\in\) F via Eq. (16) is regarded as a d-MCCP (Lin & Chang, 2015).

$$ \begin{aligned} s_{j,i} & = \omega h_{{x_{j,i} ,i}} \;{\text{if}}\;(\omega - 1)h_{{x_{j,i} ,i}} < f_{j,i} \le \omega h_{{x_{j,i} ,i}} \;{\text{with}}\;\omega \in \{ 0,1,2, \ldots ,y_{j.i} \} \\ & \quad {\text{for}}\;j = 1,2, \ldots ,w\;{\text{and}}\;i = 1,2, \ldots ,m. \\ \end{aligned} $$
(16)

The following solution procedure based on the AMC methodology is used to determine all the d-MCCPs.

AMC-based approach

  • Input: (1) Order demand d.

  • (2) FSS network (N, A): (i) production lines L1, L2, …, Lm, (ii) repair workstations \(n_{{j,\beta_{1} }}\), \(n_{{j,\beta_{2} }}\), …, \(n_{{j,\beta_{z} }}\), and (iii) initial workstations \(n_{{j,\alpha_{1} }}\), \(n_{{j,\alpha_{2} }}\), …, \(n_{{j,\alpha_{z} }}\).

  • (3) Machine configuration (X, Y) with each configured machine’s production capacity \(h_{{x_{j,i} ,i}}\) and yield rate \(p_{{x_{j,i} ,i}}\).

  1. Step 1.

    Use Eq. (17) to compute the probability of each workstation state under the machine configuration (X, Y).

    $$ \begin{aligned} & \Pr (s_{j,i} = \tau h_{{t_{j,i} ,i}} ) = \left( {\begin{array}{*{20}c} {y_{j,i} } \\ \tau \\ \end{array} } \right)(r_{{x_{j,i} ,i}} )^{\tau } (1 - r_{{x_{j,i} ,i}} )^{{y_{j,i} - \tau }} \quad {\text{for}}\;t = 0,{1},{2}, \ldots ,y_{j,i} , \\ & \quad j = {1},{2}, \ldots ,w,\;{\text{and}}\;i = {1},{2}, \ldots ,m. \\ \end{aligned} $$
    (17)
  2. Step 2.

    Calculate the probability of each status for all the production lines.

    1. 1.1.

      Build the AMC transition matrix Bj for each production line.

      $$ {\varvec{B}}_{j} = \left[ {\begin{array}{*{20}c} {{\varvec{U}}_{j} } & {{\varvec{V}}_{j} } \\ {\mathbf{0}} & {\mathbf{I}} \\ \end{array} } \right],\quad {\text{for}}\;j = {1},{2}, \ldots ,w. $$
      (18)
    2. 1.2.

      Calculate the expected value matrix Tj for each production line.

      $$ {\mathbf{T}}_{j} = \left[ {{\mathbf{I}} - {\mathbf{U}}_{j} } \right]^{{ - }{1}} ,\quad {\text{for}}\;j = {1},{2}, \ldots ,w. $$
      (19)
    3. 1.3.

      Calculate the absorbing probability matrix Ej for each production line.

      $$ {\mathbf{E}}_{j} = {\mathbf{T}}_{j} {\mathbf{V}}_{j} ,\quad {\text{for}}\;j = {1},{2}, \ldots ,w. $$

      The element \(E_{1,2}^{j}\) in Ej represents the probability of successfully producing a finished product using production line Lj.

  3. Step 3.

    Find all demand patterns d = (d1, d2, …, dw) that satisfy the following constraints.

    $$ d = \sum\nolimits_{j = 1}^{w} {d_{j} } . $$
    (20)
  4. Step 4.

    Compute the required raw material input vector Î and product flow vector F for each feasible demand pattern d obtained in Step 3 using the following equations:

    $$ I_{j} = d_{j} /E_{1,2}^{j} \quad {\text{for}}\;j = {1},{2}, \ldots ,w. $$
    (21)
    $$ f_{j,i} = I_{j} \sum\nolimits_{{\eta \in{\varvec{\varPhi}}_{j,i} }} {t_{1,\eta }^{j} } ,\quad {\text{for}}\;j = {1},{2}, \ldots ,w\;{\text{and}}\;i = {1},{2}, \ldots ,m. $$
    (22)
  5. Step 5.

    Reserve the feasible flow vectors satisfying the following constraint.

    $$ f_{j,i} \le y_{j,i} h_{{x_{j,i} ,i}} ,\quad {\text{for}}\;j = {1},{2}, \ldots ,w\;{\text{and}}\;i = {1},{2}, \ldots ,m. $$
    (23)
  6. Step 6.

    Use Eq. (24) to transform F \(\in\) F into a d-MCCPs

    $$ \begin{aligned} s_{j,i} & = \omega h_{{x_{j,i} ,i}} \quad {\text{if}}\;(\omega - 1)h_{{x_{j,i} ,i}} < f_{j,i} \le \omega h_{{x_{j,i} ,i}} \;{\text{with}}\;\omega \in \left\{ {0,{1},{2}, \ldots ,y_{j,i} } \right\} \\ & \quad {\text{for}}\;j = {1},{2}, \ldots ,w\;{\text{and}}\;i = {1},{2}, \ldots ,m. \\ \end{aligned} $$
    (24)
  • Output: All d-MCCPs.

3.2 RSDP for production reliability computation

Let Ω = {S1, S1, …, Sκ}be the set of all d-MCCPs for configuration (X, Y). The production reliability is reformulated using Eq. (25).

$$ R(d,(X,Y)) = \Pr \left\{ {\bigcup\nolimits_{i = 1}^{\kappa } {\left\{ {\varvec{S}{|}\varvec{S} \ge \varvec{S}_{i} } \right\}} } \right\}, $$
(25)

Such a union probability can be computed using the inclusion–exclusion principle (Hudson & Kapur, 1985; Lin et al., 1995), disjoint-event method (Hudson & Kapur, 1985; Yarlagadda & Hershey, 1991), state-space decomposition (Alexopoulos, 1995; Aven, 1985), and RSDP (Bai et al., 2015; Zuo et al., 2007). The RSDP is proposed based on the sum of disjoint products and has been validated as being more efficient than the other techniques for larger networks (Zuo et al., 2007). Bai et al. (2015) later proposed ordering heuristics to improve the efficiency. Hence, our study applies the improved RSDP to calculate the union probability by inputting the d-MCCPs generated by the AMC-based approach, machine reliability \(r_{{x_{j,i} ,i}}\), and production capacity \(h_{{x_{j,i} ,i}}\). Details of the RSDP procedure can be found in Bai et al. (2015).

4 NSGA-II to search for Pareto solutions

In the case of multiple conflicting objectives, a single solution may not exist with the optimal objectives. Therefore, a tradeoff solution is required. The NSGA-II is proposed by Deb et al. (2002) and is appropriate for bi-objective optimization (Chambari et al., 2021; Lin et al., 2019). The NSGA-II uses a nondominated sorting method and crowding distance to rank the chromosomes in the population, and then adopts the evolution process to obtain a Pareto set.

The following procedure illustrates the implementation of the NSGA-II to solve the problem, followed by subsections explaining its modules.

NSGA-II

  1. Step 1.

    Generate an initial population, in which the representation of machine configuration (X, Y) is extended to express a chromosome (see Sect. 4.1).

  2. Step 2.

    For each chromosome (X, Y), evaluate R(d, (X, Y)) using the AMC-based approach and RSDP (see Sects. 3.1 and 3.2, respectively), and compute C(X, Y) = \(\sum\nolimits_{j = 1}^{w} {\sum\nolimits_{i = 1}^{m} {c_{{x_{j,i} ,i}} y_{j,i} } }\).

  3. Step 3.

    Rank the chromosomes in the population.

    1. 3.1.

      The nondominated sorting method is used to rank the chromosomes in the population (see Sect. 4.2).

    2. 3.2.

      The crowding distance of each chromosome is calculated to represent its density (see Sect. 4.3).

  4. Step 4.

    Implement the evolution process (see Sect. 4.4).

    1. 4.1.

      Parents selection.

    2. 4.2.

      Parents crossover.

    3. 4.3.

      Offspring mutation.

    4. 4.4.

      Repeat Steps 4.1–4.3 to generate enough offspring.

  5. Step 5.

    Update population (see Sect. 4.5).

  6. Step 6.

    Go to Step 4 if the terminal condition is not met; otherwise, output the Pareto set.

4.1 Population initiation

The generation of a chromosome generates a pair consisting T and O. According to the setting of the population size (Psize), the following equations must be repeated Psize times to initialize the population:

$$ x_{j,i} = {\text{Rand}}\left( {{1},q_{i} } \right)\quad {\text{for}}\;j = {1},{2}, \ldots ,w\;{\text{and}}\;i = {1},{2}, \ldots ,m,{\text{and}} $$
(26)
$$ y_{j,i} = \left\{ {\begin{array}{*{20}c} {Rand(1,M_{i}^{{x_{j,i} }} )} & {{\text{if}}\;j = 1} \\ {Rand\left( {1,M_{i}^{{x_{j,i} }} - \sum\nolimits_{a = 1}^{j - 1} {y_{a,i} } } \right)} & {{\text{if}}\;j > 1} \\ \end{array} } \right.\quad {\text{for}}\;j = {1},{2}, \ldots ,w\;{\text{and}}\;i = {1},{2}, \ldots ,m, $$
(27)

Rand(·) is a random function for selecting a random integer between the given intervals. Equation (26) randomly selects a supplier for each workstation. In addition, according to Eqs. (26), (27) then determines the number of configured machines limited to the remaining quantity of each selected supplier.

4.2 Nondominated sorting

The nondominated sorting method ranks all the chromosomes in a population based on non-domination. Let Λ = {1, 2, …, Psize} be a set of chromosome indices. For convenience, R(λ) and C(λ) represent the production reliability and total purchase cost of chromosome λ, respectively. The nondominated sorting method is described by the following pseudocode:

figure a

Nondominated sorting method

The four if–then rules within the second for-loop ensure that the chromosome with R(λ) = 0 has a worse rank than that of the feasible ones, and all the infeasible chromosomes are ranked according to the total purchase cost. After comparing all the chromosomes, the chromosomes dominated by fewer chromosomes are assigned better ranks. The output of the procedure is the rank of all the chromosomes.

4.3 Density evaluation

The crowding distance measures the density relationship between the chromosomes of same rank. Identically ranked chromosomes are sorted according to the bth objective for b = 1 and 2. Subsequently, the crowding distance of the λth chromosome, denoted by CDλ is evaluated as:

$$ CD_{\lambda } = \sum\nolimits_{b = 1}^{2} {\Delta_{\lambda ,b} } , $$
(28)

where Δλ,b is the distance of the λth chromosome with respect to bth objective, and is calculated as:

$$ \Delta_{\lambda ,b} = \left\{ {\begin{array}{*{20}c} {\frac{{K_{\lambda + 1,b} - K_{\lambda - 1,b} }}{{K_{b}^{\max } - K_{b}^{\min } }}} & {{\text{if}}\;K_{\lambda ,b} \ne K_{b}^{\max } \;{\text{or}}\;K_{\lambda ,b} \ne K_{b}^{\min } ;} \\ \infty & {{\text{otherwise}}{.}} \\ \end{array} } \right. $$
(29)

where Kλ,b is the bth objective value of the λth chromosome, and Kλ+1,b and Kλ-1,b represent the neighbors’ bth objective values of the λth chromosome. The maximal and minimal values of the bth objective in the chromosomes are denoted by \(K_{b}^{\max }\) and \(K_{b}^{\min }\), respectively.

4.4 Evolution process

For selection, the NSGA-II randomly selects two chromosomes, λ and θ. Subsequently, the following crowded comparison rule is employed to select the best chromosome according to the rank and crowded distance attributes:

Crowded comparison rule Chromosome λ performs better than Chromosome θ if (Rankλ = Rankθ and CDλ > CDθ) or (Rankλ < Rankθ).

Rankλ indicates the rank of the λth chromosome in the population. This selection process maintains the species diversity.

The selection operator is repeated to select two chromosomes as parents. Based on the crossover probability Cprob, parents may generate offspring with a uniform crossover, which is a commonly used and well-convergent operator (Lim et al., 2017). Bortolini et al. (2022) demonstrated that a uniform crossover can perform well in a manufacturing reconfiguration problem. Figure 3 shows an example of implementing a uniform crossover. It generates a w × m binary mask first, and then the gene sequences of the parents’ X and Y are reversed according to the positions of the elements, with a value of 1 in the mask. If Constraint (6) is not fulfilled, the offspring must be repaired to be noteworthy. For instance, workstations n1,4 and n2,4 choose the third supplier in offspring B. Then, the quantity of the third supplier in the fourth process is five (i.e., q3 = 5). Because y1,4 = y2,4 = 6 > 5, y2,4 must be changed to an arbitrary integer in the interval [1, 5 − y1,4].

Fig. 3
figure 3

Crossover representation

The mutation operator may further mutate the offspring based on the mutation probability Mprob. Figure 4 shows an example of this mutation. It also generates a w × m binary mask. If a gene in the mask equals 1, the corresponding gene in X is mutated by randomly reselecting another supplier. The corresponding position’s gene in Y is then randomly assigned a value that satisfies the available quantity of the supplier. The evolution process continues until a total of Psize new chromosomes are produced. Step 2 is then used to compute the production reliability and purchase cost of the new chromosomes.

Fig. 4
figure 4

Mutation representation

4.5 Population update

After generating the new chromosomes, combining them with the current population results in 2 × Psize chromosomes in the pool. Before updating the population to consist of Psize chromosomes, Step 3 is adopted to evaluate the rank and crowded distance of the 2 × Psize chromosomes. Then, Psize chromosomes are selected from the pool according to their ranks. However, suppose that the number of selected chromosomes exceeds Psize, then the chromosomes with the worst rank and lower crowded distance are eliminated, such that the number of selected chromosomes, which comprise the next population, is Psize.

5 TOPSIS to determine the best compromise alternative

TOPSIS is a popular method for determining a compromise from a set of alternatives based on multiple criteria (Tzeng & Huang, 2011). It defines the positive and negative ideal alternatives. A positive ideal alternative can be identified as the combination of the highest maximal production reliability and lowest purchase cost. Conversely, the negative ideal alternative can be identified as a combination of the smallest production reliability and largest purchase cost. The best compromise alternative has the shortest geometric distance from the positive ideal alternative and longest distance from the negative ideal alternative. That is, TOPSIS can rank nondominated solutions based on their geometric distances. The following procedure uses TOPSIS to settle on a compromise alternative.

TOPSIS

  • Step 1. Construct a decision matrix G = [[g1,1, g1,2], [g2,1, g2,2], …, [gπ,1, gπ,2]], where π is the number of Pareto solutions and gλ,1 and gλ,2 represent the production reliability and purchase cost of the λth Pareto solution, respectively.

  • Step 2. Use the following equation to obtained a standardized decision matrix Ĝ = [[ĝ1,1, ĝ1,2], [ĝ2,1, ĝ2,2], …, [ĝπ,1, ĝπ,2]].

    $$ \hat{g}_{\lambda ,b} = \frac{{g_{\lambda ,b} }}{{\sqrt {\sum\nolimits_{i = 1}^{\pi } {g_{i,b}^{2} } } }}\quad {\text{for}}\;l = {1},{2}, \ldots ,p\;{\text{and}}\;b = {1},{2}. $$
    (30)
  • Step 3. Use the following equation to create a weighted decision matrix \(\overline{\user2{G}}\) = [[\(\overline{g}_{1,1}\), \(\overline{g}_{1,2}\)], [\(\overline{g}_{2,1}\), \(\overline{g}_{2,2}\)], …, [\(\overline{g}_{\pi ,1}\), \(\overline{g}_{\pi ,2}\)]].

    $$ \overline{g}_{\lambda ,b} = \hat{g}_{\lambda ,b} \times \hat{w}_{b} \quad {\text{for}}\;\lambda = {1},{2}, \ldots ,\pi \;{\text{and}}\;b = {1},{2}, $$
    (31)

    where ŵb is the weight of the bth objective. These weights can be determined either subjectively or objectively. Analytic Hierarchy Process and information entropy are popular objective weighting methods (Chen et al., 2020a; Yu et al., 2020).

  • Step 4. Determine the positive ideal alternative (\(g_{1}^{ + }\), \(g_{2}^{ + }\)) and negative ideal alternative (\(g_{1}^{ - }\), \(g_{2}^{ - }\)) as:

    $$ \left( {g_{1}^{ + } ,g_{2}^{ + } } \right) = \left( {{\text{max}}\left( {\overline{g}_{{{1},{1}}} ,\overline{g}_{{{2},{1}}} , \ldots ,\overline{g}_{{\pi ,{1}}} } \right),{\text{min}}\left( {\overline{g}_{{{1},{2}}} ,\overline{g}_{{{2},{2}}} , \ldots ,\overline{g}_{{\pi ,{2}}} } \right)} \right),\;{\text{and}} $$
    (32)
    $$ \left( {g_{1}^{ - } ,g_{2}^{ - } } \right) = \left( {{\text{min}}\left( {\overline{g}_{{{1},{1}}} ,\overline{g}_{{{2},{1}}} , \ldots ,\overline{g}_{{\pi ,{1}}} } \right),{\text{max}}\left( {\overline{g}_{{{1},{2}}} ,\overline{g}_{{{2},{2}}} , \ldots ,\overline{g}_{{\pi ,{2}}} } \right)} \right). $$
    (33)
  • Step 5. Calculate the geometric distance of each alternative to the positive ideal alternative (\(g_{1}^{ + }\), \(g_{2}^{ + }\)) using Eq. (34).

    $$ GD_{\lambda }^{ + } = \sqrt {\sum\nolimits_{b = 1}^{2} {\left( {\overline{g}_{\lambda ,b} - g_{b}^{ + } } \right)^{2} } } \quad {\text{for}}\;\lambda = {1},{2}, \ldots ,\pi . $$
    (34)
  • Step 6. Calculate the geometric distance of each alternative to the negative ideal alternative (\(g_{1}^{ - }\), \(g_{2}^{ - }\)) using Eq. (35).

    $$ GD_{\lambda }^{ - } = \sqrt {\sum\nolimits_{b = 1}^{2} {\left( {\overline{g}_{\lambda ,b} - g_{b}^{ - } } \right)^{2} } } \quad {\text{for}}\;\lambda = {1},{2}, \ldots ,\pi . $$
    (35)
  • Step 7. Calculate the relative proximity Ωλ of each alternative using Eq. (36).

    $$ \Omega_{l} = \frac{{GD_{\lambda }^{ - } }}{{GD_{\lambda }^{ + } - GD_{\lambda }^{ - } }}\quad {\text{for}}\;\lambda = {1},{2}, \ldots ,\pi . $$
    (36)
  • Step 8. Choose the alternative with maximum relative proximity as the compromise alternative.

6 Numerical experiments of solar cell manufacturing

Solar cell manufacturing typically follows a flow shop model with multiple parallel machines at each workstation performing the same process (Chen et al., 2020b). The production line for solar cell manufacturing involves eight steps: texturing, diffusion, phosphorus glass etching, anti-reflective coating, screen printing, fast firing, edge isolation, and testing. In addition, the production line includes three repair workstations for the second, third, and fourth processes. If defective WIP flows are produced at these workstations, they are returned to the first process for further processing (Raval & Reddy, 2019; Song & Lin, 2018). Figure 5 illustrates the FSS network for solar cell manufacturing, which comprises two production lines. Table 1 presents the technical specifications of the machines used in the process, which are sourced from the solar cell device suppliers. These parameters are based on historical usage experience, or can be obtained by requesting product information from the suppliers.

Fig. 5
figure 5

The network model of the solar cell manufacturing system

Table 1 The machine data of the solar cell device suppliers

In this section, the applicability and efficiency of integrating the NSGA-II, AMC, RSDP, and TOPSIS are demonstrated using a solar cell manufacturing case. All algorithms are programmed in Python and executed on Windows 11, Intel Core i7-9700, CPU 3.00 GHz, and 16 GB RAM.

6.1 Pareto set determination and measurement

This study considers three demand scenarios: d = 4,000, d = 5,000, and d = 6,000. For each scenario, we perform ten trials using the NSGA-II and generate a reference set that includes all the nondominated solutions in the 10 Pareto sets. Based on parameter recommendations from bi-objective component assignment studies (Lin & Yeh, 2012; Lin et al., 2019; Yeh et al., 2023), we set the NSGA-II parameters to (Psize, Cprob, Mprob) = (100, 0.6, 0.025). During each trial, we run the NSGA-II for 1500 s.

To evaluate the quality of the obtained Pareto set, we employ three different metrics: the number of solutions in the reference set (NSR), ratio of nondominated individuals (RNI), and generational distance (GD) (Yeh, 2019; Yen & He, 2013). Let Pset be a nondominated set and Rset be a reference set, where all εi \(\in\) Pset and γj \(\in\) Rset are the normalized elements. These three metrics are defined as:

$$ {\text{NSR}} = \left| {{\text{Pareto}}\;{\text{set}} \cap {\text{reference}}\;{\text{set}}} \right|, $$
(37)
$$ {\text{RNI}} = \frac{{\left| {{\text{Pareto}}\;{\text{set}} \cap {\text{reference}}\;{\text{set}}} \right|}}{{\left| {{\text{Pareto}}\;{\text{set}}} \right|}},{\text{and}} $$
(38)
$$ {\text{GD}} = \sum\nolimits_{i = 1}^{{{\varvec{P}}_{{{\varvec{size}}}} }} {\sum\nolimits_{i = 1}^{{{\varvec{R}}_{{{\varvec{size}}}} }} {\left( {\varepsilon_{i} - \gamma_{j} } \right)^{2} } } . $$
(39)

A smaller GD, higher NSR, and higher RNI indicate a superior Pareto set. If all solutions in the nondominated set are members of the reference set, GD is equal to 0 and RNI is equal to 1.

Table 2 lists the result of each demand scenario with the three metrics. For each scenario, the best trail, whose Pareto set has a lower GD, higher RNI, and NSR, has been marked bold. In particular, these best trails have significantly better GA and RNI than those of the average GD and RNI. Therefore, they can be used for further decision-making.

Table 2 The experimental results from the NSGA-II

6.2 Compromise alternative selection

The TOPSIS can determine a compromise alternative from a set of solutions. Because the compromise alternative involves two criteria–production reliability and total purchase cost, it is necessary to set weights for both the criteria before executing TOPSIS. Suppose a system administrator subjectively considers three weight settings: (ŵ1, ŵ2) = (0.8, 0.2) (reliability preference), (ŵ1, ŵ2) = (0.2, 0.8) (cost preference), and (ŵ1, ŵ2) = (0.5, 0.5) (no preference). According to the reference set for each demand scenario, as described in Sect. 6.1, we use the TOPSIS to determine the compromise solution for each weight setting. The compromise alternatives and corresponding machine configurations are listed in Table 3. Figures 6, 7 and 8 show the scatter plots of all the points in the three reference sets and highlight the compromise alternatives for all the weight settings. This information can support the system administrators’ decision-making. For example, if the system administrator expects that the solar cell FSS has a production reliability of more than 0.98 and approximate budget of 4, 000 NTD (unit: ten thousand NTD), the machine configuration (X, Y) = ((1, 3, 1, 1, 1, 2, 2, 2, 1, 3, 2, 1, 2, 1, 1, 1), (1, 1, 1, 1, 1, 1, 3, 1, 4, 3, 2, 2, 3, 2, 3, 5)) can be considered for the FSS design. In addition, as shown in Fig. 6, the compromise alternatives are more inclined to lower costs under the three weight settings because each alternative has a higher production reliability and less difference in production reliability. System administrators can make decisions by focusing on purchase cost criteria.

Table 3 The compromise alternatives determined by TOPSIS
Fig. 6
figure 6

The compromise alternative selection for d = 4000

Fig. 7
figure 7

The compromise alternative selection for d = 5000

Fig. 8
figure 8

The compromise alternative selection for d = 6000

6.3 Efficiency validation for NSGA-II

This subsection focuses on validating the efficiency of the NSGA-II. SPEA2 is an efficient multi-objective genetic algorithm (GA) for optimization problems (Zitzler et al., 2001). The SPEA2 is also a population-based multi-objective GA, similar to the NSGA-II, which adapts significantly to our crossover and mutation operators (Biswas & Pal, 2021). In addition, the SEAP2 has a domination-based framework similar to that of the NSGA-II, and thus can maintain a great spread of solutions and converge well to the exact Pareto optimal front (Ma et al., 2023). Thus, it is considered comparable to the NSGA-II. Under the same parameter setting (Pisze, Cprob, Mprob) = (100, 0.6, 0.025) and terminal condition of 1500-s runtime, the SPEA2 is also performed for ten trials for the three scenarios given in Sect. 6.1. For each scenario, the Pareto set of the best NSGA-II trail is combined with one of the best SPEA2 trail to form a reference set. Then both methods' GD, RNI, and NSR are calculated according to the reference set and Eq. (37)–(39). Table 4 lists the reference set and three metrics for both the methods. The SPEA2 can determine more nondominated solutions, such that it has a better NSR. However, the NSGA-II performs better in terms of both the GD and NRI. Under the same terminal conditions, the nondominated solutions from the NSGA-II usually belong to the reference set. Therefore, the Pareto set from the NSGA-II is more informative in this case.

Table 4 Comparisons of the NSGA-II and the SPEA2

7 Conclusions

This study proposes a hybrid method that integrates the NSGA-II, AMC, RSDP, and TOPSIS to determine the reliability-cost-oriented machine configuration for an FSS with repair actions. The problem addressed involves the evaluation of production reliability. The AMC-based approach uses a transition matrix to calculate the required raw material input and product flow traveling through each workstation according to the order demand d without enumerating all the processing paths, as in the path decomposition method proposed by Lin and Chang (2015). Subsequently, using this approach the d-MCCPs in terms of these flows are determined. The RSDP then calculates the union probability of the d-MCCPs to acquire the production reliability.

The proposed method adopts NSGA-II to identify the Pareto solutions that balance the production reliability and total purchase cost. To achieve this, we integrate the AMC and RSDP into the NSGA-II for production reliability calculation. TOPSIS is then employed to select a compromise alternative based on the given weights. In Sect. 6, we elaborate the applicability of integrating the NSGA-II and TOPSIS. The information derived from the hybrid method could support decision making regarding FSS design. Additionally, our experimental results demonstrate that the NSGA-II is computationally more efficient than the SPEA2, especially under the terminal condition of a 1500-s runtime. Overall, this study makes the following contributions:

  1. (1)

    The tradeoff between the production reliability and purchase cost in the machine configuration of an FSS with repair actions is considered.

  2. (2)

    A mathematical model of the addressed bi-objective problem is built, in which the FSS with repair actions is modeled as a typical multistate flow network.

  3. (3)

    An alternate way is proposed to evaluate production reliability by combining the AMC with RSDP.

  4. (4)

    A hybrid method that integrates the AMC, RSDP, NSGA-II, and TOPSIS to solve this problem is built.

  5. (5)

    The applicability of our hybrid method using the solar cell FSS is illustrated.

  6. (6)

    The superiority of NSGA-II over SPEA2 is demonstrated by comparing their efficiency.

In certain FSSs, the product flow can undergo multiple repairs in different production lines. However, the proposed AMC-based approach is only applicable when the product flow can be repaired at the same workstation. Future research could expand on this problem by considering scenarios where the product flow can undergo multiple repairs at other production lines and then explore the impact of this situation on the production reliability.