1 Introduction

Process integration is a prerequisite for successful supply chain digitalization (Pourhejazy, 2022). System-wide optimization tools should be employed for the planning and control of decentralized production activities. Distributed scheduling problems are developed to address this practical need where production operations across different supply chain facilities are planned simultaneously. The integrated optimization view towards production planning underlines coordination between different units to meet global demands while ensuring optimal overall performance.

The Distributed Two-Stage Assembly Flowshop Scheduling Problem (DTSAFSP; (Xiong & Xing, 2014)) and the Distributed Assembly Permutation Flowshop Scheduling Problem (DAPFSP; (Hatami et al., 2013)) are the mainstream variants of distributed production scheduling under flowshop setting, i.e. when all jobs follow the same processing route and the shop floors are designed considering the flow of jobs. The former variant, DTSAFSP, models a distributed manufacturing system with production and assembly operations at every plant. The latter, DAPFSP, represents a more practical, supply chain-like setting in which an assembly plant is dedicated to the assembly of components arriving from different manufacturing facilities. Given the widespread use of DAPFSP in modern manufacturing, new extensions, and solution algorithms are developed to accommodate this practical scheduling extension and extend its industrial reach.

Among the existing studies, (Hatami et al., 2013) developed three heuristics for solving DAPFSPs, of which the variable neighborhood descent method yielded better outcomes. (Lin & Zhang, 2016) introduced the hybrid biogeography-based optimization algorithm; (S.-Y. Wang & Wang, 2016) developed the estimation of the distribution algorithm-based memetic algorithm; the backtracking search hyper-heuristic was developed by (Lin et al., 2017), and (Sang et al., 2019) put forward the invasive weed optimization algorithm. More recently, (Ferone et al., 2020) solved the DAPFSP using a biased-randomized iterated local search. Solution algorithms based on NEH (Nawaz, Enscore, and Ham; (Nawaz et al., 1983)) with new coding structures and job sorting rules have also been developed for solving DAPFSP variants (Ying et al., 2020). For a detailed and exhaustive review of the published works, we refer interested readers to the literature review in a recent study by (Pourhejazy et al., 2022).

The literature on DAPFSP assumes that the components used in the assembly of the final product have no subcomponents. In practice, however, the components are often complex and require processing. Taking automotive and heavy equipment as an example, main components, such as the engine, are made of many subcomponents that are manufactured and assembled by the component suppliers, the Original Equipment Manufacturers (OEMs). After procurement, the main components should be assembled into the final product at the product manufacturer’s plants. This practical supply chain situation can be modeled by the Distributed Three Stage Assembly Permutation Flowshop Scheduling Problem (DTrSAPFSP), which is a combination of DTSAFSP and DAPFSP.

As a new variant of distributed scheduling problems, DTrSAPFSP can be used for the integrated scheduling of the production of sub-components, the assembly of the sub-components to complete the main parts/components, and the assembly of the components that form the final product; this integration makes the scheduling model highly intractable and calls for tailored solution methods to contribute to the advances in distributed manufacturing and supply chain digitalization. There are very limited published and pre-published papers related to DTrSAPFSP. (J. Wang, Lei, and Cai 2022; J. Wang, Lei, & Li, 2022) extended the distributed three-stage assembly scheduling, which is the most relevant variant to the DTrSAPFSP, which considers the constraints of maintenance and setup time. In a preprint, (Hao et al., 2022) developed three Social Spider Optimization (SSO) metaheuristics for solving the DTrSAPFSP. They showed that their solution methods outperform the earlier algorithms developed for solving DAPFSPs; this is the only study on solving the DTrSAPFSP. To address this gap, the present research paper extends to develop a novel constructive heuristic algorithm, hereafter called the N-list-enhanced Constructive Heuristic (NCH). The NCH method is compared with state-of-the-art algorithms to evaluate its strength. The statistical test follows to confirm the superiority of NCH over the existing methods. The developed algorithm is scalable for parallel computing and is expected to be considered a strong benchmark in the field.

The rest of this paper begins with a review of the relevant literature in Sect. 2. A detailed explanation of the computational elements of the developed solution method follows in Sect. 3. Numerical experiments and statistical analysis of the results are then presented in Sect. 4 to report the performance of the new algorithm. Section 5 concludes this research by summarizing the major findings and providing insights for further developments in the topic of three-stage production scheduling.

2 Relevant literature

Multistage production processes have been extensively studied in the forms of flexible flowshop (Luo et al., 2019) and two-stage assembly flowshop (Lee et al., 1993). Distributed flowshop scheduling has seen considerable development in recent years considering both modeling and solution algorithms.

2.1 Distributed assembly permutation flowshop and distributed two-stage assembly scheduling problems

(Hatami et al., 2013) introduced the Distributed Assembly Permutation Flowshop Scheduling Problem (DAPFSP). This work inspired other scheduling extensions; in one of the seminal works, (Yang & Xu, 2020) suggested that more than one machine should be considered in the assembly stage (i.e., flexible assembly). The DAPFSP with flexible assembly assumes that the assembly stage is executed in a different plant and that no subassemblies are required for producing the components. (Xiong & Xing, 2014) put forward the Distributed Two-Stage Assembly Scheduling Problem (DTSASP), in which the production and assembly operations of a product are performed in the same factory. DTSASP schedules production operations at OEMs, based on which, separate scheduling must be done for the final product manufacturing. To model an integrated production plan, DTSASP should be merged with flexible assembly in a separate factory to enable an integrated production plan of parts and components at OEMs and final products at the mother company. For this reason, (Hao et al., 2022) proposed the DTrSAPFSP and presented three SSO-based metaheuristics to solve it.

2.2 Distributed three-stage assembly permutation flowshop scheduling problems

DTrSAPFSP assumes that there are several OEMs, each of which is responsible for producing parts and assembling them into a certain component. The components from the OEMs then arrive at the main company’s plant, where they are assembled and processed into final products.

The solution algorithms developed for solving DTSAFSPs, as the most relevant variant of DTrSAPFSP, are relatively limited. From the seminal works, (Xiong et al., 2014) developed a Variable Neighborhood Search-based approach for minimizing the total completion time in DTSAFSP with setup times. (Zhang & Xing, 2018) developed the SSO optimization for minimizing the total completion time in DTSAFSP. (Deng et al., 2016) introduced the Competitive Memetic Algorithm (CMA) for minimizing the makespan in DTSAFSPs, which outperformed the earlier algorithms. Most recently, (Pourhejazy et al., 2022) developed the Meta-Lamarckian-based Iterated Greedy algorithm for optimizing DTSAFSP with mixed setups, which outperformed the earlier best-performing algorithms in minimizing the makespan.

From the published studies, (Zheng & Wang, 2021) developed a new variant of the Bat Optimization Algorithm for Solving the Three-Stage Distributed Assembly Permutation Flowshop Scheduling Problem. (J. Wang, Lei, & Li, 2022) developed a Q-Learning-Based Artificial Bee Colony (ABC) algorithm for solving the Distributed Three-Stage Assembly Scheduling Problem with Factory Eligibility and Setup Times. (J. Wang, Lei, and Cai 2022) developed the adaptive ABC for solving the distributed three-stage assembly scheduling problem with maintenance. These metaheuristics used basic constructive algorithms, such as NEH (Nawaz et al., 1983), in the initialization stage of the algorithm. Despite its merits, NEH cannot obtain good initial solutions for complex problems and needs to be adjusted for different scheduling problems. Given the stochastic nature of the above metaheuristics, the quality of the final solution depends heavily on that of the initial solution, especially when dealing with highly intractable problems. In the next section, a new constructive heuristic is developed to contribute to the advances in distributed and multi-stage scheduling.

3 Optimization method

3.1 Problem definition

We investigate a distributed manufacturing system where geographically dispersed factories produce the parts/sub-components (Stage I), assemble the component (Stage II), and send the components to a main factory for the assembly of the final product (Stage III); this process is illustrated in Fig. 1. The problem assumes that all jobs follow the same routine and that each job can only be processed on one machine/assembly stage at a time. The processing times are deterministic and independent of the jobs/products sequence. Once a production/assembly job is assigned to a factory, it cannot be re-assigned. The model is symbolized by \({(}DF_{m} \to {1)} \to {1}||C_{max}\) with the first part indicating that the distributed system has \(m\) parallel machines in the production stage of every factory; one assembly machine completes the sub-component, and one assembly stage forms the final product. The objective is to find the production schedule with (near-) minimum maximum completion time (makespan; \(C_{max}\)).

Fig. 1
figure 1

Visual illustration of the three-stage scheduling problem

3.2 Solution algorithm

This study introduces the NCH algorithm as an alternative to the NEH-based algorithms with various initial sorting and tie-breaking rules. Inspired by the N-list technique (Puka et al., 2021), the NCH algorithm adjusts every step of the algorithm considering the problem characteristics. The N-list technique uses a list of N jobs that are candidates for establishing the job sequence. At each stage of the algorithm, each job candidate in the current N-list is individually inserted into all possible positions in the partial sequence, and the one with the best performance index is assigned. The procedure continues until all jobs are assigned and a complete solution is obtained. Employing the N-list technique enables the NEH-based algorithm to run the search procedure in parallel computing environments, which is complicated, if not impossible, to accomplish with traditional methods.

The Fig. 2 shows the pseudocode of the developed NCH algorithm. The computational procedure consists of four steps: (1) sorting the initial sequence of parts within each component; (2) sorting the initial sequence of components within each product; (3) sorting the initial sequence of products; and (4) sorting sequence of parts within each factory and the assembly sequence of the products. These procedures are explained in two phases, one at the component level and the other at the product level.

Fig. 2
figure 2

Pseudocode of the N-list-enhanced Constructive Heuristic

We now elaborate on the computational steps.

3.2.1 Phase I. Component-level sequencing

figure a

An illustrative example is now provided to clarify the above steps. Assume a small example with ten parts, two machines, four components, two products, and two factories. In the first assembly phase, the first and second components each contain three parts, while the third and fourth components each contain two parts. In the second assembly phase, each product is made from two components. Table 1 summarizes the processing times in the illustrative example.

Table 1 Configuration of the illustrative example

In Step 1, the total processing time of each part is calculated as follows: \(SUM_{1} = 5,\) \(SUM_{2} = 6,\) \(SUM_{3} = 6,\) \(SUM_{4} = 6,\) \(SUM_{5} = 5,\) \(SUM_{6} = 4,\) \(SUM_{7} = 5,\) \(SUM_{8} = 3,\) \(SUM_{9} = 5,\) \(SUM_{10} = 7.\) In Step 2, the initial order of the parts of each component considering the total processing time is: \(\pi_{1}^{PL} = \{ j_{2} \, j_{3} \, j_{1} \} ,\) \(\pi_{2}^{PL} = \{ j_{4} \, j_{5} \, j_{6} \} ,\) \(\pi_{3}^{PL} = \{ j_{7} \, j_{8} \} ,\) and \(\pi_{4}^{PL} = \{ j_{10} \, j_{9} \} .\) Taking the component \(c_{1}\) as an example, after setting \(\pi_{1}^{part} = \{ j_{2} \}\) and \(\pi_{1}^{RPL} = \{ j_{3} \, j_{1} \}\), Step 3 consists of extracting the first two parts, i.e., \(j_{3}\) and \( \, j_{1}\), making permutations of two parts,\(\{ j_{3} \, j_{2} \} ,\) \(\{ j_{2} \, j_{3} \} ,\) \(\{ j_{1} \, j_{2} \} ,\) and \(\{ j_{2} \, j_{1} \} ,\) as shown in Fig. 3a–d. At this stage, the sequence with minimum completion time are \(\{ j_{1} \, j_{2} \}\) and \(\{ j_{2} \, j_{1} \}\), in which \(\{ j_{1} \, j_{2} \}\) is randomly selected and set as \(\pi_{1}^{part} .\) Then, the part \(j_{3}\) is inserted into all possible positions of \(\pi_{1}^{part}\), as shown in Fig. 3e–g, in which \(\{ j_{3} \, j_{1} \, j_{2} \}\) is the best part sequence of the component \(c_{1}\). The above procedure is repeated for the rest components to obtain the best part sequences at the component level, as shown in Fig. 4.

Fig. 3
figure 3

An illustrative example for ordering parts of a component

Fig. 4
figure 4

Best part sequences at the component level

3.2.2 Phase II. Product-level sequencing

Based on the best part sequences from Phase I, the following steps determine the order of parts at the product level.

figure b

The procedure of phase II is illustrative by applying it to the previous example (Table 1). In Step 1, given the completion time of each component, as shown in Fig. 4, the resulting component lists of the product \(p_{1}\) and \(p_{2}\) are: \(\pi_{1}^{CL} = \left\{ {c_{2} ,\; \, c_{1} } \right\}\) and \(\pi_{2}^{CL} = \left\{ {c_{3} , \, \;c_{4} } \right\}\), respectively. Given that \(CT{(}\pi_{1} {) = 13 + 16 = 29}\) and \(CT{(}\pi_{2} {) = 11 + 15 = 26}\), the resulting product list after applying Step 2 is \(\pi^{PdL} = \left\{ {p_{2} ,\;p_{1} } \right\}\). In Step 3, the first product in \(\pi^{PdL}\), i.e., \(p_{2}\), is extracted and its associated components \(c_{3}\) and \(c_{4}\) are sequentially inserted into factory 1 and factory 2, respectively; the result is shown in Fig. 5.

Fig. 5
figure 5

The output of Step 3 in the illustrative example

Steps 4 – 5 begin with inserting the first component of the first unassigned product, i.e., \(c_{2}\) of product 1, into the last position of every factory to find the alternative with the smallest completion time (round 1). As shown in Fig. 6, the resulting sequence in alternative Fig. 6a is preferred, hence, factory 1 is selected for assigning \(c_{2}\). The next component from \(\pi_{1}^{CL}\), i.e., \(c_{1}\) of product 1, should then be extracted and inserted into the last position of every factory to find the best alternative (round 2). As shown in Fig. 7, the resulting sequence shown in Fig. 7b is better with a smaller completion time, hence, factory 2 should be selected for the permanent insertion. With assigning the last component of product 1, the final order of parts at the product level has been resulted; the result is shown in Fig. 7b.

Fig. 6
figure 6

Round 1 of applying Steps 4 – 5 on the illustrative example

Fig. 7
figure 7

Round 2 of applying Steps 4–5 on the illustrative example

With more components involved in the instance, the chances of perceiving the advantages of the developed method are expected to be greater. In the next section, the performance of NCH in various operational situations is evaluated and compared with the state-of-the-art.

4 Numerical experiments

The performance of the NCH algorithm is now evaluated by comparing it with the best-performing constructive heuristic in the literature of DTSAPFSP and DAPFSP. For this purpose, an adjusted version of the NEH+-based constructive heuristic, hereafter called ANEH+, is considered as a baseline. Besides, three variants of the most recent and state-of-the-art metaheuristic, which were also developed to solve DAPFSP, are included as benchmarks to increase the strength of our numerical analysis.

(Hao et al., 2022) improved the SSO algorithm to solve the DTrSAPFSP. Then they employed three local search methods to develop the Social Spider Optimization hybridized with Local Search Strategies (HSSO). Finally, they introduced the ‘Restart’ and ‘Self-adaptive Selection Probability’ to better regulate the local search and restart strategies in HSSO With Restart Procedures (HSSOR), and HSSOR with Self-adaptive Selection Probability (HSSOPR). Their experiments showed that these algorithms outperform the state-of-the-art in solving distributed assembly flow shops, i.e., the Competitive Memetic Algorithm (CMA; (Deng et al., 2016)) and the Estimation of Distribution Algorithm (EDA; (S.-Y. Wang & Wang, 2016)).

All compared algorithms are coded and compiled using C +  + programming language on a personal computer with the Intel® Core™ i5-10210U CPU (1.60CHz) and 8 GB RAM. The same testbed configurations considered in the earlier study for testing the base algorithms are used for the numerical experiments. On this basis, the instances can be grouped by 100, 200, and 500 parts; 4, 6, and 8 factories for producing the components; 5, 10, and 20 machines for the production stage in these factories; 30, 40, and 50 components; and, finally, 10, 15, and 20 products. The identity format 100_5_4_30_10_1 represents the first (out of ten) instance characterized by 100 parts, 5 machines, 4 factories, 30 components, and 10 products. Considering these configurations, and 10 distinct instances under each configuration, a total of 810 instances are considered for conducting the experiments. The processing time parameters are generated as follows. The production time of parts (subcomponents at the first stage) is generated randomly using uniform distribution \(U\;{[}1,\;99]\); the assembly time at the second and third stages are also generated randomly, and separately considering \(U\;{[}1 \times n,\;99 \times n]\).

The maximum computation time is considered as a stopping criterion for the metaheuristic algorithms, as suggested by (Hao et al., 2022); \({20} \times \chi \times M\), \({40} \times \chi \times M\), and \({60} \times \chi \times M\) milliseconds are applied for the largest instance under the small-, medium-, and large-scale problems, respectively. The developed heuristics in the present study stop operating as soon as having a complete solution, i.e., the production schedule for all products.

Each of the compared constructive heuristics is executed in one run for each test instance. Then, the best solution after 20 runs for solving each of the test instances using every metaheuristic is considered for further analysis. Given the best-found solution (BFS; see the Appendix), the Average Relative Percentage Deviation (ARPD) measure is considered to compare the quality of solutions between algorithms; the measure can be calculated using \(RPD = \frac{{C_{max} (X) - C_{max} (X_{best} )}}{{C_{max} (X_{best} )}} \times {\text{100\% }}\), where \(X_{best}\) and \(X\) represent the best solution and the solution under consideration, respectively; smaller RPD values represent better outcomes. The computational results are summarized in Table 2. As shown in Table 2, the proposed NCH algorithm outperforms the best-performing constructive heuristics and state-of-the-art metaheuristics concerning different operational categories.

Table 2 Computational results considering different operational categories (best in bold)

Next, different numbers of parts (subcomponents), components, products, machines, and factories are considered to analyze the impact of these operational parameters on the performance of the benchmark algorithms. The analytical results are visualized in Fig. 8. The first notable observation is that with an increase in the number of components, products, machines, and factories, the outperformance of NCH becomes larger. However, an increase in the number of parts (subcomponents) closes the performance gap between the algorithms. This may be due to the random ordering of items at the part level. Having many components and products in the instances enabled the NCH algorithm to generate better solutions even when the number of parts increased. That is, with more components involved in scheduling, the average number of parts per component is smaller, hence, the impact of the parts order on the quality of the final solution becomes smaller.

Fig. 8
figure 8figure 8

Category-based analysis of the results

Statistical tests are now performed to verify the significance of the difference between the quality of the results obtained by NCH and those obtained by the benchmark algorithms. For this purpose, 0.05 is considered as the p value’s threshold to check whether the differences are statistically significant. The analytical results of the analysis of variance (ANOVA) and the t test are summarized in Tables 3 and 4, respectively.

Table 3 The ANOVA test results for comparing NCH with the benchmarks
Table 4 The t test results for comparing NCH with the benchmarks

As the statistical results of ANOVA shown in Table 3, the difference amongst the sets of BFSs obtained by different algorithms can be regarded as statistically significant because the F-statistic is greater than the critical value with 95 percent of confidence. According to Table 4, since all the p values are less than 0.05, the null hypothesis, which implies that the NCH and each of the benchmark algorithms have equivalent effectiveness, can be confidently rejected. In other words, it can be concluded that the performance of NCH is significantly better than that of each benchmark algorithm. Considering that ANEH is a constructive heuristic, its weak performance compared to the three metaheuristics may not be surprising. However, we found the effectiveness of NCH, as a constructive heuristic, in outperforming the state-of-the-art metaheuristics quite remarkable.

As a final step of the numerical analysis, the algorithms’ computational time (CPU time in seconds) is compared considering different problem sizes, i.e., workload and number of machines. The results in Table 5 show that the efficiency of the NCH algorithm is meaningfully better than that of the benchmarks. A computationally efficient constructive heuristic for solving DTrSAPFSP will not only contribute to the development of its literature but also facilitate the industrial reach of this new scheduling extension.

Table 5 The computational time of the benchmark algorithms (best in bold)

5 Conclusions

Optimization of distributed three-stage production operations was explored in this research article. Under this production setting, the parts (subcomponents) are manufactured and assembled by OEMs to form the product components. The produced components from multiple OEMs then arrive at the main manufacturer for the assembly and preparation of the final products. Many supply chains, like those in the automotive, heavy equipment, and home appliance industries, operate under similar conditions. Coordinated production scheduling benefits the supply chain through cost reduction and improved responsiveness. A new constructive heuristic algorithm is put forward for solving DTrSAPFSP, which forms the basis for further developments in the field of distributed production planning and control. The developed method is particularly beneficial for running the search procedure in parallel computing environments.

Extensive numerical analyzes were conducted to compare the performance of NCH with the state-of-the-art; NEH+ constructive heuristic and three of the state-of-the-art metaheuristics, i.e., HSSO, HSSOR, and HSSORP, were adapted for solving DTrSAPFSP. The major findings are summarized as follows. First, the performance of the proposed constructive heuristic is significantly better than the constructive heuristic that is being widely used in the distributed flowshop scheduling literature. Second, the CPU time of the NCH algorithm showed to be meaningfully less than those of the metaheuristic benchmarks. Third, the analysis of the results considering RPD shows that NCH performs better than the metaheuristics, from an overall perspective, which is quite remarkable. Considering instances with various operational characteristics, we observed that NCH yields the best solution in the majority of test instances except for instances characterized by many parts and only a small number of components and products. The statistical test confirmed the significance of the superior performance. Overall, the N-List technique appeared to be very effective and should be considered in other optimization contexts.

Future studies may extend our research in one of the following directions. First, the mathematical model of DTrSAPFSP can be extended to allow for a more realistic representation of the real-world situation. For example, including the transportation cost of components between facilities and considering the operational efficiencies of the component producers may result in more reliable outcomes. Second, the scheduling approach can be improved to work with the dynamic arrival of new orders, considering different order priorities, emergency changes of those priorities, partial acceptance/rejection of the orders, and the assignment of components to new OEMs. Third, considering the significant improvement compared to the best-performing constructive heuristic in the literature of distributed flowshop scheduling, NCH should be incorporated as a constructive heuristic in metaheuristic algorithms for more effectively solving the problems. As a fourth direction for future research, formulating mathematical models for DTrSAPFSP and the possible new extensions, as well as developing effective metaheuristics are worthwhile topics to pursue. Finally, we feel that the results obtained by NCH can be further improved using machine learning-based approaches; a direction that should be considered for future development of the DTrSAPFSP literature.