Production line balancing by P-graphs

Assembly industry plays a key role in Central and Eastern Europe. Large companies and their subcontractors manufacture automotive and electronic products from components, employing a significant number of human resources. Due to the growing labor shortage, it is critical that the production lines should be optimally loaded, i.e., the tasks have to be evenly distributed among the workstations according to their cycle times. In this article a novel formulation of the problem by process graphs or P-graphs is presented leading in an easy to follow visual definition of the potential task to employee allocations, as well as the options to generate a mathematical programming model algorithmically, to be solved by general purpose solvers or get the optimal and alternative N-best allocations by P-graph software. In addition to the theoretical presentation, the article shows the results achieved by applying the proposed methodology in a real-world environment in a computer assembly plant. The P-graph approach provides visual modeling by graphs in a graphical editor and helps understand the relations of decision variables while generating the corresponding mathematical model, which can be generalized for a class of problems and rebuilt according to actual data. As a result, the basis for rigorous mathematical optimization-based decision support can be built up according to graphical models easily understandable by end users as well.


Introduction
In Central and Eastern Europe assembly plants have a major role in the industry. In automotive and electronic assembly factories the products are assembled from ready-made parts in production lines mainly by human resources. Typically, various steps must be performed sequentially one after the other done by trained workers.

3
Although such work does not require special qualifications, the factories are faced with significant labor shortages. It is important for the fulfillment of orders that the plant gets the most out of its resources, thus optimize the task-worker allocation.
The literature calls such optimization as line balancing, and it is highly beneficial where mass production occurs and the products have been manufactured for a longer period of time in the same production line (Sivasankaran and Shahabudeen 2014).
Although the conveyor belt was first used by meat factories in Chicago, Ford made it known in 1913. Even in these early times, they tried to schedule production (Wilson 2014). In 1955, Salveson identified the "balance delay", or wasted time, as an objective (Salveson 1955). Optimization was aimed at a local and global level. Optimization was extended not only to the type of work but also to the physical placement of machines and workforce. With the years and the development of factories, an increasing number of optimization solutions have been introduced, depending on the placement of the producing belts and the variation of the manufactured products (Kumar and Mahto 2013).
Among them, the simplest one is the single line model which produces one product on a single line, the mixed assembly line (Thomopoulos 1967), when more than one model of the same general product are intermixed on one assembly line, the multiproduct, where there are several products occurs on one line (Roberts and Villa 1970), and the two-sided assembly line where there are workstations on both sides of the conveyor belt (Bartholdi 1993). Taking into account the sales, solutions were created where overproduction was punished by extra cost, as well as stochastic and heuristic solutions exist (Fazlollahtabar et al. 2011).
In the last years, the health of workers next to the production line has become increasingly important, for example, physical (ergonomic) or even mental health, and this is already being considered when subtasks are distributed among workers (Otto and Battaïa 2017). Fortunately, recycling is becoming more and more widespread, so many publications provide a solution for not only assembly but disassembly as well (Özceylan et al. 2018). Moreover, in recent years different methods are provided for finding the optimum, especially for complex classes of the linebalancing problem. Such methods include swarm optimization (Delice et al. 2017) or gravitational search algorithm (Ren et al. 2017).
Despite the extensive efforts to provide computer aid to solve the line-balancing problem, each of the proposed computational methods results a single assignment considered to be the optimal one, and none of the well-known techniques is capable to provide the N-best alternative optimal or suboptimal solutions even for the simplest single-line problem. To meet the demand for a computational tool generating alternative solutions for engineering process design problems the process graph or P-graph framework was introduced in the early 90's (Friedler et al. 1992a). P-graph is an unambiguous representation of the potentials to structural design options (Friedler et al. 1992b). P-graph is a bipartite graph, where the M-type nodes represents materials qualities appropriate to serve as inputs to outputs from the operating units, while O-type nodes identify the operating units transforming one or more material qualities to another or others. What is crucial for automated structure generation: if multiple operating units are adequate to produce a material quality of interest, they are immediately considered as alternatives to each other. Consequently, all the feasible process structures can be algorithmically enumerated (Friedler et al. 1995) or the best once found effectively (Friedler et al. 1996).
The scope of process synthesis by P-graphs were extended to supply chains (Fan et al. 2009), occasionally relying on uncertain resources (Sule et al. 2011). The structure generation algorithms have been extended to increase process reliability by incorporating redundancy (Bertok et al. 2012b), multiperiod operations (Heckl et al. 2015), long term capacity planning (Tan and Aviso 2016), or take into consideration multiple objectives (Vance et al. 2012).
The P-graph was found to be suitable for the composition of workflows (Tick et al. 2006) and business processes (Tick et al. 2013) as well. The so-called superstructure approach, graphically revealing all the candidate routes leading to the process targets, has been established for process scheduling  according to both discrete time (Garcia-Ojeda et al. 2012) and precedence based formulation (Barany et al. 2011). Finally, a general methodology for batch process scheduling has been introduced (Frits and Bertok 2014) by time constrained P-graphs (Kalauz et al. 2012).
In this article, a single line problem with rigorous constraints on the order of tasks and a case study serves for illustrating a novel approach for production line balancing by P-graphs resulting in multiple alternative best solutions. The P-graph model is constructed first following the general guidelines of batch process scheduling, then simplified in line with the distinctive features of the industrial problem of interest. Finally, the development of a decision support system at a manufacturer, evaluations, and feedbacks are displayed.

Problem statement
In the case study of interest, multiple different products are manufactured on a single production line from numerous parts by human resources in an assembly plant in Hungary. In spite of the broad variations of products, the set of parts plausible to be incorporated is finite. For optimization, it is necessary to know what components the product to be manufactured consists of, i.e., what assembly steps have to be performed to complete the product, and how long these steps would take. The duration of the steps was measured for each product by the factory and made available in the form of a time matrix; see Table 1 for example.
As the number of employees available is uncertain, and it is revealed only at the beginning of a shift, task distribution has to be revised at the beginning of each shift to achieve maximal efficiency. Therefore, there was a demand for an optimization algorithm and software to help task distribution, i.e, line balancing. See the inputoutput schema of the software in Fig. 1.
The goal was to build up a software aware of the type of product to be manufactured and the number of workers present, which is capable to determine the optimal task distribution, where the cycle time is as low as possible. It is important to note, that due to the physical locations of components inside the computer case, their insertion has to follow a fixed predefined order. Moreover if a worker is assigned to a specific workstation at the production line, he or she remains there till the end of the shift. Consequently, each worker can only perform a series of subsequent tasks on a product.

Methodology: problem formulation and solution by P-graphs
The rigorous structural representation of process structures proposed by Friedler and Fan is the Process Graph or P-graph in short, which graphically represents operations or activities by rectangles and their results and preconditions by solid circles as a bipartite graph; see Fig. 2. Any directed arc in the graph leads either to an operation from one of its precondition or from an operation to one of its result. It is important to note, that the primary aim of forming P-graphs not visualization but executing effective graph algorithms as part of computer-aided process design.

P-graph algorithms and software
The P-graph framework involves three major algorithms. The one to be applied first on a P-graph is algorithm MSG, which generates the maximal structure, i.e., the superstructure involving each plausible operation from those defined in the problem and potentially capable to contribute to at least one structurally feasible process network according to a set of axioms (Friedler et al. 1992b). The second one is algorithm SSG, which can generate each combinatorially feasible process structure leading from the resources to the final targets through a network of operations and their preconditions and results. And finally, the third one is algorithm ABB (Friedler et al. 1993), which returns not only a single optimal but the N-best process structures satisfying additional constraints besides the structural ones, e.g., limitations on the availability of resources, volumes of operations, or conservation laws (Bertok et al. 2012a). Having multiple solutions is crucial to make decision-makers more aware of the diversity among different process alternatives and the consequences of selecting one of them.
Decision support systems aiding large scale process design or optimization can be constructed by setting up model prototypes in Software P-graph studio  (Barany et al. 2018), exporting the initial model, and finally updating and resolving according to actual data; see for example the RegiOpt conceptual planneridentifying possible energy network solutions for regions (Kettl et al. 2011). As illustrated in Fig. 3, the model export and revision can be realized either in the language of process-network synthesis (PNS) or in the language of mathematical programming (MILP), but daily update in a decision support system typically requires implementations of data export and plan import interfaces from and to the enterprise resource planning (ERP) system of the customer. The consumer can decide whether the optimization is to be achieved by the help of the P-graph solver shipped together with the P-graph Studio free of charge resulting in the N-best process structures effectively , or by a mathematical programming software carrying out a single optimal solution but being more scalable and having customer support for an annual fee. Note that the mathematical programming model can also be generated from the initial P-graph model (Bertok et al. 2012a) and exported by the software P-graph Studio as depicted in Fig. 3.
First the P-graph model then the mathematical programming is to be introduced. Production line balancing by P-graphs

Formulating the line balancing problem by P-graphs
Before developing the optimization model for periodical update and solution for daily decision support, model prototypes are created in software P-graph Studio first, as introduced in the previous section. As an initial step, a general scheduling model is constructed and then tailored due to the specialties of the line balancing problem of interest.
Let us assume that P1 and P2 are two consecutive tasks of a manufacturing process at an assembly line, and R1 and R2 are two employees. Figure 4 shows the general layout proposed by (Frits and Bertok 2014) for a two-step sequence of tasks to be realized by two potential resources R1 and R2, each of which is suitable to complete any of the two tasks P1 and P2. At the top of the P-graph in Fig. 4 R1 and R2 depicted as initial resources. Any of resources R1 and R2 can initially be assigned to any of tasks P1 and P2 by the help of operations R1toP1, R1toP2 and R2toP1, R2toP2; respectively. R1beforeP1 and R2beforeP2 are consequences of task resource assignments expressing that a resource is ready to start a task. If R1 or R2 performs task P1 through operation P1byR1 or P1byR2, then P1done appears as a consequence, which is a precondition to any operation for executing task P2, since P1 and P2 ordered elements of a two-step task sequence, i.e., their execution order is fixed. Meanwhile, the resource is released as well, and ready to continue with another task; see outcomes R1afterP1 and R2afterP1. The changeover assigning resource R1 to task P1, after completing task P1 is represented by operation R1P1P2. Similarly, R1P2P1, R2P1P2, and R2P2P1 represent changeovers of the resources from one task to another. The blue part is related to the assignments of resource R1, while the green part to the assignments of resource R2. If task P2 is Since in the line balancing problem of interest in the computer assembly plant, manufacturing tasks are to be completed one after the other and any employee can work on a single product at a time, there are no parallel steps performed on a single product, no synchronization of parallel tasks is needed. Consequently, the calculation of exact time windows for each operation is not required, the total time of completing successive jobs assigned to an employee can be counted as the sum of individual completion times of the jobs. In the P-graph model in Fig. 5 all the synchronization operations including initial assignments and changeovers are removed, but for each resource working times are incorporated and utilized as preconditions to any task performed by the resource. As a result, the process described by the P-graph consists of two resource, four operations, an intermediate target and a final target only.
In Fig. 6 elements of Fig. 5 are replaced without changing their relations to be easier to understand. It can be read as: task P1 is completed by either resource R1 or R2 first, and task P2 is finished by either resource R1 or R2 afterward achieving the final target of the process. Working times of both resource R1 and R2 are counted in parallel.
During the manufacturing of a single product, each worker is delegated to work stations at the production line. Therefore, workers can pass products to the next workstation in the forward direction of the line only. Consequently, if resource R1 is assigned to workstation #1 and R2 to workstation #2, then R1 can pass a semi-finished product to R2 but not vice versa. Process structures including operations P1byR1 and P2byR1; or P1byR2 and P2byR2; or P1byR1, pass and P2 Fig. 5 Superstructure focusing on the execution of two consecutive tasks by two potential resources in any order could be feasible and automatically generated by algorithms SSG or ABB from the P-graph in Figure 7. Note that the structure including operations P1byR2 and P2byR1 but not P1byR1 cannot be feasible since precondition P1done1 to operation P2byR1 is not satisfied.
Finally, to clarify the aim of line balancing, the maximal working time of any resource is introduced as an input to the process structure, which utilization is to be minimized during the optimization; see Fig. 8. Through operation Max, the maximum working time is defined as a precondition to the availability of each working time for any resources performing any assembly task. The model prototype in Fig. 8 is to be extended for multiple assembly tasks and more employees and serves the bases for the decision support system detailed in the forthcoming paragraphs.
P-graph representation of the work distribution of assembling Product1 given in the Table 1 with 3 worker is shown in Fig. 9. Tasks in the same column are assigned to the same worker while rows contain subtasks to be completed by at least one worker. From left to right the workers are listed, while the subtasks appear from top to down in their required order.

Optimization by P-graph algorithms
Algorithm ABB results 7 s of cycle time for the process synthesis problem introduced in the preceding session, where the first worker completes the steps 1 and 2, Fig. 8 Rearranged superstructure for minimizing the maximal time needed for any of two resource during performing two consecutive tasks in fixed order Fig. 9 P-graph representation of the potential steps of the assembly process: maximal structure the second worker the step 4, and finally, the third worker completes the last two tasks; see Fig. 10.
Ideally, every worker spends ( ∑ m j=1 s j )∕n time with assembling, but this is in most cases not feasible, because of the discrete values of times with large variations required by performing subtasks. For Product2, distribution is easy, because every step takes 2 s, so for two people the assembly time is 12-12 s, for three people is 8-8-8, and four people spend 6-6-6-6 s with assembling in the optimal case. Clearly visible is that in an ideal case the number of the cycle time is inversely proportional to the number of workers.
It is worth paying attention to alternative solutions in Table 2. The second best assignment for Product1 with 3 workers is as good as the first one when the second worker does the second subtask as shown in Fig. 11. From the third to the sixth best assignments, the cycle time is 8 s, where multiple subsets of tasks are allocated to practically two of the three workers. Consequently, the optimal allocation for Product1 with two workers is 8 s. For those who are interested in reproducing the results, the P-graph Studio file is available for download at http://www.p-graph .org/ wp-conte nt/uploa ds/2019/07/Examp le1Pr oduct 1Thre eWork ers.zip.
By increasing the number of workers further, it turns out that 4 workers can complete the process within the cycle time of 5 s, and the cycle time can no longer be reduced by increasing the number of workers because no results better than 5 s are available. It comes from the time of the longest subtask, i.e., the cycletime ≥ max(s j ). The large-assembling subtask time problem causes the longer cycle times for Prod-uct3 compared to Product4, even if the long time required to perform the 5th step does not limit the cycle time directly.
In daily decision support, graphical modeling is not a must to utilize the power of the P-graph framework as described in Sect. 3.1, but the P-graph model can be saved in text files with .pns extensions, updated, and resolved by P-graph solver. Alternatively, algorithmic generation of the mathematical programming model is offered with the input of the number of workers and the list of the tasks to be executed. In chapter 3.4, such a model generation will be presented.

MILP formulation of the line balancing problem defined by P-graphs
The line balance can be achieved by solving a linear programming problem where the objective is given by a linear function and the constraints also appear as linear inequalities. As the basics of linear programming go back to the 1940s (Dantzig 1948), therefore, many open source software implementations are now available (Orchard-Hays 1990). However, the input structures of these solvers are often difficult to understand and program. Thus language Zimpl was selected to translate the mathematical model of a problem into a linear or mixed-integer mathematical program expressed in .lp or .mps file format to be read and solved by an LP solver (Koch 2001).
For numbers n of workers and number m of tasks, the cycle time cycleTime , the working times workingTimeOfWorker i , WorkerPerformTask i,j and PassWorkerAfterTask i,j are defined as variables for each worker i ∈ {1, 2, … n} and task j ∈ {1, 2, … m} int the mathematical programming model. The cycle time in a nonnegative number: as well as the working times of workers: (1) cycleTime ≥ 0 (2) ∀ i ∈ {1, 2, … n} ∶ workingTimeOfWorker i ≥ 0 Production line balancing by P-graphs workerPerformTask is a binary matrix expressing the assignment of workers to tasks by values of 1's: while passWorkerAfterTask is another binary matrix telling us whether worker i passes the semi-finished product to the worker i + 1 after finishing task j instead of further processing it: The objective function is to minimize the cycle time: As a constraints, the following are incorporated. Each subtask must be completed by exactly one worker: The cycle time cannot be shorter than the working time of any worker: For each worker, the working time is to sum of the subtask times performed by the worker: where the subtaksTime j constants are considered to be know from the problem definition. The first subtask is assumed to done by the first worker: If a task is performed by a worker or the product is just received from the previous worker, it is questionable whether the next task is to be completed by the same worker, or the semi-finished product is to be passed further to the next worker: Since the first worker cannot receive a semi finished product from a previous one, it is a special case: Equations 9-11 guarantees, that no tasks can be omitted, and workers can perform consecutive steps only. Note, that the mathematical programming model presented herein is identical to the one exported by software P-graph Studio on the bases of the P-graph model presented in Sect. 3.2.

Solution of the MILP formulation
After translating the mathematical model given in ZIMPL language in a .zpl file the result will be a file with .lp extension, which can be handled by LP solvers. Such an LP solver is COIN-OR CBC, which is an open-source mixed integer programming solver written in C++ (Forrest and Lougee-Heimer 2005). Of course, it is also an option to omit ZIMPL and directly formulate the model in LP format ∀j ∈ {2, 3, … m} ∶ workerPerformTask 1,j−1 = workerPerformTask 1,j + passWorkerAfterTask 1,j−1 1 3 Production line balancing by P-graphs in an .lp file, but considering the composition of .lp, it would be much more complicated. For example, the MILP model for determining the optimal assignment for the production of Product4 contains 50 lines in .zpl file and 350 lines in .lp format.
Solving the .lp files by COIN-OR CBC, the optimal task-worker assignment and cycle time, as a result, are displayed in a text file. The results are consistent with those reported by the P-graph software, but only a single optimal assignment can be obtained by CBC, see Fig. 2.

Software implementation and feedback
Since production managers are typically not experts in neither process synthesis by P-graphs nor mathematical programming, it is necessary to create software for generating and solving either the P-graph or the mathematical programming models described in Sects. 3.2 and 3.4 The software developed for daily balancing a serial production line was tailored according to the practice of an assembly plant, and visible in Fig. 1. In this decision support software both P-graph model generation connected to the P-graph solver and MILP model generation providing input to a MILP solver has been implemented, user can switch between the two alternative optimization method.
First, the user selects the product to be manufactured and the number of workers considered to be available. Subtask times are stored in a table, which can be specified through the menu. This table is freely expandable. The software loads the table at startup and offers only those products for selection which are included in it. After the generation and solution of the optimization model, the software displays the optimal assignments visually in detail in an easy to read table form for the decision-makers.
The software was developed in consultation with a Hungarian assembly company, so the model was also tested on real data. The company indicated that although the names of the products to be manufactured are identical but depending on the order some assembly steps can be included or excluded on demand, thus the final version of the interface involves options for tailoring the steps relevant for the order of interest. The input received from the company contained 21 different products and measurement data of at most 128 but at least 48 assembly steps.
When using the software, the user has to select the product type and the number of workers first. After that, the list of subtasks can be tailored based on a reference list on the right-hand side of the interface. Clicking on the "Solve problem" button, the software starts both model generation and solution, then retrieves the result and displays it in an Excel file on two worksheets.
The company gave positive feedback on the software, the result speaks for itself, as the company has achieved an average efficiency improvement of 20-25% on all types of products by introducing the optimization software in daily practice; see

Future work
The model and software could be extended in the future to identify bottlenecks and take into account the possibility of parallelizing long lasting subtasks. It could be realized by setting up work stations in parallel, e.g., at both sides of the production line, where multiple workers could share semi-finished products waiting for the same operation. Such extension could lead to an even better line balance by cutting the length of the longest tasks, and eliminating bottlenecks highlighted.

Conclusion
The assembly industry in Central and Eastern Europe is considerably present. However, due to the growing demand and labor shortages, it is necessary to optimize production in as many places as possible. This article presents an exact modeling and optimization approach based on the P-graph framework during model development and prototyping, as well as, the implementation of the resultant formulation by a mathematical programming solver as part of decision support software in a real-life environment. The resultant software has been installed and introduced to the daily practice in a Hungarian assembly plant. According to real life measurements, 20-25% increase in the yields of the shifts has been achieved in comparison with the previous line assignment best practice.