Optimizing storage assignment, order picking, and their interaction in mezzanine warehouses

In warehouses, order picking is known to be the most labor-intensive and costly task in which the employees account for a large part of the warehouse performance. Hence, many approaches exist, that optimize the order picking process based on diverse economic criteria. However, most of these approaches focus on a single economic objective at once and disregard ergonomic criteria in their optimization. Further, the influence of the placement of the items to be picked is underestimated and accordingly, too little attention is paid to the interdependence of these two problems. In this work, we aim at optimizing the storage assignment and the order picking problem within mezzanine warehouse with regards to their reciprocal influence. We propose a customized version of the Non-dominated Sorting Genetic Algorithm II (NSGA-II) for optimizing the storage assignment problem as well as an Ant Colony Optimization (ACO) algorithm for optimizing the order picking problem. Both algorithms incorporate multiple economic and ergonomic constraints simultaneously. Furthermore, the algorithms incorporate knowledge about the interdependence between both problems, aiming to improve the overall warehouse performance. Our evaluation results show that our proposed algorithms return better storage assignments and order pick routes compared to commonly used techniques for the following quality indicators for comparing Pareto fronts: Coverage, Generational Distance, Euclidian Distance, Pareto Front Size, and Inverted Generational Distance. Additionally, the evaluation regarding the interaction of both algorithms shows a better performance when combining both proposed algorithms.

however miss to integrate the interrelation between them and view each problem separately [8]. However, as identified by [9], warehouse problems are strongly coupled. Thus, optimizing each warehouse problem individually may yield suboptimal solutions, harming the overall warehouse performance. Since the employees spend most time traveling in such a mezzanine warehouse [13], it is not surprising that most approaches focus on optimizing the travel distance. Additionally, ergonomic constraints are rarely considered, even though mezzanine warehouses represent labor-intensive working environments.
In this paper, we propose an integrated approach for combined storage assignment and order picking that simultaneously optimizes multiple economic and ergonomic constraints in mezzanine warehouses. Expert interviews have shown, that in practice the following set of economic criteria is important and, hence, supported by our approach: products should be spread equally among each floor, fast-moving products should be easily accessible, correlated products should be stored in proximity of each other, and the storage space should be used as efficiently as possible. Further, we integrate ergonomic constraints such as storing heavy products and fast-moving products at grip height or reducing the requirement to switch a mezzanine floor. In an evaluation using three simulated mezzanine warehouses of different sizes, we analyze the quality of the solutions returned by our algorithms compared to commonly used techniques. Finally, we assess the quality improvement when combining both of our algorithms compared to an isolated application. Hence, the contribution of this paper is threefold: 1. Design of storage allocation and order picking algorithms that incorporate the interdependence of both tasks. 2. Integration of diverse economic and ergonomic constraints. 3. Evaluation of the approach in a use case based on real-world data provided by our cooperation company.
The remainder of this paper is structured as follows. Section 2 presents related work and delineates our paper from existing approaches. Section 3 presents the meta-model and floor layout of considered mezzanine warehouses. Afterwards, Section 4 provides an overview of the goal and a 3-phase algorithm of our storage assignment approach, while Section 5 presents the details of the proposed Genetic Algorithm for storage assignment. Then, Section 6 shows our order picking approach based on an adapted Ant Colony Optimization (ACO) algorithm. Section 7 presents our evaluation methodology and discusses the results and threats to validity. Finally, Section 8 concludes the paper and summarizes future work.

Related work
In the literature, diverse storage assignment policies exist such as the dedicated and the random storage policy [2], the closest open location storage policy [13], rank-based storage policies [24]. Further, class-based, golden zone, and family grouping storage policies are introduced in the literature [13,23]. Additionally, diverse approaches apply optimization techniques. Sooksaksu et al. [28] propose a Particle Swarm Optimization algorithm for warehouses that deploy the class-based storage policy. Kovács [14] presents a mixed integer programming model for optimizing the storage assignment problem for class-based assigned warehouses. Kofler et al. [11] apply local search algorithms for reorganizing the products in the warehouse to keep it operating efficiently. Li et al. [19] propose a multi-objective genetic algorithm for optimizing the storage assignment problem in automated storage/retrieval warehouses.
Similarly, heuristic policies exist for the order picking problem such as the S-Shape, Return, Mid-Point, Largest Gap, and Combined heuristic [22,26,29]. Besides, [25] presents an optimal algorithm using dynamic programming to find the shortest pick route in a single-block warehouse. Additionally, [5] propose a mathematical model in combination with construction heuristics and apply Tabu Search to construct order picking routes. Ene andÖztürk [7] present an integer programming model for optimizing the order picking problem. Xing et al. [32] propose an Max-Min Ant System (MMAS) algorithm for optimizing machine travel paths in automated storage/retrieval warehouses. Chen et al. [4] propose an ACO algorithm that detects congestion situations that arise when multiple order pickers traverse the same pick aisle simultaneously. Lee et al. [16] presents a comparison on different multi-objective evolutionary algorithms for order picking and storage assignment. Other approaches apply the information of orders and order picking to optimize the storage assignment [21,30]. However, such approaches are not flexible enough in our setting which is not order-driven and, hence, dynamic interactions between storage assignment and order picking are required to be considered.
Finally, related work also assess the interaction of storage assignment and order picking approaches. Petersen and Schmenner [24] and [8] provide an overview of wellperforming combinations of storage assignment strategies and routing heuristics. Manzini et al. [20] analyze different parameters that affect the travel time in single-block warehouses that deploy the class-based storage policy. Shqair et al. [27] study the effects of different parameters on the travel distance in multi-block warehouses.
Our work delineates from these existing approaches in diverse aspects. First of all, our work applies optimization techniques and does not rely on a policy on how to select fitting storage racks or shortest pick routes. Second, regarding existing optimization approaches, our work integrates multiple objectives at once considering economic as well as ergonomic constraints at once while most of the other approaches focus on a single economic goal. Similarly, the authors of [3] also integrate ergonomic considerations, but do not focus on optimizing the storage assignment but rather the order picking only. Finally, in contrast to existing work that address the influence of storage assignment and order picking tasks, we designed algorithms that optimize the targets of both tasks. Hence, they optimize storage assignment and order picking with regards to the interdependence of both algorithms, while other works only provide well-performing combinations of algorithms or perform parameter tuning. Similarly, [16] also integrates both activities, but do not focus on a multiobjective approach,especially neglecting the ergonomic constraints. Also the work of [17] integrates both activities, however, they apply a digital twin based approach rather than considering multi-objective optimization algorithms.

Meta-Model of considered mezzanine warehouses and overview on the processes
The storage assignment and order picking algorithm require information on the warehouse layout, the product assortment, the products' storage locations, and the current state of the warehouse. A list of all used variables in this section can be found in the Appendix in Table 11. Figure 1 illustrates our proposed meta-model. The blue box describes the floor layout defining the arrangement of racks within one floor of the mezzanine warehouse (FloorLayout). Each floor consists of the classes, P/D-Point, WidePickAisle, and Rack. A p/d-point is the pickup and delivery point where personal needs to collect items to be stored in the warehouse or deliver items of a customer order that were collected. Regard the class WidePickAisle, two types of pick aisles exist: wide and narrow pick aisles. In wide pick aisles, pickers can take along their pick cart to cross the aisle while it needs to be parked at the aisle entry for narrow pick aisles. A floor can be illustrated as a two-dimensional map as depicted in Fig. 2: The racks with their unique identifiers r 3 and r 4 are assigned the floor coordinates x = 1 and y = 2 since their access points are both located at (1|2). The vertical aisles located at x = 0 and x = 4, as well as the horizontal cross aisles at y = 0, y = 4, and y = 7, form the periphery of the floor. Periphery aisles usually contain the p/d-points (e.g. at (2|0)). A wide pick aisle is depicted at x-coordinate two and two narrow pick aisles are shown at x-coordinates one and three, where the picker needs to park his pick cart. Real-world mezzanine warehouses may apply different layouts on each floor; however, we assume that each floor in the mezzanine warehouse applies the same layout.
Since diagonal movements are not possible in this layout, the Manhattan distance function is applied to calculate the distance between two locations p and q (with n being the number of dimensions of the coordinates for p and q): The classes inside the yellow box (Compartment and RackConfiguration) define the configuration of a rack, referring to its size, the number of shelf  The Compartment class includes an identifier and a three-dimensional vector specifying the compartment's dimensions. The shelf level and the shelf level position defines the compartment's location within the rack. The class Product defines the products using five properties: product number, size, weight, rank, and order frequency. The rank (≥ 1) allows identifying fast and slow-moving products by the frequency at which the product appears in recent customer orders. The product of rank 1 represents the most frequently ordered product. The order frequency describes the frequency to which a product is usually ordered using a gaussian distribution. Both properties are derived from recent customer orders and represent redundant information which prevents the algorithms from recalculating this information each time they need it. Further, these properties are later used in the storage assignment optimization to find better racks regarding their frequency and usual ordered amount. The class ProductAssignment specifies the quantity of which a product is assigned to a specific compartment. The classes Order and OrderLine of the orange package define the structure of a customer order consisting of a unique order number and multiple order lines. An order line specifies the quantity to which a product is ordered. The class AssociationRule defines association rules derived by the Apriori algorithm [10]. The confidence ranges from 0 to 1 and expresses the strength of the correlation between the left-sided and the right-sided set of products. These rules are used in the storage assignment algorithm later on to store correlated products close to each other which may increase the order picking performance. Figure 3 presents the two processes for the storage assignment as well as order picking. Storage assignment relies on the NSGA-II algorithm for optimal distribution of the items in the mezzanine warehouse (see Fig. 3a). First, the algorithm distributes the incoming product across the mezzanine floors with the goal to reduce the need for changing floors during order picking. Second, on each floor, the algorithm searches for appropriate racks to store the incoming product based on the following four objectives: frequently requested quantity, spread items, distance to pick-up/delivery points, and correlated products. The NSGA-II algorithm shall optimize the four objectives. Third, the NSGA-II algorithm selects compartments for storing the incoming product based on two criteria.
(1) Fast-moving products should be assigned to compartments at grip height. (2) Heavy products should be assigned to lower-level compartments. Order picking also follows a three-step approach (see Fig. 3b). First, the algorithm constructs a graph representation based on the information of the pick list as well as the mezzanine floor layout. Second, the ACO algorithm applies artificial ants that explore the graph. Third, the quality of the computed pick route is assessed based on the travel distance (economic goal) and the weight violations of the items (ergonomic goal). In the following, we discuss those different algorithms.

Storage assignment
The overall goal of the storage assignment algorithm is to select a set of compartments for storing an incoming 18608 V. Lesch et al.

Fig. 3
Schematic overview on the storage assignment (above) and order picking (below) processes product by considering multiple economic and ergonomic constraints simultaneously.

Constraints and assumptions
In expert interviews, we identified multiple hard constraints that should be covered in our approaches. These hard constraints specify whether a storage allocation is considered feasible and a feasible solution never violates any of these constraints: Each incoming item must be assigned to a compartment (H C 1 ). The selected compartment must either be empty or partially occupied by items of the same product (H C 2 ). Each item has to fit in the remaining free space if its compartment (H C 3 ). Furthermore, we define multiple soft constraints that measure the extent to which a storage allocation fulfills economic criteria: The products should be evenly spread on each floor (SC 1 ). Fast-moving products should be assigned close to a p/d-point (SC 2 ). The mean ordered quantity of a product should be locally available (SC 3 ). Correlated products should be stored close to each other (SC 4 ). The storage space should be used as efficiently as possible (SC 5 ). Finally, we define two ergonomic soft constraints: Heavy products should be stored at grip height (SC 6 ). Fast-moving products should be assigned to compartments at grip height (SC 7 ). Further, we state the following assumptions for our approach: The state of the warehouse does not change while the storage assignment algorithm is running. Thus, the products are not repositioned nor removed, and the racks' configurations do not change. Further, the algorithm allocates only one product at a time. The storage racks may apply different rack configurations and products may only be assigned to fitting compartments. A compartment is allowed to store multiple items of the same product but may not store two different products at the same time. Finally, product ranks and association rules are derived from recent customer orders.

3-Phase storage assignment algorithm
Our storage assignment algorithm consists of three phases that intend to reduce the complexity of the optimization problem: (i) assignment of products to floors, (ii) assignment to racks w.r.t. economic criteria, and (iii) assignment to compartments w.r.t. ergonomic criteria.
In the first phase, the incoming product quantity is split among the mezzanine floors (SC 1 ) so that each floor provides the same quantity of the product. This way, we try to reduce the required floor changes during a pick route to a minimum. Thus, we first determine the total quantity of the incoming product that is already available in each floor, calculate the ideal quantity for each floor after storage assignment, and assign the missing quantity to each floor. Remaining items, due to rounded results, are allocated to a random floor.
The second phase addresses the economic soft constraints SC 2 to SC 5 and aims to reduce travel distances during order picking. This phase assigns the incoming products to racks on a specific floor. Since this phase requires optimizing a set of constraints, we apply a multi-objective optimization algorithm that is described in Section 5.
The third phase aims to satisfy the ergonomic soft constraints SC 6 and SC 7 . We classify a product p into three weight classes: light (up to 3 kg), medium (between 3 kg and 7 kg), and heavy (over 7 kg). We set the grip height to be between 0.75 m to 1.25 m and refer to compartments below/above the grip height as low/high zone compartments. Additionally, we distinguish fast-moving, moderately-moving, and slow-moving products by their relative rank. The relative rank of a product p calculates as rank p /|P |, where rank p denotes the rank of product p, and |P | the size of the product assortment. In the first step, the incoming items are assigned to the rack's compartments that already provide items of the same product. In the second step, the remaining incoming items are assigned to the rack's unoccupied compartments based on predefined penalty values. The penalty values range from zero to three and the more a compartment comp is unsuited for storing the product p, the more penalty points are given (see Tables 1 and 2).

Genetic algorithm for storage assignment
This section presents our custom version of the NSGA-II algorithm that was proposed by [6]. The algorithm receives the current state of a floor and assigns the incoming items to a set of racks on this floor. Note that the NSGA-II is executed for each floor individually. A list of all used variables in this section can be found in the Appendix in Table 12.

Chromosome encoding
We propose the chromosome encoding depicted in Fig. 4.
The figure illustrates an example allocation task where ten items of product p 1 must be assigned to the racks on f loor 1 . The black numbers indicate the existing items of product p 1 , while the red numbers indicate the incoming items of product p 1 . The right side shows the chromosome that encodes the storage allocation depicted on the left side by specifying the racks selected for storing each incoming item. Since ten items of product p 1 are assigned, the chromosome's length equals 10.

Objective functions
A set of objective functions guide the NSGA-II algorithm to find good storage allocations. We propose four domain  specific objective functions for our maximization problem: (i) spread score, (ii) distance score, (iii) quantity score, (iv) correlation score.

Spread score
This score addresses constraint SC 1 and aims to equally spread the incoming quality of product p across the entire floor. Hence, we divide the floor f j into multiple areas A of equal size. To calculate the spread score, we use the total (totalQ) and ideal quantity (idealQ) of a product in an area of a floor. The totalQ is the sum of the existing and incoming items in an area, while the idealQ is calculated by dividing the sum of the existing and incoming quantity of product p on the floor by the number of defined areas. The final spread score for chromosome C is calculated as the sum of differences between the total and the ideal quantity for all areas (see (2)).

Distance score
This score addresses constraint SC 2 and aims to allocate slow-moving products to racks further away from the p/dpoints. Hence, the distance score quantifies the extent to which the walking distances (dist r i ) of the selected racks match the ideal distance (idealDist p,f j ). For calculating the idealDist, we perform the following steps: First, we determine the relative rank of the incoming product p by dividing the rank of the product (rank p ) by the size of the product assortment P : relRank p = rank p /|P |. Then, the relative rank is mapped to a rack index: rackI dx p,f j = relRank p · |R f j | with R f j being the list of racks of floor f j sorted by the racks' walking distances to their closest p/dpoint. The rack in R f j at index rackI dx represents the bestsuited rack for storing product p with regard to constraint SC 2 . Finally, the idealDist computes as: The overall distance score calculates as the sum over all racks in chromosome (C) of differences between the walking distances of the racks selected for storing product p and the idealDist (see (3)).
Further, we provide an example of this calculation in Fig. 5.

Quantity score
This score assesses SC 3 and ensures that the mean ordered quantity of a product is locally available. Therefore, the target quantity defines the quantity to which the product p should be locally available based on a set of recent customer orders: tq p = μ p + 2σ p with μ p as expected value for the orders and σ p for the standard deviation. Further, we define four masks and a modifier for each mask to measure the density to which the tq p is locally available: M 1 equals the size of a rack (maskMod = 1), Using these masks, we calculate a quantity factor for each sub aisle (sa) of a floor and each mask (M k ). Therefore, we select the quantity (q) of products inside a mask divided by the target quantity: qF actor p,f j ,sa l = max(q p,f j ,sa l (M k )/tq p ). This results in a value of 1 if the target quantity is met and a value of 0 if no products can be found within this mask. This quantity factor is then multiplied by the maskMod to calculate the mask score: maskScore p,f j ,sa l (M k ) = maskMod M k · qF actor p,f j ,sa l (M k ). The highest possible mask score is 1, indicating that the target quantity is available in a single rack of the sub aisle. Based on these mask scores, the maximum value is selected to assign a score to each sub aisle: subAisleScore p,f j ,sa l = max 4 k=1 maskScore p,f j ,sa l (M k ).
The final quantity score computes as the sum of all subAisleScores (with |SA| as the number of sub aisles): subAisleScore p,f j ,sa l (4) Figure 6 illustrates the idea of using masks of different sizes to measure the density to which the target quantity tq p of product p is locally available. The left side shows the storage locations of existing and incoming items of product p in a specific sub aisle sa. The center of the figure depicts the four masks M k that iterate over the racks of the sub aisle. During this process, the masks count the existing and incoming quantities of product p that can be found in the covered regions. The right side shows the regions where the masks find the largest quantity of product p in the sub aisle sa.

Correlation score
This score relates to SC 4 and describes the extent to which the incoming product is stored close to its correlated products. Association rules describe correlations between products and can be derived from recent customer orders.
We consider association rules of the form rule = {p} conf −−→ {cp}, where p denotes the incoming product, cp the correlated product, and conf a confidence value. We first calculate the number of possible clusters of target quantities of the incoming product: qClusters p,f j = totalQ p,f j /tq p . We use this value to define the ideal quantity to which the correlated product should be available in the vicinity of the incoming product: In the next step, we determine the quantity of cp that already is available in the vicinity of p. For this task, the previously introduced masks M k are used and are placed directly on top of the racks containing cp. Again, the qF actor is calculated to capture the extent to which the target quantity of p is available in the region covered by M k placed on top of rack r: qF actor p,r (M k ) = q p,r (M k )/tq p . Then, we calculate the fraction to which the items of cp stored in r are considered to be in the vicinity of p: corrQ rule,r (M k ) = exQ cp,r · qF actor p,r (M k ) · maskMod M k . exQ cp,r refers to the existing quantity of the correlated product cp in rack r. Afterward, we select the corrQ with the highest value representing the mask with the largest amount of p in the vicinity of cp: corrQ rule,r = max 4 k=1 corrQ rule,r (M k ). The sum of all corrQ rule,r over all racks on this floor denotes the quantity of the cp on this floor that is considered as being in the vicinity of p: corrQ rule,f j = rack∈R f j ,cp corrQ rule,r . Now, we calculated the quantity of the correlated product that is in the vicinity of the incoming product and the difference of this value to the ideal quantity. Based on this difference, the correlation score is calculated as (with A p as the set of associations rules) ( Fig. 7):

Genetic operators
The NSGA-II is a genetic algorithm and requires the definition of selection, crossover, and mutation operators.

Selection
We apply a binary tournament selection operator where two random parent individuals compete against each other [6]. The individual with the higher Pareto rank is declared the winner and is allowed to participate in the crossover procedure. In case both parents are of equal Pareto rank, the individual with the larger crowding distance, i.e. the higher diversity, wins the tournament.

Crossover
Since all chromosomes created during a run of the NSGA-II algorithm are of equal length, we use the traditional singlepoint crossover operator. It selects a random crossover point on both parents' chromosomes, splits them, and recombines them cross-wise to obtain two new children.

Mutation
We define eight mutation operators that incorporate domainspecific knowledge to guide the search process: (1) The FillRack mutator selects a random rack and fills it with incoming items from the same sub aisle. (2) The MoveRack mutator selects a random rack containing at least one incoming item and moves them to a different rack within the same sub aisle. (3) The FillSubAisle mutator selects a random sub aisle and fills it with incoming items from other sub aisles until it provides the product's target quantity. (4) The ClearSubAisle mutator selects a random sub aisle and moves any incoming items to a different sub aisle. (5) The RedistributeExceedingQuantities mutator redistributes incoming items of racks that provide more items than the target quantity to racks that require only a few items to provide the target quantity. (6) The ShiftRacks mutator shifts all incoming items towards a randomly selected direction: left, right, up, or down. (7) The SwapSubAisles mutator first groups the sub aisles into pairs and swaps incoming items randomly within each pair. (8) The SwapRacks mutator is similar to (7) but swaps items within pairs of racks instead of sub aisles.

NSGA-II algorithm
The overall procedure of our NSGA-II algorithm is summarized in Algorithm 1.
The algorithm receives the product to be stored and its quantity as well as the list fittingRacks. Further, the parentPopSize defines the size of the parent population, the mutation probability is given by mutProb, the number of generations to be used when calculating the standard deviation of the maximum crowding distance std(L) is called L, the threshold for the standard deviation of the crowding distance is δ lim , and the maximum number of generations is called maxGen. In the end, the algorithm returns a paretoFront of the best storage assignments.
In the first step, the algorithm initializes the population by randomly creating the required amount of chromosomes. Therefore, the algorithm selects fitting racks for the product randomly which might produce invalid solutions due to exceeded rack spaces. Each invalid chromosome is then repaired by moving the amount of exceeding products to another available rack. Then, the generation counter gen and the history of observed maximum crowding distances are initialized. Then, the while loop starts and iterates using the two following stopping criterions: (i) the number of maximum generations (maxGen) is executed, or (ii) the standard deviation of observed crowding distances (std(L)) falls below the given threshold (δ lim ). Inside the while loop, the generations counter is incremented, and a complete new children population in the size of the parent population is bred using the proposed selection, crossover and muta- tion operators (createChildrenPopulation). This set is added to a combined population of existing parent individuals and select the best individuals to fill the new parent population (createNextParentPopulation()). Afterwards, a Pareto front is calculated from this parent population (calculateParetoFront()) and the maximum crowding distance of this front is calculated. This value is added to the history of maximum crowding distances. In case, the while loop stops, the current Pareto front is returned.
Since the NSGA-II algorithm returns a Pareto front, a user is usually required to identify the most valuable tradeoff solution. However, we automate this step by applying the following procedure. For each of the four objective functions (of i ), we select the solution (s j ) of the Pareto front with the highest value (of i (s j )) for this function. We then use these values as a 4-dimensional reference point . Based on the Euclidean distance, the solution that is closest to the reference point is automatically selected as the most valuable trade-off solution.

Order picking
This section introduces our order picking approach that is based on ACO. The overall goal of this algorithm is to construct a pick route for a given customer order. Since the travel distance is an essential economic goal, the pick route should be as short as possible. Additionally, the pick route should also be ergonomically favorable. The need for changing floors should be minimal to reduce the order picker's physical stress. Further, the product picking sequence is relevant as if light products are picked first, the order picker might need to rearrange the already picked products so that light products are placed on top of heavy products. Hence, the order picking algorithm aims to construct a short pick route that collects heavy products first and changes floors as little as possible to address economic and ergonomic criteria. A list of all used variables in this section can be found in the Appendix, in Table 13.
The main idea of this approach is to represent a mezzanine warehouse as a graph and let ants search for satisfactory order picking sequences. We make the following assumptions to better deal with the complexity of the order picking problem: (i) The state of the mezzanine warehouse does not change while the algorithm is running, that is no repositioning or removal of products is performed. (ii) The start and ending p/d points of a pick route may differ. (iii) Narrow sub aisles may only be traversed to the sub aisles' midpoint, as the picker always must go back to the cart in the wide pick aisle. (iv) The order pickers visit only one rack each time they enter a sub aisle. (v) Picking carts withstand infinite loads and can carry an unlimited amount of items.

Constraints
For the order picking algorithm, we define a set of hard and soft constraints. The hard constraints assess the feasibility of a solution, while the soft constraints measure the extent to which the solution fulfills economic and ergonomic goals. We define the following hard constraints: The pick route must start and end at a p/d point (H C 1 ). The pick route must collect the requested quantities of the products specified in the pick list (H C 2 ). After entering a narrow sub aisle, the route must always return to the sub aisle's entrance (H C 3 ). Further, we define one economic soft constraint: The travel distance should be minimal (SC 1 ); And two ergonomic soft constraints: The need for changing floors should be minimal (SC 2 ). Heavy products should be picked first, followed by lighter products (SC 3 ).

Graph representation
We propose the following procedure for transferring a mezzanine warehouse into a graph representation. Figure 8 illustrates the procedure of dividing the warehouse into multiple zones called market zones.
Each market zone is represented by a market, and thus, a node in the graph. The figure depicts the state of f loor 1 that consists of three cross aisles, three wide (pick) aisles, and two p/d-points. We define six market zones obtained by dividing the floor along the wide (pick) aisles into multiple vertical lanes. In the depicted example, lane 1 refers to the area from aisle 0 to pickAisle 3 , and lane 2 refers to the area from pickAisle 3 to aisle 6 . A crossLane (c,l) refers to the part of the cross aisle c that lies within the lane l. For each cross lane, we define a market zone that comprises the storage racks that can be visited from the respective cross lane up to their midpoints. For example, the red market zone includes the racks that can be visited if the order picker is located at the crossLane (1,1) . The market zones are limited to the midpoints of the corresponding sub aisles, which prevents the ants from constructing pick routes that entirely traverse the pick aisles. A market is referred to as market (f,c,l) , where f denotes the floor, c the cross aisle, and l the lane. For each market, we define three attributes: (1) the market's coordinates, (2) the market's closest p/dpoint, and (3) the market's supply that specifies which products are available at which quantity. After defining all markets, they are connected via edges to create a complete directed graph. The edges' weights represent the Manhattan distances between the markets. If the warehouse consists of a second f loor 2 , the markets on f loor 1 are also connected to the markets on f loor 2 and vice versa, with an extra f loorP enalty added to the edges' weights.

Pick route construction
An ant colony explores the graph to construct a set of pick routes, i.e., a sequence of markets that provide the products, for a given pick list. A pick route consists of two layers: (i) representing markets, and (ii) rack sequences. Figure 9 depicts an example pick route created by a single ant of the colony.
The market sequence (layer one) of a pick route is computed by an ant that is placed on a market within the graph. Guided by the pheromone trails, the ant visits neighboring markets until it collected the requested product quantities specified in the pick list. The ant manages a purchasing list that specifies the missing items. The pick route is complete after the ant's purchasing list is empty. Further, each ant must decide whether it enters the market zone from the left or from the right side which depends on the position of the previously visited market. The left/right entrance is located at the position where the cross lane has its lowest/highest x coordinate value. The decision from which side the ant enters the market zone depends on the position of the previously visited market. While constructing pick routes, the ant applies a heuristic function to identify the markets within its vicinity that seem attractive to visit next.
The second layer represents the rack sequence, i.e., the racks the ant visited in each market. When calculating the rack sequence, the ants use the following priority rules: (1) Racks that provide heavy products should be visited first. (2) Racks located closer to the sub aisle's entrance should be visited second. (3) Racks that provide the largest quantities should be visited third.

Heuristic function
To identify the most promising paths and assess the attractiveness of a market, the ants apply a heuristic function. The attractiveness of a market is based on two factors: (i) the closeness of the market to the ant's current location, and (ii) the availability of required items. Thus, we define the heuristic function as follows: where η k m,n is the heuristic value that the ant k currently located at market m associates with the edge (m, n) leading to market n. U k is the set of markets the ant has not visited yet and d m,n > 0 refers to the Manhattan distance between the markets. I k n ∈ [0; 1] denotes the percentage to which the required items of ant k are available at market n. The higher the heuristic value, the more attractive is the market for the ant.

Objective functions
After retrieving possible pick routes from the algorithm, we use two objective functions to asses the quality of the route.

Travel distance
This objective function calculates the travel distance of a pick route and measures the extent to which the soft constraint SC 1 and SC 2 are satisfied. We define a pick route P to be P = (M, R) where M = (m 1 , ..., m k ) refers to the market sequence and R = (r 1 , ..., r l ) refers to the rack sequence. We then define the objective function as follows: where we sum up the distance from the start p/d-point to the first market, the sum of the distances within each entered sub aisle (d sub r i ), the sum of the distances within the cross lanes (d cross m i ), the distances between the visited markets (d market (m i ,m i+1 ) ), and the distance from the last visited market to its closest p/d-point (d pd m k ).

Weight violation
The second objective function measures the extent to which a pick route satisfies the soft constraint SC 3 and counts the number of weight violations in the product picking sequence. A weight violation occurs if a heavy product is collected after a much lighter product. In this case, the order picker must rearrange the lighter products already placed on the picking cart to prevent damage. The user-specified threshold allowedWeightDifference defines the acceptable weight difference between the heavier and the lighter products. Using this threshold, we count the number of weight violations in a product picking sequence.

ACO algorithm procedure
This section proposes our proposed ACO algorithm and shows the pseudo-code in Algorithm 2. First of all, the algorithm constructs the graph and initializes the pheromones. The pheromones are initialized with their maximum possible value determined by τ max . Additionally, a minimum pheromone can be specified by using the value τ min in the parametrization of the algorithm. Then, a while loop starts and uses the concept of cataclysms [4] and a maximum number of iterations as stopping criterion: The parameter maxCataclysms specifies the maximum number of cataclysms that may occur. The parameter maxconsIterWoImpr defines the time window in which the ACO algorithm must improve the current Pareto front to prevent the cataclysm operator from being applied. The parameter maxIter defines the maximum allowed number of iterations regardless of happened cataclysms.
Inside the loop the number of current iterations is incremented and pick routes are constructed. The general idea is to place one ant on each market of the graph from which the ant starts to create a pick route. The next market is selected based on the pheromone values and the heuristic function. We propose two different versions of the ACO to combine these values as explained later. For each found pick route, the reverse pick route is calculated by reversing the market sequence, toggling the sides from which the ant entered the markets, and recalculating the rack sequence. We store the pick routes the ants construct in each iteration in the variable pickRoutes. In the next step, the Pareto-optimal pick routes of this iteration are selected by calculating the objective function and the Pareto rank of all routes. Afterward, the iteration-best (pickRoutes ib ) and the global-best pick routes (nextPickRoutes gb ) are merged into a single set and the Pareto-optimal pick routes in this set represent the next set of global-best pick routes. The iteration-best pick routes and the global-best pick routes are used to perform the pheromone update, which is explained later. In the further course of the iteration, the ACO algorithm checks whether the cataclysm operator must be applied and compares the global best pick routes of the last and the current iteration. If the ACO algorithm succeeded in improving the Pareto front, the set pickRoutes gb is updated, and the counter variable consIterWoImpr is reset to 0. However, if no improvement was made, this counter variable is incremented. If multiple consecutive iterations fail to achieve an improvement, the search is considered stuck, and the cataclysm operator is applied. In case the cataclysm is applied, the global-best pick routes pickRoutes gb are included in the set pickRoutes cataclysm , the pheromones on the edges representing the pick routes in pickRoutes gb are reset to the lowest possible value, and the set pickRoutes gb is emptied. Then, the number of cataclysms is incremented and the counter variable consIterWoImpr is reset to 0. After the main loop terminates, the global-best pick routes pickRoutes gb of the last iteration are included in the set pickRoutes cataclysm and the algorithm returns the Pareto-optimal pick routes in this set.

ACO 3 variant
In the following, we introduce two variants of our algorithm that show a distinct pheromone handling. Both variants are inspired by [1] that propose four different variants to handle multi-objective problems with an ACO. We select the two best performing variants (ACO 3 and ACO 4 ) and integrate them in our approach to compare which variant produces the best results in our problem domain. This section introduces the ACO 3 variant that applies one ant colony using a single pheromone matrix τ 1 for optimizing both objectives simultaneously. In each construction step, the probability of selecting an edge calculates as: where prob k m,n denotes the probability of ant k located at market m to select the edge (m, n) leading to market n. τ 1 m,n refers to the pheromone value of edge (m, n). η k m,n denotes the heuristic value (see Formula (6)) that the ant associates with the edge (m, n). The parameters α and β control the importance of the pheromone values and heuristic values. Lastly, U k represents the set of markets that ant k has not visited yet.
When performing the pheromone update, the ACO 3 variant rewards in 90% of the time the iteration-best pick routes and in 10% of the time, the global-best pick routes (found since the last cataclysm) to update the pheromone where ρ refers to the evaporation factor and Δτ 1 m,n is the amount of pheromone that is added to the edge (m, n). P F refers to the Pareto front containing the solutions to be rewarded.

ACO 4 variant
The ACO 4 variant also applies one ant colony but a pheromone matrix τ 1 for optimizing the first objective function, and another pheromone matrix τ 2 for optimizing the second objective function. When deciding which edge to explore next, an ant randomly chooses a pheromone matrix. In each construction step, the probability of selecting an edge calculates as [1]: where τ r m,n refers to the pheromone value of edge (m, n) w.r.t. pheromone matrix τ i . At the end of an iteration, the ACO 4 variant updates the pheromone matrix τ i by rewarding the iteration-best pick route P R i ib that minimizes the objective function of i [1]: , if (m, n) belongs to the pick route P R i ib 0, otherwise (13) where ρ again refers to the evaporation factor and Δτ i m,n is the pheromone added to the edge (m, n) in pheromone matrix τ i . P R i gb refers to the global-best pick route that minimizes the ith objective function of all pick routes constructed since the last cataclysm occurred.

Evaluation
This section presents the evaluation of our approaches. It defines the used warehouse models for applying our algorithms, presents performance indicators, summarizes alternative policies to which we compare our algorithms, and provides the parameter settings of our algorithms. Afterwards, we first evaluate our storage assignment and order picking algorithms individually before we evaluate the interaction of both algorithms.

Mezzanine warehouse models
The NSGA-II and the ACO algorithm are evaluated in three artificial mezzanine warehouses of different sizes that are defined in cooperation with our cooperation company to build real-world test cases. The warehouses are shown in Fig. 10: W H small (yellow), W H medium (orange), and W H large (red). For the small, medium, and large warehouses, we define the size of the product assortment to be 500, 1000, and 1500, respectively. Since each product requires a weight, we define three normal distributions and a probability to determine the weight using this distribution: 25% to use N (2, 1.0 2 ), 50% to use N (5, 2.0 2 ), and 25% to use N (8, 1.0 2 ). Using these distributions and probabilities, we aim at a representative set of product weights where most of the products have a medium weight and some products have low and some have heavy weights. The products might also have correlations to up to three other products: With a probability of 30%, 40%, 20%, and 10% a product has no, one, two, or three correlated products, respectively, with a randomly generated correlation confidence between 10% and 90%. For evaluating the order picking algorithm, we fill the storage up to 50% of the available storage space and randomly generate 100 customer orders based on the product assortment and given correlations between products. Each customer order comprises 20 items to pick that are selected as follows: We split the product assortment into four equally sized groups based on the product rank. With a probability of 40%, 30%, 20%, and 10% an order contains an item of the highest, second highest, third highest, and lowest rank class, respectively, which ensures that high-ranked products appear more often in customer orders.

Performance indicators for assessing Pareto fronts
Since we assess a multi-objective optimization problem, the algorithms compute a Pareto front. We use the following quality indicators for Pareto fronts introduced by [31]. The Coverage (Cov) quality indicator quantifies the extent to which a computed Pareto front covers the reference Pareto front. The quality indicators Generational Distance (GD) and Euclidean Distance (ED) measure the distance from a computed Pareto front to a reference Pareto front or a reference solution. The quality indicators Pareto Front Size (PFS) and Generated Spread (GS) measure the diversity of the solutions that exist in a computed Pareto front. Finally, the quality indicator Inverted Generational Distance (IGD) combines convergence and diversity aspects. Since most of these quality indicators require the calculation of a reference Pareto front, we use the Pareto fronts returned by all algorithms as basis. From these Pareto fronts, we select the non-dominated solutions of the union of all computed Pareto fronts and use this front as reference Pareto front.

Alternative policies
We use the following alternative policies for the storage assignment problem. The random storage assignment policy allocates the incoming items to random racks on the floors in clusters of target quantity size [2]. In the closest open location storage assignment policy, the warehouse employees select the storage locations for storing an incoming product, which are usually the racks closest to the p/d-points [13]. The rank-based storage assignment policy assigns fast-moving products close to the p/d-points, while slow-moving products are assigned to racks further away [24].
For the order picking problem, we apply a modified S-Shape heuristic for comparison that constructs s-shaped pick routes based on the graph representation [22]. This heuristic uses all markets as starting point iteratively as well as the reversed versions of each route to generate a Pareto front of possible solutions.

Algorithm parameter settings
Based on a preliminary parameter study, we parameterize our NSGA-II algorithm as follows: We set the mutation probability to 0.95 for all warehouse sizes so that the mutation operators are applied very frequently. Additionally, we set the crossover probability to 1.00 so that the crossover operator is applied for all generations. Further, we define parameters dependent on the warehouse size (small/medium/large): The parent population size is set to (50/60/70), and the maximum number of generations to (200/250/300). These values increase with the size of the warehouse since the number of possible solutions increases with the warehouse size and we provide the algorithm more exploration possibilities (population size) and more time (number of generations) for optimizing the solutions.
Further, we set the parameters for our ACO algorithm as follows: In line with the literature, we set the pheromone factor α to 1.0 and the heuristic factor β to 2.0. We set the evaporation factor ρ to 0.02 causing the pheromones to evaporate rather slowly which enables the algorithm to achieve a higher degree of exploration especially in the early stages. The min/max values for the pheromone matrices (τ min/max ) are set to 1 and 25, respectively, add a floor change penalty of 50, and set the allowed weight difference to 3 kg. As stopping criterion, we set the maximum number of cataclysms to 3 and hence, the algorithm terminates after it became stuck for the third time.
Using the results of a preliminary parameter study, we set the maximum consecutive iterations without improvements to 20, and the maximum iterations to 250 since these values yield the best results w.r.t the scores.

Evaluation of the NSGA-II algorithm for storage assignment tasks
We evaluate our NSGA-II algorithm against the random, closest open location, and rank-based storage assignment policies. We apply all approaches on the three warehouse sizes (Setting 1.a, 1.b, 1.c) and on five randomly generated storage assignment tasks, i.e., we select a random product from the product assortment and set the quantity to be assigned to the quantity already existing in the warehouse. We repeat the execution of the NSGA-II algorithm ten times to reduce random effects and present mean and standard deviation values. All generated solutions of all algorithms are then used to calculate the reference Pareto front required for the quality indicators. Table 3 summarizes  the mean values and Table 4 shows the standard deviation values for this evaluation. The coverage (Cov) results for the small warehouse show that the NSGA-II Pareto front covers about 90% of the reference Pareto front while the other approaches cover only around 9% and 1% Since, the NSGA-II shows the lowest GD and ED values, this Pareto front is located closest to the reference front. The NSGA-II algorithm finds around 48 solutions per problem instance with a maximum possible value of 50 solutions for the small warehouse size. The other policies only construct 22 to 28 Pareto-optimal solutions while their maximum possible value is set to 500. Further, the NSGA-II achieves the lowest GS and IGD values which indicates, that the solutions converge well towards the reference Pareto front and offer diverse solutions.
In the medium warehouse, the results show similar behavior. The  Similarly, the large warehouse shows comparable results. The Pareto front P F nsga 2 (c l ) covers about 99% of the reference Pareto front P F ref , while P F rank covers only 1%. Thus, almost all solutions found by the rank-based policy are dominated by the solutions found by the NSGA-II algorithm. The mean GD value of P F nsga 2 (c l ) equals 0, indicating that the entire Pareto front P F nsga 2 (c l ) is part of P F ref in almost all cases. Other than that, the observations made in the previous Setting 1.b also occur in this setting. Thus, the standard deviations of the ED metric further increase, the NSGA-II algorithm finds the most solutions per problem instance, the alternative In summary, the results show that the random and the closest open location policy struggle to cover even a single solution in the reference Pareto front. Additionally, the NSGA-II algorithm outperforms the alternative policies in smaller warehouses. Further, the NSGA-II finds on average the most solutions per problem instance and the solutions are less equally distributed.
In addition to the quality evaluation, we also measure the mean execution time of the approaches for solving 50 problem instances in each warehouse size 1 . The alternative policies achieve low execution times of about 0.17/0.30/0.50 seconds for small/medium/large which is due to their comparably simple operation. In the warehouse small/medium/large, the NSGA-II algorithm achieves execution times of about 2/6/15 seconds, which is due to the increase population and iteration count for larger warehouses. The execution times of the NSGA-II algorithm may be considered acceptable, as the algorithm requires only a few seconds to find storage allocations that are remarkably better than the ones found by the alternative policies.

Evaluation of the ACO algorithm for order picking tasks
We evaluate both versions of our ACO algorithm against the modified S-Shape heuristic. We apply all approaches on the three warehouse sizes (Settings 2.a, 2.b, 2.c) using five customer orders randomly selected from the set of generated customer orders as explained earlier and repeat the execution of the ACO algorithms ten times to reduce random effects and present mean and standard deviation values. Then, we use all generated solutions the algorithms to calculate the reference Pareto front required for the quality indicators. Table 5 summarizes the mean values  and Table 6 shows the standard deviation values for this evaluation. For the small warehouse, the Pareto fronts of the ACO 3 and ACO 4 variants cover 74% of the reference Pareto front while the S-Shape heuristic fails to cover even a single solution. Both ACO algorithms achieve nearly the same GD and ED values and the close to zero GD values show that many solutions are part of the reference front. The ACO algorithms find around ten solutions per problem instance, while the S-Shape only finds three solutions per problem instance. The S-Shape achieves the lowest, hence, the best GS values, but it is not meaningful to compare these values to the ACO ones as it contains only three solutions that are considerably worse than solutions of the ACO 1 We run our experiments on a MacBook Pro using macOS Sierra 10.12.6, a 2.2GHz Intel Core i7 CPU and 16GB DDR3 RAM. algorithms. The ACO algorithms achieve low IGD values, indicating that both Pareto fronts converge well towards the reference front and provide diverse solutions.
The metrics of the medium warehouse show similar behavior as in the previous setting. Like in the previous setting, the Pareto front P F sShape fails to cover even a single solution in the reference Pareto front P F ref . The Pareto front P F aco 3 covers approximately 69% of P F ref , while P F aco 4 covers only 33%. Thus, the ACO 3 variant tends to find better pick routes than the ACO 4 variant. The Pareto front P F aco 3 achieves the lowest GD and ED values among all computed Pareto fronts. Thus, P F aco 3 converges best towards P F ref , which is not surprising, as P F aco 3 covers most of the solutions in P F ref . The GD and ED values of P F aco 4 are slightly larger than the ones of P F aco 3 , indicating that P F aco 4 does not converge as well as P F aco 3 towards P F ref . Compared to the previous setting, the GD and ED values of P F sShape increased, which may be due to the larger search space. Concerning the PFS metric, the ACO 3 variant finds about 12 solutions per problem instance, followed closely by the ACO 4 variant that finds around 11 solutions per problem instance, while the S-Shape heuristic only finds about 4 solutions per problem instance. Regarding the GS metric, the solutions in P F aco 4 are slightly better distributed than the solutions in P F aco 3 . With respect to the IGD metric, P F aco 3 achieves the lowest IGD values, indicating that P F aco 3 converges well towards P F ref and offers a high diversity of solutions.
In the large warehouse, the S-Shape heuristic is again unable to cover a solution in P F ref . The Pareto front P F aco 3 covers 84% of the reference Pareto front P F ref , while P F aco 4 covers only 16%. Thus, most of the solutions found by the ACO 4 variant are dominated by the solutions found by the ACO 3 variant. Accordingly, the ACO 4 variant has problems to compete with the ACO 3 variant in larger warehouses. Compared to the previous setting, the GD and ED values of P F aco 4 further increased, indicating that the distances between the solutions in P F aco 4 and the solutions in P F ref became larger. The Pareto front P F aco 3 converges best towards P F ref , as it achieves the lowest GD and ED values. Regarding the PFS metric, both ACO variants find about 10 solutions per problem instance. The GS metric indicates that the solutions in P F aco 4 are marginally better distributed than the solutions in P F aco 3 . Finally, the Pareto front P F aco 3 achieves the lowest, and thus, best IGD values, signalizing that P F aco 3 converges best towards P F ref and offers diverse solutions.
In summary, the ACO algorithms outperform the S-Shape heuristic in all warehouse sizes, and ACO 3 and ACO 4 show similar performance in smaller warehouses. With increasing warehouse size, the solutions found by the ACO 3 variant dominate more and more solutions of the ACO 4 variant. Hence, the ACO 3 variant starts to find better In addition to the quality evaluation, we also measure the mean execution time of the approaches for solving 50 problem instances in each warehouse size. The S-Shape heuristics takes around 0.15 seconds to compute routes. The ACO 3 and the ACO 4 variant achieve fairly the same execution times in all warehouse sizes of around 1/3/6 seconds for W H small /W H medium /W H large . As the warehouse size increases, the graph consists of more markets causing more ants to be deployed in each iteration. Still, we consider the ACO execution times acceptable, as they require only a few seconds to find noticeably better pick routes.

Evaluation of computational costs
Besides evaluating the performance of the proposed algorithms in terms of solution quality, we also measured execution times and evaluated the computational costs. The experiments were conducted on a MacBook Pro with a 2.2GHz Intel Core i7 processor, 16GB of RAM, and macOS Sierra 10.12.6. Each time measurement was repeated 50 times. Tables 7 and 8 report the mean and standard deviation for the initialization and execution phases in seconds. Table 7 shows the initialization times for the storage assignment and order picking algorithms. The initialization time increases with the warehouse size as more data must be requested from the database. However, the measured initialization times must be analyzed with caution, as the initialization time depends more on the size of the database tables than on the warehouse's size. Initialization times for the order picking algorithms are much higher due to the graph construction, which makes the initialization time proportional to the number of markets. This also indicates that tuning the SQL queries would only partly enhance the initialization time of the order picking but could have a more significant impact on the order picking. The graph construction could be parallelized further to increase the initialization time for order picking, as no data dependencies exist between the markets. Table 8 shows the execution times for the proposed algorithms as well as the execution time of the baselines. For storage assignment, alternative policies achieve low  execution times of up to 0.5 seconds, which is not surprising, as they require only a few calculation steps to assign the incoming items to racks on the floor. The NSGA-II algorithm needs more time to compute the storage allocations and requires up to 15 seconds for the large warehouse. However, the execution times of the NSGA-II algorithm may be considered acceptable, as the algorithm requires only a few seconds to find storage allocations that are remarkably better than the ones found by the alternative policies.
For order picking, the alternative S-Shape heuristic requires 0.15 regardless of warehouse size. The AC0 3 and the AC0 4 achieve fairly the same execution times across the different warehouse sizes, with approximately six seconds for the largest. The execution time increases with the warehouse size since the graph consists of more markets, causing more ants to be deployed in each iteration. Considering the noticeably better routes compared to the S-Shape heuristic, the execution time of a few seconds to find a picking route may be considered acceptable.

Evaluation of the interaction between NSGA-II and ACO algorithm
In the previous section, we have shown the applicability of our algorithms for storage assignment and order picking in dedicated analyses. The results indicate that both algorithms outperform state-of-the-art solutions for those tasks. In this section, we evaluate the interaction between our proposed algorithms by assessing them in three settings: Section 7.8.1 determines whether the ACO 3 performs better on the NSGA-II planned warehouse compared to the random warehouse; Section 7.8.2 performs a similar assessment for the ACO 4 ; Section 7.8.3 evaluates whether the ACO 3 or the ACO 4 perform better on the NSGA-II planned warehouse. Tables 9 and 10 summarize the results.

Comparison of NSGA-II and random planned warehouses for ACO 3
In this setting, we apply the ACO 3 algorithm on all warehouse sizes (Setting 3.a, 3.b, 3.c) twice: once for the warehouse that used the NSGA-II algorithm for storage assignment and once for the randomly assigned warehouse. Again, we select five random items from the product assortment and set the amount to assign to the already existing amount inside the warehouse. In the small warehouse, the Pareto front of the NSGA-II planned warehouse covers the entire reference front while the random planned warehouse does not cover a single solution in the reference front, hence, the GD and IGD values of the NSGA-II planned warehouse are 0 and the ED values are minimal. The high GD and ED values of the random planned warehouse indicate that its Pareto front does not converge well towards the reference front. Thus, the solutions found in the random warehouse are considerably worse than the solutions found in the NSGA-II warehouse. In the medium warehouse, the Pareto fronts of random and NSGA-II planned warehouses achieve fairly the same quality indicator values as in the previous setting. However, the GD and ED metric indicate  that the results for the random warehouse unexpectedly converge better towards the reference front than in Setting 3.a. This could be due to the limited amount of executed problem instances and needs to be further assessed with a higher number of problem instances. Nevertheless, the Pareto front of the random warehouse is still far from converging towards reference front. In the large warehouse, the same observations can be made as in the previous settings, underlining that the ACO 3 variant finds better pick routes in the NSGA-II warehouse than in the random warehouse. In summary, the evaluation results show that the NSGA-II algorithm and the ACO 3 variant interact well together and the ACO 3 variant profits from the NSGA-II algorithm that ensures our four economic constraints.

Comparison of NSGA-II and random planned warehouses for ACO 4
This setting repeats the Settings 3.a to 3.c for the ACO 4 algorithm. We now discuss the results for the Settings 4.a to 4.c. In the small warehouse, the Pareto front of the NSGA-II warehouse covers the entire reference Pareto front and the random warehouse fails to cover a single solution. Thus, all solutions found in the random warehouse are dominated by the solutions of the NSGA-II warehouse. The GD and ED values for the random warehouse are higher than the ones of the NSGA-II warehouse which shows that the Pareto front of the random warehouse is further away from the reference front. Hence, the solutions of the random warehouse are noticeably worse than the solutions found in the NSGA-II warehouse. In the medium warehouse, the Pareto front of the NSGA-II warehouse does not always cover the entire reference front, while the random warehouse covers at least one solution in the reference front in 7 of 50 repetitions. Thus, the ACO 4 variant occasionally finds a few solutions in the random warehouse that are comparable with the solutions found in the NSGA-II warehouse. Despite these few outliers, the results show a similar behavior as in the previous setting. Similar to the previous settings, the evaluation in the large warehouse show comparable results. The NSGA-II warehouse covers all solutions in the reference front in 49 of 50 repetitions and the random warehouse manages to cover at least one solution in the reference front. In summary, the results show that the ACO 4 variant also finds better pick routes if the warehouse applies the NSGA-II storage strategy and both algorithms interact well with each other.

Threats to validity
We identified the following threats to validity of our evaluation. First, the NSGA-II and ACO algorithms are evaluated in three mezzanine warehouses of different sizes. However, real-world mezzanine warehouses may consist of more floors, blocks, pick aisles, and racks than specified in the warehouses used for evaluation. Nevertheless, we are convinced that our defined warehouses form a representative set for mezzanine warehouses and can easily be extended for further evaluation runs. Second, since the algorithms are evaluated in warehouses that apply either the random or the NSGA-II storage strategy, the evaluation results may not be transferable to warehouses that apply different storage strategies. Even though the product assortment, the product correlations, the customer orders, and the storage allocations are randomly generated reflecting specific characteristics of real-world mezzanine warehouses, the proposed algorithms are easily transferable to real application data.
Third, we decided to compare the ACO algorithm only to one order picking policy. This decision was made in awareness of the limited expressiveness of our results but was necessary as the majority of policies in the literature violate assumptions made in this work. Finally, we only evaluate our NSGA-II and ACO algorithms against heuristic policies. Hence, they also should be evaluated against other optimization methods like other evolutionary optimization algorithms or graph-based optimization techniques. However, we decided to do this evaluations as future work. Additionally, we performed less than 30 runs for each setting due to the complexity and runtime of the runs. Obviously, 30 runs and more might result in more stable and reliable results. However, the report the standard deviations. As those are pretty low, we assume that the results are already stable.

Conclusion
Due to the complexity of the storage assignment and the order picking problem, efficient optimization algorithms are required to find satisfactory solutions within reasonable times. This paper proposes an NSGA-II algorithm for optimizing the storage assignment problem, and an ACO algorithm for optimizing the order picking problem in mezzanine warehouses. The algorithms incorporate knowledge about the interdependency between both problems to improve the overall warehouse performance. Besides optimizing economic constraints, the algorithms also optimize ergonomic criteria, as mezzanine warehouses represent labor-intensive working environments in which the employees account for a large part of the warehouse performance. We evaluate the NSGA-II algorithm against three storage assignment policies frequently applied in practice: the random, the closest open location, and the rank-based policy. The evaluation results show that the NSGA-II algorithm outperforms the alternatives already in smaller warehouses and the larger the warehouse, the better the NSGA-II algorithm prevails against the alternative policies. We evaluate the ACO algorithm against the S-Shape heuristic that is frequently applied in practice. Our evaluation results show that the ACO outperforms the S-Shape heuristic in all tested warehouse sizes. Finally, we evaluate the interaction between the NSGA-II and the ACO algorithm. The evaluation results show that both ACO variants find better pick routes if the warehouse assigns its products by applying the NSGA-II algorithm instead of the random storage strategy, thus, the NSGA-II and the ACO algorithm interact well with each other.
In the future, we plan to integrate additional features to further increase the applicability of the storage assignment. First, we want to allow state changes of the mezzanine warehouses while the storage assignment is running which would make some of the solutions in the Pareto front infeasible.
Further, we plan to parallelize the storage assignment algorithm so that the sequential assignment of products is replaced and the execution times will decrease. Regarding the order picking algorithm, we also aim at parallelizing the execution to reduce the required calculation times. Finally, we want to research on integrating forecasts (e.g., based on our previous work [33]) of future assignment and order picking tasks to proactively replace goods within the storage that will be ordered in the near future. Aditionally, a promising option can be to study further algorithms for optimization and different parameter setting. On the one hand, one option is to analyze additional optimization techniques. For example, some multi-objetive evolutionay algorithms with great performance have been reported recently, such as, Multi-strategy co-evolutionary differential evolution for mixed-variable optimization, firefly algorithms with courtship learning, or multi-objective artificial bee colony algorithms. However, those algorithms were applied in different domains. Still, it would be interesting as future work to analyze them in our target domain. On the other hand, we showed In previous work that choosing the best optimization algorithm is situation-aware [18] and, hence, requires to be modeled as a self-adaptive system [15]. Implementing such an approach could also help to identify the best fitting algorithms for a new warehouse design. Rank of product p |P | Size of the product assortment Total quantity of a product in an area idealQ

Appendix: Variable Summary
Ideal quantity of a product in an area C Chromosome representation of an assignment solution dist r i Walking distance to rack i idealDist p,f j Ideal walking distance for product p relRank p Relative rank of product p rackI dx p,f j Rack index of product p on floor f j R f j List of rack on floor f j tq p Target quantity for product p Quantity of products inside a mask sa l Sub aisle with identified l qF actor Factor of actual existing quantity of a product compared to the target quantity maskScore Score for a mask indicating how close the actual quantity of a product is to the target quantity subAisleScore Score for each sub aisle based on the maskScore quantityScore Score for a floor based on the subAisleScore cp Correlated product conf Confidence value of a correlation qClusters Possible clusters of target quantities idealCorrQ Ideal Quantity to which the correlated product should be available in the vicinity of the incoming product corrQ Fraction to which the items of cp stored in r are considered to be in vicinity of p exQ cp,r Existing quantity of the correlated product cp in rack r L Number of generations to be used for calculating the standard deviation of the maximum crowding distance δ lim Threshold for the standard deviation of the crowding distance  Iteration-best pick route that minimizes objective function of i P R i gb Global-best pick route that minimizes objective function of i Funding Open Access funding enabled and organized by Projekt DEAL. No funding was received to assist with the preparation of this manuscript.

Conflict of Interests The authors have no relevant financial or nonfinancial interests to disclose
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.