A polynomial algorithm for the earthwork allocation problem with borrow and waste site selection

Abstract In road construction projects, earthwork is planned together with horizontal and vertical alignments. This study focuses on earthwork operations that basically include cutting the hills and filling the holes on the road path. The candidate borrow and waste sites can also be used to obtain or heap earth when the available cut and fill amounts are not balanced or operating these sites reduces the total earthwork cost. Total earthwork cost contains the transportation cost and the overall cost related to opening the candidate sites. The problem is to determine which borrow and waste sites to operate, and the earth flows between cut, fill, waste, and borrow sites such that the total cost is minimized. It is shown that the problem is a generalization of the well-known lot-sizing problem. A fixed charge network flow problem formulation is presented, and a polynomial time dynamic programming algorithm is developed for solving the problem.


Introduction
The earthwork allocation problem with borrow and waste site selection (EAPS) addresses a practical location-allocation decision problem whose workflow and capacity management characteristics are quite different from those in the related logistics literature.
A road construction project mainly requires horizontal and vertical alignments and earthwork planning. Earthwork basically includes cutting the hills (cut/supply sites) and filling the holes (fill/demand sites) on the road path. Exact amounts of earth must be removed from the cut sites and filled into the fill sites. If the distances between cut and fill sites are long or the amounts of available earth supply and demand are not balanced, the open candidate sites are operated in order to match demand with supply and/or decrease the earth transportation costs. Figure 1 shows an illustration of possible sites for a road construction project. Note that candidate sites are called waste (borrow) under the over-supply (undersupply) case.
Opening a waste or borrow site incurs a fixed cost and a variable cost. Fixed cost includes all expenditures for preparations necessary to make the site usable and construction of slip roads necessary to reach the site. Variable cost includes loading/unloading expenditures and transportation costs on the slip road.
We formulate the EAPS on a line network L = (V, E) where V is the node set {1, 2, 3, …, n} and E is the edge set {(i, i ? 1)| i = 1, 2, 3, …, n -1}. Nodes in V represent cut, fill, waste, and borrow sites. Note that a waste (or borrow) site is represented as a node in V at the connection point of its slip road to the main road in construction. Although there may be limitations on the capacities of borrow and waste sites in practical cases, this study assumes that they are uncapacitated. Edge (i, i ? 1) [ E represents the road segment between nodes i [ V and (i ? 1) [ V. Let C ( V be the set of cut nodes, F ( V the set of fill nodes, D ( V the set of candidate waste nodes, and Q ( V the set of candidate borrow nodes. Let c i be the amount of material that must be sent from cut node i and f i be the amount of material that must be sent to fill node i. Let q i be the fixed cost and p i be the variable cost for node i [ D [ Q. Let r ij be the unit transportation cost between i [ V and j [ V. Although different dipper dredgers and trucks can be used for loading/unloading operations and shorthaul/long-haul transportation activities, which may require differentiating the related cost items, this study assumes that the variable transportation cost is a linear function of distances between nodes i and j. The problem on L is to determine the set of waste nodes D 0 ( D, the set of borrow nodes Q 0 ( Q, and the amount of material flows between node pairs so that (a) the total material sent from Q 0 and C to i [ F is equal to f i , (b) the total material sent from i [ C to D 0 and F is equal to c i , and (c) the total of fixed and variable costs is minimized.
Considering the diverse nature of cut and fill operations, stocking function of waste sites, and supplying functions of borrow sites, we assume that C, F, D, and Q are disjoint sets. All cost, distance, and supply/demand parameters are assumed to be nonnegative for simplicity.
The EAPS can be considered as a mixture of two oppositely structured location problems on a line. If D = C = Ø, i.e., there are no waste and cut nodes, it includes only fill nodes with known requirements on a line and candidate uncapacitated borrow nodes with fixed and variable costs. This is exactly the same problem as the uncapacitated facility location problem (UFLP) on a line in terms of matching demand with supply, where fill nodes represent customer locations with known demands and borrow nodes represent uncapacitated candidate facility locations. The problem is to determine the number and locations of facilities and allocations of customers to facilities such that the total cost is minimized. Now, consider the opposite case in which Q = F = Ø, i.e., there are no borrow and fill nodes. In this case, the EAPS contains only cut nodes and candidate waste nodes. The problem is almost the same as the previous one except that the flows are from customers to facilities.
The UFLP on a line network is equivalent to the uncapacitated lot-sizing problem (ULSP) with backlogging (ULSPB), which is polynomially solvable (Pochet and Wolsey, 1988). Although each of the above special cases of the EAPS is equivalent to the ULSPB, these two separate problems come together and merge on the same line in the EAPS setting so that one can move material from cut and borrow nodes to fill and waste nodes. However, even if we consider cut and borrow nodes as candidate facility locations, we actually do not make an opening decision about a cut node, but determine the material flows from these nodes. On the other hand, even if we consider fill and waste nodes as customer locations, we actually decide to open a waste node and determine the material flow to these nodes.
The earthwork allocation problem without borrow and waste site selections is a polynomially solvable linear programming problem. In the EAPS, however, there are fixed costs related to borrow and waste sites so that a site can operate only if its fixed cost is paid. We present the fixed charge network flow problem formulation of the EAPS in which borrow and waste site selection decisions are represented by integer variables. It is shown that the EAPS is the generalization of the ULSP. To the best of our knowledge, this version of the ULSP is not studied before in the literature. Using the properties of the optimal solution to the EAPS, we develop a polynomial time dynamic programming algorithm and show that the EAPS is polynomially solvable.
There are various studies related to the EAPS in the road construction, location, lot sizing, distribution, and transportation literature. However, to the best of our knowledge, the relation between the EAPS and the known optimization problems is only partially examined. We review and present these relations in the second section. The third section gives the fixed charge network flow problem formulation of the EAPS. Optimal solution properties of the problem are discussed in the fourth section. Using the properties in the fourth section, the fifth section presents a polynomial time dynamic programming algorithm. The last section concludes the paper.

Literature review
There are several studies on the EAPS and its variants in the construction literature. According to Jayawardane and Harris  (1990), one of the common methods used in the literature is the mass-haul diagram and the earliest linear models in the literature are developed by Mayer and Stark (1981) and Nandgaonkar (1981). Easa (1987) and Easa (1988) consider stepwise and linear unit cost functions, respectively. Jayawardane and Harris (1990) consider equipment fleet combinations that affect the project duration and costs. Moreb (1996) considers the vertical alignment and the earthwork simultaneously. Hare et al (2011) consider removal of physical blocks such as rivers and trees so that removals and earthworks can be performed simultaneously at different sites on the road being constructed. Lima et al (2013) consider the earthwork and paving issues for the cases in which cut and borrow sites have different material properties, fill nodes demand a mix of different materials, and there are material mixing plants. Hare et al (2014) consider the vertical alignment and earthwork allocation problems simultaneously on a dynamic setting with features such as side slopes and physical blocks. They propose two mathematical models (i.e., transportation and network flow type models) in order to compute the cost and show that the flow type model is computationally faster. Burdett and Kozan (2014) and Burdett et al (2015) consider threedimensional blocks in the EAP instead of considering sections. Son et al (2005) consider the earthwork allocation problem on a plane, rather than on a line. The earthwork area is not limited by roads. They consider the earthwork area in two dimensions (width and length) with smaller rectangular cells. The earth transportation among these cells is determined in order to minimize the total transportation distance. Moreb and Bafail (1994) consider a similar earthwork allocation problem with additional decisions about land leveling.
There are a few studies on the facility location problem on a line in the location literature, some of which are formulated as the p-median and fixed charged facility location problems. There are two main properties that make a location problem on a line easier and lead to efficient algorithms. The first one is the eligibility of non-fractional allocations of demands to facilities, and the second one is to set identical capacities at all facilities. Uncapacitated cases guarantee the validity of the first property. Love (1976) proposes a dynamic programming algorithm to solve the p-median problem. Brimberg and Revelle (1998) consider the UFLP and the p-median problem and show that the linear relaxations of their mixed integer models give the integer optimal solution. Hsu et al (1997) propose an O(pn 2 ) algorithm for solving the p-facility location problem with n candidate location sites, fixed location costs, and unimodal service cost structures.  solve the location and sizing problems of facilities by dynamic programming. Facilities may reach any capacity level at the expense of a fixed cost, which is a continuous nondecreasing function of the capacity. As a result of this capacity-cost relation, the first property is guaranteed.  investigate the effect of capacity constraints on the continuous location-allocation problem with at most p facilities. The objective is to minimize the total fixed and transportation costs. They propose a dynamic programming algorithm when the unit transportation cost is an increasing convex function of the distance and the second property is valid. They show that the problem is NP-hard under more general cost structures. Eben-Chaime et al (2002) consider the fixed charged capacitated location-allocation problem and use heuristic solution methods to solve the problem. Mirchandani et al (1996) consider the capacitated facility location problem where facilities must be located within a given neighborhood of customers. Fixed and service costs depend upon their locations on the line. They develop dynamic programming algorithms for (i) locating minimum cost facilities to serve all customers and (ii) maximizing the profit by locating at most p facilities that may not serve all customers. None of these studies consider the supply-demand flows similar to the EAPS.
As we discussed above, the lot-sizing problem called ULSPB is polynomially solvable (Zangwill, 1969;Wolsey, 1988, 2006). Zangwill (1969) is the first to formulate the ULSPB as the network flow problem and proposes an O(n 3 ) dynamic programming algorithm, where n is the number of periods. Pochet and Wolsey (1988) provide a shortest path formulation with O(n) nodes, requiring O(n 2 ) operations for computing arc weights, which make the ULSPB solvable in O(n 2 ). The lot-sizing literature also contains several extensions of the problem including capacities, multiple products, multi levels, and start-ups (e.g., Agra and Constantino, 1999;Cheng et al, 2001;Miller et al, 2003;Denizel and Süral, 2005). A few extensions of the core problem include product returns from the customers and disposals of excess inventory. The generic version of this extension is a special case of the EAPS (see Beltran and Krass, 2002). All other studies add features to the problem such as remanufacturing of returned products, production capacities, or multiple products. Note that when backlogging is included in the lot-sizing problem with returns and disposals, the resulting problem is equivalent to the EAPS. To the authors' best knowledge, there is no such study in this literature.
There are several studies on the transportation problem with fixed costs of flows between supply and demand sites (see Adlakha and Kowalski, 1999). However, to the best of our knowledge, there is no study on the transportation problem that incorporates supply and demand location decisions in addition to the initially given sites.

The fixed charge network flow problem formulation
The EAPS can be represented as the fixed charge network flow Fixed and variable charges of using arcs (0, i) and (i, 0) in G are equal to q i and p i , respectively, where node i corresponds to a candidate borrow node for the former case and waste node for the latter case. Variable charges of using both arcs (i, i ? 1) and (i ? 1, i) for i = 1, 2, …, n -1 are equal to r i,i+1 . The amount of supply (demand) For node 0, the supply amount is equal to max 0; Decision variables are defined as follows: • u i is the amount of forward material flow from node i to (i ? 1) for i = 1, …, (n -1). Similarly, v i is the amount of backward material flow from node (i ? 1) to i for i = 1, …, (n -1).
• c i is the amount of material obtained from borrow node i for i [ Q. Similarly, b i is the amount of material heaped to waste node i for i [ D.
• y i is equal to 1 if the candidate waste (borrow) node i is . The directions of arcs show the directions of corresponding flows. For example, u 4 represents the amount of forward material moved from node 4 to node 5, while v 3 represents the amount of reversed material moved from node 4 to node 3. Note that the arc with demand f 1 (supply c 3 ) is a downward (upward) arc.

FNFP formulation
s:t: The objective function (1) minimizes the total cost. Constraints (2)-(5) are material flow balance equations and guarantee that the exact amount of material is removed (filled) from (to) cut (fill) nodes, and the amount of material is obtained (heaped) from (to) borrow (waste) nodes, if necessary. Constraints (6) and (7) set corresponding binary variable equal to 1 if a shipment occurs from (to) borrow (waste) node. (8)-(12) are set and integrality constraints. Note that variables u 0 , v 0 , u n , and v n are fixed at zero, and the balance equation is not written for node 0 because it would be satisfied as a result of constraints (2)-(5).

Optimal solution properties of the EAPS
It is known that when the values of y variables in FNFP formulation are given, the FNFP reduces to the minimum cost network flow problem, whose optimal solution satisfies the acyclic graph property. Hence, an extreme solution of the FNFP has a tree structure as shown in Figure 3.
Example 1 in Figure 3a displays a possible extreme solution. In this example, borrow node 2 meets the demand at fill node 1, the demand at fill node 6 is satisfied from the supply at cut nodes 3 and 8, and the demand at fill node 9 is satisfied from the supply at cut node 8. The remaining supply at node 8 is sent to waste node 7. Note that no material flows from borrow site 5 to waste site 4, but a supply (cut) quantity from node 3 is sent to node 6 through nodes 4 and 5 for satisfying a part of demand (fill) at node 6.
Example 2 given in Figure 3b illustrates another extreme solution in which waste node 4 and borrow node 2 are open. Node 2 meets the demand at fill node 1, while the demands at fill nodes 6 and 9 are satisfied from the supply at node 8, and the supply at node 3 is sent to node 4. In this example, waste node 7 and borrow node 5 are not used. A supply (cut) from node 8 is sent to node 6 via node 7 for satisfying the demand (fill) at node 6.
Using the tree structure of extreme solutions of the FNFP, we specify the structure of optimal solutions to the EAPS in Observation 1.
Observation 1 There exists an optimal solution to the EAPS where (1) u i v i = 0 for all 1 B i\n and (2) Definition If an extreme solution of the EAPS satisfies Note that nodes 1 and 2 in Figure 3a form the first segment S[1,2], while nodes from 3 to 9 constitute the second segment S [3,9]  Hüseyin Güden and Haldun Süral-A polynomial algorithm for the earthwork allocation problem e = 1, …, k and a 1 = 1, b k = n, b e = a (e+1) -1 for e = 1, …, (k -1). These segments correspond to ''regeneration intervals'' of the ULSPB (see Pochet and Wolsey, 2006) and generalize them since the ULSPB is a special case of the EAPS. An extreme solution for a nine-period ULSPB instance is illustrated in Figure 4, where S[1,4] and S [5,9] form the first and second regeneration intervals. a,b] of the EAPS satisfies the following properties.
1. One of the following three cases is valid: Below we match the five properties for the EAPS in Observation 2 with the corresponding properties of a regeneration interval R[a,b] of the ULSPB in order to highlight the similarities and differences between two problems.
1. There is exactly one production period. 2. All demands are satisfied from the production period. 3. Demand in a period is entirely satisfied by only inventory, production, or backlogging. 4. Inventory and backlogging flows are separated on the subparts of the regeneration interval. Backlogging flows occur only on the left of the production period, and inventory flows occur only on the right of the production period. 5. There is no flow from the left of the production period to its right or vice versa. Let D ab and Q ab be the sets of candidate waste and borrow nodes in S[a,b], respectively. We know that one of the following three cases occur according to Observation 2.1.

A dynamic programming algorithm for the EAPS
The corresponding total cost is TC m ab ¼ q m þ p m ðb m Þþ P bÀ1 j¼a r j;jþ1 ðu j þ v j Þ. Note that locating a waste node in S[a,b] can be decided by computing TC m ab values for all m [ D ab and selecting the minimum cost location among them. If D ab = Ø, then S[a,b] is not a valid segment of a basic feasible solution. We therefore redefine the total cost term as where M is a very large positive number.
Case 3 If c ab \ f ab , then there is only one open borrow node (say at node m). Although the computations for this case are similar to Case 2, we present all necessary steps for the sake of completeness. The amount of material obtained from borrow node m is equal to the difference between the cut and fill amounts in S [a,b]. So, we have The total cost is TC m ab ¼ q m þ p m ðc m Þ þ P bÀ1 j¼a r j;jþ1 ðu j þ v j Þ. Locating or not locating a borrow node in this segment can be decided as it has been defined in Case 2. First, compute TC m ab values for all m [ Q ab and choose the location node with the minimal cost value. Since Q ab = Ø makes the segment S[a,b] invalid as discussed for Case 2, we have again Figure 2, where f 1 = 10, f 6 = f 9 = 5, c 3 = 12, c 8 = 8, q 2 = q 4 = q 5 = q 7 = 100, p 2 = p 4 = p 5 = p 7 = 10 and r i,i+1 =2 for i = 1, …, 8. Below we illustrate necessary computations for the associated flows and costs of three candidate segments. Let us consider S [1,9] in Table 1, where c 19 = f 19 = 20. It implies that Case 1 is satisfied and there will be no open waste and borrow nodes. The corresponding total cost TC 1,9 equals to 2(10 ? 10 ? 2 ? 2 ? 2 ? 3 ? 3 ? 5) = 74. Now, we consider S [1,8] in Table 2, where c 18 = 20 [ f 18 = 15, which corresponds to Case 2 that opens a waste node. Let us assume that the waste node is opened at node 4, i.e., m = 4, and b 4 = 5. The resulting total cost TC 4 1;8 is obtained as 100 ? (10)(5) ? 2(10 ? 10 ? 2 ? 3 ? 3 ?
Note that computational complexity of finding u and v values (forward and backward flow amounts) for a fixed a and b pair and any m is O(b-a). Similarly, complexity of computing the total cost for a fixed a and b pair and any m is O(b-a). Since these computations must be repeated for all m [ D ab and m [ Q ab if cases 2 and 3 are satisfied, respectively, the complexity of computing TC ab is O((ba) (max{|D ab |, |Q ab |})). Let G b be the optimal objective function value of the subproblem including only the first b nodes. Let G 0 = 0.
n sequentially. G n gives the value of an optimal solution. The optimal solution can be constructed by backtracking.
Since the complexity of computing TC ab is O((b-a) (max{|D ab |, |Q ab |})) and these computations are done for all pairs of a and b (1 B a B b B n) in the above forward DP algorithm, the total computational complexity results in O(n 3 max{|D|, |Q|}), or simply O(n 4 ).
Given that the EAPS is generalization of the ULSPB, we now examine whether the complexity of the above algorithm can simply be reduced from O(n 4 ) to O(n 3 ) time by using the properties of the regeneration intervals in the ULSPB although they are not valid for the segments in the EAPS (see Observation 2).
Consider nodes (a -1), a, …, b, -1)) is added to S[a,b] as the last (first) node instead of b (a) for an EAPS instance. We assume that the material flow amounts and the associated total cost for the segment S[a,b] are given, and we explore S[a,(b ? 1)] to compute its material flows and the associated total cost with performing a constant number of operations using the knowledge about S [a,b].
Note that S[a,b] and the new segment S[a,(b ? 1)] may fit different cases in Observation 2.1, which make the information at hand useless for necessary computations as flows, open node types, and their locations are changing. Similarly, if both S[a,b] and S[a,(b ? 1)] fit the second (or third) case in Observation 2.1, the number of operations needed to compute the flows and cost in the new segment would be affected largely by the size (i.e., the number of nodes) of the part between waste node or borrow node and node (b ? 1). The reason is that when node (b ? 1) is added to S[a,b], the flows on this part may increase, decrease, or even change their directions (i.e., some forward flows can turn to backward flows or vice versa). It follows that all flows and the total cost on this part need to be computed from scratch. Now assume that the original segment S[a,b] fits the first case in Observation 2.1. If (b ? 1) is a cut or fill node, S[a,(b ? 1)] cannot fit the first case because the total cut and fill amounts cannot be equal to each other after adding (b ? 1). If (b ? 1) is a candidate waste node or borrow node, then node (b ? 1) would not be opened and no material flow would occur between [a,b] and (b ? 1) since the total cut and fill amounts on [a,b] are equal to each other. We thus can conclude that the number of operations needed to compute the flows and cost in a new segment would be affected from the number of nodes and it would not be trivial to perform a constant number of operations using the original information at hand. This remark is also valid for the case of adding node (a -1) to S[a,b] as the first node to have the new segment S[(a -1),b]. Now, consider an ULSPB instance with n periods. Let p k and d k be unit production cost and demand in period k, respectively. Consider a regeneration interval R [a,b]. Let i be the production period in R[a,b] and TC ab be the total cost associated with R[a,b]. For a new regeneration interval R[a,(b ? 1)], the material flow on R[a,b] would not be affected from adding (b ? 1) because all periods in R[a,b] remain as they are. The periods before the production period satisfy their demands by backlogging, the production period satisfies its demand from itself by production, and the periods after the production period satisfy their demands by inventory. After adding period (b ? 1) to the regeneration interval, it satisfies its entire demand from the production period by inventory. Hence, the total cost in the new situation can be computed in a constant number of operations, which is independent of the size of R[a,b] by just adding the term (d (b+1) (p i ? h i,(b+1) )), where h i,(b+1) is the unit inventory holding cost from production period i to period (b ? 1). The case of adding the period (a -1) to R[a,b] as the first period is very similar to the case of adding period (b ? 1) as the last node. In this case, the additional cost term is (d a-1 (p i ? s (a-1),i )), where s (a-1),i is the unit backlogging cost from period (a -1) to production period i. Hence, the total cost in the new regeneration interval can also be computed in a constant number of operations, which is independent of the size of R[a,b]. Pochet and Wolsey (1988) use the property of the ULSPB related to Observation 2.5 and consider regeneration interval R [a,b] in three parts in their O(n 2 ) algorithm for the ULSPB: [a,i), [i], and (i,b], where i is the production period. They show that computing the associated costs for [a,i), [i], and (i,b] requires O(n 2 ) operations for all (a,i), O(n) operations for all i, and O(n 2 ) operations for all (i,b), respectively. Thus, the ULSPB is converted to a shortest path problem instance solvable in O(n 2 ) time. However, according to Observation 2.5, a material flow occurs from one part to another part in the segment of the EAPS so that the flow affects the amount of demand on the opposite part, which would be satisfied from the open waste or borrow node, and its total cost. Therefore, such a division of the segments into the parts is not immediately applicable for the segments of the EAPS.

Conclusion
The earthwork allocation problem with borrow and waste site selection appears in road construction projects and is studied in different forms in the construction literature. We show that the EAPS is an extension of the well-known uncapacitated lotsizing problem with backlogging and can be solved in O(n 4 ) time. Extensions of the solution algorithm to the EAPS with capacitated sites and/or nonlinear transportation costs can be considered as future studies. Besides, the current state-of-theart for road design is to consider vertical alignment and earthwork simultaneously. Future research can explore whether extending the EAPS to include vertical alignment optimization can still be achieved in polynomial time.