1 Introduction

Dry bulk terminals (DBTs) are port terminals that handle and store dry bulk cargo, which are transported unpackaged (without containers) in large quantities. Dry bulk cargo is categorized into major bulks and minor bulks. Major dry bulk cargo includes coal, iron ore and grain. Minor dry bulk cargo comprises steel, plastic pellets, wood chips, sugar, cement and minerals. DBTs are categorized by the flow direction of the bulk material as either an import or an export terminal. This categorization together with the type of the bulk material determines the equipment to be used in handling the materials.

This study is motivated by the world’s largest coal exporting bulk terminal based in the Newcastle region of Australia. In this DBT, the focus is on enabling the outgoing flow of coal, and hence, the stocks are built to serve the arriving vessels, with each vessel having its own set of stocks. These stocks are assembled from coal railed from the mines. Since the distance between the mine and the terminal can be of as large as 400–500 km, as a general policy, a vessel is not berthed until all of its stock has been assembled in the terminal. Once a vessel has berthed, then it is loaded with coal via a set of reclaimers (to reclaim the material from the yard) and ship loaders (an equipment used for loading the vessel) in a continuous process. Every berth has a dedicated ship loader, but a reclaimer can be shared between different berths. However, once a reclaimer is assigned to a particular berth, then it must finish loading of this vessel. Since loading a vessel requires two machines simultaneously, there are always at least as many reclaimers as ship loaders. The selection of a reclaimer, with not too many choices, is also dependent on the area where coal for the vessel has been stored in the terminal. The problem of selecting reclaimers and therefore a more detailed description of the operations in Newcastle can be found in Singh et al. (2011, 2015).

The continuous process of loading the bulk material to the vessels and the fixed specialized equipment used are the differences between a DBT and a container terminal (CT). Another significant difference between a DBT and a CT is the effect of tides to the terminal. Since bulk material carrying vessels are generally much larger with respect to the draught compared to container carrying vessels, a loaded vessel departing the DBT will be restricted by the tides. More specifically, the vessel cannot depart during a low tide, and hence, its berth assignment should be planned such that its departure will coincide with a high tide window.

UNCTAD (2015) reports a steady increase in dry bulk cargo, which signifies the better management of DBTs. One important aspect of better management in a port terminal is the allocation of incoming vessels to the berths efficiently. Hence, the focus of this paper is the berth allocation problem (BAP) particularly when the berth is considered as a continuous resource in an export DBT with tidal constraints (BAP_DBT).

The BAP_DBT studied in this paper can be described as follows: The quay is considered as a continuous resource with a limited length, L. There is a set of incoming vessels, V, with different arrival times and different lengths. Each vessel needs to be handled at the terminal for a certain period of time. The main decision in BAP_DBT is to allocate these incoming vessels to the quay. One distinctive constraint in dry bulk terminals is the existence of tides. The vessels may berth and can be handled during a low tide. However once fully loaded, they sit lower in the water and may depart only during a high tide. There are no precedence constraints on the order in which vessels are processed; however, each vessel has an arrival time (release time) that limits how early it can be processed. The objective is to minimize the sum of vessel completion times, or equivalently to minimize the total flow time over all vessels. This equivalency can easily be noticed since the flow time of a vessel is defined as the difference between its completion time and its arrival time. By this definition, the total flow time over all vessels is equal to the sum of vessel completion times minus the sum of vessel arrival times. Since the last part, that is the sum of vessel arrival times, is a constant, minimizing the total flow time will correspond to minimizing the sum of vessel completion times. We remark that if the vessels are differentiated, then a weight should be assigned to each vessel indicating its priority, which will not affect this equivalency.

We note that if the quay is considered as a discrete resource, then BAP, with all vessels arriving at time zero and with no tidal restrictions, can be modeled as a scheduling problem in parallel machine environment to minimize the total completion time. The discretized version of the continuous BAP can also be viewed as a scheduling problem, where each vessel (job) can be handled (processed) on multiple berths (machines) at a time (this problem is known as multiprocessor task scheduling in the literature) (Guan et al. 2002; Li et al. 1998). Since we do not consider any discretization with respect to the quay for our problem, which would lead to an approximate model of the real-life problem, these scheduling models and hence the solution methodologies proposed for them are not applicable to our problem.

Figure 1 shows an example of a berth allocation with 20 vessels. The horizontal axis represents time in hours, while the vertical axis is for the length along the quay. Vertical bars are used to represent the low tide periods during which vessels may not leave the terminal. The boxes represent the location and timing associated with the processing of each vessel in the allocation.

Fig. 1
figure 1

Example of a berth allocation. Hatched boxes represent vessels with the horizontal position giving the timing and the vertical the location at the berth

The remainder of the paper is organized as follows: In Sect. 2, we present the literature related to our work, establishing the deficiencies of this literature regarding berth allocation problem in dry bulk terminals. In Sect. 3, we introduce two new mixed integer linear programming (MILP) models for this BAP_DBT. The novelty of our models is twofold: (1) They can handle realistic constraints of BAP without any approximations, such as having a continuous quay and dynamic arrivals of the vessels. (2) They incorporate the tidal restrictions, which bring in unavailability periods for the departure of the vessels. We also introduce several properties of an optimal solution together with valid inequalities to increase the efficiency of the models in Sects. 3.2 and 3.3, respectively. After describing our computational experiments to test the performance of these MILP models in Sect. 4, we present the results of these computational experiments and discuss them in Sect. 5. We conclude the paper in Sect. 6.

2 Literature review

As explained in the previous section, there are some similarities between BAP in DBTs and BAP in CTs, and the impact of tidal constraints is much higher in DBTs. Hence, it is necessary to consider the berth allocation problem in dry bulk terminals separately. In this section, we will present the previous research on continuous discrete berth allocation problem only. We note that the studies on discrete BAP in CTs without tidal constraints were excluded since the problem structure is different. The studies on integrating BAP with crane assignment in CTs were also not included since the cranes are not relevant to DBTs, and the assignment of handling equipment is relatively straightforward in DBTs as discussed in Sect. 1.

BAP in container terminals when berths are considered as a continuous resource is studied by several researchers with different objective functions and under different assumptions. Lim (1998) solved continuous BAP via a heuristic algorithm and under restrictive assumption of constant handling times. Nishimura et al. (2001) proposed a nonlinear integer programming (NLIP) model for BAP with a continuous quay. Due to the difficulty of the problem, they solved the problem with a genetic algorithm by dividing the continuous quay into segments. Kim and Moon (2003) developed an MILP model for continuous BAP with a cost minimization objective. The cost function is comprised of delay and handling costs. Since the MILP model can solve the instances with only up to seven vessels for a 3-day planning horizon, they suggested a heuristic algorithm based on simulated annealing for large-sized instances (up to 40 vessels). Guan and Cheung (2004) considered continuous BAP to minimize both the handling and the waiting times of the vessels. They differentiated the vessels in the objective function based on their importance and proposed two MILP models. One of the MILP models is similar to that of Kim and Moon (2003), and the other MILP is used to obtain a lower bound. They also proposed a heuristic algorithm to solve large-sized instances. Imai et al. (2005) is the first to consider continuous BAP under the assumption that the handling time of a vessel may change depending on where it moors. They proposed a heuristic to solve the problem by minimizing the total completion times of the vessels. Cordeau et al. (2005) considered both discrete and continuous BAP to minimize the total completion times of the vessels. While they presented mathematical formulations for the discrete case, they proposed a heuristic to solve the continuous BAP. Most recently, Lee et al. (2010) proposed a greedy randomized adaptive search procedure to solve continuous BAP with the objective of minimizing the total weighted flow time and tested its effectiveness with large-sized instances.

In the literature, there are only a few studies that address berth allocation problem in dry bulk terminals. Barros et al. (2011) dealt with BAP_DBT with tidal and stock-level constraints. Stock-level constraints are important for some of the dry bulk materials such as minerals as one can talk about a continuous process of consumption or production of minerals. The authors developed an integer linear programming (ILP) model and a simulated annealing algorithm for the problem. They considered the handling time of a vessel as a multiple of tidal periods, which makes their problem easier. By this way, it is possible to define the problem as a transportation problem as stated by the authors. Furthermore, they considered discrete berth setting, which is also easier compared to a continuous berth setting. Their computational results indicate that the ILP model can solve instances with 10–30 vessels; however, the CPU time can be as large as 8 h. We note that stock-level constraints of Barros et al. (2011) are not relevant to our problem since, for safety reasons, a vessel is not berthed until all its material has been railed and stored in the terminal, which is explained in Sect. 1.

Umang et al. (2013) studied continuous BAP in a dry bulk terminal, which holds different dry bulk cargo, by discretizing the quay. In this setting, a vessel can occupy more than one section. Their aim is to minimize the total service times of the vessels. They developed an MILP model and a model based on generalized set partitioning problem (GSPP). Due to the difficulty of the problem, they then proposed a heuristic algorithm, namely a squeaky wheel optimization, for solving the BAP. Their results from the computational experiments show that MILP can solve the problem only with 10 vessels with a 2-h time limit. On the other hand, the model based on GSSP can solve all instances. However, it should be noted that the size of this model grows very fast as the instance size gets larger.

Abdekhodaee and Wirth (2013) considered the tidal constraint within the context of a single machine (single berth) scheduling problem with a minimum makespan objective. They showed that even this simplified problem is NP-complete. Xu et al. (2012) studied discrete BAP in CTs with one tidal period. As a consequence of assuming discrete berths and considering just one tidal period, they could model the problem as a parallel machine scheduling problem. They then analyzed the computational complexity of the problem and concluded that the problem is computationally intractable. Subsequently, they presented heuristic algorithms to solve the problem. The authors stated that their model produces better results compared to the one without considering tidal effect. Du et al. (2015) considered continuous BAP in CTs to analyze the effects of tides. The authors first provided an MIP model without the tidal constraints and then accommodated tidal constraints by applying two different arrival policies. These models assume that both the arrival and the departure of the vessels are constrained by the tides, and the vessels, which finish their loading, will wait to depart until a high tide period. With extensive computational experiments, they addressed several managerial questions. Two aspects differentiate our work from that of Du et al. (2015): (a) the arrival of the vessels is not constrained by the tides since we focus on an export DBT, and (b) the assignment of berths is done so that the vessels do not wait at the berth for departure.

Apart from similarities to scheduling problems that were discussed in Sect. 1, we can observe that the BAP_DBT has similarities with strip packing problems, if we represent each vessel as a rectangle. In this representation, the length of the quay is the width and the time dimension is the length of the strip. The length of each vessel will be the width of each rectangle to be packed, with the duration that a vessel spends in the quay determining the length of the rectangle. This problem can be viewed as a strip packing problem with fixed orientation to minimize the waste, where each object to be placed on the strip will represent a vessel. However, again the tidal constraints impose a special structure in the BAP_DBT, so that the mathematical models and the solution procedures developed for strip packing cannot be used directly for the BAP_DBT (Arahori et al. 2012; Castro and Oliveira 2011).

3 Mathematical models for BAP_DBT

In this section, we present two new MILP mathematical models for the continuous berth allocation problem in dry bulk terminals with tidal constraints. In these models, we consider the berthing area in two dimensions: the vertical axis denoting the length of the continuous quay and the horizontal axis denoting the time periods within the planning horizon of the quay. Then, each vessel is considered as a rectangle, where the length of the vessel will be on the vertical axis and the handling time of the vessel in the port will be on the horizontal axis. In this representation, the bottom-left corner of a rectangle will denote the start time of handling the respective vessel on the horizontal axis and also the placement of the vessel at the quay on the vertical axis. Hence, the bottom-left corner plus the handling time of a vessel will indicate the departure time of a vessel on the horizontal axis, and bottom-left corner plus the length of the vessel will indicate the end of its placement on the vertical axis.

The first MILP model given in Sect. 3.1. (\(\mathcal {S}\)) is based on the sequence-variables, whereas the second MILP model given in Sect. 3.4. (\(\mathcal {TI}\)) is based on time-indexed variables.

For both models, the following notation is used. The quay is considered as a continuous resource with a limited length, L. Vessels, indexed by \(j\in V=\{1,2,\ldots ,|V|\}\), have arrival times to the terminal \(a_j\) and may have to wait before service. Each vessel j has to be handled at any part of the berth with a given \(p_{j}\) without interruption and occupies a length of the berth \(L_j, j\in V\). There are also a number of high tide periods, which must be used for the departure of the vessels. A high tide period will be denoted by \([B_i, E_i]\), where \(B_i\) denotes the beginning time of the high tide and \(E_i\) denotes the end of the high tide period with \(i\in H, H\) being the set of high tides in a given planning horizon T. We note that a vessel will arrive empty at the port and can be handled during the low tide, however, will require a high tide period to be able to depart from the port.

It should be noted that even if we dropped the tidal constraint and had \(L_j=L\ \forall \ j\in V\) the underlying problem is a scheduling problem in parallel machine environment to minimize the total completion time with release times as stated in Sect. 1, which is an NP-hard problem. Thus, the problem studied is also difficult to solve. Moreover, as noted in Abdekhodaee and Wirth (2013) the concept of forbidden zones created by tidal constraints is sufficient to make a scheduling problem NP-hard. Hence, dealing with the tidal constraints as well as the packing within the available berth space indicates that the problem studied is computationally intractable, and it is the main focus of the formulations below.

3.1 Formulation of \(\mathcal {S}\)

In our first model, \(\mathcal {S}\), three sets of binary variables are considered:

  1. (i)

    Variables \(x_{jk}\) are associated with the decision of sequencing the vessels over the time horizon:

    $$\begin{aligned} x_{jk} = \left\{ \begin{array}{l@{\quad }l} 1 &{}\text{ if } \text{ vessel }~j~\text{ completes } \text{ before } \text{ vessel }\\ &{}\quad k~\text{ starts } \text{ to } \text{ be } \text{ processed } \\ 0 &{}\text{ otherwise } \end{array} \right. \end{aligned}$$
  2. (ii)

    Variables \(z_{ji}\) are associated with the decision of using the high tide period i or not:

    $$\begin{aligned} z_{ji} = \left\{ \begin{array}{ll} 1 \quad &{}\text{ if } \text{ vessel }~j~\text{ completes } \text{ in } \text{ high } \text{ tide } \text{ period }~i \\ 0 \quad &{}\text{ otherwise. } \end{array} \right. \end{aligned}$$
  3. (iii)

    Variables \(I_{jk}\) are associated with the decision of sequencing the vessels over the quay side. In other words, a value of 1 for this decision variable will indicate that vessel j and vessel k are overlapping in time, and they are berthed at different locations of the quay during that time period:

    $$\begin{aligned} I_{jk} = \left\{ \begin{array}{l@{\quad }l} 1 &{}\text{ if } \text{ vessel }~j~\text{ is } \text{ handled } \text{ at } \text{ the } \text{ right } \text{ side }\\ &{}\quad \text{ of } \text{ vessel }~k \\ 0 &{}\text{ otherwise. } \end{array} \right. \end{aligned}$$

Moreover, three sets of continuous variables are considered:

  1. (iv)

    \(S_{j}\) : time at which vessel j begins to be handled.

  2. (v)

    \(C_{j}\): time at which vessel j completes to be handled.

  3. (vi)

    \(y_{j}\): position at which vessel j is assigned to.

We now present our model \(\mathcal {S}\).

figure a

The objective function (1) expresses minimization of total completion time of vessels. Constraint set (2) expresses the position of vessel j with respect to vessel k, which can be one of four cases: (i) either to the right or to the left along the quay and (ii) either before or after along the time horizon.

Constraint set (3) ensures that the completion time of a vessel will be defined by the sum of its start time and the handling time. Constraint set (4) imposes that the completion time of a vessel should be greater than or equal to the start time of a high tide that vessel is handled in. Constraint set (5) similarly expresses an upper bound on the start time of a vessel. Constraint set (6) determines the sequence of two vessels along the time horizon. Here, \(M_k=E_H - a_k\) is a suitably large constant to ensure that \(S_k\) is unconstrained if vessel j is not completed before vessel k. Constraint set (7) indicates that each vessel should be handled in at most one high tide period. Constraint set (8) ensures that two vessels will not overlap during their berthing. Constraint set (9) states that the start time of each vessel should be greater than or equal to its arrival time.

This relative position formulation has some similarities to the berth scheduling models without tidal constraints presented in Guan and Cheung (2004) and Kim and Moon (2003). However, the basic formulation is not very tight and hence does not perform well computationally. In the next section, we discuss some ways to improve the formulation.

3.2 Properties of the problem and the optimal solution

In this section, we present several properties of the problem and the optimal solution, which allows us to preprocess the mathematical formulation or add additional constraints to the model.

  1. 1.

    A vessel, \(j \in V\), cannot be completed before the earliest arrival time plus the required handling time (\(a_j+p_j\)), and therefore, all the corresponding completion variables before this time must be zero:

    $$\begin{aligned} \sum _{i \in H: E_i < a_j+p_j} z_{ji} = 0&\forall j \in V \end{aligned}$$
    (15)
  2. 2.

    For any two vessels, if the combined length of the vessels is larger than the length of the quay, then these two vessels will never overlap in time. This can be added by fixing the appropriate \(I_{jk}\) variables to zero using the following constraint:

    $$\begin{aligned} \sum _{k \in V: L_j+L_k > L} I_{jk} = 0&\forall j \in V \end{aligned}$$
    (16)
  3. 3.

    Consider any subset, \(V_3\subseteq V\) such that \(|V_3|=3\) and \(L < \sum _{n\in V_3} L_n\), then all three vessels in \(V_3\) cannot overlap in time. In other words, at least one of the vessels in \(V_3\) must be processed before the other two vessels. Since there are potentially \(O(n^3)\) such constraints, we select to only include “minimal” sets of this kind by considering subsets where \( \sum _{n\in V_3} L_n \le L+\min _{n\in V_3} L_n\). A similar set of constraints can be written for \(V_4\subseteq V\) such that \(|V_4|=4\). The constraints can be included in our model as follows:

    figure b
  4. 4.

    The solutions of the problem exhibit significant symmetry. Often there are multiple vessels that have the same handling time and the same length (or nearly the same length) that can be easily swapped in the optimal solution without affecting the objective function value or feasibility of the solution. In order to make the model more tractable, we would like to eliminate as much of this symmetry as possible.

    In the case where vessels have identical length and handling time, we can simply break the ties. Let \(\prec \) be an ordering of the vessels such that \(j\prec k\Rightarrow a_j\le a_k\), then:

    $$ \begin{aligned}&C_j\le C_k \quad \mathrm {and}\quad x_{kj}=0\quad \forall j\prec k\in V: \nonumber \\&\quad p_j=p_k\ \& \ L_j=L_k \end{aligned}$$
    (19)

    When the vessels are not of identical length, then it becomes more difficult to enforce an ordering as the ordering may depend on the fit with other vessels that are berthed at the same time. In some of the datasets, we observed large vessels of diverse lengths but equal handling times that were simply sequenced in an arbitrary order. We can eliminate this symmetry with the following constraints:

    $$ \begin{aligned}&x_{kj}\le \sum _{v\in V\setminus \{j,k\}} (I_{vj}+I_{jv}) \quad \forall \ j\prec k \in V:\nonumber \\&\quad p_j=p_k\ \& \ L_j < L_k \end{aligned}$$
    (20)

    That is process the earlier vessel first unless there are other vessels that are being processed at the same time as the smaller vessel (so that a swap may not be feasible). An analogous constraint applies for the case where \(L_k < L_j\). To minimize the number of slack constraints, we only apply these in the case where \(\min \{L_j,L_k\}\ge L/2\).

3.3 Tightening of the \(\mathcal {S}\) Model

By taking some characteristics of the problem into account, we propose several cuts to be employed within the MILP model given in Sect. 3.1. The following ways of tightening the formulation are considered, most of which involve constraints that are redundant for the integer program but expected to lead to a tighter MILP formulation:

  1. 1.

    If a vessel k is completed before the arrival time of another vessel j, then we want to enforce the appropriate ordering in the x variables:

    figure c

    In other words for each pair of vessels j and k, we can determine the earliest time that vessel k could complete after vessel j (\(a_j+p_j+p_k\)). If the vessel is assigned to a high tide period that ends before this time, then vessel k cannot be after vessel j (21). Similarly (22) states that if vessel k completes in a high tide window before the arrival time of vessel j, then vessel k must be before vessel j chronologically.

  2. 2.

    We can tighten the constraints (4), (6) and (8), respectively, as follows:

    figure d

    Each of these constraint variants is simply reducing the size of the “Big M” used in the basic formulation leading to potentially tighter bounds. For example in (23), we just use the earliest completion time of vessel j to get a tighter lower bound on \(C_j\) than if we simply used the start of the high tide window i (\(B_i\)). In (24), the start time of vessel k is bounded below by the completion time of vessel j, provided that vessel j completes before vessel k (\(x_{jk}=1\)) as per (6).

  3. 3.

    Initial investigation of the formulation indicated that the z variables defining the time interval when a vessel completes, play an important role for the solver. An additional set of constraints can be developed by refining the time when loading completes. We divide each high tide period into D smaller intervals. Let \({{\bar{z}}}_{jts}\) be a binary variable that is one if vessel j completes during the \(s^{th}\) interval of high tide t, then we get the following additional constraints:

    figure e

    Here, \(B_{ts}\) and \(E_{ts}\) represent the start and the end of interval s within high tide period t. When all the data are integer, so that completion times must be integer, these can be non-adjacent so that \(B_{t,s+1}=E_{ts}+1\). These constraints are analogous to (4), (5) and (7) for the z variables. However, this approach is effective even if we choose a smaller set of coarser intervals. The new variables allow us to carry out a partial discretization of the time domain, though the final solution is still continuous in both time and space (length along the berth). For the computational experiments in this paper, we used \(D=4\).

    On their own, the above \({{\bar{z}}}\) variable definitions are not very useful (though they can encourage the solver to branch on completion times rather than x or I variables). However, we can write some additional tightening constraints in terms of these variables as described below.

  4. 4.

    At each \(B_{ts}\), the total length of vessels berthed can be at most L:

    $$\begin{aligned}&\sum _{j\in V}\sum _{\tau \in H: B_\tau - B_{ts}< p_j} \nonumber \\&\quad \times \sum _{\sigma \in D: E_{\tau \sigma }-B_{ts} < p_j} L_j {{\bar{z}}}_{j\tau \sigma } \le L\qquad \forall \ t\in H,\ s\in D\nonumber \\ \end{aligned}$$
    (29)

    That is we sum all \({{\bar{z}}}\) variables that correspond to completion times which would overlap with \(B_{ts}\). Similar ideas are used under the name “capacity bounds” in resource constrained scheduling problems (e.g., Haouari et al. 2012). As a small performance tuning step, we can eliminate the sum over \(\sigma \in D\) by using \(z_{j\tau }\) where any completion time in that high tide period would result in an overlap with \(B_{ts}\). This gives an equivalent formulation, but reducing the number of nonzeros tends to have a beneficial effect on the speed at which the LP relaxations can be solved.

  5. 5.

    A related set of constraints can be defined to simply limit the “area” (length of vessel times its duration) during a high tide window.

    $$\begin{aligned}&\sum _{t\in H} {\mathop {\mathop {\sum }\limits _{s\in D:}}\limits _{B_{ts} \ge B_i}} {\mathop {\mathop {\sum }\limits _{j\in V:}}\limits _{E_{ts}-p_j < E_i}} L_j\,(\min \{E_i,B_{ts}\}\nonumber \\&\quad -\max \{B_i,E_{ts}-p_j\})\,{{\bar{z}}}_{jts} \le L\,(E_i-B_i)\quad \forall \ i\in H\nonumber \\ \end{aligned}$$
    (30)

    This simply computes the minimum duration for which each vessel j has to overlap the window \([B_i,E_i]\) based on the “worst” completion time of a vessel in \([B_{ts},E_{ts}]\) (if vessel j finishes in that window). The total duration times length of the vessel cannot exceed the available area \(L\times (E_i-B_i)\).

  6. 6.

    We include knapsack cover cut style constraints based on (29). These ensure that we have no more than n vessels that have length greater than \(L/(n+1)\) occupying a point in time. This can be defined for any integer n, though in practice we find that it only makes sense for a few small values of n (the implementation tested in this paper uses \(n\in \{1,2,3,5,10\}\)).

    $$\begin{aligned}&\sum _{j\in V:L_j>\frac{L}{n+1}} \sum _{\tau \in H:B_\tau -B_{ts}<p_j}\nonumber \\&\quad \times \sum _{\sigma \in D: E_{\tau \sigma }-B_{ts} < p_j} {{\bar{z}}}_{j\tau \sigma } \le n \qquad \forall t\in H, s\in D \end{aligned}$$
    (31)

    A related constraint counts the number of “large” vessels occupying a high tide window. Here “large” means occupying at least half the high tide window in duration and length greater than \(L/(n+1)\) as before.

    $$\begin{aligned}&\sum _{j\in V:L_j>\frac{L}{n+1}} \sum _{\tau \ge t}\nonumber \\&\quad \times \sum _{\sigma \in D: p_j-(E_{\tau \sigma }-E_t) > \frac{1}{2}(E_t-B_t)} {{\bar{z}}}_{j\tau \sigma } \le 2n \qquad \forall t\in H\nonumber \\ \end{aligned}$$
    (32)
  7. 7.

    We can tighten the bound on start times slightly by considering which vessels are scheduled to be loaded before the current vessel:

    figure f

    The first set of constraints is analogous to the area constraints (30) but considers the area between the arrival of the first vessel and the start of loading of vessel j. To understand the correctness of the second set of constraints consider the case \(n=2\). We are then summing the duration of all vessels k that are completed before vessel j and have a length greater than L / 3. The length limit means that no more than two of them can be berthed simultaneously. Hence, the total time elapsed since the start of the first vessel has to be at least as much as half the total handling time of these vessels. As for equations (31), we only include these constraints for small values of n.

  8. 8.

    There is often some slack or arbitrary ordering along the berths. We can break some of these by enforcing

    $$\begin{aligned}&y_j \le \sum _{k\in V} L_k\,I_{kj} \qquad \forall \ j\in V. \end{aligned}$$
    (35)

    The following constraints are based on the fact that we can have at most two different vessels of same/longer duration that are berthed at any point below (to the right) of vessel j. The constraints are logically correct but do not appear to help very much.

    $$ \begin{aligned} y_j\ge & {} \sum _{k\in V:p_k\ge p_j} \frac{1}{2}\,L_k\,I_{kj} \nonumber \\&+ {\mathop {\mathop {\sum }\limits _{k\in V: p_k< p_j}}\limits _{ \& p_k\ge p_j/2}} \frac{1}{3} \,L_k\,I_{kj} \qquad \forall \ j\in V. \end{aligned}$$
    (36)

We will refer to the mathematical model with these valid inequalities (15)–(36) as \(\mathcal {S}\_VI\) and compare its performance with the original model, that is \(\mathcal {S}\), in our computational experiments in Sect. 5.

3.4 Formulation of \(\mathcal {TI}\)

In this section, we present a time-index based formulation of the problem, which builds on the \(\mathcal {S}\) model given in Sect. 3.1. We define a new binary variable, \(q_{jt}\), which is 1 if vessel j is completed by time t and 0 otherwise. Unlike the previous formulation, this makes the assumption commonly found in the scheduling literature, that all time data are integer. In our formulation, time is in integer number of hours for a horizon of 1–2 weeks. Along with this new variable, we also use some of the variables defined in the previous formulation. For convenience, we also define a set HT, which is the collection of all time points within a high tide window, i.e., \(HT=\{t: B_i\le t \le E_i \text{ for } \text{ some } i \le |H| \}\). The complete formulation is presented below.

$$\begin{aligned} \text{ Min } \sum _{j} C_{j} \end{aligned}$$
(37)

subject to

$$\begin{aligned} x_{jk} + x_{kj} + I_{jk} + I_{kj} = 1&\quad \forall j, k \end{aligned}$$
(38)
$$\begin{aligned} S_{j} \ge a_j&\quad \forall j \end{aligned}$$
(39)
$$\begin{aligned} C_{j} = S_{j} + p_{j}&\quad \forall j \end{aligned}$$
(40)
$$\begin{aligned} C_{j} = \sum _{t=1}^{E_H-1} t(q_{j(t+1)}-q_{jt})&\quad \forall j \end{aligned}$$
(41)
$$\begin{aligned} q_{j(t+1)} \ge q_{jt}&\quad \forall j, 1\le t \le E_H-1 \nonumber \\\end{aligned}$$
(42)
$$\begin{aligned} q_{j(t-1)} = q_{jt}&\quad \forall j, t \notin HT \end{aligned}$$
(43)
$$\begin{aligned} q_{jE_H} =1&\quad \forall j \end{aligned}$$
(44)
$$\begin{aligned} \sum _jL_{j}(q_{j(t+p_j)}-q_{jt}) \le L&\quad \forall t \in HT \end{aligned}$$
(45)
$$\begin{aligned} S_{k} \ge C_{j} - M (1- x_{jk})&\quad \forall j, k \end{aligned}$$
(46)
$$\begin{aligned} y_{k} \ge L_{j} I_{jk} + y_{j} - L (1- I_{jk})&\quad \forall j,k \end{aligned}$$
(47)
$$\begin{aligned} x_{jk} \in \{0, 1\}&\quad \forall j,k \end{aligned}$$
(48)
$$\begin{aligned} q_{jt} \in \{0, 1\}&\quad \forall j, 1\le t \le E_H \end{aligned}$$
(49)
$$\begin{aligned} I_{jk} \in \{0, 1\}&\quad \forall j, k \end{aligned}$$
(50)
$$\begin{aligned} S_{j} \ge 0&\quad \forall j \end{aligned}$$
(51)
$$\begin{aligned} C_{j} \ge 0&\quad \forall j \end{aligned}$$
(52)
$$\begin{aligned} y_{j} \ge 0&\quad \forall j. \end{aligned}$$
(53)

The objective function (37) expresses minimization of total completion time of vessels. Constraint set (38) indicates the position of vessel j with respect to vessel k, which can be one of four cases: (i) either to the right or to the left along the quay and (ii) either before or after along the time horizon.

Constraint set (39) states that the start time of each vessel should be greater than or equal to its arrival time. Constraint set (40) ensures that the completion time of a vessel will be defined by the sum of its start time and the handling time. Constraint set (41) imposes that the completion time of a vessel is when the vessel actually gets completed. Constraint set (42) ensures the “cumulative” definition of the variable, i.e., once the vessel is completed, then it stays completed. Constraint set (43) dictates that any vessel is not completed outside of a high tide window. Constraint set (44) ensures every vessel gets completed. Constraint set (45) states that total berthing capacity is not violated. Note, since a vessel can only get completed within a high tide window, it is only necessary to add this constraint at time points within a high tide window. Constraint set (46) determines the sequence of two vessels along the time horizon. Constraint set (47) imposes that two vessels will not overlap during their berthing.

To compare the performance of this time-index-based formulation, we first added the tightening constraints from Sect. 3.3 to this formulation. During our preliminary computational experiments, we noticed that the running time of this model is longer than that of \(\mathcal {S}\_VI\), and we enhanced the performance of this model using a two-phase approach:

  1. 1.

    The above formulation without the xI and y variables (and corresponding constraints) provides a valid lower bound on the problem (call it LB formulation). We solve LB formulation first to optimality.

  2. 2.

    In the second phase, the solution of the first part, which simply assigns vessels to a time window without resolving position along the berth or an exact ordering of vessels, is passed as a starting solution to CPLEX. In this way, we both provide a lower bound and a guide in the search for a feasible solution. In many cases, it is possible to construct a complete berth allocation problem using the time-indexed formulation that matches the assignment of vessels to high tide windows found in the first phase. So in approximately 70% of cases no branch-and-bound nodes were required (though it is less common for larger instances).

The solution from phase 1 is simply passed as a bound to the objective value to phase 2 problem, and the solution from phase 1 is given as a starting solution (note that we are not fixing the variables but using CPLEX’s “starting solution” feature to warm start the Phase 2). In some cases, the starting solution can be infeasible, but in this case CPLEX would simply ignore this starting solution. But as our experiments show, more often than not CPLEX is able to find a good feasible solution due to this warm start. We note that phase 1 is a bound because we are only solving for “time slot” via time-indexed variables and not assigning any orders or positions in the berth.

This version of the model had a comparable performance with \(\mathcal {S}\_VI\), and we refer to the two-phase approach based on this formulation as \(\mathcal {TI}\) in the rest of the paper.

4 Computational experiments

To test the performance of the mathematical models (with and without tightening constraints), we performed an extensive computational experiment. The data were generated based on the real data of a dry bulk terminal in Newcastle, Australia, and the details are as follows:

  1. 1.

    The tidal times, that are \(B_i\) and \(E_i\), were obtained from the public Web site of Newcastle port, which are available at http://www.bom.gov.au/australia/tides/

  2. 2.

    Regarding the length of the vessels, we considered three cases:

    1. (a)

      In one set of the instances, we have the same length of 1 for each vessel to consider a case where all vessels are identical in length.

    2. (b)

      In the second set of instances, we have arbitrary length of the vessels. In this case, the length of vessels, \(L_j\), was generated from a uniform distribution of (0,2].

    3. (c)

      In the third set of instances, we want to reflect the real situation in the terminal we are motivated from. In that terminal, there are three types of vessels with respect to their lengths as below:

      1. (i)

        10% of the vessels will have a length \(L_j\) generated from a uniform distribution of (0, 0.85),

      2. (ii)

        30% of the vessels will have a length \(L_j\) generated from a uniform distribution of [0.85, 1.36), and

      3. (iii)

        60% of the vessels will have a length \(L_j\) generated from a uniform distribution of [1.36, 2).

  3. 3.

    The handling times, \(p_j\), were generated based on three cases:

    1. (a)

      The handling times are similar for different vessels so were generated from a uniform distribution of [16, 20].

    2. (b)

      The handling times vary to a larger extent for different vessels so were generated from a uniform distribution of [5, 20].

    3. (c)

      For the handling times of the case of 2.(c) above, we have the following structure:

      1. (i)

        \(p_j = 7\) for case 2.(c)i.,

      2. (ii)

        \(p_j = 12\) for case 2.(c)ii.,

      3. (iii)

        \(p_j = 15\) for case 2.(c)iii..

  4. 4.

    We considered two cases for arrival times, \(a_j\):

    1. (a)

      The arrival times are arbitrary so were generated from a uniform distribution of (0, 150).

    2. (b)

      The arrival times are at 12:00 noon everyday, so the above arrival data were rounded to 12:00noon of their respective day.

  5. 5.

    We considered four cases for the number of arriving vessels per week as \(|V|= 16, 18, 20, 32\).

  6. 6.

    Regarding the planning horizon, T, we considered two cases as follows:

    1. (a)

      \(T = 1\) week (hence 168 hours), when \(|V|= 16, 18, 20\).

    2. (b)

      \(T = 2\) weeks (hence 336 hours), when \(|V|= 32\).

    Allowing for some additional time in case some ships are delayed, our data sets include 17 to 31 high tide periods (rather than the approximately 14/27 that would occur in 1/2 weeks).

  7. 7.

    The length of quay side is taken as \(L=3\).

Table 1 Performance of three mathematical models (averages over 10 instances, except for the final gap which is the maximum over all instances)
Fig. 2
figure 2

Cumulative number of instances that can be solved in a fixed time limit. The horizontal axis indicates the elapsed time limit in seconds; the vertical axis the number of instances that could be solved to optimality within that time

The above structure results in 40 different scenarios considering a full factorial design. For each scenario, we generated 10 random instances. Hence, we have 400 instances in total. The data sets can be accessed at the following Web sites: http://home.ku.edu.tr/~coguz/Research/Dataset_BAP_DBT.zip or http://users.monash.edu/~andrease/Downloads.

Table 2 Effect of adding different improvements to the model listed in Sect. 3.3.
Table 3 Average and maximum root gap over all instances solved
Fig. 3
figure 3

Cumulative number of instances that could be solved to within the indicated optimality gap in 1 h

5 Results and discussion

In this section, we present the results of our computational experiments to test the performance of our original MILP model \(\mathcal {S}\), the one with the valid inequalities \(\mathcal {S}\)_VI, and time-index based MILP model \(\mathcal {TI}\).

The mathematical models were solved by CPLEX 12.5 limited to 8 parallel threads and an elapsed-time limit of 1 h, on a computer with two Intel Xeon X5660 CPUs running at 2.79 GHz with 64 Gb of RAM. For the \(\mathcal {S}\) formulation, we modified the default CPLEX parameters to turn off cover and MIR cuts, to use pseudo reduced cost variables selection, and to attempt solution polishing after 15 min. The solution polishing helps CPLEX to find better solutions for challenging problems where often the lower bounds are reasonably tight but finding a good feasible solution is difficult.

In Table 1, we present four attributes of a mathematical model related to its performance in the columns: The columns “Secs” indicate the elapsed time of a model in seconds, while the columns “Root (s)” indicate the time spent by a model to find a solution at the root of the search tree. The columns “Nodes” give the number of nodes exhausted at the completion of the search tree. Finally, the columns “Gap” present the optimality gap between the final solution and the lower bound found at termination. The rows of Table 1, on the other hand, represent the 40 scenarios we considered. Each scenario can be read as follows: \(|V|\_T\_L_j\)_\(a_j\)_\(p_j\). For each scenario, we present the results of each mathematical model regarding four attributes explained. Each entry for the first three attributes is the average values over 10 instances for every scenario. The fourth attribute, the optimality gap, on the other hand is given as the maximum gap over 10 instances. Finally, the row indicated with “Overall” displays the overall averages of each column for the first three attributes of each model and the overall maximum optimality gap for the fourth attribute of each model.

Since for some of the larger instances the time limit has been reached, it is instructive to consider the “easy” and “hard” instances separately. For the 273 instances which all models solved within the time limit, the average elapsed time is 90.28 s using the basic \(\mathcal {S}\) formulation, just 3.55 s using the improved \(\mathcal {S}\_VI\) formulation and 8.97 s using the two-phase approach \(\mathcal {TI}\). That is both of our new approaches achieve an improvement of more than an order of magnitude in the computational time. This behavior is further illustrated in Fig. 2, which shows that the new approaches solve significantly more instances for any given time limit up to 1  h.

The overall results indicate that among the three mathematical models, \(\mathcal {S}\_VI\), that is the model based on the sequence variables with valid inequalities, is the most efficient one for the small- to medium-sized instances. This demonstrates the power of the valid inequalities developed for the model. We achieve big reductions in the running times and also in the optimality gaps.

To better understand the contribution made by individual tightening constraints introduced in Sect. 3.3, we have tested the model with 16- and 18-vessel per week data sets by adding the constraints with the increasing number of improvements to the formulation. In Table 2, the column headings correspond to the item numbers in Sect. 3.3, with zero being the original formulation \(\mathcal {S}\) that has not been tightened in any way. The results show how the run time, the average root node gap, and the maximum final gap change as we add these tightening constraints. Clearly, the largest impact is made by item 1 (variable fixing) and item 4 (constraints on vessel length). However, the other constraints are also useful and as the results given in Table 1 show, the combination of these tightening constraints proved most effective particularly as the problem size increases.

We further note that \(\mathcal {TI}\) becomes more effective on larger instances. When we look at the last 10 scenarios in Table 1, which include a two-week planning horizon and 32 arriving vessels, we observe that \(\mathcal {S}\_VI\) cannot solve most of the instances within the given 1-h time limit, whereas \(\mathcal {TI}\) achieves optimal solutions in most of the instances. Hence, we conclude that both of the models have their strengths and can be used in different planning decisions.

In Table 3, we present average and maximum gap at the root node for each of the models. To be more specific, we provide the root gap after solving the Phase 1 and the root value of Phase 2 (which has the lower bound from Phase 1) under column \(\mathcal {TI}\). It is notable that first phase of \(\mathcal {TI}\) method produces very good lower bounds on average with quite often the gap being zero. For this method, the maximum gap over all instances is higher than the \(\mathcal {S}\_VI\) method, and in future research we aim to close this gap. The level of the gap after 1 h is shown graphically in Fig. 3, indicating that significantly more problems are solved within any given optimality tolerance (gap) using the new approaches than the original formulation. For instances that could not be solved to optimality using the basic formulation \(\mathcal {S}\), the average final gap is 17.91, 1.47 and 0.56% for the three models \(\mathcal {S}, \mathcal {S\_VI}\) and \(\mathcal {TI}\), respectively. Again this represents an order of magnitude improvement in the performance.

6 Conclusions

We considered the berth allocation problem in dry bulk terminals, where the quay is considered as a continuous resource and tidal restrictions bring in unavailability periods for the departure of the vessels. The first contribution of our paper is the development of two new MILP models including all realistic constraints without any approximations. The second contribution is to provide several valid inequalities for both of the MILP models, which improve both the solution quality obtained and the running time of the models. The third contribution is to produce a realistic data set for the BAP_DBT, which can be used as a benchmark data for further studies. These benchmark data include randomly generated values based on the characteristics of real-world ports.

Our results indicate that both models benefit from the valid inequalities as we can see a significant improvement with respect to both the solution quality and the run time of the models. Our results from the computational experiments also demonstrate that both of the MILP models strengthened with the valid inequalities perform very well for this NP-hard problem, with performance improvements of an order of magnitude. Out of 400 instances, 273 instances were solved to the optimality by both of the models with an impressive average elapsed time of 3.55 s for the sequence-based MILP (\(\mathcal {S}\_VI\)) and 8.97 s for the time-index-based MILP (\(\mathcal {TI}\)) compared to over 90 s for the original approach. Hence, we can conclude that the MILP models proposed for the BAP_DBT are both effective and efficient even for large-sized instances and can easily be used in companies where such decisions need to be made frequently. Moreover, we can deduce from Table  3 that the time-index-based MILP (\(\mathcal {TI}\)) is better suited for especially larger instances.