1 Introduction

Port throughput forecasting plays an important role in port capacity planning and management. This is due to the long technical life of indivisibles and the irreversible nature of port infrastructure investments (Taneja et al. 2010). Once the infrastructure is in place, the characteristics of the port are determined for a long period (Van Dorsser et al. 2012). Furthermore, port planning processes may take 5–15 years, from the initiation of the masterplan to its final approval (Notteboom 2006). Moreover, port projects require capital and fixed investments having long payback periods. This necessitates the financial viability of investments based on projections of port throughput and commodity flows (De Langen et al. 2012).

A capacity shortage affects port performance and consequently the competitive position of the port due to congestion and increases in waiting time (Jarrett 2015). On the other hand, structural overcapacity signifies a failure in port planning, but excess port capacity is often created and offered to port users to satisfy their potential growth (Haralambides 2017). Eskafi et al. (2019) pointed out that demand is temporally and spatially affected by salient stakeholders during the projected lifetime of a port. Furthermore, demand levels are volatile over time (Novaes et al. 2012), and the assumption of system stability leads to uncertain and inaccurate forecasts (Flyvbjerg et al. 2003). In testimony of volatile circumstances, the current outbreak of COVID-19 has created uncertainty in cargo flows, signaling increasing challenges in decision-making in port development projects (Notteboom and Haralambides 2020).

Forecasting models provide insights to the development of port demand. Soft computing models have received increasing attention as they capture linear and nonlinear causal relations between input data and port throughput (Munim and Schramm 2020). For instance, port throughput forecasting models based on back-propagation (BP) neural network algorithms (Ping and Fei 2013) have been presented in literature. However, a lack of input data restricts the performance of these models, increases uncertainty, and reduces the reliability of the forecasts result (Parola et al. 2020).

The multiplicity of disciplines with uncertain or missing information (quantitative and qualitative) and data in engineering and management systems entail various uncertainties associated with model outputs (Yang and Xu 2002). Rasouli and Timmermans (2014) stressed the existence of uncertainty associated with input data and forecast models. Liu and Duru (2020) emphasized that to increase the reliability of forecasts, the epistemic uncertainty of the forecast should be taken into account. Epistemic uncertainties are divided into model uncertainties (due to the choice of variables, assumptions, and processes) and parameter uncertainties (related to the quantity and quality of the data used) (Kowsari et al. 2019).

However, as far as epistemic uncertainty in port throughput forecasts is concerned, and as overviewed in Sect. 2, forecasting models generally suffer from the following: (1) limited handling of uncertainties in the models, (2) subjectively selecting explanatory variables, and (3) insufficient/sparse input data to properly build a forecasting model. These limitations hamper the reliability and performance of a port throughput forecasting model.

Therefore, this paper presents a rigorous Bayesian model that accounts for epistemic uncertainties in a port throughput forecast. The model meaningfully increases the reliability of forecast results and facilitates informed decision-making in port capacity planning and management.

To select our influencing macroeconomic variables, a variable selection method based on mutual information is applied. The method estimates the level of linear and nonlinear correlations between variables. It also determines the statistical dependency of the variables by quantifying the amount of information held in a variable through another variable (Soofi et al. 2010).

The uncertainty of parameters is accounted for in the Bayesian method by treating the regression coefficients as random variables and considering their distributions conditional on the data (Kowsari et al. 2020). One of the advantages of the Bayesian method in port throughput forecasting, moreover, is that it can be used with sparse or relatively small number of input observations, providing acceptable results.Footnote 1 Taneja (2013, p. 199) states that demand forecasts in port planning should take into account a certain degree of uncertainty, providing interval forecasts rather than point estimates. The model presented in this paper not only gives a point forecast which has the highest probability, but also offers a range of port throughput forecasts with confidence intervals. Consequently, the outcome of the model provides useful information to decision-makers and port planners, enabling them to better meet changing and uncertain future demand. Another strength of our model lies in the fact that it has an adaptive learning capability to be updated over time based on new information. Hence, it can provide a continuously or regularly updated port throughput forecast.

Our methodology is applied to forecast the annual throughput of the multipurpose Port of Isafjordur in Iceland. The approach presented here can be tailored to other ports to forecast their throughput.

The remainder of the paper is structured as follows: Sect. 2 outlines the literature review by discussing different port throughput forecasting methods, Sect. 3 addresses the mutual information and Bayesian method, Sect. 4 describes the study area and the data used, Sect. 5 presents and discusses the results, and Sect. 6 concludes with further remarks.

2 Different port throughput forecasting methods

Given the importance of port throughput forecasting, this section provides a literature overview of the state-of-the-art port throughput forecasting research, while also pointing out the present knowledge gap.

Different time series models have been used earlier to forecast port throughput. The moving average is a simple time series model that uses past internal patterns of data to forecast future values (Rojas et al. 2015). However, Van Dorsser et al. (2012) criticized the model as it assumes a static environment without insights from external influencing factors, which is an inappropriate simplification for (long-term) port throughput forecasts.

Hui et al. (2004) used regression models to forecast container throughput. The authors seized the opportunity to reiterate the obvious need for stationarity in regressor variables. In a port throughput forecasting context, they also point out that, if a nonstationary time series follows a random walk, the capability of the model to include the effects of a temporary macroeconomic shock is limited (Gosasang et al. 2011) and/or the shock is not dissipated with the time series (Van Dorsser et al. 2012).

A vector error correction model without a theoretical basis has been criticized as a purely mathematical model (Bonham et al. 2009). The vector error correction and its alternative error correction (Hui et al. 2004) are suitable for multivariate forecasting models where macroeconomic variables are characterized by stationary time series (Munim and Schramm 2020) and have a true relation in the long-term with port throughput (Van Dorsser et al. 2012). Jarrett (2015) pointed out that, to forecast port throughput, time series decomposition model can be used if observed data show a seasonal pattern, and the seasonal component has a multiplicative or additive trend. This model is mainly suitable for intermediate or long-range port throughput forecasts.

Due to the limitations of time series models, recent studies have used soft computing models including artificial neural networks (Gosasang et al. 2011), transfer forecasting models (Xiao et al. 2014), fuzzy logic, genetic algorithms (Chen and Chen 2010), artificial bee colony (Gökkuş et al. 2017), and ant colony algorithms (Nie and Zhao 2019). These models are used to simulate complex processes where a mathematical description is not performable due to random behavior and nonlinear characteristics of the process (Peng and Chu 2009). These models postulate the relation between port throughput and one or more independent variables.

Gosasang et al. (2018) pointed out that an artificial neural network is suitable for nonstationary data. This method provides better forecasting results than traditional methods (Gosasang et al. 2011) as the artificial neural network effectively captures complex (linear and nonlinear) relations between macroeconomic variables and port throughput (Ping and Fei 2013). However, artificial neural network models require a substantial amount of input data during the training and learning process, otherwise they are not able to generate accurate and reliable results (Ping and Fei 2013). The models are prone to be over fitted by a wide variety of variables due to their black-box nature and complexity (Gosasang et al. 2018).

Qualitative methods mainly rely on expert judgment (De Langen et al. 2012). These methods apply different techniques including rating scale, analog, Delphi, leading indicator, diffusion, performance evaluation review technique, survey, interviews, direct observation, and written documents (Jain 2005; Kesh and Raja 2005; Patton 2001). Qualitative models are used when data are unavailable, scarce, and ambiguous. However, the results of these models are based on the opinion, knowledge, and experience of experts, and thus are subjective and prone to (cognitive) biases (Patton 2001).

Chen et al. (2016) pointed out that, due to the diversity of many influencing factors, a single model is often insufficient and may result in inaccurate forecasts. Hybrid (or joint) models, made up of two or more models to synthesize their information (Chen et al. 2016), take advantage of each model for more stable results (Van Dorsser et al. 2012; Tian et al. 2010) and improved forecast precision (Huang et al. 2015; Li et al. 2008). Hybrid models are useful when it is uncertain which single model provides the most accurate forecast (Armstrong 2001). However, Chen et al. (2016) warned that, in hybrid models, individual models should be carefully selected as each model has its own influence and thus increases the uncertainty of the result. On the other hand, using several models may increase the redundancy, complexity, and computation load of hybrid models.

Moreover, despite the advances made in forecasting methods, the correct interpretation of results and their effective communication to stakeholders present challenges to port authorities, with regard to choosing and applying the right forecasting methods (Parola et al. 2020). Parola et al. (2020) stressed that the time horizon can further influence the selection of forecasting method and that, in strategic planning, port authorities should deal with uncertainties including opportunities and vulnerabilities. In this vein, Eskafi et al. (2021) presented a framework to deal with uncertainties in port planning process aimed at seizing opportunities and managing vulnerabilities in different time horizons of a port plan. They point out that the time horizon can affect the level of uncertainty and, consequently, the forecasting methodology.

Port throughput forecasting models have always contained epistemic uncertainty due to incomplete knowledge of model components, and complex and causal (with partly known) relations, with a large number of macroeconomic variables that often include limited data in them, the chosen modeling technique, the applied modeling assumptions, and the necessary simplifications. To increase the reliability of the forecast results, the inevitable epistemic uncertainty should be taken into consideration.

Eskafi et al. (2020b) presented the advantages of mutual information in the selection of influencing macroeconomic variables as input for port throughput forecasting models. They stated that the application of mutual information increases the reliability of the models. The mutual information method identifies the important variables that should be used in Bayesian models, and thus it improves the accuracy of model results (Yang et al. 2018) as it accounts for model uncertainties. The Bayesian method has been used in the literature in different fields including ship emissions (Liu and Duru 2020), shipping accidents (Zhang and Thai 2016), resilience of inland waterways ports (Hosseini and Barker 2016), deep-water port infrastructure resilience (Hossain et al. 2019), and classification of port variables (Serrano et al. 2018). However, the application of a Bayesian method to forecast port throughput is scant in the scientific literature.

3 Methods

3.1 Mutual information

Economic development is an important driver of maritime trade, and there is an interrelation between port throughput and macroeconomic variables (e.g., Parola et al. 2020). We use mutual information to identify key macroeconomic variables that influence port throughput, and thus reduce the need to subjectively select macroeconomic input variables. Application of mutual information reduces uncertainty in port throughput forecasts, as it effectively identifies the influencing macroeconomic variables on port throughput (Eskafi et al. 2020b). In other words, mutual information can be used as an approach to recognize insignificant variables that should be excluded from a model (Yang et al. 2018).

Mutual information is an important concept in information theory and a widely used measure to define the dependency of variables, especially in nonlinear systems. It is rooted in the concept of entropy (Shannon 1948) and Kullback–Leibler divergence (Kullback and Leibler 1951) and is suitable for assessing uncertainties and the information content between variables. The mutual information method measures the linear and nonlinear correlation between random variables and illustrates the distributions of the information measures in terms of interdependency between variables. It takes a zero value iff the two random variables (e.g., macroeconomic variables and port throughput in this study) are statistically independent. However, when the two variables are similar their mutual information is maximized.

For a pair of random variables (\(X, Y\)) with marginal probability distributions of \({\mu }_{x}(x)\) and \({\mu }_{y}(y)\), mutual information uses the Kullback–Leibler measure to determine the distance between the joint probability distribution, \(\mu \left(x, y\right)\), and the distribution associated with the case of complete independence [i.e., \({\mu }_{x}(x){ \mu }_{y}(y)\)] and according to Kraskov et al. (2004) is expressed as

$$I\left(X,Y\right)=\iint \mu \left(x,y\right)\mathrm{log}\frac{\mu \left(x, y\right)}{{\mu }_{x}(x){\mu }_{y}(y)}\mathrm{d}x\mathrm{d}y.$$

Mutual information quantifies how informative a random variable (\(X\)) with possible outcomes (\({x}_{i}\)), each with probability \(p(x)\), could be

$$H\left(X\right)=-\underset{x\in X}{\overset{}{\int }}p(x){\mathrm{log}}_{2}p(x)\mathrm{d}x,$$

where the base-2 logarithmFootnote 2 corresponds to the unit of information measured in “bits” (Shannon 1948). Thus, mutual information can be obtained as

$$I\left(X, Y\right)=H\left(X\right)+H\left(Y\right)-H\left(X,Y\right)$$

where \(H\left(X\right)\) and \(H\left(Y\right)\) are the entropy of random variables \(X\) and \(Y\), respectively; \(H\left(X, Y\right)\) is their joint entropy; and \(H\left(X|Y\right)\) and \(H\left(Y|X\right)\) are their conditional entropy and can be calculated as

$$H\left(X|Y\right)=-\iint \mu \left(x, y\right)\mathrm{log}\mu \left(x|y\right)\mathrm{d}x\mathrm{d}y,$$

where \(\mu \left(x,y\right)\) is the joint probability distribution. The conditional entropy \(H\left(X|Y\right)\) is the amount of uncertainty left in \(X\) when knowing \(Y\). Thus, from these equations, the \(I\left(X, Y\right)\) can be interpreted as the reduction in the uncertainty of the random variable \(X\) by the knowledge of another random \(Y\) (Maes et al. 1997).

3.2 Bayesian method

The Bayesian statistical method is an effective approach that allows the combination of knowledge about parameters in a synthesis of prior knowledge with the available data. In the Bayesian method, a posterior probability density is proportional to the likelihood function on the data, multiplied by the prior probability density. In classical approaches, instead, such as maximum likelihood, the inference is based on the likelihood of the coefficients, conditional on the data alone (Congdon 2014). To utilize the Bayesian method, the prediction models can be linearized by a simple expression of the form


where the dependent variable (\({y}_{i}\)) is the annual port throughput; the independent variables (\({x}_{i}\)) are the macroeconomic variables; and the coefficients \({C}_{0}\)\({C}_{6}\) can be estimated by Bayesian regression. In other words, the relationship between a dependent variable (\({y}_{i}\)) and the explanatory variables (\({x}_{i}\)) can be obtained by a linear regression model. Let \({y}_{i}=({y}_{i},\dots ,{y}_{n})\) be a vector of historical data, with \(n\) number of available observations. The matrix of explanatory variables (\(X\)) can be expressed as

$$X=\left[\begin{array}{ccc}{x}_{11}& {x}_{12}& \begin{array}{cc}\dots & {x}_{1k}\end{array}\\ \vdots & \vdots & \begin{array}{cc}\vdots & \vdots \end{array}\\ {x}_{n1}& {x}_{n2}& \begin{array}{cc}\dots & {x}_{nk}\end{array}\end{array}\right].$$

Assuming a conditional normal distribution of the dependent variable (\({y}_{i}\)), given the explanatory variables (\(X\)), the mean of the normal distribution has a linear function as:

$$E\left({y}_{i}|\theta , X\right)={\theta }_{1}{x}_{i1}+\dots +{\theta }_{k}{x}_{ik},$$

where \(\theta =({\theta }_{i},\dots ,{\theta }_{k})\) is a vector of unknown parameters. In other words, the dependent variable follows a normal distribution, \({y}_{i}\sim N\left(X\theta ,{\sigma }^{2}I\right),\) with a mean of \(X\theta\) and variance of \({\sigma }^{2}I\), where \(I\) is the \(n\times n\) identity matrix.

In Bayesian statistics, the posterior distribution describes updated information about the unknown parameter (\(\theta\)) and can be obtained by multiplying a prior distribution by a likelihood function as follows:

$$p\left(\theta |y\right)\propto p(\theta )p\left(y|\theta \right),$$

where \(p(\theta )\) is the prior distribution and \(p\left(y|\theta \right)\) is the likelihood function; i.e., a probability distribution that expresses the information contained in the historical data.

In this paper, the logarithm of the port throughput is assumed to follow a normal distribution, so that (Ding and Teo 2010):

$$p\left(y|{\sigma }^{2}, \theta , X\right)=\prod_{i=1}^{N}\frac{1}{\sigma \sqrt{2\pi }}\mathrm{exp}\left(-\frac{{({y}_{i}-{(X\theta )}_{i})}^{2}}{2{\sigma }^{2}}\right),$$

where N is the number of available historical observations, \(y\) is the vector of the logarithm of the port throughput data, \({(X\theta )}_{i}\) is the i-th element of the vector \(X\theta\) representing the mean value of the prediction model, and \(\sigma\) is the standard deviation. On the other hand, we assume a noninformative prior for the unknown parameters, i.e., \(p\left(\theta ,{\sigma }^{2}|X\right)\propto {\sigma }^{2}\). Thus, the joint posterior distribution of \(\theta\) and \({\sigma }^{2}\) is given by

$$p\left(\theta ,{\sigma }^{2}|y, X\right)\propto p\left(\theta ,{ \sigma }^{2}|X\right)p\left(y|{\sigma }^{2}, y, X\right)\propto {\sigma }^{2}\prod_{i=1}^{n}N\left({y}_{i}|{(X\theta )}_{i}, {\sigma }^{2}\right).$$

The posterior distribution of the unknown parameters θ is obtained by using Eq. 10. Therefore, the Bayesian posterior inference is used to simulate port throughput from the posterior macroeconomic variables.

The Bayesian model can take into account the statistical uncertainty associated with the limited number of input observations. The macroeconomic variables are considered as random variables and their associated uncertainties are quantified by the posterior distribution. This makes the Bayesian method preferable over classical regression because more information can be extracted from the probability distribution of each parameter. The capability of accounting for causal and uncertain relations of macroeconomic variables with port throughput makes the Bayesian model a useful tool for port throughput forecast.

In this paper, the MATLAB programming language is used to code the equations: (1) to calculate the mutual information between macroeconomic variables and port throughput, to identify the macroeconomic variables that influence port throughput, and (2) to develop a Bayesian model to forecast port throughput based on the selected macroeconomic variables.

4 Study area and data used

The multipurpose Port of Isafjordur is a hub port in northwest Iceland, in the so-called Westfjords (Fig. 1). The port has a competitive advantage, due to its infrastructure and services, among the other ports in the region. Coastal shipping and road transportation are the only two transport modes that connect the port to its hinterland, which is the whole country. Industrial fisheries, aquaculture, and further fish processing (i.e., packing, freezing, and storage) are the main businesses of the region. These activities are increasing in the region, which increases the volume of cargo and container handling in the port. A reliable port throughput forecast supports the port authority in decision-making for capacity planning and management to position the port for sustained growth. The multipurpose Port of Isafjordur is the third busiest port of call for cruise ships in Iceland (Isafjordur Port Authority 2019).

Fig. 1
figure 1

The multipurpose Port of Isafjordur. The location of the study area is shown on the map of Iceland at the top left

The main functions of the Port of Isafjordur include:

  • Transfer and storage of containerized and noncontainerized cargo.

  • Industrial value-added activities related to fisheries and aquaculture.

  • Recreational activities, such as rendering services to expedition vessels, cruise ships, and small private and sailing boats.

In this study, two types of port throughput data are collected: containerized throughput in twenty-foot equivalent unit (TEU) and noncontainerized throughput in tonnes. The latter includes fuel oil, marine products, and industrial materials. Table 1 presents all cargoes that are handled in the port in question. Small cargoes (in terms of quantity) are considered as other general cargo. There is no information about the nature of the cargo inside containers. Port throughput related to recreational activities has not been considered in our study.

Table 1 List of cargoes handled at the Port of Isafjordur

The annual containerized throughput data of the port are collected for the years 1990–2019. The available data for noncontainerized throughput are garnered between 1990 and 2016. Noncontainerized data for 2017–2019 were limited and unusable for building the model. Thus, the noncontainerized throughput is forecast for 2017–2025.

To build our model, six macroeconomic variables, available at Statistics Iceland (2019), have been used. They include national gross domestic product (GDP), average yearly consumer price index (CPI), world GDP, the volume of national export trade, the volume of national import trade, and the national population. These variables were also used in previous studies (Gökkuş et al. 2017; Gosasang et al. 2018). Of course, if more macroeconomic variables are available, they could naturally be used in mutual information analysis to discover those that influence port throughput the most. In other words, the application of mutual information discovers variables that should be used and/or excluded as inputs in building a forecasting model. Historical and forecast values of these variables refer to 1990–2019 and 2020–2025, respectively (Statistics Iceland 2019).

The influence of factors that cannot be quantified from observation of the past (e.g., growth in the port’s captive market) or cannot be accurately predicted (e.g., innovation or breakthrough technology in cargo handling) are excluded in this study. Transshipment flows are not covered in this study either. However, the presented methodology can also be applied to forecast (non)containerized port throughput with (high) share of transshipment flow. This is because the changes in transshipment flow are also influenced by the development of macroeconomic variables (e.g., Parola et al. 2020).

5 Results and discussion

To increase the reliability of our model, the associated epistemic uncertainties are taken into consideration. To account for parameter uncertainty, mutual information is used to objectively select the input variables for of model. Figure 2 shows the results of the mutual information values between port throughput and macroeconomic variables.

Fig. 2
figure 2

Mutual information values between port throughput (right: containerized, left: noncontainerized) and macroeconomic variables. The acronyms are the national GDP (NGDP), the average yearly CPI (ACPI), the world GDP (WGDP), the volume of national export trade (VNET), the volume of national import trade (VNIT), and the national population (NPOP)

The results indicate that port throughput is correlated with the six macroeconomic variables of this study. In comparison with noncontainerized throughput, containerized throughput has a relatively higher correlation with macroeconomic variables. This is because the majority of cargo flows in the port is containerized, and containerized cargo is the main form of transportation from/to the Port of Isafjordur.

Since the port throughput is influenced by the six macroeconomic variables, these variables are used as independent variables (input) in the port throughput forecasting model. The mean and standard deviation of the model parameters, along with the total standard deviation of the model with respect to the port throughput are shown in Table 2. The values are derived from the corresponding variable’s posterior distribution that results from the model.

Table 2 Parameter estimates and 95% marginal posterior intervals of the model parameters for (non)containerized throughout

Figure 3 shows the posterior distributions of the model parameters. For the sake of space, only containerized throughput is depicted. However, almost the same behavior can be seen for noncontainerized throughput. The well-defined normal-shaped posterior distribution of the regression coefficients indicates an appropriate assumption of the prior distribution.

Fig. 3
figure 3

The posterior histograms of the regression coefficients. The solid lines indicate the normal distribution fitted on the posterior values for containerized throughput

Figure 3 showcases one of the advantages of the Bayesian statistical method, as it determines the posterior distribution of macroeconomic variables, vis-à-vis classical approaches which only return point estimates.

Figure 4 shows the residuals as a function of data that represent the model goodness of fit. Another qualitative assessment of normality is demonstrated by the histograms of the residuals. The residuals of the model follow the Gaussian distribution and are generally assumed from the outset to be normally distributed with zero mean and a standard deviation of σ. This assumption is depicted by a normal probability plot in Fig. 4.

Fig. 4
figure 4

Right: the histogram of residuals along with a fitted normal distribution. The mean and standard deviation of the residuals are also shown. Left: residuals (circles) of the prediction model using the mean model parameter estimates for containerized throughput (top row) and noncontainerized throughput (bottom row)

As can be seen in Fig. 4, both in containerized and noncontainerized throughput, the residuals are distributed around zero. Also, the residuals of the model are normally distributed with zero mean and small standard deviation (i.e., 0.049 σ for containerized and 0.098 σ for noncontainerized throughput), indicating that there are neither significant residual outliers nor systematic trends in the overall distribution of residuals. As demonstrated, the results show the model’s goodness of fit with (limitedFootnote 3) input data. Table 3 gives the result of the port throughput forecasts, based on the available forecast macroeconomic variables (i.e., X1 to X6) and their distribution over the years.

Table 3 The prediction of the port throughput (logarithms base 10) is based on the explanatory variables for throughput

Figure 5 shows the development of the historical and the forecast port throughput expressed by the gray shaded area for different confidence intervals of the forecast.

Fig. 5
figure 5

Historical and forecast containerized (left) and noncontainerized (right) port throughput (PT) developments, and confidence interval (CI). The forecast port throughput is surrounded with the red box in the inserted graph including the historical data

The confidence limits indicate the future port throughput forecasts while associating the epistemic uncertainties, including model uncertainties and parameter uncertainties. Thus, the uncertainty bounds can be further used for decision-making in port planning and management. For instance, the national GDP and the world GDP have been affected by the COVID-19 pandemic. In this context, although updated macroeconomic variables including the national and world GDPs forecasts were not used in this study, port throughput can be expected to be within the lower uncertainty bounds.

As shown in Fig. 5, containerized throughput shows a growing trend since 1990. However, during the world economic downturn of 2008–2009, a reduced pace of growth is observed until 2012. Noncontainerized throughput generally shows a decreasing trend from 1990 to 2012. In 2013, noncontainerized throughput recovered, and containerized throughput significantly increased. One of the reasons for this substantial increase is the rapid growth in aquaculture, especially the salmon industry in the region. The fast-growing aquaculture stimulates the business environment and drives the growth of relevant activities including marine production, processing, and packing, as well as industrial equipment manufacturing. In this respect, an additional shipping company started calling the port from 2013 to satisfy the increasing demand.

As depicted in Fig. 5, the forecast containerized throughput follows an increasing trend. The growth rate is somewhat lower between 2022 and 2025. Containerized throughput in the period from 2020 to 2025 resumes a total increase of about 26% in TEU. This is an increase of 324 TEU (324/100 = 3.24 times the TEU containerized throughput of the indexed year 2005). The outer bound (shaded area indicating the 99% confidence interval) surpasses the maximum values of 480 and the minimum value accounts for almost 215 TEU. Higher market uncertainty requires higher flexibility in port infrastructure, operation, and services (Taneja et al. 2010; Wang et al. 2019). Thus, this range of port throughput forecasts with confidence intervals provides useful information to decision-makers and port planners to develop flexibility and create a buffer in port capacity planning to satisfy changing and uncertain future demand (Notteboom and Haralambides 2020).

The continuous need for export of marine and aquaculture products (i.e., farmed and wild, frozen and fresh, processed and unprocessed), as well as imports of industrial and consumer goods are increasingly handled in containers in the multipurpose Port of Isafjordur. Also, there is an increasing need for a reliable and quick exporting of marine catch and products which are considered as time-sensitive cargo in reefer containers (Eskafi et al. 2020a). Imports of fish feed in containers has increased. The increase in containerized throughput is supported by the causal relation with the increasing macroeconomics of Iceland. In response to this increase, larger vessels are being utilized, enjoying economies of scale, which have impacted the containerized throughput of the port. This growth in containerized throughput is also aligned with the increase in scale and concentration in the world container markets (Haralambides 2019). Containerization is an important transportation system in the rapid growth of international trade. As a preferred form of transport of both exports and imports, containerization is one of the reasons for the container growth in the present study (Gharehgozli et al. 2019).

As depicted in Fig. 5, noncontainerized throughput follows the historical data trend and continuously decreases until 2025. The decline in noncontainerized throughput reached 40 tonnes in 2019 (40/100 = 0.4 times tonnes of the noncontainerized throughput of the indexed year 2005). Afterwards, a gradual decline in noncontainerized throughput, of a lower rate, is observed until 2025. Noncontainerized throughput is forecast to decrease by 82% from 2017 through 2025. This is a decrease to 19 tonnes of noncontainerized throughput. The outer bound (shaded area indicating the 99% confidence interval) reaches a maximum value of about 45 tonnes and the minimum value is about 8 tonnes. The decline in noncontainerized throughput may gradually stabilize in the long run. The slowdown of the decline from 2019 to 2025 can be due to an increase in Iceland’s macroeconomics until 2025, thus resulting in economic growth and consequently in an increase of maritime trade (De Langen et al. 2012). The ongoing containerization is driving noncontainerized throughput down, with noncontainerized cargoes increasingly transported by containers (Haralambides 2019).

This decreasing and stabilizing range of noncontainerized throughput helps the port authority to determine the ultimate required capacities and facilities that can satisfy future demand. Furthermore, the port authority can consider phasing of new development based on the changing demand in the volatile market environment. The results of this short-term forecast facilitate the port’s operational decisions (i.e., port capacity utilization, cargo handling, and facilities development plan), resources allocation (Gökkuş et al. 2017; Rashed et al. 2017), port logistics, and terminal and hinterland connections capacity (Brooks et al. 2014).

In comparison with existing forecasting methods, the presented method has several advantages including the following: (1) it quantifies the relationship of macroeconomic variables with port throughput and then identifies the influencing macroeconomic variables as input to the model. This meaningfully increases the accuracy of the model (Yang et al. 2018) and the reliability of the forecast results (Eskafi et al. 2020b). Thus, it considers model uncertainties, (2) it uses a probabilistic approach to quantify the associated parameter uncertainty of the influencing macroeconomic variables by providing their posterior distributions, (3) the Bayesian model can be updated when more data are available (Zhang et al. 2013), (4) it can deal with uncertain information characterized by scarcity and limitation of data (Kowsari et al. 2019). The method was applied to the Port of Isafjordur in Iceland just as one of the many ports that could have been used as a case.

6 Conclusions

Port throughput forecasts provide valuable and fundamental input to capacity planning and management, adjusting this way the direction of port development. Additionally, to uncertain demand and a volatile market environment, epistemic uncertainty associated with parameter uncertainties and model uncertainties impose challenges in decision-making. In the context of uncertainty, decision-makers should not rely on a single-point forecast but should assess a range of port throughput forecasts.

This paper presented a port throughput forecasting model using the Bayesian statistical method. Our model was developed to forecast the annual containerized and noncontainerized throughputs of the multipurpose Port of Isafjordur from 2020 to 2025. The mutual information approach was used to determine the influence of macroeconomic variables on port throughput and thus objectively use input variables in the forecasting model, resulting in reduced model uncertainties. The Bayesian method accounted for the uncertainty associated with the macroeconomic variables, considered to be random variables following a given probability distribution. The model also accounted for parameter uncertainties and delivered reliable results with relatively sparse input data. Furthermore, the model offered a range of port throughput forecasts that allows decision-makers and port planners to develop flexibility in capacity planning to satisfy the changing and uncertain needs of port users.

Our results show a growth of containerized throughput up to 2025. That throughput increases by 26% during the period 2020–2025 and, in 2025, it reaches 324 TEU (324/100 = 3.24 times the TEU containerized throughput of the indexed year 2005). However, in that year, noncontainerized throughput slumped to about 19 tonnes. This is about an 82% decrease over the period 2017–2025. The decline in noncontainerized throughput slowed down after 2019. An increase in containerized throughput and a decline and stabilization in noncontainerized throughput helps the port authority to consider the required port capacities and facilities and be proactive in planning to satisfy the future demands of stakeholders.

The theoretical contribution of this paper lies in the presentation of a robust port throughput forecasting model, based on the influencing macroeconomic variables that accounts for epistemic uncertainties including model uncertainties (choice of variables, assumptions, and processes) and parameter uncertainties (quantity and quality of data used). Furthermore, the managerial contribution of the paper is by drawing up a reliable port throughput forecasting framework that can support port authorities to rationalize their investment decisions based on future demand and thus maintain the competitive edge of their ports and growth in their market share. Various data sources, and inconsistencies in terms of data collection may have affected the results of this case study. Although the paper has developed a short-term forecast due to a lack of forecasts of independent variables, the model can be applied for long-term forecasts too, which are useful to assess future infrastructure investment decisions. The application of the Bayesian statistical method in long-term forecasting is recommended in future research.