Keywords

1 Introduction

It is well known, that tourism industry produces a perishable product which strongly depends on micro-macroeconomic factors, natural disasters, cultural events, social behaviors, marketing policies etc. At the same time tourism expenditure has become a valuable source of economic activity and employment.

In this work a study of the tourist occupancy in the area of Western Greece is presented. The Western Greece region consists of three dissimilar prefectures (Achaia, Ilia, Aitoloakarnania) regarding the type of the visiting tourists, the available resources and infrastructures and the level of development and employment (Panagopoulos & Panagopoulos, 2005). However, despite the heterogeneous geographic morphology and economic activity, the overall region retains the same characteristics of a tourist destination, that is, the suggestibility in various exogenous factors as well as the considerable contribution to the local and country economy.

The most notable factors describing the tourist industry in the area are the tourism development and the finance crisis. The last decades, the massive tourism development of the region had been rapidly increased and a significant number of residents have turned into tourist occupations (Panagopoulos & Panagopoulos, 2005). In addition, during the last few years, Greece, as well as the area of Western Greece, experiences a strong finance crisis and one of the remaining productive sectors (both private and public) that present a significant activity is the tourism industry.

All the aforementioned facts are crucial for all specialists and consultants on economic affairs who make and adopt policies concerning tourism in order to have a clear view in the near and distant future. It is of a great importance to be able to forecast accurately future tourism demand to maximize the benefits of selling the tourist product and minimize the loss of a predictable disaster or misadventure. However the special nature of the characteristics of the tourist product cannot allow forecasters to easily make, reliable and efficient suggestions in the future timeframe.

In this direction, Panagopoulos An. & Al. proposed a forecasting model for predicting the tourist occupancy in the West Greece area using the Box-Jenkins Method (1976) using monthly data from January 1990 to December 1999 forecasting for 2 years (Panagopoulos & Panagopoulos, 2005).

Thus, the study of the West Greece region still remains a great challenge, due to all the inherent dissimilarities, and any suggestions in the direction of modeling the overall tourist product circulation remains a well-timed issue for both researchers and local authorities.

Considering that there exists no clear evidence that a unique forecasting model can always deliver trustworthy forecasts (Song & Li, 2008), different methods and techniques have been proposed covering a wide range of different countries and locations, as well as different time intervals. The most widely used models (especially using monthly data) are univariate or time-series models (Gunter & Önder, 2015). The most widely used technique in this framework is the (Seasonal) Autoregressive (Integrated) Moving Average models (Box & Jenkins, 1976). Recently, some new, well performed, time-series models have been proposed such as the Exponential Smoothing models (Hyndman, Koehler, Ord, & Snyder, 2008; Hyndman, Koehler, Snyder, & Grose, 2002), and a low cost inferential model (Psillakis, Panagopoulos, & Kanellopoulos, 2009); multivariate or Econometric models are also employed, such as Autoregressive Distributed Lag Models (Dritsakis & Athanasiadis, 2000; Ismail, Iverson, & Cai, 2000), Error Correction Models (Kulendran & Witt, 2003; Roselló, Font, & Roselló, 2004), Vector Autoregressive models (Shan & Wilson, 2001; Witt, Song, & Wanhill, 2004) and Time-Varying Parameter models (Li, Song, & Witt, 2006; Song & Witt, 2006); some artificial intelligence methods were, also, used (Chena & Wang, 2007; Claveria & Torra, 2014; Hernández-López & Cáceres-Hernández, 2007; Kon & Turner, 2005; Palmer, Montaño, & Sesé, 2006). An exhaustive review on forecasting time series can be found in (Song & Li, 2008).

The problem of predicting future values on the basis of some collected historical data arises in many scientific, economic and engineering applications (Wan, 1993; Weigend & Gershenfeld, 1994; Weigend, Huberman, & Rumelhart, 1990), i.e. the prediction of future sample values of time series by extracting knowledge from its past values. The most powerful approach to the problem of prediction is to find a law underlying the given dynamic process or phenomenon. If such law can be discovered and analytically described, i.e. by a set of ordinary differential equations, then by solving them we can predict the future values if the initial conditions are completely specified. Unfortunately, the information about a dynamic process under investigation is often only partial and incomplete, so the prediction cannot be based on a known analytical model. In this case we must try a less powerful approach and attempt to discover some strong empirical regularity in the observation of the time series. The unknown dynamic process is described by the nonlinear multivariable function:

$$ y(k)=F\left[y\left(k-1\right),\;y\left(k-2\right), \dots y\left(k-n\right)\right], $$
(1)

where \( y(k),\;k=N,\;N-1, \dots n \) with \( n\ll N \) are given samples of the time series and \( F\left[\cdot \right] \) is an unknown nonlinear function. Such a function can be viewed as a multidimensional surface. This means that the present or a future value is assumed to be a nonlinear function of the \( n \) previous ones. In a more compact form the above equation can be rewritten as

$$ y(k)=F\left[x(k)\right]. $$
(2)

In this paper we have used a new prediction technique based on Independent Component Analysis (ICA) that transforms the data to a set of statistical Independent components (ICs). Then prediction is performed on each IC separately using any known forecasting method depending on the nature of the data. In this paper, the prediction approach that was used is based on ARIMA models (Box, Jenkins, & Reinsel, 1994). Finally, the ICs are transformed back to the data space and mixed again to form the predicted time series. In order ICA to be applied to a single time series, the time series must be first transformed to a higher dimensionality space using Dynamical System Analysis and Dynamical Embedding (DE).

The structure of the paper is as follows: In the next section we present the methodology of the proposed prediction method. In Sect. 3 our experiments are presented while in the last section, some conclusions and remarks are drawn.

2 Methodology

2.1 Independent Component Analysis

The task of ICA is to estimate a set of independent components that transform the input feature vector of the observations x into vector of independent components y using all available higher order statistical information of the observations. The linear projection of the observations is given by:

$$ \boldsymbol{y}(t)={\boldsymbol{W}}_{ICA}\boldsymbol{x}(t) $$
(3)

where, W ICA is the MxM ICA separating matrix with the transformation axes. To estimate this matrix in an unsupervised manner, we apply the Maximum Likelihood Estimation criterion (MLE). The log-likelihood of the observations x is given by:

$$ L= log\left({p}_x\left(\boldsymbol{x};{\boldsymbol{W}}_{ICA}\right)\right)= log\left(\left|{\boldsymbol{W}}_{ICA}\right|\right)+ log\left({p}_y\left(\boldsymbol{y}\right)\right). $$
(4)

The weights of the ICA network are estimated recursively using the stochastic gradient of \( L \) with respect to the matrix W ICA :

$$ \frac{\partial L}{\partial {\boldsymbol{W}}_{ICA}}={\left[{\boldsymbol{W}}_{ICA}^{-1}\right]}^T-\varPhi \left(\boldsymbol{y}\right){\boldsymbol{x}}^T, $$
(5)

where

$$ \varPhi \left(\boldsymbol{y}\right)=-{\left[\frac{p_1^{\prime}\left({y}_1;{\boldsymbol{W}}_{ICA}\right)}{p_1\left({y}_1;{\boldsymbol{W}}_{ICA}\right)}\dots \frac{p_N^{\prime}\left({y}_N;{\boldsymbol{W}}_{ICA}\right)}{p_N\left({y}_N;{\boldsymbol{W}}_{ICA}\right)}\right]}^T $$
(6)

and \( {p}_i\left({y}_i;{\boldsymbol{W}}_{ICA}\right) \) is the probability density function of the \( {i}^{th} \) source signal. The marginal pdf of the source regions for the examined time series was found experimentally to follow the hyperbolic cosine distribution with \( {p}_i\left({y}_i;{\boldsymbol{W}}_{ICA}\right)\propto 1/ \cosh \left({y}_i\right) \), so \( \varPhi \left({y}_i\right)= \tanh \left({y}_i\right) \). From (5) and using the natural gradient approach, we derived the following weight adaptation rule for the W ICA matrix:

$$ \varDelta {\boldsymbol{W}}_{ICA}=-n\frac{\partial L}{\partial {\boldsymbol{W}}_{ICA}}{\boldsymbol{W}}_{ICA}^T{\boldsymbol{W}}_{ICA}=n\left[\boldsymbol{I}-\varPhi \left(\boldsymbol{y}\right){\boldsymbol{y}}^T\right]{\boldsymbol{W}}_{ICA} $$
(7)

2.2 ICA Based Prediction

Independent Components estimated by (3) can be compressed by fewer bits than the observed signals \( \boldsymbol{x}(t) \). They are thus more structured and regular. This gives motivation to try to predict the signals \( {x}_i(t) \) by first going to the ICA subspace, performing the prediction there and then transforming back to the original time series, as suggested by (Pawelzik, Muller, & Kohlmorgen, 1996). The prediction can be done separately and possibly with a different method for each component, depending on its time structure. Furthermore, for each independent component \( {s}_i(t) \) a suitable nonlinear filtering can be applied to reduce the effects of noise. In particular, the ICs that contain very low frequencies (such as trend, slow cyclical variations) are smoothed, whereas the components that contain high frequencies or sudden shocks are high-pass filtered. As a next step, the non-linearly transformed components are predicted separately, using any kind of linear or non-linear prediction technique (Koutras et al., 2001). In this work we have used the method of ARIMA modelling and the prediction is performed for a number of different steps into the future.

2.3 Dynamical System Analysis

In order to apply the ICA method on the observed tourism demand time series, we must first transform it into a higher dimensionality space. To this end we have used the Dynamical System Analysis method and in particular the Dynamical Embedding (DE) technique. Given a sampled time series, through DE we attempt to uncover as much information as possible about the underlying generators based only on the measured data (Broomhead & King, 1986). This is based on the assumption that the measured signal is due to the non-linear interaction of just a few degrees of freedom, with additive noise and suggests the existence of an unobservable deterministic generator of the observed data. If the number of degrees of freedom of the underlying system is given by D, then D can be used as a coarse measure of system complexity. Using the Taken’s theorem (1981) we can reconstruct the unknown dynamical system that generated the measured time series by reconstructing a new state space based on successive observations of the time series. A DE matrix is constructed from a set of delayed vectors taken from the observed data \( \boldsymbol{x}(t) \), where the state of the unobservable system at time \( t \) is given by \( \boldsymbol{X}(t) \):

$$ \boldsymbol{X}(t)=\left\{x\left(t-\tau \right),\;x\left(t-2\tau \right), \dots x\left(t-\left(m-1\right)\tau \right)\right\}, $$
(8)

where \( \tau \) is the time lag and \( m \) is the embedded dimension. The above delay vector describes observations of the underlying system states, assuming that the data x(t) are generated by a finite dimensional non-linear system of the form:

$$ \boldsymbol{x}(t)=f\left\{\boldsymbol{X}\left(t-1\right),\;\boldsymbol{X}\left(t-2\right), \dots \boldsymbol{X}\left(t-D\right)\right\}+{e}_t $$
(9)

where \( {e}_t \) is i.i.d. with zero mean and unit variance. Takens (1981) showed that the Euclidean embedding dimension \( m \) must be as large as \( D \), but in practice must be such that,

$$ m>2D+1 $$
(10)

When applied to real world data the delay vector’s size \( m \) actually used needs to be a lot larger than the Euclidean embedding dimension \( m \) because of the dependences of the time series and the noise of the system. The parameter \( m \) needs to be chosen to be big enough to capture the information content necessary and if the time series is heavily correlated, then more time series samples are needed to make up the required information content of the delay vector. Once the optimal delay vector has been estimated, an embedding matrix is constructed out of a number of consecutive delay vectors. The number of delay vectors \( N \), is determined by the length of the signal to be analysed but in practice must be as large as \( m \). The form of the embedding matrix is:

$$ \boldsymbol{X}=\left[\begin{array}{cccc}x(t)& x\left(t+\tau \right)& \dots & x\left(t+N\tau \right)\\ {}x\left(t+\tau \right)& x\left(t+2\tau \right)& \dots & x\Big(t+\left(N+1\right)\tau \\ {}\vdots & \vdots & \vdots & \vdots \\ {}x\left(t+\left(m-1\right)\tau \right)& x\left(t+m\tau \right)& \dots & x\left(t+\left(m+N-1\right)\tau \right)\end{array}\right] $$
(11)

The practical minimum size of \( m \) can be chosen based on the lowest frequency of interest and the lag \( \tau \) can be set to 1,

$$ m\ge \frac{f_s}{f_l},\tau =1 $$
(12)

where \( {f}_s \) denotes the sampling frequency and \( {f}_l \) denotes the lowest frequency of interest in the acquired signal. For the signals described here, we derived values for \( m \) and \( \tau \) in this manner, and over the series of the tourism demand, the choice of \( m=65 \) and \( \tau =1 \) proved optimal. If the choice of the lag term \( \tau \), the delay vector size \( m \) and number of lag vectors \( N \) is adequate, then the embedding matrix in equation (11) is rich in information about the temporal structure of the measured data. The overall time series prediction scheme is presented in the following Fig. 1:

Fig. 1
figure 1

The proposed ICA-ARIMA forecasting scheme

Once the prediction is performed in the ICA subspace for every component separately, we return to the data space again to reconstruct the predicted time series. Therefore, the ICs must be projected back to the measurement space such that:

$$ {\boldsymbol{\varUpsilon}}^{\iota }={a}_{\iota }{\boldsymbol{y}}_i^T, $$
(13)

where y i is the \( {i}^{th} \) IC (\( i=1,2,\dots p \)), \( {a}_{\iota } \) the corresponding column of the mixing matrix A (the inverse of the separating matrix W ICA estimated by Eq. 7) and Y i the resulting “embedding matrix”. From Y i it is now possible to extract the projected time series \( {\boldsymbol{y}}_i(t) \), by performing an averaging of the rows of \( {\boldsymbol{Y}}_i \), that un-embeds the time series using:

$$ {\boldsymbol{y}}_i(t)=\frac{1}{m}{\sum}_{k=1}^m{\boldsymbol{Y}}_{k,\left(t+k-1\right)}^i $$
(14)

for \( t=1,2,3,\dots N \), where \( {\boldsymbol{Y}}_{k,\left(t+k-1\right)}^i \) refers to the elements of Y i indexed by row k and column \( t+k-1 \).

3 Experiments

3.1 Experimental Dataset

For evaluating the performance of the proposed ICA-ARIMA forecasting method, the occupancy of all tourist accommodations (except from camping sites) in the Region of Western Greece that includes data from the Prefectures of Aitoloakarnania, Achaia and Hlia from January of 2005 till December 2012. All data employed in this study were obtained from the official records of the Hellenic Statistical Authority. It is underlined that Hellenic Statistical Authority has not released any similar data for the period 2013 until now.

There are a total of 96 data points in the dataset and the monthly occupancy series is plotted in Fig. 2. The plot exhibits a long-term of downward trend as well as a strong seasonality of 12 months with the maxima of the occupancy occurring during the high touristic summer season (maximum in August for every year).

Fig. 2
figure 2

Monthly occupancies of all tourist accommodations (except from camping sites) from 2005 (1) to 2012 (12)

In order to test the performance of the proposed method, the collected data is divided into two sets, training data and testing data. In order to further test the efficacy of the ICA-ARIMA prediction method, we have tested the prediction accuracy with a prediction step ranging from 3 to 48 months (4 years).

3.2 Performance Criteria

According to Tay and Cao (2001) and Thomason (1999a, 1999b), the prediction performance of our method is evaluated using measures including mean absolute percentage error (MAPE), and root mean square error (RMSE). MAPE and RMSE were used to measure the correctness of the prediction in terms of levels and the deviation between the actual and predicted values. The smaller the values, the closer the predicted values are to the actual values.

3.3 Experimental Results

In order to estimate the dimensionality of the DE step, we performed extensive experiments using different values of \( N \) (that equals the number of estimated independent components) varying from 2 to 20. Then forecasting was performed for a fixed prediction step of 3, 6, and 12 months and the MAPE and RMSE errors were calculated for the ICA-ARIMA and the classic ARIMA technique (ARIMA(2,1,2) Model with Seasonal AR(12) and MA(12)). Results have shown that the performance of the proposed method is best for \( N=5 \), and it remains constant for values of \( N \) up to 10 while it deteriorates for values greater than 10. Therefore, the value of \( N=5 \) was used throughout our experiments that ensures the best performance as well as the smallest computational complexity.

The proposed method’s performance was initially tested by using the first 84 data points (84 months—7 years from 2005 to 2011) for training purposes and the remaining 12 data points (12 months—1 year, 2012) were forecasted using the proposed ICA-ARIMA as well as the ARIMA forecasting technique. The performance of both methods was compared using the prediction error measurements presented in Table 1. In Fig. 3, the forecasting of the last year (2012) is presented for the two aforementioned methods as well as the actual monthly data. It is clear that the proposed method works well and slightly better than the ARIMA technique, as the MAPE as well as the RMSE indexes indicate (MAPE: ICA-ARIMA:0.0829, ARIMA: 0.0918, RSME: ICA-ARIMA:0.0193, ARIMA:0.0235). Both algorithms were further used to predict the next year’s hotel occupancy for 2013. The results are shown in Fig. 4. In Tables 2 and 3 we present analytically the forecasted occupancy values for the year 2012 as well as the forecasts for the year 2013 with the proposed method as well as the ARIMA prediction model.

Table 1 Performance indices and their calculations
Fig. 3
figure 3

Occupancy prediction for the year 2012

Fig. 4
figure 4

Occupancy prediction for the year 2013

Table 2 Predicted occupancy values for the year 2012
Table 3 Predicted occupancy values for the year 2013

The efficacy of the proposed ICA-ARIMA method was further investigated by calculating the forecasting prediction error using different cases of prediction steps varying from 3 to 48 months (4 years), and compared to the traditional ARIMA forecasting method that was applied on the original observed time series.

The two types of prediction error measurements (MAPE and RMSE) are presented in the following three Figures with respect to different prediction steps. It is clear that the proposed ICA-ARIMA method outperforms the ARIMA forecasting technique especially when the prediction step increases. Furthermore, it is clear that the proposed method’s performance does not depend on the size of the prediction step as in the case of the ARIMA. The proposed ICA-ARIMA technique shows a mean value of the MAPE prediction error \( 0.0713\pm 0.0077 \) while the ARIMA technique shows a mean error value \( 0.2335\pm 0.3679 \). For the ICA-ARIMA technique the mean RMSE prediction error was found \( 0.0203\pm 0.0021 \) while in the case of the ARIMA \( 0.0655\pm 0.1059 \).

In addition, the proposed method works better than the traditional ARIMA technique when used to estimate tourism demand on the high touristic season (June–September), where the mean prediction error (estimated by taking into account only these predicted months) is also invariable of the prediction step and smaller than the average error ARIMA technique presents as Fig. 5 shows (\( 0.0101\pm 0.0024 \) for the ARIMA-ICA method, compared to \( 0.0373\pm 0.0719 \) for the ARIMA model) (Figs. 6 and 7).

Fig. 5
figure 5

The MAPE error for the ICA-ARIMA and the ARIMA forecast method

Fig. 6
figure 6

The RMSE error for the ICA-ARIMA and the ARIMA forecast method

Fig. 7
figure 7

The RMSE error for both prediction methods and the high tourist season (June–September)

4 Conclusion

In this paper we have proposed a new technique to forecast tourism demand from observed time series. The proposed technique projects the data into an orthogonal space using Independent Component Analysis (ICA) combined with Dynamical Embedding to transform the observed time series into a higher dimensional space, and performs forecasting in every Independent Component separately using the ARIMA prediction technique. The forecasted components are then combined in order to move back into the time series space and finally estimate the forecasted time series. Experiments on measures of the occupancy of all tourist accommodation (except from camping sites) in the Region of Western Greece from January of 2005 till December 2012, have proved the efficacy of the proposed method, tested using a wide range of prediction steps compared to the classic ARIMA forecasting method. The prediction error was found to be significantly smaller and almost invariant for a wide range of prediction steps. Furthermore, the proposed method performs well when used to forecast tourist accommodation in high tourism seasons (June–September).