Keywords

1 Introduction

With the massive amount of data accessible on the internet, processing information and collecting the most suitable options among these have become an exhausting task. Therefore, recommender systems are used in many different areas to present the most relevant options to the users according to their expectations. The travel industry is one of the areas that can benefit highly from recommender systems but travel recommendation is complex because several items have to be combined to create a reasonable itinerary. This problem of recommending a trip composed of multiple points-of-interests (POIs) is called the Tourist Trip Design Problem (TTDP) [1].

TTDP solutions focus on personalizing trips by applying different techniques to find POIs that are interesting to the user and create feasible routes that maximize the satisfaction of the traveler while taking certain constraints into account. Therefore, the route is only personalized by the selection of the attractions and consequently, the people with similar interests are recommended similar itineraries. However, a recommender system needs to take another factor into account, the travel style of the user. Two travelers might enjoy the same type of attractions; however, their travel style also affects how they create an itinerary. In this work, we propose a solution to create multi-day itineraries that are personalized according to the user’s travel style using four factors: the level of interest of the user for different POI categories in terms of time spent on visiting POIs, the pace of the traveler, the effort of visiting each POI and the diversity of POI categories expected from the itinerary. These factors are incorporated into the itinerary creation process by altering the visiting duration of the POIs as well as modifying the clustering and route creation algorithms.

The paper is structured as follows: In Sect. 2, we discuss related work on recommending interesting POIs and generating feasible itineraries. Section 3 explains the methodology in incorporating personalization options into the itinerary creation process. In Sect. 4, we present metrics to test the success of the proposed algorithms in an offline experiment. In Sect. 5, we describe the user study we conducted to evaluate the solution from a user’s perspective. Section 6 concludes the paper with a summary and directions for future work.

2 Related Work

Different approaches focus on solving the TTDP by recommending interesting POIs for travelers, creating feasible itineraries and taking travelers’ constraints into account. Du et al. [2] suggest a data mining approach from GPS data to find the most suitable POIs for the travelers using features of scenic spots and the order of visit. Hsueh et al. [3] propose a personalized itinerary recommender system with time constraints using data collected from location based social networks to match users that have similar interests. Aksenov et al. [4] propose using traveler’s dynamic needs and affects to recommend POIs by adding the users’ emotions to the itinerary planning process. Lim et al. [5] collect user interests and travel behaviour through processing the previous routes using geotagged photos to personalize visiting duration of POIs and create itineraries. Zancanaro et al. [6] use unsupervised learning models on the travel pattern data collected from museum visitors to categorize the travel behavior of users.

Bolzoni et al. [7] use an approach to limit the number of POIs visited from a certain category throughout the itinerary to reduce the sensory overload. Chang et al. [8] propose an itinerary planner that creates a trip from a given starting location to an end destination and find out POIs that match users’ interests using a greedy approach. Chen et al. [9] propose a novel approach by inferring user interests to select the interesting POIs and personalize the visiting duration of POIs by comparing how much the traveler spends time at a certain POI compared to an average visit. Luan et al. [10] tackle the issue of diversity of attractions visited within the itinerary by using an ant-colony-optimization strategy where for each itinerary, a trip score that denotes the satisfaction level and an entropy factor which denotes the diversity of the trip are calculated. Chen et al. [11] propose a multi-day itinerary planner that models the problem as a variant of TOP where single-day itineraries are created using travelers’ interests and then are combined to create multi-day trips.

In our solution, we extend existing work by combining several options to create personalized multi-day itineraries and evaluate them both in an offline study as well as from a user’s perspective.

3 Method

We use a time-limit approach [12] for the initial route creation process. This time-limit approach divides the route creation process into two parts: (i) creating clusters with highly rated POIs for each day and (ii) creating routes with the highest \(profit/cost\) ratio. The personalization options are then added to the existing steps as well as adding new steps to assign intensity coefficients for POI categories and to calculate personalizing visiting duration of POIs.

3.1 Calculating Coefficients and Scores Related to POIs

In this section, various coefficients that are used to interpret the interests and the travel style of the user in order to generate feasible routes are explained.

Intensity Coefficients of POI Categories. For each POI category, an intensity score is determined so that the pace of the traveler can be matched with the POIs. We calculated the intensity coefficients by conducting a survey to collect the opinion of participants on the intensity of visiting a certain POI category in terms of physical or mental effort. A score of 1 means that a very low effort is needed to visit POIs in the given category and 5 means a very high effort.

In total, the questionnaire was filled out by 242 people (61.2% female, 38.4% male, 0.4% other). The user study participants were found by sharing the questionnaire on social media and among acquaintances. The distribution of the ages of the participants were 18–24 (41.7%), 25–30 (29.7%), 31–40 (5.4%), 41–59 (21.1%) and 60+ (2.1%). Each coefficient is used to categorize POI categories into low, medium and high-effort groups. For each age group, the POI categories with intensity coefficients that are one standard deviation above the mean intensity score for all categories are marked as high-effort, the ones that are one standard deviation below the mean intensity score for all categories are marked as low-effort, and the rest are marked as medium-effort. Each POI has the intensity coefficient of its respective category. Walking between the POIs is assigned the highest possible intensity coefficient of 5.

Interpreting the Travel Style of the Users. The travel style of the users is affected by many different factors such as the city they are visiting, their physical condition, interests and previous experiences. Therefore, the effect of different factors are taken into consideration and mapped to coefficients, which are then used to tailor the visiting duration of POIs as well as selecting and sequencing of POIs throughout the itinerary.

  • Interest Coefficient: The interest coefficient of POI categories is calculated using the previously visited POIs by the user to understand how much they are willing to spend time at a POI category. Even though two users have a certain POI category as their favorite, their interest level may vary. Therefore, the interest coefficient of a user for the POI category c is calculated as:

    $$ Interest_c = \frac{\sum _{\begin{array}{c} p\in POI_c \end{array}} dur_p / avg_p}{|POI_c|} $$

    where \(POI_c\) is the set of all POIs from category c that has a visiting duration added by the user, \(dur_p\) is the visiting duration added for the POI p and \(avg_p\) is the average duration suggested for POI p by the experts.

  • Visiting Coefficient: Visiting coefficient defines if the user wants to spend longer times at selected POIs or visit different places. A lower visiting coefficient indicates that the user prefers to spend longer times at given POIs and a higher coefficient indicates that the user wants to visit more places and prefers to spend less time at each of them.

  • Effort Coefficient: The effort coefficient defines how much the user is willing to exert effort to visit POIs. The users with higher effort coefficients spend the recommended time at high-effort POIs whereas the users with lower effort coefficients spend less time at high-effort POIs and spend more time at low-effort POIs.

  • Diversity Coefficient: Diversity coefficient defines how much the user prefers to visit different types of attractions instead of visiting their highly rated types of attraction. Users with a low diversity coefficient are recommended highly rated POIs and the users with a high diversity coefficient have a diverse selection of POIs.

In order to collect the preferences of the users and calculate these coefficients, travelers are asked a set of questions through a mobile interface.

Rating of POIs. We calculate the ratings of POIs similarly to the research by Wörndl et al. [13] by using the traveler interest for each category as well as the TripAdvisor scores of the POIs:

$$ score = category\; rating * TripAdvisor \; Rating * log_{2}(amount\; of \; votes + 1) $$

This ensures that places that have low ratings on TripAdvisor or places that have fewer reviews are given lower scores. The ratings calculated for each POI are used during the generation of the routes as the profit of visiting a POI is measured through its rating.

Calculating Visiting Duration for POIs. For each POI, an average visiting duration is recommended by experts. However, these values often don’t match the personal expectations of the user. In order to assign personal visiting durations, firstly, each POI is assigned a minimum and maximum visiting duration, where the minimum duration is half the duration and the maximum is twice the duration recommended by the experts. This ensures that travelers spend a feasible amount of time at the POIs and still get personalized recommendations. For a user u, the personalized visiting duration of a POI p is then calculated as:

$$ VisitingDuration_p = \frac{AverageDuration_p * Interest_p *Intensity_p}{Visiting_u} $$

where \(Interest_p\) is the interest coefficient of the POI, \(Visiting_u\) is the visiting coefficient of the user. \(Intensity_p\) is calculated using the intensity score of the POI calculated in Sect. 3.1 to mark each POI category as low, medium and high-intensity. For each level of effort coefficient, a value between 0.5 and 2 is assigned to each intensity level according to its similarity to the effort coefficient. Therefore the pre-calculated intensity coefficients are selected for each POI category according to its intensity group.

3.2 Clustering POIs into Multiple Days

To have an itinerary with routes that match the travel style of the user, the overall intensity of the POIs in the route must reflect the pace of the traveler. Therefore, the clusters must have different types of POIs with different intensities while keeping closer and highly rated POIs together. This way, the diversity criteria can also be met since each cluster has a variety of POIs from different categories. The basic time-limit approach aims to cluster the highest-rated POIs together and selects the clusters with the highest ratings. However, in certain cases, this approach might yield an uneven distribution of POIs from different categories and the number of POIs in the clusters might be too low for certain POI categories. To solve this problem, each POI is pre-processed within its respective category, rather than the overall score for all categories and each category has POIs marked as must-visit, can-visit and don’t-visits by comparing individual ratings to the average rating of categories similarly in [12]. Then, agglomerative clustering with time limit is applied to the POIs from each category separately to create clusters that only include POIs from that category. In the agglomerative clustering step, the visiting duration of each POI is used as the minimum value of the average and personalized duration.

After the agglomerative clustering is applied to each category and clusters are obtained, the resulting clusters are further merged using a modified version of agglomerative clustering. Instead of adding a time limit to the clusters, the two closest clusters that have POIs from different categories are merged. The resulting clusters have different types of POIs and the top n clusters with the highest ratings are selected to run the routing algorithm. One drawback of this approach is that the clusters obtained are too large to be visited in one day and higher rated POIs might be discarded, since each sub-cluster for a POI category has the time limit as the total visiting duration of that day. To mitigate this issue, the user can specify a higher level of diversity expectation, where a strict-clustering approach is used that assigns the time limit of each cluster as totalDuration/numPOICategories.

Fig. 1.
figure 1

Sample route with diversity options enabled

3.3 Creating a Sequence of POIs for Each Day

To add diversity within the route, the ordering of the POIs must ensure that different types of POIs are selected in succession. One way to implement this is to add a diversity coefficient as a contextual factor: a POI that is selected next has a penalty of 0.5 if it is from the same category of the previous POI on the route and a penalty of 0.75 if it is from the same category with the POI that was added before the last added POI.

For the route creation step, we use the greedy approximation algorithm proposed by Laporte et al. [14]. At each step, the POI with the highest profit/cost ratio is selected, where the profit is calculated as \(rating*penalty\) of the POI with the rating calculated in Sect. 3.3. In certain cases, POIs within a cluster might have longer visiting durations and therefore, a sufficient route cannot be created. Thus, the personalized duration of POIs is used until the time limit is reached during the route creation process. Then, for each POI the algorithm switches to average visiting duration, if it is less than the personalized duration to ensure more POIs can be added to the route. A sample route is shown in Fig. 1.

Fig. 2.
figure 2

Mobile application to present itineraries: (a) Overview of the routes (b) Route of each day on the map and (c) Detailed information about the selected POI

3.4 Generating a User Interface to Present Itineraries

We developed an Android application to present the itineraries to travelers. Users can add information about their age, gender, interests, travel preferences and the POIs they visited before with the time they spent there. The information is then used by the recommender system to generate multi-day itineraries as explained above.

Once the routes are generated, the mobile application shows the time to visit a place, approximate walking duration between POIs and how long to stay there. The interface for a sample three-day itinerary is shown in Fig. 2(a). While showing the routes, the app also shows the route on a map so that the travelers can follow the routes (Fig. 2(b)). The app also presents information about each POI (Fig. 2(c)).

4 Offline Evaluation

To understand if the itineraries follow the constraints given by the user, we evaluated generated itineraries with different intensity and diversity levels in an offline study. To evaluate different intensity levels, we used four variants: the baseline time-limit approach, the low-intensity variant that promotes visiting low-effort POIs for longer times, the medium-intensity variant that balances visiting different types of POIs and the high-effort variant that promotes visiting higher effort POIs for longer times. To evaluate diversity levels, three variants are used: the baseline time-limit approach, the low-diversity variant that promotes diversity using modified time-limit clustering only and the high-diversity variant that enables diversity both in clustering and the route creation process.

4.1 Intensity Score of Routes

For each route, the intensity of the route r with n POIs, where the first POI is the hotel of the user, is calculated as:

$$\begin{aligned} Intensity_r = \frac{\sum _{i=0}^{n}walkIntensity_{i,i+1}+\sum _{i=1}^{n}Intensity_{i}+walkIntensity_{n,0}}{duration_r} \end{aligned}$$
(1)

where the intensity score of POI i is \(Intensity_i\) and

$$\begin{aligned} Intensity_i = intensityCoeff_i * visitingDuration_i \end{aligned}$$
(2)

where \(intensityCoeff_i\) is the coefficient calculated for the POI category of i in Sect. 3.1. \(walkIntensity_{i,j}\) is the intensity score of walking between POIs i and j and

$$\begin{aligned} walkIntensity_{i,j} = intensityCoeff_{walking} * duration_{i,j}\end{aligned}$$
(3)

where \(duration_{i,j}\) is the time to walk from i to j and \(duration_r\) is the total duration of the route.

Fig. 3.
figure 3

Mean of intensity scores for the algorithm with different intensity levels (a) and coefficient of variation of intensity scores for the algorithm with different intensity levels (b)

For different intensity variants, the mean of intensity scores of routes are shown in Fig. 3(a) and the coefficient of variation of intensity scores are shown in Fig. 3 (b). The time-limit approach yields similar results to the medium-intensity variant because the personalized visiting duration of POIs are similar to the recommended duration that are used for the time-limit algorithm. For all routes, the low-intensity route has a lower intensity score than other variants and high-intensity variant produces a higher mean intensity score than others in some cases. One possible reason for high-intensity variants not having significantly higher intensity scores than others is due to the dataset including less POIs from high-intensity categories.

4.2 Diversity Score of Routes

The mean diversity score of each route is calculated by giving a penalty to the POI, similarly to the penalty explained in Sect. 3.3. For a trip with n days, where each day d has \({k_d}\) POIs selected, the mean diversity score is:

$$ \overline{div} = \frac{\sum \limits _{d=1}^n \frac{\sum \limits _{p=1}^{k_d} {penalty_p}}{k_d}}{n} $$

For different diversity variants, the mean diversity scores for 1 to 7-day trips are shown in Fig. 4(a) and the coefficient of variations are shown in Fig. 4(b). The diversity score is lowest for the baseline time-limit approach, since the POIs selected are mostly from the highest rated category. As expected, the high diversity variant has a higher diversity score than other variants and a lower coefficient of variation since it enables diversity within route creation step and chooses the POIs while checking the previously selected POIs.

Fig. 4.
figure 4

Mean diversity score for the algorithm with different intensity levels (a) and coefficient of variation of diversity score for the algorithm with different diversity levels (b)

5 User Study

In addition to the offline evaluation, we conducted a user study to evaluate the different options from a user’s perspective. To ensure that the routes generated for different participants match their travel style, we created three personas to reflect different preferences: adventurers, museum lovers and foodies. Each persona has different interest coefficients for the POI categories, different intensity and visiting coefficients as shown in Table 1. The participants of the user study were asked to select the persona closest to their travel style and given a detailed explanation to evaluate the routes according to the pre-defined travel styles.

Table 1. Coefficients to calculate visiting durations of POIs for each persona

To compare the different algorithms and options, we created five different types of routes from two different hotels as starting points for a total of ten routes for each persona. The algorithms had enabled personalization options of user interest, personalized visiting durations and diversity in different levels to measure the effect of these options on user satisfaction. The personalization options for each algorithm variant are shown in Table 2.

Table 2. Personalization options enabled in each variant for the user study

For each route, the participants were shown the same 10 routes that were created for their persona and were asked to answer the following questions on a Likert scale ranging from strongly disagree (1) to strongly agree (5):

  1. 1.

    The recommended points-of-interests match the selected travel style

  2. 2.

    The number of points-of-interests on each day of the trip is good

  3. 3.

    The points-of-interests visited throughout the itinerary are diverse

  4. 4.

    The time spent visiting each point-of-interest is suitable for the selected travel style

  5. 5.

    The overall intensity of the itinerary matches the selected travel style (The itinerary isn’t too light or too tiring)

  6. 6.

    The recommended itinerary matches the selected travel style overall

  7. 7.

    I’m satisfied with the overall recommendation

In total 53 people (66% female, 34% male) participated in the user study. The user study participants were found by sharing the questionnaire on social media and among acquaintances. The distribution of the ages of the participants are 18–24 (14.8%), 25–30 (33.3%), 31–40 (14.8%), 41–59 (29.7%) and 60+ (7.4%). The selected personas of the participants were adventurers (39.6%), museum lovers (28.3%) and foodies (32.1%). Table 3 shows the results for each of the seven questions.

Table 3. User study results (* \(p<0.1\), ** \(p<0.05)\)

We performed a one-way ANOVA test to test significance. The results of the ANOVA test showed that for questions 4 and 6, the differences between the algorithms were significant on the level of \(\alpha = 0.05\) and for questions 3 and 5, it yielded significant results for the level of \(\alpha = 0.1\). We applied further pairwise comparison tests with a significance level of \(\alpha = 0.05\) for the questions that yielded significant results to compare the algorithms. In terms of diversity, all algorithms performed significantly better than the TL approach. In terms of visiting durations of POIs, TLVD recommended significantly better visiting durations for POIs than the PBTL, TL and MTL variants, also MTLVD performed significantly better than the PBTL approach. In terms of matching the expected intensity level, MTL and MTLVD performed significantly better than the TL approach. In terms of matching the overall travel style of the user, TLVD performed significantly better than all other approaches except MTLVD.

6 Conclusion and Future Work

In this work, we investigated different factors that affect the travel style of the travelers when generating personalized multi-day round trip itineraries. The results showed that personalizing the visiting duration of POIs provided better itineraries in terms of matching the pace of the traveler and also in providing more satisfying itineraries. The users found the visiting duration of POIs more suitable than the suggested average durations. The approaches used for clustering POIs and route creation were also successful in increasing diversity in the itineraries. Overall, the proposed personalization options provided better and more enjoyable routes than their non-personalized counterparts.

For future work, each POI can be evaluated individually to calculate the intensity coefficient, since the category of a POI is not enough to reflect its intensity. To better understand the travel style, the user can select periods to rest for each day or specify time for a lunch break. The pace of the user can also be interpreted better by understanding the physical condition of the user such as collecting average walking distances from smartphone sensors to tailor walking distances. In terms of route creation, the current approach follows the shortest paths between the POIs to generate the directions and as a future work, routes can be altered to include scenic walking routes to create more enjoyable trips.