Keywords

1 Introduction

Tourism is a major industry in the global economy and plays a significant role in our lives. To create enjoyable and fulfilling travel experiences, it is crucial for tourists to consider potential destination cities, local attractions, cultural experiences, and route planning. However, much of the imagery and information provided by travel enterprises and local governments may have been beautified to attract tourists from a marketing and publicity perspective [1], potentially leading to untrustworthy information. Therefore, information based on user-generated content is necessary to assist tourists in deciding on destinations and attractions to visit.

Moreover, personalized POI and route recommendations also face challenges when dealing with tourists who have little or no historical records (e.g., check-ins, tweets, images, etc.), which is known as cold-start user scenarios. To make personalized recommendations in cold-start user scenarios, extra user effort is required to learn user preferences [2]. One direct way to acquire information from a new user for personalized recommendation is to provide POIs for the user to rate or attributes to select [3]. However, this can pose a challenge for users who are unfamiliar with the destination city, as they tend to select only what is familiar to them or what they expect to enjoy, which leads to a lack of exploration in the destination city.

Another challenge that travel planning systems face is the overtourism problem: large numbers of visitors can cause overcrowded attractions and traffic congestion, which leads to diminished experiences for both tourists and local residents. Although there has been an increasing interest in congestion-aware route recommendations to handle this task, the application of these recommendations in travel planning systems remains limited.

In this paper, we propose U-KyotoTrip, a travel planning system designed to offer information and recommendations for spots and routes in Kyoto, Japan, renowned as one of the world’s most famous destinations. We developed an iOS application to demonstrate how U-KyotoTrip assists users in information acquisition and generates recommendations for spots and routes. To supplement user-experience-based information in the information search process, we utilize spot information presented in Kyoto Sightseeing Map 2.0 [4]. In order to provide personalized recommendations, especially for users new to the application, we propose user-friendly methods for acquiring information and utilize the POI recommendation approach proposed in [5]. For new users, they can express their preferences by uploading their own photos or by rating several random images from the database. Their preferences are captured based on their responses, and the POI and route recommendations are generated based on the inferred preferences of the users. Moreover, to provide personalized congestion-aware route recommendations, we employed route recommendation methods that have varying degrees of consideration for congestion and tourism diversification. Users can choose from various route recommendation methods based on their preferences.

The contributions of this work are summarized as follows.

  • We propose a travel planning system which integrates spot information based on user generated contents, POI recommendation and route suggestion.

  • To address cold-start user scenarios, in Sect. 3.3, we introduce novel photo-based information acquisition methods, which are more user-friendly compared to existing approaches.

  • To address the issue of overtourism, in Sect. 3.5, we utilize a variety of congestion-aware methods for route recommendations. Users can select routes with varying levels of congestion consideration based on their preferences.

  • To evaluate the proposed U-KyotoTrip, we conducted an empirical user study. The results of the user study demonstrate that users are satisfied with U-KyotoTrip, the recommended spots and the approach that utilizes photos to present preferences.

2 Related Work

Numerous studies have focused on mining tourism spots based on user-generated content to discover attractions and provide information. Zhuang et al. [6] analyzed geotagged images and comments uploaded on social networking sites, discovering attractions through clustering. Katayama et al. [7] developed a system that identifies potential spots using location information from tourists’ historical records. Xu et al. [4] introduced Kyoto Sightseeing Map 2.0, a web-based application that employs comprehensive content analysis of information from user-generated content to assist travelers in their information search process.

There has been a growing interest in designing mobile applications for POI or route recommendations [8]. Vansteenwegen et al. [9] introduced a city tour planner featuring a Greedy Randomised Adaptive Search Procedure (GRASP) [10] as its core planning engine. Brilhante et al. [11] presented TripBuilder, a web-based application that leverages POI collections from Wikipedia and albums of photos from Flickr to provide POI recommendations. Gavalas et al. [12] proposed eCOMPASS, a mobile application that integrates public transport options into recommended trips. Wang et al. [13] proposed a personalized crowd-aware trip recommendation algorithm that considers the most crowded times at tourist spots. Khodadadian et al. [14] addressed the time-dependent orienteering problem with time windows and service time-dependent profits. Herzog et al. [15] developed TourRec, a context-aware tourist trip recommender system capable of suggesting routes for both individual and group users.

However, most existing methods focus solely on providing spot information, recommending POIs, or recommending routes. In contrast, the proposed U-KyotoTrip offers spot information and recommendations for both POI and route, presenting users with a comprehensive solution for their travel planning needs. With U-KyotoTrip, users can complete their travel planning all at once. Furthermore, many existing methods require new users to select attributes at the start, which can be relatively challenging for those unfamiliar with the destination city. In comparison, our approach offers more user-friendly methods to learn user preferences, enhancing the usability of the proposed application. Moreover, the route recommendation module in U-KyotoTrip integrates route recommendation methods that consider varying levels of congestion. These congestion-aware methods can assist users in avoiding overcrowded attractions based on their tolerance for congestion.

3 System Overview

3.1 System Architecture

U-KyotoTrip, as a travel planning system, assists users in discovering tourism information and provides recommendations for POIs and routes. The foundational concepts of U-KyotoTrip are drawn from [4, 5, 16, 17], and [18]. demonstrated in Fig. 1, U-KyotoTrip comprises four modules: information display, preference capture, POI reccommendation, and route recommendation. The iOS application of U-KyotoTrip is developed using Node.js.

Fig. 1.
figure 1

System architecture.

3.2 Information Display Module

For each spot, U-KyotoTrip provides comprehensive information through user-generated content and analysis results presented in Kyoto Sightseeing Map 2.0 [4]. This information includes the pictures of spots uploaded by locals, statistical graphs showcasing aesthetic quality and image counts for each POI, congestion information, and reviews contributed by local residents. This wealth of information enables users to make well-informed judgments and establish reasonable expectations of tourist spots, aiding them in deciding whether to visit a particular destination.

3.3 Preference Capture Module

For users who have previously utilized U-KyotoTrip, POI and route recommendations can be generated based on their historical records. For new users who are unfamiliar with U-KyotoTrip, one of the following three approaches is required to understand their preferences.

  • The new user can review spot information and mark certain spots as either “want to visit” or “do not want to visit”.

  • The new user is asked to rate several photos provided from the server. Based on the ratings of the photos, the user’s preferences are inferred.

  • Alternatively, the new user can upload photos to the server. Descriptive words for these photos are extracted using the scene detection method proposed by [19]. These extracted words are considered as keywords representing the user’s preferences, which are then used to infer their preferences.

Based on the input, the Pseudo-Rating Mechanism (PRM) proposed in [5] is used to infer the user’s preferences. We provide a detailed demonstration of the preference capture process for the approach of rating photos. As for the other two approaches, we can substitute the images with locations or extracted words.

Suppose n images from \(N \, (n<<N)\) records of other users are randomly provided to a new user \(u'\) to capture this user’s preference, and \(u'\) rates the images. We denote the rating information by \(R = \{ (r_k,l_k,w_k, I_k)\}_{k=1}^n,\) where \(r_k, l_k, w_k\) represent the rating, location and keywords of the image \(I_k\), respectively. Under the assumption that every user’s ratings of all images \(\{r_k\}_{k=1}^N\) follow a known distribution \(F_{rating}\), the ratings can be inferred and used to predict the spots and behaviors that \(u'\) may prefer. Denote the set of users, spots, and keywords in the database as U, L, and W, respectively. For arbitrary location \(l_i \in L\) and keyword \(w_j \in W\), the visit probability is

$$\begin{aligned} p(l_i,w_j|R) = \sum _{u \in U} p(l_i,w_j|u)p(u|R). \end{aligned}$$

Applying Bayes’ theorem,

$$\begin{aligned} p(u|R) &\propto p(R|u)p(u) = \prod _{k=1}^{n}p(rating(l_k, w_k, I_k)=r_k|u)p(u). \end{aligned}$$

Under our assumption of rating distribution,

$$\begin{aligned} p(&rating(l_k, w_k, I_k)=r_k|u) = p\Big (\frac{N +1- rank(l_k, w_k, I_k)}{N} = F_{rating}(r_k)\Big |u\Big ). \end{aligned}$$

where, \(rank(l_k, w_k, I_k)\) is the rank of \((l_k, w_k, I_k)\) in N images. We approximate the distribution of \(rank(l_k, w_k, I_k)\) using Monte Carlo sampling. The captured preferences are used for subsequent POI and route recommendations.

3.4 POI Recommendation Module

Based on the captured preferences, the User Experience Model (UEM) proposed in [5] is employed to recommend spots. The UEM considers not only “where to visit” but also “what to do” at attractions. The UEM used in U-KyotoTrip is trained using geo-tagged photos in YFCC100M Kyoto [20] as the data source. Descriptive words for the photos are extracted using the scene detection method proposed in [19]. Out of a total of 91 spots, the system recommends the top 10 spots that may interest the user.

Fig. 2.
figure 2

The interface of U-KyotoTrip.

3.5 Route Recommendation Module

If the user has decided and selected all the spots to visit, U-KyotoTrip could provide the shortest route based on the user’s choices using GRASP [10].

If the user wants the recommended route to include spots that are not initially selected, the route recommendation module of U-KyotoTrip offers the option to provide routes using the following route recommendation methods. These methods have varying degrees of consideration for congestion and tourism diversification. These methods are trained using the dataset of Kyoto Sightseeing Map 2.0 [4] and trajectory data used in [18] and [17].

  • MARLRR: An multi-agent reinforcement learning approach proposed in [16] for solving the congestion-aware route recommendation task. This method considers that several groups are moving simultaneously, and a congestion penalty of reward function is introduced to avoid overconcentration at spots.

  • RPMTD: A multi-agent reinforcement learning based route recommendation method proposed in [17] that considers both multiple users accessing simultaneously and the congestion at attractions. This method introduces a dual-congestion mechanism, in which both the local congestion at visited spots and the global distribution of tourists affect tourists’ satisfaction.

  • Non-Dual RPMTD: An alternative version of RPMTD that does not consider the global distribution of tourists in the dual-congestion mechanism.

  • Pointer-NN: A reinforcement learning approach to the Orienteering Problem with Time Windows proposed in [21], without considering congestion.

  • TRGCSC: A reinforcement learning approach proposed in [18], which is based on [21] and introduces two novel concepts, “dynamic stay duration” and “environmental tax metaphor.” The former concept estimates the necessary stay duration at a spot based on its congestion, and the latter concept assigns dynamic rewards depending on congestion at spots.

3.6 Interface and User Interaction

Figure 2 demonstrates the user interface of U-KyotoTrip. In Fig. 2(a), upon user login, the map displays a total of 91 spots in Kyoto. Clicking on a marker on the map leads the user to the corresponding spot’s information page. The upper part of this page, as shown in Fig. 2(b), provides photos of spots uploaded by locals, statistical graphs presenting aesthetic quality and image counts for each spot, and congestion details. The lower part of this page, as shown in Fig. 2(c), provides reviews from local residents. Within the spot information page, users can select from three options: “want to visit,” “do not want to visit,” or “cancel.”

The user can utilize the button located at the top left corner of the map display page to access the map view, perform POI recommendations, or route recommendations, as illustrated in Fig. 2(d). For new users who are using U-KyotoTrip for the first time, POI and route recommendations are not available until they complete the “ask-to-rate” step to capture their preferences. As depicted in Fig. 2(e), users can initiate a brief questionnaire by clicking the first button labeled “Rate photos,” which presents several photos for rating. Figure 2(f) illustrates the presentation of photos for rating. These ratings contribute to capturing users’ interests and preferences as described in Sect. 3.3. Alternatively, users can click the second button labeled “Upload photos” in Fig. 2(e) to express their preferences by uploading photos. Moreover, if the user has already marked certain spots as “want to visit” or “do not want to visit,” the third button enables them to directly obtain recommendations.

Subsequently, as illustrated in Fig. 2(g), the recommended spots are shown on the map. Similar to the map page in Fig. 2(a), users can click on a marker to access spot information and mark it as “want to visit” or “do not want to visit.” As shown in Fig. 2(h), the marker for the “want to visit” spot will turn green, while the marker for the “do not want to visit” spot will turn gray.

After the user has selected a few spots they wish to visit, they can click the “Recommend route” button. The system will return the shortest route using GRASP. Alternatively, the user can also utilize congetsion-aware route recommendation methods from the menu page. This action takes them to the route recommendation page, depicted in Fig. 2(i), where they can input the start location, end location, start time, and time budget. The user can select a spot or an arbitrary coordinate as the start or end point, thereby providing flexibility to the route recommendation system. Figure 2(j) illustrates the recommended route.

Table 1. Demographic profile of respondents.

4 User Study

4.1 Participants and Setting

We conducted a user study in Kyoto, Japan to evaluate the proposed U-KyotoTrip. A screening questionnaire was distributed to 41 participants. The respondents’ demographic characteristics are listed in Table 1. We had participants experience the tour planning process as new users. Each participant was provided with 10 random photos and asked to rate these photos, capturing their preferences. POI reccommendations and route suggestions were made based on the inferred preferences. Participants could explore spot information on the map, as shown in Sect. 3.6. After reviewing the recommendations, participants were asked to respond to the following questions:

  • Q1. How many of the recommended spots did you like?

  • Q2. Did you find the number of recommended spots too many or too few?

  • Q3. Did you find the types of recommended spots too many or too few?

  • Q4. Did you find the popular spots too many or too few?

  • Q5. Did you find less-known spots too many or too few?

  • Q6. How satisfied did you feel overall with U-KyotoTrip?

  • Q7. Did the recommended spots meet your satisfaction?

  • Q8. Did you find the use of photo ratings to express preferences satisfactory?

  • Q9. How satisfied did you feel with the route recommendation methods?

Additionally, we asked respondents to provide written comments on the recommended spots, the use of photo ratings, and the overall feeling of U-KyotoTrip.

Fig. 3.
figure 3

Survey results.

4.2 Survey Results

Figure 3(a) illustrates the results for Q1, showing that 84% of the respondents liked 6 or more of the recommended spots, and 54% of the respondents liked 8 or more of the recommended spots. This suggests that the recommended spots are considered satisfactory for most respondents.

Figure 3(b) illustrates the results for Q2, Q3, Q4, and Q5. For the number of recommended spots, 68% of the respondents selected “just right,” 20% chose “too few,” and 12% opted for “Too many.” As for the types of recommended spots, 51% noted “Too few,” 44% selected “Just right,” and 5% thought “Too many.” Regarding popular spots and less-known spots, many respondents thought popular spots are “Too many,” and less-known spots are “Too few.”

Figure 3(c) displays the responses to Q6, Q7, and Q8. While a few respondents expressed dissatisfaction, the majority were “Satisfied” or “Very satisfied” with the application, the recommended spots, and rating photos.

Figure 3(d) illustrates the satisfaction with route recommendation methods. Pointer-NN is the most satisfactory method for respondents, with a total of 67% of the respondents selecting “Very satisfied” or “Satisfied.” This is followed by TRGCSC, with a total of 51% of the respondents selecting “Very satisfied” or “Satisfied.” Non-Dual RPMTD (40%), RPMTD (41%), and MARLRR (40%) achieved similar performance. More details concerning the evaluation of route recommendation methods can be found in [22].

Fig. 4.
figure 4

Clustering results.

4.3 Clustering Analysis

To explore further subjective satisfaction with the recommended spots, the approach of expressing preferences through photo ratings, and the overall feeling of U-KyotoTrip, we conducted a clustering analysis on the textual comments provided by respondents.

We collected a total of 1,489 word comments from 41 participants. We preprocessed the data by removing punctuation, converting to lowercase, reducing word forms, and filtering stop words using Spacy [23]. The Skip-Gram model in Word2Vec [24] was utilized to embed words into high-dimensional vectors, capturing their semantic meanings. We employed DBSCAN [25] to cluster similar comments for each question, and UMAP [26] to reduce the comment vectors to two dimensions for visualization purposes. The results are presented in Fig. 4. Furthermore, to gain insights into the features of each respondents’ cluster, we created word clouds to visualize the most common words in each cluster, as illustrated in Fig. 5, Fig. 6, and Fig. 7.

Regarding the recommended spots, the comments are divided into 4 clusters. As illustrated in Fig. 5, all clusters exhibit a significant proportion of positive sentiment, suggesting that respondents are satisfied with the recommended spots. Meanwhile, Cluster 1, Cluster 3, and Cluster 4 express concerns about potential overcrowding at these recommended spots.

The comments regarding the use of photo ratings are divided into 4 clusters, and the word clouds are shown in Fig. 6. On one hand, numerous respondents convey a positive sentiment, indicating that the utilization of photo ratings is a simple but effective way to capture users’ preferences. On the other hand, a few respondents express that some provided photos may have been beautified for attractiveness, potentially leading to inaccurate preference capturing.

For the overall feeling of U-KyotoTrip, the comments are divided into 3 clusters. As demonstrated in Fig. 7, all the clusters agree that the application is easy to use and can assist users in travel planning. Respondents in Cluster 1 express satisfaction with U-KyotoTrip due to the high-quality recommendations it provides. Cluster 2 respondents emphasize that this application enables tourists to easily explore scenic spots, assisting users in making decisions about which attractions to visit. Clusters 3 respondents emphasize the term “convenient,” “concise,” “saving time,” and “simple,” indicating their high satisfaction with the user-friendly nature of U-KyotoTrip.

Fig. 5.
figure 5

Word clouds of comment clusters on recommended spots.

Fig. 6.
figure 6

Word clouds of comment clusters on rating photos.

Fig. 7.
figure 7

Word clouds of comment clusters on U-KyotoTrip.

4.4 Discussion

As demonstrated in Sect. 4.2, most of the respondents are satisfied with the proposed U-KyotoTrip and the recommended spots. This indicates that the system effectively captures the preferences of most participants, resulting in a high level of satisfaction with the suggested spots. One limitation of the system is that many respondents found the variety of spot types to be insufficient. The POI recommendation module tends to recommend spots of similar types to users after capturing their preferences. Another limitation pertains to the number of less-known spots. The participants who favor less-known spots expressed that the quantity of such spots provided was inadequate. Therefore, future research on the inclusion of diversification and more less-known spots in POI recommendations is necessary to establish more satisfactory travel planning systems.

Regarding the approach of presenting preferences through photo ratings, the majority of respondents expressed that their preferences were satisfactorily captured. The comment analysis indicates that most participants appreciate the visual experience and find it intuitive. However, some participants express concerns that the provided photos may have been enhanced for attractiveness, potentially leading to inaccurate preference capturing. Incorporating 360-degree views or videos would further enhance the user experience.

Concerning the route recommendation methods, respondents show a tendency to favor Pointer-NN, which does not consider congestion, over the congestion-aware route recommendation methods. This phenomenon suggests that the consideration of congestion could not always improve tourists’ satisfaction. Although local residents and governments may benefit from these methods, the tourists might experience dissatisfaction with non-ordinary trajectories, particularly when the recommended less-known spots are far from other spots.

5 Conclusion

In this paper, we propose U-KyotoTrip, a user experience-oriented travel planning system designed to assist users in planning their trips. U-KyotoTrip integrates spot information, POI recommendation, and route recommendation, offering users a comprehensive solution for their travel planning needs. We evaluated U-KyotoTrip using an empirical user study. Most participants expressed that U-KyotoTrip is enjoyable to use for information search and recommendations, and the utilization of rating photos effectively captures participants’ preferences. Two major limitations are that the variety of recommended spot types tends to be too few, and the system tends to prioritize suggesting popular spots over less-known ones. Therefore, future studies on diversifying POI recommendations are needed to establish more satisfactory systems. Furthermore, the comparison of route recommendation methods reveals that the consideration of congestion may not always improve tourists’ satisfaction. Further research should be undertaken to explore the balance between tourism diversification and satisfaction.