Keywords

1 Introduction and Background of Research

Tourism has a great relevance in the national economy representing 8.7% of Mexico’s GDP, coupled with the widespread use of smartphones in our society, this represents an area of opportunity for innovation in this sector. Also, one should consider the extension of the national territory, as well as the variety of tourist attractions [1]. The state of Querétaro has been chosen in this study because it offers a series of tourist options ranging from: The Biosphere Reserve “Sierra Gorda”, The Art, Cheese and Wine Route, The Six Magical Towns (Amealco, Bernal, Cadereyta, Jalpan, San Joaquín and Tequisquiapan) [2], as well as visits to museums and other activities that visitors may enjoy.

According to Google Trends, by May 2023, travel searches in Mexico are returning to pre-pandemic levels with annual peaks every July. And interest in the state of Querétaro between July 2022 and March 2023 surpasses destinations such as Cancun and Puerto Vallarta [3].

The tourism sector today faces new realities, supported by technology for decision-making when selecting a tourist destination. The choice is not determined exclusively by cost, distance, or the quality of the facilities, but also by the experience and opinion shared by other tourists in those places. The use of technological tools to promote tourism has been limited to the use of repositories or websites containing generic publications and information submitted by some institution or organization to “promote” tourism.

Nowadays, the selection of a tourist destination is related to the use of smartphones and different mobile applications. This represents a frequently used tool by visitors to different tourist areas, often focusing on the most visited or promoted places. Part of the success of the most visited sites is based on the opinions and recommendations that are read and heard on different social networks.

Social networks have become a tool for tourism promotion, “the means of dissemination involve both traditional media and virtual spaces, within which social networks and the various mechanisms of interaction with groups of people with the support of technology (blogs, wikis, etc.) stand out” [4]. Unfortunately, tourism promotion supported by social networks is limited to the publication of opinions, photographs, location of a place or location on a map and the evaluation of the place or service.

The main social networks cover different needs, which range from language selection, the ability to issue opinions, having a personal profile account, being able to establish direct contact between users, adding photographs of their experiences, the existence of discussion forums, all of this based on the personal experiences of users in tourist sites.

It should be noted that the information published on social networks is not focused on the experience lived in the place, and tourists are forced to manually search for information on multiple sites or social networks; thus, those interested must filter, highlight, analyze, compare and evaluate the information obtained from these sites, and in the case of the opinions provided by other visitors, they must go through a similar process to be able to make a decision and express a personal assessment of the site.

Depending on the type of user, communication strategies, contact or promotion of a tourist site will also be used, we can also find that depending on the type of information that you want to know or share, you should choose between the different applications of the existing social networks.

Among the most popular free applications is Twitter, which offers users the possibility of knowing information on different topics or people in real time, the preferences, desires, needs and opinions of other users, as well as allowing rapid connectivity between them and stands out for its simplicity of use.

We have used McKinsey’s decision-making model as a reference to diagram the consumer’s decision-making journey, as it allows us to understand and truly measure the consumer’s behavior in their purchasing decision with a high flow of information on Twitter.

In the model, we can see how the customer begins to learn about a product or service they have been thinking about trying or buying from search engines and social networks to their interaction on social networks to express their feelings about consuming the product or service before, during and after the purchase. This information remains static on multiple platforms and social networks, the analysis work can be exhausting to identify the criteria that are decisive for your purchase.

Therefore, we developed an application that performs the analysis of the opinions or comments (perception) made by tourists on a social network (Twitter), to make a concentrated and a positive, negative or neutral classification of the tourist towards the travel experiences to the various tourist attractions, such as: landscapes, endemic products, hotels, restaurants, archaeological sites, churches, ecotourism, tourist information, festivals and events using segmented hashtags to classify the information and create a high-impact strategy that serves for tourists and entrepreneurs.

2 Methodology

In this study we used mixed research methods that allow us to integrate different techniques such as desk research, surveys, and interviews. The triangulation of these research and data methods allows for a holistic understanding of reality and increases the degree of credibility of the analysis of qualitative information that allows us to design a web application that enables potential customers to make decisions based on customer experience. In this methodology, theory and data constantly interact to have a holistic understanding of reality based on the behavior, feelings, and discourses of the researched people. We have carried out the analysis of secondary information to help us contextualize the problems raised. In the following, we will present the multiple case study and the secondary analysis.

2.1 Multiple Case Study

  1. a.

    Survey of tourists who use social networks to select products or services.

  2. b.

    Analysis of the interactions of tourists on social networks sharing their feelings before, during and after the purchase.

  3. c.

    Analysis of the functioning of mobile applications for the tourism sector in Querétaro.

2.2 Secondary Data Studies

  1. a.

    Analysis of feasible artificial intelligence algorithms for the project.

  2. b.

    Investigations of artificial intelligence applied to solutions in the tourism sector.

Figure 1 summaries the research methodology, triangulating the boxes grouped in the ellipse of dashed lines used for this study.

Fig. 1
A diagram represents a circle connects to surveys of tourists using social networks, analysis of the interaction of tourists in social networks, analysis of the functioning of mobile applications, analysis of viable A I algorithm for the project, and investigation of A I applied in solutions.

Methodological triangulation of the research

3 Discussion and Results

A survey of twenty questions was developed to answer the applicability of social networks to expose their feelings regarding their consumption of tourism products and services. Responses were given on three types: rating scale, yes/no, and open-ended questions. A survey designed on the Google Form platform was shared in Facebook groups for traveler’s during the first half of 2022. An interpretative and descriptive analysis of the data obtained from the 680 survey participants was carried out. A comparative table of the functionalities of the applications was also made Querétaro Enamora and Querétaro (Tourist Guide); with the intention of assessing the viability of the proposal that introduces a technological novelty in tourism promotion and planning. Referring to the social context and the purpose of the creation of this system, even though Mexico has a very large tourist industry, the very title of the present work is limited to a single case study, in this case the state of Querétaro. For this case study we will have the information of some of the tourist attractions, highlighting that the system only contemplates the registration, the tracing of the route, and the possibility of registering a new perception on Twitter, a social network from which we will obtain and capture information on perceptions that have been made before within the same progressive web application. Specific sites will be delimited according to the coordinates, and in case of having additional information, the address of the site or place will be available.

As for the documentary analysis, different books and scientific articles on artificial intelligence tools and systems were reviewed to detect the areas of application and limitations. Thinktur’s definition of Artificial Intelligence, Artificial Intelligence is defined as the ability to process and interpret information to carry out actions considered intelligent by replicating characteristically human behavior and reasoning [5]. For this study, we consider that the incorporation of artificial intelligence systems has modified marketing and distribution channels, creating new paradigms when it comes to offering experiences and customizing the offer, the diffusion of which is being accelerated even more since the pandemic.

From the user’s perspective, the following spheres of interaction can be differentiated: the personal, delimited by the person’s own physical body; with the immediate environment and with the surroundings, destination, nature, or city. Offering new opportunities to connect the physical world with the digital world, enabling a more active interaction between tourists and tourism products and services and destinations [6]. The services most used by these systems in the tourism field of application of this type of tools are geolocation technologies, virtual concierges, virtual tours and globalizers. It is also frequently used by retail distributors to better understand and personalize the tourists’ experience throughout the travel cycle, through dynamization, promotion, sales, and loyalty actions [6]. Therefore, these tools help the real personalization of the service and experience, allowing the optimization of the user’s offer to achieve an increase in sales globally, according to the capabilities and knowledge of each company [5].

For this tourist travel mobile application, we obtained the comments and opinions on Twitter that belong to a hashtag (#) of a locality or tourist site in the state of Querétaro, the hashtag identifies digital content associated with a particular topic, related to a specific tourist site. The comments associated with the hashtag of the site or location are processed with an Artificial Intelligence model that allows classifying each of the comments obtained as positive, negative, or neutral, resulting in a concentrated classification of all the opinions of the site. Figure 2 shows the operation process of the application described above.

Fig. 2
A flow diagram starts with installation of the application on the mobile device, points to main view, main screen, pop up window of a site, trip started, request to publish tweets, to tweets made with your analysis in the installed application.

The environment and the use of the progressive web application can be visualized from the installation on the cell phone to the feedback interaction on twitter to re-evaluate the tourist sites

To understand the implementation of the model, we will explain how the “sentiment analyzer” was implemented in Fig. 3, which a priori is described as a classifying algorithm (in this case and for this project) of texts that once processed through mathematical models used by the algorithm, numerical values are obtained that, with the correct interpretation, can indicate whether the text has a positive, negative or neutral connotation or perception related to the text itself.

Fig. 3
A flow chart starts with sentence, points to keywords, data cleansing, polarity is less than 0.2 but greater than 0, naive Bayes algorithm, with training set and with standard set, to positive, neutral, and negative recommendation.

The diagram visualizes the general operation of the text parser/classifier algorithm, which interprets sentiments to identify whether the text is of a positive, negative or neutral nature

As mentioned above, this parser is implemented thanks to the Naive-Bayes Theorem, Python programming, the TextBlob library and NLTK. These two together contain a sentiment analyzer based on the Naive-Bayes Theorem that has been trained with comments on English movie reviews. The Naive-Bayes Theorem is described by the following Eq. 1. Naive-Bayes Theorem.

$$P\left( {A\backslash B} \right) = \frac{{P \left( {B\backslash A} \right)*P\left( A \right)}}{P\left( B \right)}$$
(1)

Which is entirely based on Bayes’ Theorem of probability, and which is usually very efficient even in diverse situations and statements. This Naive-Bayes Theorem can be clarified by using an example. When a text is analyzed, we have 2 actors:

  • Text 1

  • Text 2

And 2 premises:

  • Text 1 already has an assigned percept

  • Text 2 already has an assigned percept

Naive-Bayes theorem, requires finding P(A|B) which can be interpreted as A given B and for practical purposes, we can interpret it as knowing that text B already has a percept and A is estimated to have a percept, what is the probability that A has that percept knowing that B, a similar text, already has the percept assigned to it? This is basically the reasoning that follows the analysis with many comments, until a value is obtained. This value is calculated and is called polarity, while the percentage that has been given right in the calculation is called subjectivity.

For the programming done, the polarity value is between −1 and 1, having as a suitable reference that if the polarity is less than 0, the analyzed text is negative, while if it is greater than 0, it is positive, but if it is exactly 0, it means that it has a neutral perception.

This, as mentioned, is an ideal situation, but given the characteristics of our language, idioms, slang, and the use of different words does not allow the analyzer to obtain a very accurate result, in fact it is estimated that the efficiency of this algorithm is 50% due to the ranges obtained. Fortunately, and thanks to the NLTK library, it has been possible to refine the results obtained by the algorithm. This is achieved by implementing a new dataset that already has a perception assigned to it from the beginning. This is achieved by using an additional JSON file in which we add those texts that we know have already been checked and are not in the initial set of references for the calculation.

Another factor that we must take care of when deciding on the perception based on the numerical values obtained is to choose the correct permissive ranges for the decision making. In this system, an adjustment had to be made for the use of polarity and subjectivity, since polarities above 0 cannot always be positive. For this purpose, an offset of 25%, i.e. a value of 0.25, was made, resulting in the following ranges:

  • From −1 to 0 the analyzer will interpret and decide that the text is negative.

  • From 0 to 0.25 the parser will have to do an extra step, which consists of using the additional set of references that has been created to add sentences or texts that have already been checked.

  • From 0.25 onwards the parser will roundly decide that it is positive.

  • If it is exactly 0 it is an ideal where the interpretation is neutral.

For when this value is between 0 and 0.25, the parser should check if there are any additional references to the initials that have a resemblance to the original text. If so, the parser will calculate the probability with better certainty and the error rate is reduced to 1 in 4 comments.

This is the work of the sentiment analyzer, which will trigger and provide the subjectivity, which, interpreted, we can identify whether the originally analyzed text is of a positive, negative or neutral nature.

The interaction with the maps was also a milestone for the application to have a correct visualization and representation of places, making them well located.

In addition, the magical towns represented in the application also helped from the beginning to express whether they were good or bad to visit. This was part of the objectives that tried to better represent when the registered sites complied with the opinions that had been expressed on Twitter and analyzed with our implementation of the text analyzer or classifier algorithm.

4 Conclusions

In this work, the authors set out to investigate the feasibility of using twitter, to assess the artificial intelligence model suitable for handling the volume of information, as well as the classification of sentiment in the customer sales and post-sales process; thereby allowing the validation of the data obtained for the development of the functionality of the tool. In the case of the Twitter study, there are windows of opportunity to achieve a greater correlation between tourist sites and reviews to know the opinion of tourists in real time, the preferences of users, as well as evaluation and comparison of service providers in Querétaro.

Overall, this paper is the first step for the analysis of the artificial intelligence model and twitter in Querétaro State Tourism. The methodology can be improved in the future with the integration of new tools to access other social networks and the input of the actors in the system that are related to the tourism sector. In conclusion, artificial intelligence can innovate in the tourism sector and the productive use of twitter can provide analysts with low costs for a large volume of information; fast and real-time access to many users; contextual information, considering the site of data extraction.