Keywords

1 Introduction

On the web, there is an overwhelming number of options when searching for restaurant´s reviews, leading to the need to filter, prioritize, and deliver relevant information efficiently to alleviate the problem of information overload [1]. To address this situation, the present work proposes the generation of a comprehensive review from a collection of reviews gathered from restaurant evaluation platforms like Yelp, Google Reviews, and TripAdvisor.

2 Theoretical Framework

2.1 Artificial Intelligence in Consumer Decision Journey

Artificial Intelligence (AI) has transformed the marketing landscape by enabling companies to deliver personalized experiences, make data-driven decisions, and improve customer engagement. The role of AI technology in marketing and customer decision-making is expected to shape the future of customer interactions and business strategies. AI can analyze customer feedback and social media interactions to identify sentiment and customer satisfaction levels.

Companies that use advanced technology can also collect information about consumer preferences through digital data analysis and consumption patterns promoted by social media. Big data and experiments with machine learning are bringing together consumers’ personal values to determine their behavior and preferences in the markets [2].

In the modern consumer decision journey, consumer outreach has become more crucial than traditional push-style marketing. Word-of-mouth, internet reviews, and consumer interactions are significant touchpoints during the active-evaluation phase.

With the rise in popularity of digital platforms and social media, consumers are relying more on online reviews and recommendations from other consumers to shape their perceptions and to make purchasing decisions. Marketers must engage actively with consumers, manage brand reputation, and leverage user-generated content to build trust, credibility, and loyalty in this new consumer-centric landscape [3].

2.2 Aspect Extraction Module

Natural Language Processing (NLP) refers to the branch of computer science, and more specifically to the branch of AI, which deals with giving computers the ability to understand text and speech in the same way that humans do [4].

We make use of NLP by training a machine learning model for the extraction of aspect-opinion-sentiment triplets from a restaurant review. This model is trained with a corpus of labelled reviews obtained by processing data from restaurant review sites and social media.

For aspect extraction, an Aspect-Opinion-Sentiment Triplet Extraction (ASTE) model was used, focusing on the span-level approach. ASTE generates triplets consisting of an aspect target, the corresponding opinion term, and its associated polarity sentiment. The span-level approach explicitly considers the interaction between complete spans of aspects and opinions when predicting their sentiment relationship. As a result, it can make predictions with the semantics of complete spans, ensuring better sentiment consistency [5].

For example, in Fig. 1, the spans highlighted in orange are aspect target terms, and the interval in blue is the opinion term. From the same figure, the aspects are “food,” “service,” and “decoration”; there are three triplets: (food, wasn’t great, negative), (service, really nice, positive), and (decoration, really nice, positive).

Fig. 1
A text box of an A S T E example. The text reads the food wasn't great, but the service and the decoration were really nice. An arrow from food is mapped to wasn't great and arrows from service and decoration are mapped to really nice. Food, wasn't great is negative. Service really nice is positive.

Aspect-Opinion-Sentiment Triplet Extraction example

When considering only word-by-word interactions, it is easy to mistakenly predict that “great” expresses a positive sentiment about “food.” For this reason, a segment-based model for ASTE (Span-ASTE) is implemented, which directly captures span-to-span interactions when predicting the sentiment relationship between an aspect and a pair of opinions.

Span-ASTE consists of three modules: sentence encoding, mention module, and triplet module. For the given example, the sentence is first introduced into the sentence encoding module to obtain token-level representations, from which interval-level representations are derived for each enumerated interval, such as “wasn’t great” “food”. Then, aspect category detection, Aspect Term Extraction (ATE), and Opinion Term Extraction (OTE) tasks are adopted to supervise the proposed dual-channel span reduction strategy, which obtains reduced aspect and opinion candidates, such as “food” and “not delicious,” respectively. Finally, each aspect candidate and opinion candidate are paired to determine the sentiment relationship between them [6].

For word embeddings, BETO was used, a language model based on the Transformer architecture specifically designed for natural language processing in Spanish [7]. BETO is an initiative to enable the use of pre-trained BERT models for natural language processing tasks in Spanish.

2.3 Review Generation

Using multiple reviews from different rating sites, the goal is to generate a single general review that integrates the positive and negative aspects of the products and services for each restaurant. To achieve this, it is necessary to group the significant criteria and from these groups identify the most relevant aspects.

The review generation module consists of two relevant implementations, the classification of the extracted triplets. For the first approach, the use of the unsupervised machine learning algorithm k-means was proposed to achieve a better organization of the extracted triplets. A k = 3 was chosen, where k represents the number of clusters the algorithm creates.

The algorithm identifies similar patterns or features among a set of elements and works by calculating the minimization of the sum of distances between each element and the proposed centroids. This process is done iteratively, updating the centroids by taking the position of the average of the objects belonging to that group as the new centroid [8].

We notice that, for all the triplets, usually one cluster corresponds to general aspects and opinions of the restaurant, for example, [restaurant, good], [place, clean], [restaurant, would return]. Another cluster corresponds to semi-general aspects related to the restaurant, such as [taste, spectacular], [quality, excellent], [service, impeccable]. Finally, the last cluster corresponds to specific dishes of the restaurant, for example, [scrambled eggs, delicious], [lemonade, tasty], [chicken, juicy].

Once having this classification, the probabilistic Latent Dirichlet Allocation (LDA) model was applied to each cluster concerning their aspects to find the three most relevant aspects in each group [9, 10]. Subsequently, the same method is applied again, this time to all the opinions for each found aspect to obtain the most relevant opinion for that aspect.

Once the most relevant aspect-opinion pairs have been obtained, they are sent to the GPT-4 language model through the API provided by OpenAI.

In this way, a review-formatted text is obtained from the n most relevant aspect-opinion pairs from the set of reviews and stored in a database. The process is automated, extracting reviews from established restaurant review sites (Yelp, Google Reviews, and TripAdvisor), and the generated reviews can be accessed through a web tool.

3 Methodological Considerations

This project was based on an exploratory methodology that focused on analyzing similar products and expanding on traditional restaurant evaluation sites. Machine learning, generative AI models, and NLP techniques were used to develop an innovative solution in restaurant review generation [6, 11].

Projects with a similar focus usually only filter or categorize reviews, but this project, on the other hand, is capable of synthesizing information from multiple reviews into a single, more detailed review that compiles the most important aspects. This is done by obtaining information from different restaurant evaluation sites such as Yelp, TripAdvisor, and Google Reviews. Reviews of these sites are attained to be later analysed automatically, and finally the aspects obtained are extracted and their sentiments are classified as positive or negative, constructing an aspect-opinion-sentiment triplet (e.g. [chicken, tasty, positive], [drinks, bad smell, negative]). In doing so, it facilitates the decision-making process for those consulting it, allowing them to obtain a result based on their own criteria. The complexity of the project lies in building the labelled training corpus of reviews, extracting restaurant aspects, assigning specific relevance to those aspects, and generating the review itself.

Research took place in January 2022, the review collection began in August 2022 and ended in June 2023, using data extracted from APIs provided by the rating platforms themselves.

Reviews in Spanish of restaurants in Mexico City were collected on the following rating sites: TripAdvisor 425 reviews, Google Reviews 1415 reviews, Yelp 7160 reviews; as well as 1000 reviews of various posts on Instagram.

For the elaboration of this model, the analysis of information from the paper.

“Centralization of Information to Understand the Consumer Within the Restaurant Sector” was used, as well as the use of a tool to label the information, since the system requires to be trained with data in a specific format.

4 Results and Discussion

A corpus with 10,000 manually labelled reviews was obtained to train the Span-ASTE triplet extraction model through the developed labelling tool. The performance achieved through training with this corpus and k-fold cross-validation is reflected in the metric values: precision of 0.7, recall of 0.63, and F1 score of 0.66.

The model extracts and leverages existing resources, allowing users to make purchasing decisions based on the experiences of other customers. The information is dynamic and constantly updated based on the opinions of all diners, unlike other tools like Google Maps or Yelp, which offer a general view of services without direct customer feedback. With the model, users save research time, as relevant aspects of the place are provided.

In terms of managerial implications, this solution can be beneficial for both companies and customers. For companies, analysing and generating global reviews enables them to better understand customer opinions and preferences, helping them improve their services and make more informed strategic decisions. Additionally, by providing more detailed and relevant reviews, companies can positively influence customer perception and increase satisfaction.

For customers, this solution saves them time and effort by providing a comprehensive review that succinctly summarizes the most important aspects of the restaurant. This helps them make informed and reliable decisions when choosing a restaurant or service, thus enhancing their overall experience.

A similar algorithm can be applied to any other tourist attraction or sector, such as accommodation. This allows the evaluation and perception of services based on customer experiences.

Below are screenshots of the developed system, Fig. 2 shows the selected Restaurant and the reference from the review sites where the reviews were obtained. The following figures (Figs. 3, 4, 5, 6 and 7) show relevant information about the restaurant such as the generated review, a graph with a summary of positive and negative reviews, relevant aspects found in the total reviews, etc.:

Fig. 2
A screenshot of the restaurant search screen includes the space to enter restaurants, name, I D google, I D yelp, and I D trip Advisor and the google map of Mexico City.

Restaurant search screen, where the user can select the restaurant for which he wants to generate a general review

Fig. 3
A screenshot of the restaurant screen presents the reviews on Pujol Restaurant of Mexico city.

Restaurant screen, where the generated review is shown (in Spanish). On the right side are the relevant opinions extracted from the reviews of rating sites that were used to generate the general review. As well as the number of reviews that were used for the generation, in this case 59

Fig. 4
A semicircular graph presents the number of positives as 224 and negative aspects as negative 61.

Restaurant screen, graph showing the total number of positives (green bar) versus Negative aspects (number in red) that were identified in the reviews

Fig. 5
A pie chart presents the aspects of a restaurant screen. The words restaurant, experience, service, comida, and so on are mentioned. The number of the aspect restaurant is the highest.

Restaurant screen, below the generated review, graphics are shown with the aspects extracted from the reviews. The image shows that aspect “Restaurant” was mentioned 36 times

Fig. 6
A donut chart with every aspect of the restaurant listed.

Restaurant screen, by clicking on any aspect of the graph, it is possible to see all the opinions that were expressed about it. The picture shows that of “Restaurant”, 3 people mentioned that “it’s worth it” (in Spanish, “vale la pena”)

Fig. 7
A screenshot of the restaurant screen presents the menu and the reviews.

Restaurant screen, menu showing all reviews that were analyzed for the generated review. In each review it is possible to see the aspects highlighted in orange and the opinions that were expressed about it in blue. Hovering the pointer over an aspect highlights its corresponding opinion

5 Conclusion

By including longer reviews in the training, there is a greater diversity of data, improving the extraction of aspect-opinion-sentiment triplets for longer reviews. However, it is necessary to increase the number of such reviews to further improve the metrics.

Contrary to our initial hypothesis that a larger corpus would improve the results, the performance decreased in all metrics, and none of the folds surpassed the threshold of 0.70 in the F1 score.