Sentiment-based predictive models for online purchases in the era of marketing 5.0: a systematic review

Gooljar, Veerajay; Issa, Tomayess; Hardin-Ramanan, Sarita; Abu-Salih, Bilal

doi:10.1186/s40537-024-00947-0

Sentiment-based predictive models for online purchases in the era of marketing 5.0: a systematic review

Survey
Open access
Published: 05 August 2024

Volume 11, article number 107, (2024)
Cite this article

Download PDF

You have full access to this open access article

Journal of Big Data Submit manuscript

Sentiment-based predictive models for online purchases in the era of marketing 5.0: a systematic review

Download PDF

Veerajay Gooljar¹,
Tomayess Issa¹,
Sarita Hardin-Ramanan¹ &
…
Bilal Abu-Salih²

Abstract

The convergence of artificial intelligence (AI), big data (DB), and Internet of Things (IoT) in Society 5.0, has given rise to Marketing 5.0, revolutionizing personalized customer experiences. In this study, a systematic literature review was conducted to examine the integration of predictive modelling and sentiment analysis within the Marketing 5.0 domain. Unlike previous research, this study addresses both aspects within a single context, emphasizing the need for a sentiment-based predictive approach to the buyers’ journey. This review explores how predictive and sentiment models enhance customer experience, inform business decisions, and optimize marketing processes. This study contributes to the literature by identifying areas of improvement in predictive modelling and emphasizes the role of a sentiment-based approach in Marketing 5.0. The sentiment-based model assists businesses in understanding customer preferences, offering personalized products, and enabling customers to receive relevant advertisements during their purchase journey. The paper’s structure covers the evolution of traditional marketing to digital marketing, AI’s role in digital marketing, predictive modelling in marketing, and the significance of analyzing customer sentiments in their reviews. The Prisma-P methodology, research questions, and suggestions for future work and limitations provide a comprehensive overview of the scope and contributions of this review.

Introduction

The widespread use of artificial intelligence (AI), big data, and the Internet of Things (IoT), among other technologies emerging from Industry 4.0, has given rise to Society 5.0, a societal evolution in which lines between the virtual and physical space are often blurred as technology is increasingly being used to resolve economic and social problems [150]. The changing needs of Society 5.0, in terms of product purchase, gave rise to Marketing 5.0, which has revolutionized the way that products and services are advertised, offering personalized customer experiences [19]. Society 5.0 consumers are empowered by digital technologies that enable them to access a great deal of information about products and services through online reviews. This makes customers more knowledgeable and expects more from sellers than in traditional advertising approaches. According to Zhang et al. ([175], 2), the cost of acquiring new users is 5 to 10 times higher than that of retaining existing users, and a 5% increase in customer retention can increase profits by 25% to as much as 95%.

To retain their customers, businesses are therefore increasingly required to embrace Marketing 5.0 principles by collecting and analyzing customer data prior to sending personalized advertisements based on preferences or purchasing history [91]. Driven by AI and a data-centric approach, Marketing 5.0 practices involve the analysis of previous buying patterns (predictive analytics) and customer feelings (sentiment) at the time of purchase to monitor customer purchasing intentions for more targeted and successful promotion of products and services [156]. In this context, Marketing 5.0 is defined as a tech-enabled and customer-centric approach, whereby advanced technologies are used to gather insights and create personalized experiences for customers [91].

Predictive modelling (PM) makes use of different algorithms such as long short-term memory (LSTM), Bidirectional Encoder Representations from Transformers (BERT), Support Vector Machine or Naïve Bayes) to determine patterns within a dataset and forecast probabilities of events taking place in the future [61]. As claimed by Taherkhaniet al. [147], a prediction model is more sophisticated than current sales approaches, as it offers a better visualization of best-selling products on a dashboard. This results in better positioning within the competitive market by providing customers with products of choice that also meet their needs. Marketing 5.0 has been further enhanced with the integration of sentiment analysis (SA), which is used to determine feelings expressed through textual content or facial expressions [124]. Wang et al. [159] defined sentiment analysis as the identification and categorization of sentiments expressed in a text source, such as comments, product reviews, or news feeds on social media, such as Facebook, LinkedIn, and Instagram.

Regarding literature reviews within the digital marketing field, Al-Sai et al. [13] conducted a systematic review of the structure of big data and its uses in different types of data analysis (e.g., sentiment and predictive). Wang et al. [159] analyzed how natural language processing (NLP) can be used to improve the accuracy of sentiment-based models in different contexts. Moher et al. [105] analyzed the various steps involved in Preferred reporting items for systematic reviews and metaAnalysis protocol (PRISMA-P) in a systematic review. Busalim and Hussin [28] reviewed different deep learning approaches that can be used to improve social commerce today. Guha, Dutta and Paul [53] analyzed different recommender systems available to guide online buyers throughout their purchase journey. For customer-oriented data, textual content, such as customer reviews, is examined mostly to learn about customers’ beliefs, attitudes, and sentiments [174, 176, 178]. These data, including sentiment and historical information, can be used for the analysis of product feedback, enabling the business to better manage its reputation by gathering online feedback through regular web extractions [121]. Shah et al. [134] claimed that reviews help businesses constantly improve their products/services, thereby strengthening customer loyalty. Moreover, PM can be used for better decision-making processes by forecasting the quantity of different categories of products that a business should have on hand[174, 176, 178]. In addition, customer data can be used for targeted marketing to send specific advertisements to a smaller but more specific group of potential customers who might be more interested in a specific product [101]. In their market analysis study, Mehmood et al. [99] claimed that over 54% of small businesses in the United States (US) used social media to acquire new customers, while 80% of the respondents claimed that they experienced an increase in their website traffic/visit when advertisements were posted on social media. Therefore, to understand the different applications of predictive and sentiment analysis within the field of marketing, this study conducted a systematic literature review to analyze how previous studies have used predictive and sentiment models to improve customer experience, business decisions, and any other marketing process that constitutes buyers’ journey.

Contribution of study

This paper highlights areas of improvement in the predictive modelling field and provides a better understanding of the integration of predictive and sentiment analysis within the context of Marketing 5.0. Furthermore, the sentiment-based model will shed greater light on customer preferences, thereby giving businesses insights on customers’ wants and needs by means of customer segmentation [103] and appropriate targeted marketing campaigns [129]. Moreover, such models can help businesses to strengthen their competitiveness by analyzing their competitors’ reviews to identify weaknesses that can be addressed for better business performance [59]. Also, customers could receive advertisements that may facilitate their decision making and purchasing journey [170].

Paper structure

This paper is structured as follows—Sect. "Digital marketing" explains the evolution of marketing from traditional to digital. Sect. "Artificial intelligence (AI) for improved digital marketing" explains the application of AI to enhance digital marketing, while Sect. "Data preprocessing" covers the application of predictive modelling within the marketing field. Sect. "Predictive modelling" explains the use of customer reviews as a means of understanding customers’ feelings (sentiments) during online purchases. The Prisma-P methodology is explained in Sect. "Research methodology" and the various steps involved are covered in the sub-sections. The research questions are discussed in Sect. "Results and findings". Sect. "Conclusion" concludes the paper with an acknowledgement of this study’s limitations and suggestions for future research directions.

Digital marketing

This section explains the digitalization of marketing via the Internet and its impact on buyers’ purchase journeys by encouraging more online sales. The sub-sections explain the contribution of Industry 4.0, technologies (predictive and sentiment analysis) in modernizing marketing processes, and better understanding customer perception during a purchase.

Since its creation in 1983, the Internet has been extensively used worldwide. From 16 million in 1995, the number of Internet users has reached 4536 million in 2021 [45], (3). This was further boosted by the Covid19 pandemic where online platforms flourished because they were an effective way to reach customers during lockdown periods [108]. Many businesses had to shift to online sales using websites or social media to promote and sell their products, leading to a worldwide retail e-commerce sales rate estimate of more than 5.7 trillion U.S dollars by the end of 2023 (Statista, “E-Commerce worldwide—statistics and facts). As customers’ day-to-day interconnectivity increases, it is vital for businesses to understand online customer behavior in an increasingly competitive market [55]. Despite numerous platforms being available for online buying, Kakalejcik, Bucko, and Vejacka claim that “96% of website visitors do not purchase from an online platform during their first visit” (2019, 47–58) because of a lack of online guidance. The buyer’s journey, which comprises three main stages, is illustrated in Fig. 1.

In the awareness stage, customers decide to search for products or services online based on their needs or wants. This is followed by the decision-making phase (consideration phase), in which customers look for product and/or seller reviews prior to deciding whether to proceed with their choice or to find another solution. Finally, in the purchasing phase, customers confirm their final choice and finalize their purchases [75]. Not all businesses can track and guide customers at each stage of a buyer’s journey to ensure an enjoyable purchase experience and a high level of customer satisfaction [47]. For instance, businesses often lose customers throughout the journey because of insufficient online customer assistance and interaction channels at the decision-making stage, where customers need to be convinced before proceeding to the purchase of the product [71]. Therefore, the availability of automated responses through online services using artificial intelligence(AI) improves digital marketing [34]. For example, AI chatbots can be used to assist and provide customers with positive reviews posted by previous customers. Furthermore, Rivas and Zhao [127] used ChatGPT-powered models for content creation of advertisements and generated AI posts, claiming that it helped save time so that the business could concentrate on other tasks.

Digital marketing is less time-consuming, as it uses online tools to advertise and persuade online buyers to purchase a specific product or service through business websites and social platforms such as Facebook, Instagram, and Tik-Tok [74]. Mika and Winczewski [102] suggested that redirecting useful content to consumers based on their searches, social media views, or browser activity, and displaying related ads on their screens enables faster and more effective communication with customers through online channels. Furthermore, utilizing AI techniques to extract social data and analyse them could be helpful to provide some better insights to businesses.

Artificial intelligence (AI) for improved digital marketing

The emergence of AI in the era of Industry 4.0, and its application to Marketing 5.0, drove the transition to digital marketing. AI is a game changer for digital marketers who integrate cutting-edge technologies into their marketing plans to boost product visibility online [16]. AI-driven tools provide marketers with knowledge obtained from customer data, such as purchasing history and product reviews, while providing customers with purchasing flexibility. However, this convenience can pose difficulties for buyers who do not have the opportunity to physically examine products before purchasing [68]. Therefore, buyers look for online product reviews and read comments from previous customers before deciding on a purchase. Thus, online customer reviews comprise critical data that are also actively sought and closely monitored by businesses [158].

These data are collated and analyzed to extract key information, such as best-selling products and common purchasing patterns, thereby assisting businesses in stock management [22]. As a result, buyers’ experiences are improved by providing them with the products of their choice based on historical data [141]. With evolving consumer behavior, enterprises are progressively embracing a ‘pull marketing strategy’. This approach encourages customers to proactively seek online experiences through channels such as social media and influencer marketing [174, 176, 178].

For instance, businesses use customer reviews to determine buyers’ emotions during online purchases. Sentiment polarities are represented by positive, negative, or neutral feelings towards a product or service, helping to better understand customer satisfaction [67]. AI also provides voice assistants to simplify the purchasing process [96]. However, customer reviews may also include sarcastic or hateful comments posted by users, which can negatively influence buyers [130]. Therefore, it is important to clean buyers’ data before monitoring customers’ purchasing intentions and forecasting their tentative next purchase using machine learning (ML) algorithms through predictive modelling [42]. The section that follows explains the preprocessing stage before discussing AI techniques used to improve digital marketing namely predictive modelling and sentiment analysis.

Data preprocessing

Before conducting the PM on the dataset, a data-cleaning process is required. This helps prepare and transform the data. For instance, outliers, missing rows of data, and inconsistencies must be identified and handled to avoid affecting the model [24]. Missing values can be replaced with estimates through the imputation process, whereby new variables can be derived or calculated from existing variables. For instance, Hassler et al. [58] used the values of weight, height, and BMI formula to replace missing BMI values in their dataset before analysis. Reducing the number of missing values or creating new ones improves the model’s performance [148].

Furthermore, normalization can also be used to ensure all the feature variables (columns of data) have a specific scale between 0 and 1 (min–max scaling), thus ensuring that all features contribute to the training process of a model. This process can be represented in general using the following formula:

$$\mathbf{X}\_\mathbf{n}\mathbf{o}\mathbf{r}\mathbf{m}\mathbf{a}\mathbf{l}\mathbf{i}\mathbf{z}\mathbf{e}\mathbf{d}=(\mathbf{X}-\mathbf{m}\mathbf{i}\mathbf{n}(\mathbf{X}))/(\mathbf{m}\mathbf{a}\mathbf{x}\left(\mathbf{X}\right)-\mathbf{m}\mathbf{i}\mathbf{n}(\mathbf{X}))$$

X is the original feature vector, X_normalized is the final normalized value of the original vector, min (X) is the minimum value, and max (X) is the maximum value of the feature vector [142].

In some situations, data must be transformed or converted to other data types that the model understands, or into more specific variables. For instance, “time since last purchase” can be obtained if the purchase data and the reference date of a customer purchase are available. Furthermore, if there are skewed data, logarithmic (log (x)) and square root transformation (sqrt (x)) can be used because they help compress larger values and expand smaller [44]. Once the data preparation phase is complete, specific variables (columns of data) must be selected through the feature-engineering process. Variables can be tested individually or against each other to visualize them through Python libraries such as matplotlib and pandas [85]. Data preprocessing and feature engineering phases ensure a good data quality [58]. The next two sections will discuss two Marketing 5.0 AI strategies: predictive modelling and sentiment analysis.

Predictive modelling

The process of creating mathematical or statistical models that indicate future outcomes based on past and present data is known as predictive modelling [72]. This area of data science involves pattern recognition, utilizing data to identify connections and predict future trends [57], assisting companies in taking appropriate actions. Several strategies have been adopted to influence customers’ next purchases. These include cross-selling (CS), which is used to increase the sales volume per customer while maintaining a good customer relationship. For instance, if customers purchase cereals in bulk, the business will advertise products that go well with cereals, such as milk or coffee, to encourage their purchase and reduce the tangible and intangible costs of a customer switching to a different seller [73]. Another common method used to predict customer purchases is the analysis of customer data, such as demographics and purchasing history. The demographic variables age, education, and marital status have an impact on customers’ choices of products and can be used to forecast future purchases [94]. Businesses have also started identifying loyal customers by tracking click-stream data, including clicks and impressions from different platforms [54]. Redirecting useful materials to consumers based on their searches, social media views, or browser activity and displaying ads on their screens enables faster and more effective communication with customers through online channels [111]. The next section explains the different stages involved when conducting predictive modelling.

Stages of predictive modelling

Predictive modelling (PM) consists of a three-staged process: data acquisition, model selection and model testing.

Data acquisition: Firstly, an appropriate experimental design is chosen to generate the experimental data. This stage ensures that the correct data are acquired for the study so as to control bias and eliminate inaccurate data [131]. Furthermore, this step outlines the data collection strategies and establishes whether data for analysis will be collected from online social platforms or Kaggle.
Model selection: A model is selected to represent the experimental data by comparing different models and determining which one is best suited to the dataset to be tested. This is important because different models have different strengths and weaknesses, leading to inconsistencies if the chosen model is not fully justified [3]. For instance, choosing machine learning models compared to deep learning models could require less training time because of the less complex structures [26], but the accuracy might not be as expected.
Model testing: The model was tested on a dataset [72]. After choosing the model, the data must be fed to the model to evaluate its performance using different metrics such as accuracy, precision, and F1 score. This helps to identify the potential problems of overfitting and underfitting to further optimize the model [135].

The chosen model can be represented by f_(w,b) (x) = wx + b, where w and b represent the parameters used in the prediction [51]. The cost function (mean squared error MSE), which is the measurement of how far the prediction is from the actual target, can be measured as follows:

$$J\left(w, b\right)=\frac{1}{2n}{\sum }_{i=1}^{n}{\left({f}_{w,b} \left({x}^{\left(i\right)}\right)- {y}^{(i)}\right)}^{2}$$

W represents the coefficient of the model, b is the bias term, f _(w,b) (x⁽ⁱ⁾) is the predicted value for input x⁽ⁱ⁾, y⁽ⁱ⁾ actual value corresponding to input x⁽ⁱ⁾ and n is the number of data points in the dataset [51, 52]. The value of J(w,b) must be decreased to ensure an accurate analysis.

Various methods can be used to conduct predictive modelling. Nonetheless, the data preprocessing described above, along with the three-staged predictive modelling stages, proved to be more systematic when employing pretrained models that required minimal training time [65, 72].

For instance, several researchers [39, 72, 85, 93, 112, 165] have investigated predictive modelling algorithms and techniques. Wen et al. [164] used customer feedback and clickstream datasets from a shopping platform in China to develop a prediction model for customers’ purchasing intentions. They recorded a good predictive F1-score of 0.9031 with Random Forest for an optimal time window of 2 days. Their study helps businesses understand that purchase intention is influenced by fashion products and public reputation established through social media. Khanna and Maheshwari [80] developed a predictive model based on regression and statistical methods to forecast weld bead dimensions using welding data. Furthermore, their mathematical models were able to make a prediction based on different variables by undergoing some data transformation through mathematically derived formulas. This process had a 2% error margin because of the coefficients used during the conformity test. This could have been improved by the use of deep learning algorithms [80], 4481–4483). Rahmani et al. [122] utilized Random Forest and AdaBoost modelling techniques to predict steer prices. Their test data had a confidence interval of 95% [122], 15–16), indicating the effectiveness of the multivariate approach and the effective use of probabilistic modelling for price variability. This has helped businesses minimize the wastage of resources, thus making them more sustainable. However, to improve the reliability of the results, other external factors that affect pricing could have been added to the training dataset to determine new coefficients and relationships between the different variables. Thangeda et al. [152] collected data from Andhra Pradesh Telecom users and used a nonlinear adaptive approach to predict customer churn in the telecom industry. Different trigonometric and linear combinations were used to train the dataset, and they exhibited promising convergence performances. However, the dataset comprised information that were too generic and did not have a structured feature-engineering approach, thus leading to biased results. Therefore, sentiment data (polarity, sentiment scores, feelings, word aspects) could be integrated with purchasing data to provide better insights through a sentiment-based predictive model. The next section explains sentiment analysis and its applicability in the current Marketing 5.0 era.

Sentiment analysis (SA)

Sentiment analysis (SA) is a discipline that studies consumer responses towards goods and services to assess how customers’ feelings are reflected in their purchasing attitudes and product evaluations [20]. Singh and Singh [139] described SA as a study of feelings based on textual information published as online evaluations on various social media platforms. Chan et al. [32] claimed that these evaluations could be used to analyze people’s attitudes, sentiments, and ideas regarding a certain event to understand unstructured data (raw data collected from social media). This process is referred to as the traditional method of conducting sentiment analysis through polarity extraction and contextualization of sentence structure [87]. SA helps to determine whether textual content is positive, negative, or neutral by analyzing feelings and opinions using deep learning (DL) or natural language processing (NLP) techniques [137]. DL is a subset of machine learning methods that can be used to identify patterns or perform complex tasks using data [173]. On the other hand, NLP refers to the ability of computer programs to understand text or spoken language in a similar way to humans through the constant training of AI models [85].

The basic SA model consists of feature extraction (transforming raw data into statistics), a training set (dataset that will be used to train the model), and a classifier (model to be used to analyze data), which indicates the polarity of the textual content [137]. Knowledge-based sentiment analytics (KBSA), an improved version of SA, is used to extract features such as emotions, linguistics, sarcasm, and lexicons from customer feedback. To manage the different feelings indicated in review comments and forecast consumer opinions of items, multi-sentiment analysis, which uses multiple models, is applied to customer review data collected from different sources such as blogs, Facebook, Twitter, or e-commerce websites. KBSA also includes tokenization, where long sentences are broken down into smaller phrases, to which scores are allocated to determine their polarity [97]. Breaking down sentences into single words provides a sophisticated view of the overall sentiment at a granular level to facilitate the evaluation of emotional tone [60]. A sentence such as “The movie was not only captivating but also brilliantly directed” was tokenized in emotion analysis by breaking it up into individual words such as “The” “movie”, “was”, “not”, “only”, “captivating”, “but”, “also”, “brilliantly”, and “directed”. The emotion associated with each token was then evaluated, enabling the analysis to identify both positive and negative feelings found in the text.

Contrary to the conventional document and sentence sentiment analysis, ABSA (Aspect-Based Sentiment Analysis) looks at the viewpoint directly, which is the root of the sentence, thus making it more relevant to the context. ABSA can also include topic modelling (TM) abilities used to explore and find patterns automatically from sentences. TM conducts clustering (grouping of words), finds patterns, and determines the probabilistic distribution of topics. Figure 2 summarizes the types of sentiment analysis, from traditional to knowledge-based and aspect-based sentiment analysis.

Therefore, it is evident that sentiment analysis has shifted from traditional approach to one that is more aspect-based approach in order to provide information beyond sentient polarity and sentiment scores [115, 168]. For instance, customer trust and loyalty can be determined through aspect-based sentiment analysis and these factors could help a business understand whether a customer has been successfully retained. Furthermore, numerous studies have focussed on ways to improve digital marketing strategies to increase sales, taking into consideration various purchasing determinants including competitiveness, pricing strategies, discounts and product ratings [8, 40, 53, 154]. However, they have not considered merging these purchasing factors with sentiment data to obtain a better forecast of customers’ future purchasing behaviors. Significant attention has been directed towards the utilization of artificial intelligence, deep learning and natural language processing techniques for the extraction and analysis of sentiment expressed in customer reviews [109, 117]. However, some studies [43, 89, 163] have constructed predictive models based solely on sentiment polarities, overlooking crucial factors obtained from reviews such as customer trust, loyalty and customer retention. Conversely, others [79, 81] have primarily focussed on trust, neglecting the exploration of other purchasing factors that could be derived from customer reviews. In the light of these findings, this study proposes to have a sentiment-based predictive model that uses sentiment factors (polarity, sentiment score, trust, loyalty and retention) merged with purchasing history to forecast the subsequent purchase intentions of customers. The next section explains the research methodology that has been used for the study.

Research methodology

A systematic literature review (SLR) approach was adopted to address the research questions of this study. SLRs provide clear identification, analysis, and display of data collated from previous research conducted in the chosen area of study [116]. This strategic evidence is useful to researchers in different fields, such as artificial intelligence (AI). The use of SLR sometimes lacks flexibility if the guidelines are not followed properly, because it requires more in-depth research and analysis of sentiment-based predictive modelling [76]. However, SLR is appropriate for this study, as it provides a more comprehensive and evidence-based structure because of the various previous studies that are well tabulated with proper identification of trends, patterns, and inconsistencies [25]. Furthermore, the factors that were considered during their proposed framework/model were also analyzed, enabling a better identification of research gaps.

Review protocol: PRISMA-P

There are many review protocols available that can be used to conduct a systematic literature review. Two of them are meta-analysis reporting standards (MARS) and preferred reporting items for systematic reviews and meta-analysis protocols (PRISMA-P). MARS could have been used for this study. However, it is more suitable for studies involving statistical methods and quantitative results in their reviews of research topics [76, 128]. On the other hand, Prisma-P is a technique used to review academic journals and articles prior to formulating the research questions. With this technique, the inclusion and exclusion criteria are stipulated, and the data extraction and search approach is used based on the findings of various authors [49]. It helps minimize research bias by carefully considering the research background, research questions, search strategy, selection of studies, quality assessment, and data extraction and synthesis of data [28]. Therefore, PRISMA-P has been chosen for this study, comprising identification, screening, eligibility, and inclusion phases.

Identification is also known as the screening phase, and authors search for journals/articles from specific databases, such as ProQuest/Scopus, based on keywords or paper titles. Screening refers to a review of the papers that have been gathered based on the identification phase. Predefined quality assessment (QA) questions were used to filter journal papers and reduce the number to those that were more relevant to the study according to their titles and abstracts. In the eligibility phase, papers are further divided into different categories based on the database filter option to select papers that have been used in the field of consumer behavior, purchasing factors, sentiment analysis, and predictive modelling. For the inclusion phase, a quality assessment plan is used to give scores to the selected papers that were used to answer the research questions.

Following the steps in the PRISMA diagram (see Fig. 3), 150 journal papers were considered in this study, comprising 30 journal papers for the analysis section and 120 journal papers for all other sections. The 30 selected papers were suitable for this study because they covered sentiment and predictive modelling in a marketing context. Furthermore, they were specific to buyers’ journeys, which helped to further improve the findings of this study with relevant comparison data for the different purchasing factors and behaviors.

All papers used to answer the research questions were published between 2021 and 2024, thus covering the latest technological developments of the various models available. Additionally, many of these 30 papers provided sufficient information to address the research questions of this study. To filter the different studies, the following formula was used and the number of journals has been shown in Fig. 3:

X = papers in green boxes, indicating the initial number of papers considered.

Y = papers in red boxes, indicating those that were eliminated.

Z = papers still in the green box to be considered in the next phase.

Z = X – Y

The next section explains how these journal papers were sourced (Fig. 3).

Search strategy

In this phase of the systematic review, the ProQuest database was used to search for relevant papers. These were not limited to specific journals, such as the International Journal of Data Science and Analytics, Journal of Machine Learning Research, Journal of Sentiment Analysis and Opinion Mining. Instead, journals were explored based on the research questions and the main keywords such as “sentiment” and “predictive modelling” and search phrases such as “customer purchase prediction model” and “use of sentiment analysis for buyers’ reviews”.

However, identifying publications based on keywords and search phrases alone is not very efficient, as it depends on word similarity using weights of terms and shared references (existence of the same citations across multiple papers) only [28]. Citation search is a good approach for enhancing the search for papers. Backward citation searching, which involves looking at references cited by a particular work, was also used [48]. All publications were then stored on Endnote to allow direct citations from search databases through direct export [30], facilitate the removal of duplicates, and organize papers in a systematic way under clearly labelled folders.

Inclusion and exclusion criteria for study selection (filtering process)

The inclusion and exclusion criteria were established to ensure that the selected studies were current and relevant. Abstracts were analyzed using EndNote, and a new subgroup (SG1) was created on the reference manager to retain only those papers that were relevant to the research questions [34]. All journal papers addressing customer prediction models and using customer data to predict their purchase or analyze trends to forecast their purchasing intentions were saved. Regarding sentiment analysis, studies providing the steps used to analyze customers’ textual reviews using different algorithms were considered. Journal papers exploring purchasing history data were also included in SG1 and systematically analyzed. Papers older than 2021 were excluded from this review because the field of prediction analytics has evolved rapidly, and recency for SLR is considered primordial.

Quality assessment

This study aimed to review journals covering sentiment and predictive analytics within the field of marketing—buyers’ journeys. The limitations of the current models were determined and presented in a tabular format, along with previous studies that have addressed various gaps. Furthermore, customer purchasing behaviors obtained from different studies were analyzed to understand the areas that require more focus. Moreover, a quality assessment of the papers was required to ensure that relevant papers were used for this study. The Quality Assessment (QA) stage ensures that only quality papers are selected for the study. As advised by Jadhav, Gaikawad and Bapat [66], three levels of quality schema, categorized as high, medium, and low, were used.

This study focused on addressing the following research questions:

1.
What are the predictive and sentiment analysis models used to improve Marketing 5.0?
1. a.
  What predictive analytics models have been used to improve Marketing 5.0?
2. b.
  What sentiment models have been used by previous studies for analyzing customer reviews?
2.
What are the factors considered in online purchase predictive models?
3.
What are the challenges and limitations of existing sentiment-based predictive models?
1. a.
  What was missing from previous sentiment-predictive models?
2. b.
  What are the challenges in terms of dataset transformation and model development?

Several journal papers were used to answer the RQs, and quality scores (QS) were applied to filter papers by assigning scores to them based on several quality assessment questions.

Quality scores were assigned to each SG1 paper based on the following criteria:

QA1: Does the paper address customer purchase predictive analysis?
QA2: Was sentiment analysis used to explore customer reviews?
QA3: Were algorithms or technologies used to predict customer purchase intention clearly explained?
QA4: Have the factors considered for customer purchase predictions been adequately discussed?

The scores for each paper were calculated based on the quality assessment questions using a scale of 0 to 2, where 0 meant that the paper did not fulfil the QA being evaluated (No), 1 meant that it partially addressed it (P), and 2 meant that it fully covered the QA being assessed (Y). To avoid assessing papers based only on linguistic terms, scores were used by converting all Y, N, and P into numerical forms for better interpretability [12].

The total for all four QAs were calculated using the following formula:

F.S (P_n) = ${\sum }_{{\varvec{i}}}^{{\varvec{n}}}\mathbf{Q}\mathbf{A}\mathbf{i}$,where F.S stands for the final score, P_n stands for the paper number, and i is the value given for the different QAs.

Therefore, when $\mathbf{F}.\mathbf{S}(\mathbf{P}\mathbf{n})={\sum }_{{\varvec{i}}}^{{\varvec{n}}}\mathbf{Q}\mathbf{A}\mathbf{i}$ ≥ 6, the paper has a high QA, 4 ≥ $\mathbf{F}.\mathbf{S}(\mathbf{P}\mathbf{n})={\sum }_{{\varvec{i}}}^{{\varvec{n}}}\mathbf{Q}\mathbf{A}\mathbf{i}$ ≥ 5, medium QA is inferred; and when $\mathbf{F}.\mathbf{S}(\mathbf{P}\mathbf{n})={\sum }_{{\varvec{i}}}^{{\varvec{n}}}\mathbf{Q}\mathbf{A}$ < 4, this implies low QA. Table 1 presents these explanations and helps eliminate any search bias and increases the validity of the literature review [13].

Table 1 Quality Assessment (QA) Metrics

Full size table

The different QA questions were used to determine which papers were eligible for the study and they were filtered accordingly through the eligibility phase shown in Fig. 3. After this process, 30 journals were selected; and their individual scores and QA assessments are presented in Table 2. All the high and medium QA papers from the remaining papers were used to answer research questions that were related to SA and PM, while the low QA were papers which were used to answer purchasing factors related research questions.

Table 2 Quality assessment of academic papers

Full size table

Reporting review

After conducting the QA for all shortlisted papers, the results of the systematic review were reported [84], contributing to extend research studies on customer predictive analytics and answering the research questions of this study. This is discussed next.

Results and findings

This section uses the findings from previous studies to answer the research questions presented in the methodology section. This was achieved by examining research gaps within previous models, constraints in their study analysis, performance metrics, and the application contexts (such as marketing, sales, stock pricing, etc.). In order to broaden the understanding of purchasing patterns this current study focussed on different product types and services to analyze gaps rather than focussing on only one specific product.. Consequently, although product type is used as a predictive variable to track customer purchase, it is not the only element that can be used. There are multiple other factors to consider such as customer feeling, the need to purchase, budget, post-purchase emotion, etc. Therefore, this study proposes a sentiment-based predictive model to address the existing gaps in the context of customer purchase behaviors. The reviews are presented in a tabular format to better understand how different models have been used in different scenarios to assess their impact on marketing outcomes. Studies have demonstrated that predictive analytics can accurately forecast outcomes based on various factors that contribute to customer purchasing decisions (satisfaction, rating, reviews, and loyalty). The models were able to identify the key factors that influence customer behavior and used these factors to make predictions. The results vary under different circumstances, and are explained in the following subsections.

Predictive analytics approaches (RQ1-a)

As discussed in Sect. "Data preprocessing", predictive modelling (PM) is an AI sub-component that enhances marketing 5.0, with its ability to forecast customer behaviors and preferences [19]. PM can also be used to analyze customers’ purchase intentions based on historical data. Multiple approaches have been used to develop predictive models in existing research. Figure 4 depicts the four different approaches along with their associated models which will be explained.

Classical machine learning

Classical machine learning (CML) is a traditional approach that uses algorithms such as support vector machines (SVM), logistic regression (LR), factorization machines (FM), decision trees (DT), and random forests (RF). These methods use manual feature engineering, where relevant features are filtered from the input data to be used in the model for proper analysis.

Ensemble learning

Ensemble learning (EL) involves combining different models to develop a more robust model that will, in most cases, outperform the individual models when evaluated. The main characteristics of EL are bagging and stacking concepts [104]. Bagging refers to the process of training multiple instances of the same algorithm on different subsets of training data. Stacking is a method of combining the predictions of multiple base models through multiple training sets [180]. Examples include extreme gradient boosting (XGBoost), gradient boosted decision trees (GBDT), and light GBM (LGBM).

Deep learning

Deep learning (DL) is a sub-component of machine learning (ML) that involves neural networks with multiple layers capable of learning patterns from large datasets [130]. It is suitable for natural language processing, such as speech recognition and textual review analysis [24]. Examples are Long-Short Term Memory (LSTM), bidirectional encoder representation from transformers (BERT), and generative pre-trained transformer 3 (GPT-3). However, the problem of vanishing gradients arises when DL algorithms are used. This occurs when there are multiple layers of data within the data network, thus causing issues in updating the weight of the predecessor layer when moving to a new layer for analysis. Consequently, the weight gradient for the different layers is not updated properly, thereby producing less efficient results [92]. A technique used to minimize this effect is batch normalization, which requires the scaling and centering of data [163].

Fusion model

The fusion model is a combination of the outputs from different models to obtain the final prediction from a dataset. It can be used at different levels, including the feature level (combining extracted features), decision level (combining model predictions), and sensor level (integrating data from various datasets) [1].

Predictive modelling algorithms (RQ1-a)

The approaches adopted for predictive modelling can be different as discussed in Sect. "Predictive analytics approaches (RQ1-a)". However, data pre-processing and feature engineering phases are commonly used as explained in Sect. "Data preprocessing". Table 3 lists and explains the various predictive models used in previous PM studies.

Table 3 Algorithms/Models description and usage—prepared by the authors

Full size table

Limitations from previous studies encountered with the different models have been summarized under the weakness column. The number of papers that addressed the different models have been shown in Fig. 5.

Figure 5 shows the number of studies (x-axis) that use the various models listed (y-axis). As evident, the papers reviewed in this study did not use (GPT), whereas six out of the 30 papers used Bi-directional encoder from transformers (BERT) and Support vector machine (SVM). Seven of the studies used Long Short-term memory (LSTM) and Decision tree (DT), while five used linear regression. For the remaining studies, Factorization machine (FM), Naive Bayes (NB), Random Forest (RF), Light gradient-boosting Machine (LightGBM), Extreme gradient boosting (XGBoost), and Gradient boosted decision tree (GBDT) were used. Many of these algorithms, which have different strengths, can be used for both sentiment and predictive models. Therefore, hybridizing two of these models to form a sentiment-based predictive model based on the nature of a dataset could offer a good solution to issues within the customer purchase field.

Application of sentiment models for customer reviews analysis (RQ1-b)

Some of the weaknesses listed in Table 3 can be mitigated by integrating sentiment models. For instance, BERT has a complex data architecture and requires an appropriate training phase. Therefore, training a dataset with sentiment and predictive models of BERT can help minimize complexity by better understanding the data relationships better [181]. Furthermore, sentiment analysis involves data transformation, classification, training, and evaluation. This helps improve the categorization issues faced by many fusion and ensemble algorithms. The conversion of raw data into labelled data, followed by the training phase, results in a more scalable and flexible dataset that can be evaluated and visualized more easily [41]. Chen et al. [33] proposed a scalable DL model that uses an Apache dataset for analytics. They found that DL algorithms are more efficient when sentiment analysis is conducted in multiple phases of data preparation, feature extraction and polarity determination. Apart from sentiment models, there are also several side algorithms, such as the Genetic Algorithm (GA) and Firefly Algorithm (FA), that can be used to further enhance the efficiency of SA. The next section explains the concepts of GA and FA.

Genetic and firefly algorithm

Natural selection serves as a model for optimization algorithms, known as genetic algorithms. GAs can be produced by changing a collection of parameters or characteristics. They can be used in SA to improve the performance of sentiment classification models by tuning the data architecture to combine the variables. FA is the process of mimicking the flashing nature of fireflies, where brighter fireflies have a greater attraction power. Similarly, FA determines the groups of variables using the feature weight and parameter values. LSTM-DGWO (long-term memory with differential grey wolf optimization) is used to improve the optimization phase of the SA. Datasets were divided into different levels for better training of the models [20]. Furthermore, SA involves a text classification phase, thus it is important that models understand the context of the sentence being evaluated. Latent Dirichlet Allocation (LDA) is a probabilistic model that can be used for topic modelling to understand words within sentences being evaluated for their sentiment polarity [97, 137]. The next section explains how different deep learning models can be merged to produce better results.

Random multimodal deep learning

Random multimodal deep learning (RMDL) was used to combine diverse deep learning models to enhance SA performance. Local search with improved binary ant lion optimization (LSIBA-ENN) helps to optimize feature selection for classification tasks [20]. CNBL (convolutional neural network with binary layers) integrates binary layers into convolutional neural networks for efficient modelling [62]. SLCABG (semi-supervised learning with class-wise adversarial binary generative models) improve binary generative models for semi-supervised learning when training models using datasets.

When performing SA, ensemble learning algorithms, such as XGBoost, can be used. Therefore, extreme random forest with XGBoost (ERF-XBG) can strengthen the performance of SA with a better data classification [109]. The next section explains the findings and models of different sentiment-based studies.

Sentiment analysis models (RQ1-b)

Table 4 presents the studies, models used and the findings.

Table 4 Sentiment Analysis Models

Full size table

Numerous studies have used machine learning and deep learning algorithms to analyze sentiments, demonstrating the efficacy of random forest, Naïve Bayes, and support vector machine classifiers. A BERT-integrated model using DGSO and LSTM was able to attain a remarkable 98% accuracy for sentiment categorization. The ensemble Bagging SVM and BERT/CNN showed good accuracies of 96.1% and 99.23% in restaurant and e-commerce sentiment analysis, respectively. Therefore, depending on the dataset, the fusion models performed well in terms of the evaluation metrics. Furthermore, most of the papers mentioned above used secondary data, that is, data extracted from social media endpoints or open sources, such as Yelp or Kaggle. Therefore, multiple other factors, such as feelings, emotions, reviews and ratings, and trust can and should be extracted from textual content since they play an important role in the buyer’s journey [119], and help to ensure a better customer experience. Therefore, the extraction of other variables by means of sentiment analysis can help to predict whether customers will be repurchasing from a specific business [175]. The next section discusses other purchasing factors that can be used to analyse buyers’ journeys.

Factors for online purchase predictive models. (RQ2)

This section discusses the different factors and behaviors that influence customers during their online purchasing journey, such as customer service, product brand, customer ratings and reviews, and attraction to the product.

Purchasing factors

After reviewing the shortlisted papers, several factors were found to have been considered when determining customers’ purchasing patterns. These include customer behaviors and other business-related factors, as presented in Fig. 6 [27].

In the social media era, customer reviews have become a major factor in online purchase decisions. Businesses are increasingly focusing on tracking customer satisfaction through reviews to improve sales and track purchasing trends. Reviews can be used to determine customer preferences, attitudes, and loyalty, as well as to optimize marketing and sales strategies [9]. Customer satisfaction is another factor that drives online purchase decisions. Defined as the extent to which a business can resolve purchasing issues, high customer satisfaction can lead to better reviews, more online sales, and increased customer loyalty. The latter occurs when buyers repeatedly purchase from the same seller, regardless of competitors’ advertisements [146]. Loyalty can be determined by analyzing the purchasing patterns evident in customers’ purchase histories. Branding, pricing, and discounts should also be considered when analyzing patterns and retaining customers [110].

Branding is the process of creating a strong marketing perception that attracts customers and influences purchasing intentions [53]. Pricing is another important factor that customers use to evaluate their purchases and can vary depending on the brand. A favorable price can attract more customers and increase loyalty and retention probabilities. Additionally, sellers use discounts to attract customers by offering reduced prices for online sales [110]. Customers often take this opportunity to purchase more products from their preferred sellers, thus increasing the probability of a subsequent purchase because of greater customer satisfaction. The interface and web design of a commerce website also affect online customers. Visual design factors such as interface color, font type and size, and images have a positive impact on customers' trust, while ease of navigation improves customers’ journey by helping them find their preferred products more easily [123]. A good user interface design leads to a better display of products, which enhances customers' purchasing pleasure and experience during their purchase journey. Customers also highly value their personal data protection, and e-commerce websites with clear buyer data policies can easily attract online buyers and build buyer–seller trust.

Strong trust between customers and sellers can increase the loyalty factor, which can be integrated with sentiment and purchasing history to further enhance the predictive probability of customer repurchasing [40]. The number of papers addressing each of these factors is shown in Fig. 7.

Previous studies [143, 157, 174, 176, 178] have addressed customer satisfaction, attraction, customer reviews, pricing, and product ratings more often than factors such as customer retention, loyalty, and trust. Customer trust plays a fundamental role in cultivating loyalty and ensuring customer retention, as emphasized by studies. This trust can be established by aligning products and services with customer needs, ultimately influencing satisfaction levels and fostering long-term relationships. Loyalty manifests in customers’ repeated purchases and positive word-of-mouth, which significantly impact business success [98]. Despite its importance, the examination of customer loyalty remains limited, particularly in the context of mobile and online shopping, where customer switching costs are low [169, 174, 176, 178]. Therefore, the integration of sentiments yielded by customer reviews with purchasing history data enables the behavioral patterns to be identified, thereby facilitating the prediction of future purchase behaviors [113]. Additionally, customer retention efforts focus on understanding customer behavior and preferences so as to enhance satisfaction levels and predict future purchasing needs [50, 138, 140, 153], thus highlighting the need for a sentiment-based predictive model.

Moreover, a valuable asset of every online business is its faithful customer base with readily available funds. Therefore, businesses attempt to convert current customers into loyal ones to ensure repeated purchases from the same seller using different techniques, including electronic word-of-mouth (eWoM), influencer marketing, and the posting of online customer reviews and testimonials [132]. Therefore, the extraction of such factors (trust, loyalty and retention) through aspect-based sentiment analysis could be crucial to enhancing current marketing strategies, thus highlighting the importance of a sentiment-based predictive model.

Since influencer marketing involves the usage of social media, from which reviews can be obtained, this is explained in the next section.

Influencer marketing

Influencer marketing involves using well-known or less-known opinion leaders with sizable followings on social media to encourage favorable attitudes and actions from customers towards the brand, thus retaining them for the business. Any type of digital media-based positive or negative communication regarding a product or service is known as electronic word-of-mouth. This can drastically alter how consumers make judgments about what to buy by serving as the contemporary equivalent of old-fashioned word-of-mouth advertising. eWoM has the power to influence customer decisions to remain loyal to a certain business, thus ensuring a favorable evaluation that can boost revenue. Moreover, consumers can share their opinions about their experiences through websites, which can be extracted for sentiment analysis. A few studies have considered factors such as discounts and page interfaces. However, these were not selected for this study because the focus is on ensuring a smoother buyers’ journey by taking into account customer trust, loyalty and retention through a sentiment-based predictive model. However, when developing such a model, there are certain constraints and challenges. These have been discussed in the next section.

Challenges and limitations of sentiment-based predictive models (RQ3)

This section discusses the challenges that can be faced during the development of a sentiment-based predictive model for the marketing field: online customer purchase and data analysis. Furthermore, previous models have several limitations that have been explained.

Limitations of previous sentiment-predictive models (RQ3-a)

A total of 150 papers were used for the current SLR and 30 were filtered because they considered sentiment or predictive models to improve their current marketing service. Of the 30 papers, only nine addressed both sentiment and predictive modelling from a marketing perspective. Various models have been used to determine customer feelings about a product, such as SVM, BERT, K-Means, LSTM, and NB. However, ensemble learning was not used in the study, and there was no hybridization of algorithms, which would have increased the performance metrics used. Moreover, LSTM is known to have good predictive metrics when it comes to customer data [61].

Liu and Ying [93] applied SNOWNLP to analyze customer sentiment and used LSTM for their predictive model. However, there is room for improvement in the preprocessing phase, particularly in terms of enhancing the data structure and conducting more thorough data cleaning. Furthermore, RF could have been merged with LSTM to increase the efficiency by mitigating the weakness of each individual model and providing better data modelling [172]. For the sentiment analysis phase, algorithms such as BERT can be used to increase the reliability of research [145]. Ullah et al. [155] proposed the use of the BiLSTM and QLeBERT algorithms for sentiment and predictive modelling. However, their dataset did not have a resource pool of languages (lexicon), which would have improved the research. For example, the model was not trained to detect sarcastic contexts, and the tokenization phase could have been made more efficient by dividing the process into sub-processes. Breaking a sentence into two parts and further breaking it down to single words would improve the performance metrics because of more efficient classification and analysis of different groups of words [7].

As illustrated in Fig. 8, of the nine papers, only two used LSTM and BERT despite the limitations of these approaches. The two algorithms can be merged to form a sentiment-based predictive model. Furthermore, only two of the 30 papers (6.7%) used LSTM and BERT for the sentiment-based predictive model, indicating that this area requires more focus. This is supported by the facts presented in Tables 3 and 4, where the evaluation metrics for LSTM and BERT were among the highest when tested on different types of datasets. Additionally, when working with Kaggle secondary customer dataset for the proposed model, LSTM and BERT present viable options due to their capacity to discern connections among diverse data variables, consequently mitigating evaluation margin errors [35]. Moreover, during the development of these models, the inclusiveness of a variety of data variables (columns) can serve as a pivotal factor. Inadequate coverage, such as lacking emotional data, purchasing history, and product details [100] can lead to biased or less accurate results. To minimize such possibilities, it is imperative to ensure a proper balance among pre-existing biases and limitations (sarcasm, low-quality data, noise, uncleaned data, etc.), thereby ensuring an accurate analysis of customers’ purchasing intentions.

Model development challenges (RQ3-b)

As claimed by Kim et al. [82, 83], deep learning can be used to conduct a sentiment analysis of customer reviews. However, the main challenge is the sarcasm detection based on the structure and context of the sentences. A key sign for sarcasm identification is the sentiment incongruity of words in a phrase; that is, the contrast between positive and negative concepts. Xiong et al. [166] and Tay et al. [151] proposed a solution to overcome the sarcasm issue by considering similarity and assigning greater weight ratings to highly comparable terms. However, this approach cannot efficiently identify inconsistent information [60]. Therefore, aspect level sentiment analysis can be used in the proposed model to enhance the capabilities of LSTM to analyse word by word for better interpretability.

Dataset transformation challenges (RQ3-b)

Baroiu and Stefan [21] proposed the “MUStARD” multimodal dataset, which can detect sarcasm by incorporating data from a comedy series. However, this was not sufficient to detect sarcasm in customer reviews due to contextual differences. Furthermore, sentiments expressed in words can sometimes differ from what is actually felt [7]. Data quality can be another challenge if the datasets are not properly cleaned and are divided into training and test sets [82, 83]. Therefore, the phases involving strict data preprocessing and feature engineering are important to ensure that a good quality dataset is available for SA and PM.

Overfitting and underfitting issues (RQ3-b)

When conducting PM, overfitting and underfitting have often been an issue in previous studies [135]. Overfitting occurs when a dataset is too complex and the model cannot be trained sufficiently because of time limitations or complex relationships between variables. Conversely, underfitting occurs when a model is too simple and cannot capture the patterns required to obtain the proper evaluation metrics [131]. Therefore, the use of scaling and normalization concepts can help to reduce the chance of having outliers and to decrease the error margin of the proposed model.

After answering all the research questions, the findings are illustrated in Fig. 9 and Fig. 10.

Significance of the study

This section has been divided into theoretical and practical significance.

Theoretical significance

In terms of theoretical significance, this study highlights areas requiring improvement in the predictive modelling field and provides a better understanding of the integration of predictive and sentiment analysis within the context of Marketing 5.0, by placing greater focus on specific processes in the buyers’ journey. Moreover, this study provides insights on the limitations and best usages of different models which can be applied in different contexts of sentiment and predictive analytics.

Practical significance

The proposed sentiment-based model will help businesses better understand customer purchases, thereby enabling them to provide customers with products chosen to meet their needs [42]. Furthermore, this novel model serves as an emotional driver by capturing customer attitudes and opinions and merging these with their purchasing history to help businesses understand customer purchase behaviors [115]. Nowadays, buyers tend to check product branding before purchasing [23]. Consequently, such a model can help marketers develop more specific targeted marketing strategies that address negative customer reviews and focus on positive ones [61], ultimately enhancing customer experiences [63]. On the other hand, customers can receive advertisements for products that help them during their journey. By receiving targeted advertisements aligned with their preferences, customers can make informed choices, fostering a more satisfying and efficient shopping experience [144].

Limitations and future work

This study discussed the various models (LSTM, BERT, XGBoost, SVM, NB) used to conduct sentiment analysis. However, it does not examine in detail the larger language models (LLM) such as ChatGPT or Gemini in its analysis section. Therefore, some suggestions for future work in the field of predictive and sentiment analysis include: first, exploring the use of deep learning and a large language model (LLM) for advanced customer-oriented datasets for sentiment analysis; second, developing models that are specifically tailored to different types of online buyers based on their online behavior; and lastly, developing robust models that can understand the relationship between customer sentiments and other e-commerce factors, which could lead to an improvement in current predictive models. Moreover, since this study focusses on the customer’s purchasing journey, future studies could analyse the applications of sentiment-based predictive models in other fields such as healthcare and education. For instance, facial expressions and textual communication can be used to determine the sentiments of mental patients to predict their medicine dosage [12]. Such models can be used to analyze the sentiments of students in class (frustrated, happy, confused, etc.), thus identifying students with consistent negative sentiments to determine whether they need extra support. In terms of comparative parameters, this current work was based on the findings of other studies. Therefore, in future studies, experiments could be conducted involving a specific product online and an analysis of textual reviews in order to forecast of the number of potential buyers. In terms of data extraction, social data and metadata from different sources can increase the dataset size for better training of models to improve customer interaction and sentiment analysis [5]. Additionally, with emerging technologies, such as GPT4 models and Bard, a comparative study can be conducted to enhance the literature and improve the technologies used to implement such models. For such tasks, natural language inference (NLI), which is a natural language processing (NLP) task, can be useful because it helps models to understand the nuances of human language and make logical inferences from text, thus understanding customer reviews better to allow appropriate product recommendations [6].

Conclusion

Based on the findings of previous studies, it can be determined that businesses need to transition to the latest technologies from Industry 4.0 (PM, SA, and chatbots, etc.) to show a strong presence in the Marketing 5.0 era. As it can be analyzed, having such transition data is key. To obtain a robust predictive model, data need to be thoroughly cleaned, and new insights such as customers’ emotions, feelings, processes, and others need to be identified and well-integrated with historical data. Furthermore, it is crucial to extract data from the various phases of buyers’ journey, such as the awareness, decision-making, and purchase stages, as it helps to have more variables to train the model, resulting in a robust model.

Sentiment-based predictive models for online buyers have the potential to revolutionize the way e-commerce businesses operate. By analyzing the sentiments of online buyers, businesses can better understand their customers’ needs, preferences, and difficulties. This information can be used to improve products and services, target marketing campaigns more effectively, and reduce customer churn. However, as this systematic review has shown, there is still room for improvement in the development and application of sentiment-based predictive models for online buyers by considering more online purchasing factors and customer behaviors. Furthermore, the lack of an appropriate and comprehensive dataset containing online purchasing history and sentiment affects the accuracy of these models. Additionally, many existing models are not sufficiently robust to handle the diversity of online reviews and the complex relationships between sentiment and other variables, such as product features and buyers’ characteristics.

In this study, PRISMA was used to conduct a systematic literature review because it provides a better structure and clarity to the findings from different studies, thus helping to answer the research questions more efficiently. Journals were identified using keywords from the ProQuest database. They were then screened to make the search more accurate in selecting papers eligible for the study. However, this study has several limitations, one of which is that predictive models were explored in the marketing field. Hence, future studies could extend this research by including a range of different domains.

Data availability

Not applicable.

Code availability

Not applicable.

References

Abdar M, Salari S, Qahremani S, Lam H-K, Karray F, Hussain S, Abbas Khosravi U, Acharya R, Makarenkov V, Nahavandi S. UncertaintyFuseNet: robust uncertainty-aware hierarchical feature fusion model with ensemble monte carlo dropout for COVID-19 detection. Inf Fus. 2023;90(February):364–81. https://doi.org/10.1016/j.inffus.2022.09.023.
Article Google Scholar
Abidar L, Zaidouni D, Ikram ELA, Ennouaary A. Predicting customer segment changes to enhance customer retention: a case study for online retail using machine learning. Int J Adv Comput Sci Appl. 2023. https://doi.org/10.14569/IJACSA.2023.0140799.
Article Google Scholar
Abrego N, Ovaskainen O. Evaluating the predictive performance of presence-absence models: why can the same model appear excellent or poor? Ecol Evol. 2023. https://doi.org/10.1002/ece3.10784.
Article Google Scholar
Abreu LR, Maciel ISF, Alves JS, Braga LC, Pontes HLJ. A decision tree model for the prediction of the stay time of ships in brazilian ports. Eng Appl Artif Intell. 2023;117(January):105634. https://doi.org/10.1016/j.engappai.2022.105634.
Article Google Scholar
Abu-Salih B, Alotaibi S. Knowledge graph construction for social customer advocacy in online customer engagement. Technologies. 2023;11(5):123. https://doi.org/10.3390/technologies11050123.
Article Google Scholar
Abu-Salih B, Alweshah M, Alazab M, Al-Okaily M, Alahmari M, Al-Habashneh M, Al-Sharaeh S. Natural language inference model for customer advocacy detection in online customer engagement. Mach Learn. 2023. https://doi.org/10.1007/s10994-023-06476-w.
Article Google Scholar
Ahmed K, Nadeem MI, Zheng Z, Li D, Ullah I, Assam M, Ghadi YY, Mohamed HG. Breaking down linguistic complexities: a structured approach to aspect-based sentiment analysis. J King Saud Univ Comput Inf Sci. 2023;35(8):101651. https://doi.org/10.1016/j.jksuci.2023.101651.
Article Google Scholar
Akter S, Ali S, Fekete-Farkas M, Fogarassy C, Lakner Z. Why organic food? Factors influence the organic food purchase intension in an emerging country (Study from Northern part of Bangladesh). Resources. 2023;12(1):5. https://doi.org/10.3390/resources12010005.
Article Google Scholar
Al-Abbadi L, Bader D, Mohammad A, Al-Quran A, Aldaihani F, Al-Hawary S, Alathamneh F. The effect of online consumer reviews on purchasing intention through product mental image. Int J Data Netw Sci. 2022;6(4):1519–30. https://doi.org/10.5267/j.ijdns.2022.5.001.
Article Google Scholar
Alghazzawi DM, Alquraishee AGA, Badri SK, Hasan SH. ERF-XGB: ensemble random forest-based XG boost for accurate prediction and classification of E-commerce product review. Sustainability. 2023;15(9):7076. https://doi.org/10.3390/su15097076.
Article Google Scholar
Alharbi ZH. A sustainable price prediction model for airbnb listings using machine learning and sentiment analysis. Sustainability. 2023;15(17):13159. https://doi.org/10.3390/su151713159.
Article Google Scholar
Ali Y, Khan HU, Khalid M. Engineering the advances of the artificial neural networks (ANNs) for the security requirements of internet of things: a systematic review. J Big Data. 2023;10(1):128. https://doi.org/10.1186/s40537-023-00805-5.
Article Google Scholar
Al-Sai ZA, Husin MH, Syed-Mohamad SM, Abdullah R, Zitar RA, Abualigah L, Gandomi AH. Big data maturity assessment models: a systematic literature review. Big Data Cognit Comput. 2023;7(1):2. https://doi.org/10.3390/bdcc7010002.
Article Google Scholar
Alsayat A. Customer decision-making analysis based on big social data using machine learning: a case study of hotels in Mecca. Neural Comput Appl. 2023;35(6):4701–22. https://doi.org/10.1007/s00521-022-07992-x.
Article Google Scholar
AL-Sous N, Almajali D, Alsokkar A. Antecedents of social media influencers on customer purchase intention: empirical study in Jordan. Intl J Data Netw Sci. 2023;7(1):125–30.
Article Google Scholar
Alzahrani RA, Aljabri M. AI-Based techniques for Ad click fraud detection and prevention: review and research directions. J Sens Actuator Netw. 2023;12(1):4. https://doi.org/10.3390/jsan12010004.
Article Google Scholar
Anas AM, Abdou AH, Hassan TH, Alrefae WMM, Daradkeh FM, El-Amin M-M, Kegour ABA, Alboray HMM. Satisfaction on the driving seat: exploring the influence of social media marketing activities on followers’ purchase intention in the restaurant industry context. Sustainability. 2023;15(9):7207. https://doi.org/10.3390/su15097207.
Article Google Scholar
Atallah SB, Banda NR, Banda A, Roeck NA. How large language models including generative pre-trained transformer (GPT) 3 and 4 will impact medicine and surgery. Tech Coloproctol. 2023;27(8):609–14. https://doi.org/10.1007/s10151-023-02837-8.
Article Google Scholar
Bakator M, Vukoja M, Manestar D. Achieving competitiveness with marketing 5.0 in new business conditions. UTMS J Econ. 2023;14(1):63–73.
Google Scholar
Barik K, Misra S, Ray AK, Bokolo A. LSTM-DGWO-based sentiment analysis framework for analyzing online customer reviews. Comput Intell Neurosci. 2023;2023(February):6348831. https://doi.org/10.1155/2023/6348831.
Article Google Scholar
Baroiu AC, Stefan TM. Comparison of Deep learning models for automatic detection of sarcasm context on the MUStARD dataset. Electronics. 2023;666:5. https://doi.org/10.3390/electronics12030666.
Article Google Scholar
Bashir R, Mehboob I, Bhatti WK. Effects of online shopping trends on consumer-buying behaviour: an empirical study of Pakistan. J Manag Res. 2015;2(2):1–24. https://doi.org/10.29145/jmr/22/0202001.
Article Google Scholar
Bełch P, Hajduk-Stelmachowicz M, Chudy-Laskowska K, Vozňáková I, Gavurová B. Factors determining the choice of pro-ecological products among generation Z. Sustainability. 2024;16(4):1560. https://doi.org/10.3390/su16041560.
Article Google Scholar
Benavides-Astudillo E, Fuertes W, Sanchez-Gordon S, Nuñez-Agurto D, Rodríguez-Galán G. A phishing-attack-detection model using natural language processing and deep learning. Appl Sci. 2023;13(9):5275. https://doi.org/10.3390/app13095275.
Article Google Scholar
Bintara R, Yadiati W, Zarkasyi MW, Tanzil ND. Management of green competitive advantage: a systematic literature review and research Agenda. Economies. 2023;11(2):66. https://doi.org/10.3390/economies11020066.
Article Google Scholar
Boehringer AS, Sanaat A, Arabi H, Zaidi H. An active learning approach to train a deep learning algorithm for tumor segmentation from brain MR images. Insights Imagin. 2023;14(1):141. https://doi.org/10.1186/s13244-023-01487-6.
Article Google Scholar
Trebicka B, Tartaraj A, Harizi A. Analyzing the relationship between pricing strategy and customer retention in hotels: a study in Albania. F1000Research. 2023. https://doi.org/10.12688/f1000research.132723.1.
Article Google Scholar
Busalim AH, Hussin ARC. Understanding social commerce: a systematic literature review and directions for further research. Int J Inf Manag. 2016;36(6 Part A):1075–88. https://doi.org/10.1016/j.ijinfomgt.2016.06.005.
Article Google Scholar
Bushara MA, Abdou AH, Hassan TH, Abu EE, Sobaih AS, Albohnayh M, Alshammari WG, Aldoreeb M, Elsaed AA, Elsaied MA. Power of social media marketing: how perceived value mediates the impact on restaurant followers’ purchase intention, willingness to pay a premium price, and E-WoM? Sustainability. 2023;15(6):5331. https://doi.org/10.3390/su15065331.
Article Google Scholar
Butros A, Taylor S. ‘Managing information: evaluating and selecting citation management sofrtware, a look at endnote, refworks, mendeley and zotero’. 2011. https://www.researchgate.net/publication/268428881_Managing_information_evaluating_and_selecting_citation_management_software_a_look_at_EndNote_RefWorks_Mendeley_and_Zotero. Accessed 15 Sept 2023.
Candan SS, Bayram SS. Metaphors perception in personal sales concept: evaluation with logistic regression. Bus Manag Stud Int J. 2023;11(1):208–25. https://doi.org/10.15295/bmij.v11i1.2204.
Article Google Scholar
Chan J-L, Bea KT, Leow SMH, Phoong SW, Cheng WK. State of the art: a review of sentiment analysis based on sequential transfer learning. Artif Intell Rev. 2023;56(1):749–80. https://doi.org/10.1007/s10462-022-10183-8.
Article Google Scholar
Chen SS, Pai TW, Sun CY. 2023. ‘Applying the diamond model of intrusion analysis with generative pre-trained transformer 3’. In: 2023 International conference on consumer electronics—Taiwan (ICCE-Taiwan), 2023. pp.289–90. https://doi.org/10.1109/ICCE-Taiwan58799.2023.10226923.
Cheng X, Chaw JK, Goh KM, Ting TT, Sahrani S, Ahmad MN, Kadir RA, Ang MC. Systematic literature review on visual analytics of predictive maintenance in the manufacturing industry. Sensors. 2022;22(17):6321. https://doi.org/10.3390/s22176321.
Article Google Scholar
Yang C, Fa-you A, Yu-Feng W, Yan SQ, Zhu CB, Zhang H. Impact of parameter tuning with genetic algorithm, particle swarm optimization, and bat algorithm on accuracy of the SVM Model in landslide susceptibility evaluation. Math Probl Eng. 2023. https://doi.org/10.1155/2023/1393142.
Article Google Scholar
Cui J, Bai L, Li G, Lin Z, Zeng P. Semi-2DCAE: a semi-supervision 2D-CNN AutoEncoder model for feature representation and classification of encrypted traffic. PeerJ Comput Sci. 2023. https://doi.org/10.7717/peerj-cs.1635.
Article Google Scholar
Ding Y, Lei X, Liao Bo, Fang-Xiang Wu. Biomarker identification via a factorization machine-based neural network with binary pairwise encoding. IEEE/ACM Trans Comput Biol Bioinf. 2023;20(3):2136–46. https://doi.org/10.1109/TCBB.2023.3235299.
Article Google Scholar
Do T-N, Lenca P, Lallich S. Classifying many-class high-dimensional fingerprint datasets using random forest of oblique decision trees: [Doc 24]. Vietnam J Comput Sci. 2014;2(1):3–12. https://doi.org/10.1007/s40595-014-0024-7.
Article Google Scholar
Dong W, Huang Y, Lehane B, Ma G. XGBoost algorithm-based prediction of concrete electrical resistivity for structural health monitoring. Autom Constr. 2020;114(June):103155. https://doi.org/10.1016/j.autcon.2020.103155.
Article Google Scholar
Ebrahimi P, Khajeheian D, Soleimani M, Gholampour A, Fekete-Farkas M. User engagement in social network platforms: what key strategic factors determine online consumer purchase behaviour? Ekonomska Istrazivanja: Znanstveno-Strucni Casopis. 2023. https://doi.org/10.1080/1331677X.2022.2106264.
Article Google Scholar
Edara DC, Vanukuri LP, Sistla V, Kolli VKK. Sentiment analysis and text categorization of cancer medical records with LSTM. J Ambient Intell Humaniz Comput. 2023;14(5):5309–25. https://doi.org/10.1007/s12652-019-01399-8.
Article Google Scholar
Faiz T, Aldmour R, Ahmed G, Alshurideh M, Paramaiah C. Machine learning price prediction during and before COVID-19 and consumer buying behavior. In: Alshurideh M, Al Kurdi BH, Masadeh R, Alzoubi HM, Salloum S, editors. The effect of information technology on business and marketing intelligence systems. Studies in Computational Intelligence. Cham: Springer International Publishing; 2023. p. 1845–67. https://doi.org/10.1007/978-3-031-12382-5_101.
Chapter Google Scholar
Fang Y, Wang W, Pengcheng Wu, Zhao Y. A sentiment-enhanced hybrid model for crude oil price forecasting. Expert Syst Appl. 2023;215(April):119329. https://doi.org/10.1016/j.eswa.2022.119329.
Article Google Scholar
Farooq U, Ademola M, Shaalan A. Comparative analysis of machine learning models for predictive maintenance of ball bearing systems. Electronics. 2024;13(2):438. https://doi.org/10.3390/electronics13020438.
Article Google Scholar
Faruk M, Rahman M, Hasan S. How digital marketing evolved over time: a bibliometric analysis on scopus database. Heliyon. 2021;7(12): e08603. https://doi.org/10.1016/j.heliyon.2021.e08603.
Article Google Scholar
Feng Z, Mamun AA, Masukujjaman M, Yang Q. Modeling the significance of advertising values on online impulse buying behavior. Humanit Soc Sci Commun. 2023;10(1):728. https://doi.org/10.1057/s41599-023-02231-7.
Article Google Scholar
Ferraz RM, Pereira C, da Veiga C, Pereira R, da Veiga T, Furquim SG, da Vieira Silva W. After-sales attributes in e-commerce: a systematic literature review and future research Agenda. J Theor Appl Electron Commer Res. 2023;18(1):475. https://doi.org/10.3390/jtaer18010025.
Article Google Scholar
Frandsen TF, Eriksen MB. Supplementary strategies identified additional eligible studies in qualitative systematic reviews. J Clin Epidemiol. 2023;159(July):85–91. https://doi.org/10.1016/j.jclinepi.2023.04.017.
Article Google Scholar
Frost AD, Hróbjartsson A, Nejstgaard CH. Adherence to the PRISMA-P 2015 reporting guideline was inadequate in systematic review protocols. J Clin Epidemiol. 2022;150(October):179–87. https://doi.org/10.1016/j.jclinepi.2022.07.002.
Article Google Scholar
Gao S, Meng W. Cloud-based services and customer satisfaction in the small and medium-sized businesses (SMBs). Kybernetes. 2022;51(6):1991–2007. https://doi.org/10.1108/K-05-2021-0376.
Article MathSciNet Google Scholar
James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical Learning. Berlin: Springer; 2013. https://doi.org/10.1007/978-1-4614-7138-7.
Book Google Scholar
Google. 2022. ‘Reducing loss: gradient descent | machine learning’. Google for developers. 2022. https://developers.google.com/machine-learning/crash-course/reducing-loss/gradient-descent. Accessed 15 Sept 2023.
Majumder MG, Gupta SD, Paul J. Perceived usefulness of online customer reviews: a review mining approach using machine learning & exploratory data analysis. J Bus Res. 2022;150(November):147–64. https://doi.org/10.1016/j.jbusres.2022.06.012.
Article Google Scholar
Liu G, Nguyenm T, Zhao G, Zha W, Yang J, Cao J, Wu M, Zhao P. ‘Repeat Buyer Prediction for E-Commerce’. In: KDD ’16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016.16:155–64. https://doi.org/10.1145/2939672.2939674.
Gujrati R, Gulati U, Uygun H. Digital transformation has changed consumer behvoiur from traditional market to digital market. Acad Market Stud J. 2023;27(S2):1–6.
Google Scholar
Hajek P, Sahut J-M. Mining behavioural and sentiment-dependent linguistic patterns from restaurant reviews for fake review detection. Technol Forecast Soc Chang. 2022;177(April):1. https://doi.org/10.1016/j.techfore.2022.121532.
Article Google Scholar
Hamadani A, Ganai NA, Bashir J. Artificial neural networks for data mining in animal sciences. Bulle Natl Res Cent. 2023;47(1):68. https://doi.org/10.1186/s42269-023-01042-9.
Article Google Scholar
Hassler AP, Menasalvas E, García-García FJ, Rodríguez-Mañas L, Holzinger A. Importance of medical data preprocessing in predictive modeling and risk factor discovery for the frailty syndrome. BMC Med Inf Decis Mak. 2019. https://doi.org/10.1186/s12911-019-0747-6.
Article Google Scholar
Hayati N, Jaelani E. Analysis of digital marketing quality before and during the Covid-19 pandemic on frozen food consumers in West Java Region. Calit Acces La Success. 2024;25(198):149–59. https://doi.org/10.47750/QAS/25.198.16.
Article Google Scholar
He Y, Chen M, He Y, Zhining Qu, He F, Feihong Yu, Liao J, Wang Z. Sarcasm detection base on adaptive incongruity extraction network and incongruity cross-attention. Appl Sci. 2023;13(4):2102. https://doi.org/10.3390/app13042102.
Article Google Scholar
Hicham N, Nassera H, Karim S. A thorough analysis of e-commerce customer reviews in arabic language using deep learning techniques for successful marketing decisions. IAENG Int J Appl Math. 2023;53(4):1–8.
Google Scholar
Hodgson EL, Souaiby M, Troldborg N, Porté-Agel F, Andersen SJ. Cross-code verification of non-neutral ABL and single wind turbine wake modelling in LES. J Phys: Conf Ser. 2023;2505(1):012009. https://doi.org/10.1088/1742-6596/2505/1/012009.
Article Google Scholar
Shamim HM, Rahman MF, Uddin MK, Hossain MK. Customer sentiment analysis and prediction of halal restaurants using machine learning approaches. J Islam Market. 2023;14(7):1859–89. https://doi.org/10.1108/JIMA-04-2021-0125.
Article Google Scholar
Hu J, Szymczak S. A review on longitudinal data analysis with random forest. Brief Bioinform. 2023;24(2):002. https://doi.org/10.1093/bib/bbad002.
Article Google Scholar
Igual C, Castillo A, Igual J. An interactive training model for myoelectric regression control based on human-machine cooperative performance. Computers. 2024;13(1):29. https://doi.org/10.3390/computers13010029.
Article Google Scholar
Jadhav GG, Gaikwad SV, Bapat D. A systematic literature review: digital marketing and its impact on SMEs. J Ind Bus Res. 2023;15(1):76–91. https://doi.org/10.1108/JIBR-05-2022-0129.
Article Google Scholar
Jain PK, Pamula R, Srivastava G. A systematic literature review on machine learning applications for consumer sentiment analysis using online reviews. Comput Sci Rev. 2021;41(August):100413. https://doi.org/10.1016/j.cosrev.2021.100413.
Article Google Scholar
Jia Y, Feng H, Wang X, Alvarado M. “Customer reviews or vlogger reviews?” The impact of cross-platform ugc on the sales of experiential products on E-commerce platforms. J Theor Appl Electron Commer Res. 2023;18(3):1257. https://doi.org/10.3390/jtaer18030064.
Article Google Scholar
Jiang H, Sabetzadeh F, Chan KY. Developing nonlinear customer preferences models for product design using opining mining and multiobjective PSO-based ANFIS approach. Comput Intell Neurosci CIN. 2023. https://doi.org/10.1155/2023/6880172.
Article Google Scholar
Jlifi B, Abidi C, Duvallet C. Beyond the use of a novel ensemble based random forest-BERT model (Ens-RF-BERT) for the sentiment analysis of the hashtag COVID19 tweets. Soc Netw Anal Min. 2024;14(1):88. https://doi.org/10.1007/s13278-024-01240-x.
Article Google Scholar
Kakalejčík L, Bucko J, Vejačka M. Differences in buyer journey between high- and low-value customers of e-commerce business. J Theor Appl Electron Commer Res. 2019;14(2):47–58. https://doi.org/10.4067/S0718-18762019000200105.
Article Google Scholar
Kalita K, Burande D, Ghadai RK, Chakraborty S. Finite element modelling, predictive modelling and optimization of metal inert gas, tungsten inert gas and friction stir welding processes: a comprehensive review. Arch Comput Methods Eng. 2023;30(1):271–99. https://doi.org/10.1007/s11831-022-09797-6.
Article Google Scholar
Kamakura WA. Cross-selling. Relationsh Market. 2008;6(3–4):41–58. https://doi.org/10.1300/J366v06n03_03.
Article Google Scholar
Kapoor R, Kapoor K. The transition from traditional to digital marketing: a study of the evolution of e-marketing in the indian hotel industry. Worldw Hosp Tour Themes. 2021;13(2):199–213. https://doi.org/10.1108/WHATT-10-2020-0124.
Article Google Scholar
Kelley L. The 3 primary stages of the buyer’s journey. ImageSource. 2014;16(12):14.
Google Scholar
Kepes S, McDaniel MA, Brannick MT, Banks GC. Meta-analytic reviews in the organizational sciences: two meta-analytic schools on the way to MARS (the meta-analytic reporting standards). J Bus Psychol. 2013;28(2):123–43. https://doi.org/10.1007/s10869-013-9300-2.
Article Google Scholar
Muzahid KM, Bashar I, Minhaj GM, Wasi AI, Hossain NUI. Resilient and sustainable supplier selection: an integration of SCOR 4.0 and machine learning approach. Sustain Resil Infrastruct. 2023;8(5):453–69. https://doi.org/10.1080/23789689.2023.2165782.
Article Google Scholar
Khan MA, Vivek SM, Minhaj MA, Saifi SA, Hasan A. Impact of store design and atmosphere on shoppers’ purchase decisions: an empirical study with special reference to Delhi-NCR. Sustainability. 2023;15(1):95. https://doi.org/10.3390/su15010095.
Article Google Scholar
Khan S, Rashid A, Rasheed R, Amirah NA. Designing a knowledge-based system (KBS) to study consumer purchase intention: the impact of digital influencers in Pakistan. Kybernetes. 2022;52(5):1720–44. https://doi.org/10.1108/K-06-2021-0497.
Article Google Scholar
Khanna P, Maheshwari S. Development of mathematical models for prediction and control of weld bead dimensions in MIG welding of stainless steel 409M’. In: materials today: proceedings, 7th international conference of materials processing and characterization, March 17–19, 2017, 2018; 5 (2, Part 1): 4475–88. https://doi.org/10.1016/j.matpr.2017.12.017.
Khondakar MFK, Sarowar MH, Chowdhury MH, Majumder S, Hossain MA, Dewan MAA, Hossain QD. A systematic review on EEG-based neuromarketing: recent trends and analyzing techniques. Brain Inf. 2024;11(1):17. https://doi.org/10.1186/s40708-024-00229-8.
Article Google Scholar
Kim HJ, Jayakumar Venkat S, Chang HW, Cho YH, Lee JY, Koo K. A two-step approach to overcoming data imbalance in the development of an electrocardiography data quality assessment algorithm: a real-world data challenge. Biomimetics. 2023;8(1):119. https://doi.org/10.3390/biomimetics8010119.
Article Google Scholar
Kim J, Hui-Sang K, Sun-Yong C. Forecasting the S&P 500 Index using mathematical-based sentiment analysis and deep learning models: a FinBERT transformer model and LSTM. Axioms. 2023;12(9):835. https://doi.org/10.3390/axioms12090835.
Article Google Scholar
Kitchenham B. Procedures for performing systematic reviews. Keele: Keele Univ; 2004.
Google Scholar
Kjell O, Giorgi S, Andrew Schwartz H. The text-package: an R-package for analyzing and visualizing human language using natural language processing and transformers. Psychol Methods. 2023. https://doi.org/10.1037/met0000542.
Article Google Scholar
Ko S-H, Hsieh M-C, Huang R-F. Human error analysis and modeling of medication-related adverse events in Taiwan using the human factors analysis and classification system and logistic regression. Healthcare. 2023;11(14):2063. https://doi.org/10.3390/healthcare11142063.
Article Google Scholar
Kukkar A, Mohana R, Sharma A, Nayyar A, Shah MA. Improving sentiment analysis in social media by handling lengthened words. IEEE Access. 2023;11:9775–88. https://doi.org/10.1109/ACCESS.2023.3238366.
Article Google Scholar
Kumar S, Singh P, Srivastava G, Singh S. Intelligent movie recommender framework based on content-based & collaborative filtering assisted with sentiment analysis. Int J Adv Res Comput Sci. 2023;14(3):108–13. https://doi.org/10.26483/ijarcs.v14i3.6979.
Article Google Scholar
Kyaw KS, Tepsongkroh P, Thongkamkaew C, Sasha F. Business intelligent framework using sentiment analysis for smart digital marketing in the E-commerce era. Asia Soc Issues. 2023;16(3):e252965–e252965. https://doi.org/10.48048/asi.2023.252965.
Article Google Scholar
Liang W, Luo S, Zhao G, Hao Wu. Predicting hard rock pillar stability using GBDT, XGBoost, and LightGBM algorithms. Mathematics. 2020;8(5):765. https://doi.org/10.3390/math8050765.
Article Google Scholar
Lim CV, Yu-Peng Z, Omar M, Han-Woo P. Decoding the relationship of artificial intelligence, advertising, and generative models. Digital. 2024;4(1):244. https://doi.org/10.3390/digital4010013.
Article Google Scholar
Liu D, Wang Y, Luo C, Ma J. An improved autoencoder for recommendation to alleviate the vanishing gradient problem. Knowl-Based Syst. 2023;263(March):110254. https://doi.org/10.1016/j.knosys.2023.110254.
Article Google Scholar
Liu M, Ying Q. The role of online news sentiment in carbon price prediction of china’s carbon markets. Environ Sci Pollut Res. 2023;30(14):41379–87. https://doi.org/10.1007/s11356-023-25197-0.
Article Google Scholar
Long Y, Huang L, Li Y, Quan W, Yoshida Y. Enlarged carbon footprint inequality considering household time use pattern. Environ Res Lett. 2024;19(4):044013. https://doi.org/10.1088/1748-9326/ad2d85.
Article Google Scholar
Ma J, Dhiman P, Qi C, Bullock G, van Smeden M, Riley RD, Collins GS. Poor handling of continuous predictors in clinical prediction models using logistic regression: a systematic review. J Clin Epidemiol. 2023;161(September):140–51. https://doi.org/10.1016/j.jclinepi.2023.07.017.
Article Google Scholar
Malodia S, Ferraris A, Sakashita M, Dhir A, Gavurova B. Can alexa serve customers better? AI-Driven voice assistant service interactions. J Serv Mark. 2022;37(1):25–39. https://doi.org/10.1108/JSM-12-2021-0488.
Article Google Scholar
Manikandan B, Rama P, Chakaravarthi S. A new fuzzy lexicon expansion and sentiment aware recommendation system in E-commerce. Int J Adv Comput Sci Appl. 2023. https://doi.org/10.14569/IJACSA.2023.0140629.
Article Google Scholar
Marcos AM, de Figueiredo B, de Coelho AFM. Service quality, customer satisfaction and customer value: holistic determinants of loyalty and word-of-mouth in services. TQM J. 2021;34(5):957–78. https://doi.org/10.1108/TQM-10-2020-0236.
Article Google Scholar
Mehmood S, Ahmad I, Khan F, Khan A. Sentiment analysis in social media for competitive environment using content analysis. Comput Mater Contin. 2022. https://doi.org/10.32604/cmc.2022.023785.
Article Google Scholar
Memon ZA, Munawar N, Kamal M. App store mining for feature extraction: analyzing user reviews. Acta Sci Technol. 2024. https://doi.org/10.4025/actascitechnol.v46i1.62867.
Article Google Scholar
Mgiba FM, Koopman A. The impact of motivation, attitude, quality, availability, and advertisement on the purchase intention for fashion clothing. Afr J Bus Econ Res. 2023;18(2):153–80. https://doi.org/10.31920/1750-4562/2023/v18n2a8.
Article Google Scholar
Mika B, Winczewski D. The work-on-demand platform as a part of monopoly capital: the example of a global ride-hailing company. Polish Sociol Rev. 2024;225:31–48. https://doi.org/10.26412/psr225.02.
Article Google Scholar
Mirfakhraei S, Abdolvand N, Rajaei S, Harandi. The RFMRv model for customer segmentation based on the referral value. Iran J Manag Stud. 2024;17(2):455–73. https://doi.org/10.22059/ijms.2023.329229.674722.
Article Google Scholar
Mohammed A, Kora R. A comprehensive review on ensemble deep learning: opportunities and challenges. J King Saud Univ Comput Inf Sci. 2023;35(2):757–74. https://doi.org/10.1016/j.jksuci.2023.01.014.
Article Google Scholar
Moher D, Shamseer L, Clarke M, Ghersi D, Liberati A, Petticrew M, Shekelle P, Lesley A, Stewart, and PRISMA-P Group. Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement. Syst Rev. 2015;4(1):1. https://doi.org/10.1186/2046-4053-4-1.
Article Google Scholar
Mushtaq K, Zou R, Waris A, Yang K, Wang Ji, Iqbal J, Jameel M. Multivariate wind power curve modeling using multivariate adaptive regression splines and regression trees. PLoS ONE. 2023;18(8): e0290316. https://doi.org/10.1371/journal.pone.0290316.
Article Google Scholar
Mydyti H, Kadriu A, Bach MP. Using data mining to improve decision-making: case study of a recommendation system development. Organizacija. 2023;56(2):138–54. https://doi.org/10.2478/orga-2023-0010.
Article Google Scholar
Nagam VM. Internet use, users, and cognition: on the cognitive relationships between internet-based technology and internet users. BMC Psychol. 2023;11:1–9. https://doi.org/10.1186/s40359-023-01041-5.
Article Google Scholar
Natras R, Soja B, Schmidt M. Ensemble machine learning of random forest, AdaBoost and XGBoost for vertical total electron content forecasting. Remote Sens. 2022;14(15):3547. https://doi.org/10.3390/rs14153547.
Article Google Scholar
Nguyen MS. The influence of social media marketing on brand loyalty and intention to use among young vietnamese consumers of digital banking. Innov Market. 2023;19(4):1–13. https://doi.org/10.21511/im.19(4).2023.01.
Article Google Scholar
Chen N. Research on E-commerce database marketing based on machine learning algorithm. Comput Intell Neurosci CIN. 2022. https://doi.org/10.1155/2022/7973446.
Article Google Scholar
O’Croinin C, Guerra AG, Doschak MR, Löbenberg R, Davies NM. Therapeutic potential and predictive pharmaceutical modeling of stilbenes in cannabis sativa. Pharmaceutics. 2023;15(7):1941. https://doi.org/10.3390/pharmaceutics15071941.
Article Google Scholar
Oe H, Yamaoka Y, Ochiai H. Personal and emotional values embedded in thai-consumers’ perceptions: key factors for the sustainability of traditional confectionery businesses. Sustainability. 2023;15(2):1548. https://doi.org/10.3390/su15021548.
Article Google Scholar
Ounacer S, Mhamdi D, Ardchir S, Daif A, Azzouazi M. Customer sentiment analysis in hotel reviews through natural language processing techniques. Int J Adv Comput Sci Appl. 2023. https://doi.org/10.14569/IJACSA.2023.0140162.
Article Google Scholar
Paulo R, Vong C, Pinheiro F, Mimoso J. A sentiment analysis of michelin-starred restaurants. Eur J Manag Bus Econ. 2023;32(3):276–95. https://doi.org/10.1108/EJMBE-11-2021-0295.
Article Google Scholar
Petkovic J, Welch V, Tugwell P. PROTOCOL: do evidence summaries increase health policy-makers’ use of evidence from systematic reviews? A systematic review protocol. Campbell Syst Rev. 2017;13(1):1–18. https://doi.org/10.1002/CL2.178.
Article Google Scholar
Ping Y, Buoye A, Vakil A. Enhanced review facilitation service for C2C support: machine learning approaches. J Serv Mark. 2023;37(5):620–35. https://doi.org/10.1108/JSM-01-2022-0005.
Article Google Scholar
Pop R-A, Hlédik E, Dabija D-C. Predicting consumers’ purchase intention through fast fashion mobile apps: the mediating role of attitude and the moderating role of COVID-19. Technol Forecast Soc Chang. 2023;186(January):122111. https://doi.org/10.1016/j.techfore.2022.122111.
Article Google Scholar
Prasad GB, Keerthi MV, ChandanaAnjali O, Revathi. Sentiment analysis of customer product reviews using machine learning. Turk J Comput Math Educ. 2023;14(3):178–88.
Google Scholar
Punetha N, Jain G. Aspect and orientation-based sentiment analysis of customer feedback using mathematical optimization models. Knowl Inf Syst. 2023;65(6):2731–60. https://doi.org/10.1007/s10115-023-01848-z.
Article Google Scholar
Rahman NA, Idrus SD, Adam NL. Classification of customer feedbacks using sentiment analysis towards mobile banking applications. IAES Int J Artif Intell. 2022;11(4):1579–87. https://doi.org/10.11591/ijai.v11.i4.pp1579-1587.
Article Google Scholar
Rahmani E, Khatami M, Stephens E. Using probabilistic machine learning methods to improve beef cattle price modeling and promote beef production efficiency and sustainability in Canada. Sustainability. 2024;16(5):1789. https://doi.org/10.3390/su16051789.
Article Google Scholar
Rajasa MC, Rahma F, Rachmadi RF, Pratomo BA, Purnomo MH. 2023. A review of imbalanced datasets and resampling techniques in network intrusion detection system. In: 2023 8th International conference on information technology and digital applications (ICITDA), 2023. pp. 1–6. https://doi.org/10.1109/ICITDA60835.2023.10427217.
Ramos AP, Tanes RLV, Esplanada DE. Sentiment analysis in service quality of eugene’s villa of baler based on airbnb reviEWS. Quantum J Soc Sci Humanit. 2022;3(6):153–67. https://doi.org/10.55197/qjssh.v3i6.201.
Article Google Scholar
Rapa M, Ciano S, Orsini F, Tullo MG, Giannetti V, Mariani MB. Adoption of AI-based technologies in the food supplement industry: an Italian Start-Up case study. Systems. 2023;11(6):265. https://doi.org/10.3390/systems11060265.
Article Google Scholar
Razali NA, Mat NA, Malizan NA, Hasbullah MW, Zainuddin NM, Ishak KK, Ramli S, Sukardi S. Political security threat prediction framework using hybrid lexicon-based approach and machine learning technique. IEEE Access. 2023;11:17151–64. https://doi.org/10.1109/ACCESS.2023.3246162.
Article Google Scholar
Rivas P, Zhao L. Marketing with ChatGPT: navigating the ethical terrain of GPT-based chatbot technology. AI. 2023. https://doi.org/10.3390/ai4020019.
Article Google Scholar
Rubio-Aparicio M, Sanchez-Meca J, Fulgencio M-M, Lopez-Lopez JA. MARS (meta-analysis reporting standards). Anales de Psicol. 2018;34:412–20. https://doi.org/10.6018/analesps.34.2.320131.
Article Google Scholar
Sakalauskas V, Kriksciuniene D. Personalized advertising in E-commerce: using clickstream data to target high-value customers. Algorithms. 2024;17(1):27. https://doi.org/10.3390/a17010027.
Article Google Scholar
Salim SS, Ghanshyam AN, Ashok DM, Mazahir DB, Thakare BS. 2020. ‘Deep LSTM-RNN with Word Embedding for Sarcasm Detection on Twitter’. In: 2020 International Conference for Emerging Technology (INCET). 2020. pp. 1–4. https://doi.org/10.1109/INCET49848.2020.9154162.
Santoni MM, Basaruddin T, Junus K. Convolutional neural network model based students’ engagement detection in imbalanced DAiSEE dataset. Int J Adv Comput Sci Appl. 2023. https://doi.org/10.14569/IJACSA.2023.0140371.
Article Google Scholar
Sarioğlu Cİ. The effect of customer perceptions concerning online shopping, viral marketing and customer loyalty on purchasing behaviour. Int J Manag EconBus. 2023;19(2):348–70. https://doi.org/10.17130/ijmeb.1210803.
Article Google Scholar
Sha Z, Cui Y, Xiao Y, Stathopoulos A, Contractor N, Fu Y, Chen W. A network-based discrete choice model for decision-based design. Design Sci. 2023. https://doi.org/10.1017/dsj.2023.4.
Article Google Scholar
Shah A, Kothari K, Thakkar U, Khara S. User review classification and star rating prediction by sentimental analysis and machine learning classifiers. In: Tuba M, Akashe S, Joshi A, editors. Information and communication technology for sustainable development. Advances in Intelligent Systems and Computing. Singapore: Springer; 2020. p. 279–88. https://doi.org/10.1007/978-981-13-7166-0_27.
Chapter Google Scholar
Shanmugavel AB, Ellappan V, Mahendran A, Subramanian M, Lakshmanan R, Mazzara M. A novel ensemble based reduced overfitting model with convolutional neural network for traffic sign recognition system. Electronics. 2023;12(4):926. https://doi.org/10.3390/electronics12040926.
Article Google Scholar
Sherbaz A, Konak BMK, Pezeshkpour P, Di Ventura B, Rapp BE. Deterministic lateral displacement microfluidic chip for minicell purification. Micromachines. 2022;13(3):365. https://doi.org/10.3390/mi13030365.
Article Google Scholar
Shini G, Srividhya V. Implicit aspect based sentiment analysis for restaurant review using LDA topic modeling and ensemble approach. Int J Adv Technol Eng Explor. 2023;10(102):554–68. https://doi.org/10.19101/IJATEE.2022.10100099.
Article Google Scholar
Singh G, Slack NJ, Sharma S, Aiyub AS, Ferraris A. Antecedents and consequences of fast-food restaurant customers’ perception of price fairness. Br Food J. 2022;124(8):2591–609. https://doi.org/10.1108/BFJ-03-2021-0286.
Article Google Scholar
Singh R, Singh R. Applications of sentiment analysis and machine learning techniques in disease outbreak prediction—a review. Mater Today Proc, Int Virtual Conf Sustain Mater. 2023;81(January):1006–11. https://doi.org/10.1016/j.matpr.2021.04.356.
Article Google Scholar
Singh U, Saraswat A, Azad HK, Abhishek K, Shitharth S. Towards improving e-commerce customer review analysis for sentiment detection. Sci Rep. 2022;12(1):21983. https://doi.org/10.1038/s41598-022-26432-3.
Article Google Scholar
Skinner D, Blake J. Modelling consumers Choice of Novel Food. PLoS ONE. 2023;18(8): e0290169. https://doi.org/10.1371/journal.pone.0290169.
Article Google Scholar
Skubleny D, Ghosh S, Spratlin J, Schiller DE, Rayat GR. Feature-specific quantile normalization and feature-specific mean-variance normalization deliver robust Bi-directional classification and feature selection performance between microarray and RNAseq Data. BMC Bioinform. 2024;25:1–14. https://doi.org/10.1186/s12859-024-05759-w.
Article Google Scholar
Sudirjo F, Ratnawati R, Hadiyati R, Sutaguna INT, Yusuf M. The influence of online customer reviews and E-service quality on buying decisions in electronic commerce. J Manag Creat Bus. 2023;1(2):156–81. https://doi.org/10.30640/jmcbus.v1i2.941.
Article Google Scholar
SunLuo HYE, Liu F, Lowe B. The advertisement puts me down, but i like it: examining an emerging type of audience-targeted negative advertisement. J Advert Res. 2023;63(2):160. https://doi.org/10.2501/JAR-2023-010.
Article Google Scholar
Susnjak T. Applying BERT and ChatGPT for sentiment analysis of lyme disease in scientific literature. arXiv. 2023. https://doi.org/10.48550/arXiv.2302.06474.
Suyanto A, Femi SR. Analysis of the effect of impulsive purchase and service quality on customer satisfaction and loyalty in beauty E-commerce. Calit Acces La Success. 2023;24(194):18–28. https://doi.org/10.47750/QAS/24.194.03.
Article Google Scholar
Taherkhani L, Daneshvar A, Amoozad Khalili H, Sanaei MR. Analysis of the customer churn prediction project in the hotel industry based on text mining and the random forest algorithm. Adv Civil Eng. 2023. https://doi.org/10.1155/2023/6029121.
Article Google Scholar
Alamin TM, Islam MM, Uddin MA, Hasan KF, Sharmin S, Alyami SA, Moni MA. Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction. J Big Data. 2024;11(1):33. https://doi.org/10.1186/s40537-024-00886-w.
Article Google Scholar
Taralik K, Kozák T, Molnár Z. Channel preferences and attitudes of domestic buyers in purchase decision processes of high-value electronic devices. Entrep Bus Econ Rev. 2023;11(2):121–36. https://doi.org/10.15678/EBER.2023.110206.
Article Google Scholar
Tavares MC, Azevedo G, Marques RP. The challenges and opportunities of era 5.0 for a more humanistic and sustainable society—a literature review. Societies. 2022;12(6):149. https://doi.org/10.3390/soc12060149.
Article Google Scholar
Tay Y, Tuan LA, Hui SC, Su J. Reasoning with Sarcasm by Reading In-Between. arXiv. 2018. https://doi.org/10.48550/arXiv.1805.02856.
Thangeda R, Kumar N, Majhi R. A neural network-based predictive decision model for customer retention in the telecommunication sector. Technol Forecast Soc Chang. 2024;202(May):123250. https://doi.org/10.1016/j.techfore.2024.123250.
Article Google Scholar
Torkzadeh S, Zolfagharian M, Yazdanparast A, Gremler DD. From customer readiness to customer retention: the mediating role of customer psychological and behavioral engagement. Eur J Mark. 2022;56(7):1799–829. https://doi.org/10.1108/EJM-03-2021-0213.
Article Google Scholar
Tuncer I, Unusan C, Cobanoglu C. Service quality, perceived value and customer satisfaction on behavioral intention in restaurants: an integrated structural model. J Qual Assur Hosp Tour. 2021;22(4):447–75. https://doi.org/10.1080/1528008X.2020.1802390.
Article Google Scholar
Ullah A, Khan K, Khan A, Ullah S. Understanding quality of products from customers’ attitude using advanced machine learning methods. Computers. 2023;12(3):49. https://doi.org/10.3390/computers12030049.
Article MathSciNet Google Scholar
Vásquez FGZ, Poveda DAM, Llerena WVL. Big data and its implication in marketing. Rev de Comun de La SEECI. 2023;56:302–19. https://doi.org/10.15198/seeci.2023.56.e832.
Article Google Scholar
Veloso CM, Sousa BB. Drivers of customer behavioral intentions and the relationship with service quality in specific industry contexts. Int Rev Retail, Distrib Consum Res. 2022;32(1):43–58. https://doi.org/10.1080/09593969.2021.2007977.
Article Google Scholar
Veseli-Kurtishi T, Ruci E. The impact of digital marketing on the development of tourism in Republic of Albania. Eurasian J Soc Sci. 2023;11(1):1–11. https://doi.org/10.15604/ejss.2023.11.01.001.
Article Google Scholar
Wang Lu, Zhang Y, Chignell M, Shan B, Sheehan M, Razak F, Verma A. Boosting delirium identification accuracy with sentiment-based natural language processing: mixed methods study. JMIR Med Inform. 2022;10(12): e38161. https://doi.org/10.2196/38161.
Article Google Scholar
Wang Q, Tingxuan Su, Lau RYK, Xie H. DeepEmotionNet: emotion mining for corporate performance analysis and prediction. Inf Process Manage. 2023;60(3):103151. https://doi.org/10.1016/j.ipm.2022.103151.
Article Google Scholar
Wang S, Ma J. A novel GBDT-BiLSTM Hybrid model on improving day-ahead photovoltaic prediction. Sci Rep (Nat Publ Gr). 2023;13(1):15113. https://doi.org/10.1038/s41598-023-42153-7.
Article MathSciNet Google Scholar
Wang S, Li C, Kankan Z, Chen H. Context-aware recommendations with random partition factorization machines. Data Sci Eng. 2017;2(2):125–35. https://doi.org/10.1007/s41019-017-0035-3.
Article Google Scholar
Wang Y, Shi Q, Chang TH. Why batch normalization damage federated learning on non-IID data? arXiv. 2023. https://doi.org/10.48550/arXiv.2301.02982.
Wen N, Liu G, Zhang J, Zhang R, Yating Fu, Han Xu. A Fingerprints based molecular property prediction method using the BERT model. J Cheminf. 2022;14(1):71. https://doi.org/10.1186/s13321-022-00650-3.
Article Google Scholar
Wen Z, Lin W, Liu H. Machine-learning-based approach for anonymous online customer purchase intentions using clickstream data. Systems. 2023;11(5):255. https://doi.org/10.3390/systems11050255.
Article Google Scholar
Xiong T, Zhang P, Zhu H, Yang Y. Sarcasm detection with self-matching networks and low-rank bilinear pooling. 2019. pp. 2115–24. https://doi.org/10.1145/3308558.3313735.
Xu B, Tan Y, Sun W, Ma T, Liu H, Wang D. Study on the prediction of the uniaxial compressive strength of rock based on the SSA-XGBoost model. Sustainability. 2023;15(6):5201. https://doi.org/10.3390/su15065201.
Article Google Scholar
Yang L, Zhang He, Shen H, Huang X, Zhou X, Rong G, Shao D. Quality assessment in systematic literature reviews: a software engineering perspective. Inf Softw Technol. 2021;130(February):106397. https://doi.org/10.1016/j.infsof.2020.106397.
Article Google Scholar
Yang Z, Brattin R, Sexton R, Stalnaker JL. Social media usage and customer loyalty: predicting returning customers using artificial neural network. Int J Inf Bus Manag. 2022;14(3):18–28.
Google Scholar
Yoon HJ, Huang Y, Yim M-C. Native advertising relevance effects and the moderating role of attitudes toward social networking sites. J Res Interact Mark. 2022;17(2):215–31. https://doi.org/10.1108/JRIM-07-2021-0185.
Article Google Scholar
Yu W, Liang Y, Zhu X. Sentiment analysis of hotel online reviews using the BERT model and ERNIE model—data from China. PLoS ONE. 2023;18(3): e0275382. https://doi.org/10.1371/journal.pone.0275382.
Article Google Scholar
Yue W, Li L. Sentiment analysis using a CNN-BiLSTM deep model based on attention classification. Int Inf Inst Inf. 2023;26(3):117–62. https://doi.org/10.47880/inf2603-02.
Article Google Scholar
Zanoni M, Chiumeo R, Tenti L, Volta M. What else do the deep learning techniques tell us about voltage dips validity? Regional-level assessments with the new QuEEN system based on real network configurations. Energies. 2023;16(3):1189. https://doi.org/10.3390/en16031189.
Article Google Scholar
Zhang C, Fan H, Zhang J, Yang Q, Tang L. Topic discovery and hotspot analysis of sentiment analysis of chinese text using information-theoretic method. Entropy. 2023;25(6):935. https://doi.org/10.3390/e25060935.
Article Google Scholar
Zhang M, Lu J, Ma N, Cheng TCE, Hua G. A Feature engineering and ensemble learning based approach for repeated buyers prediction. Int J Comput Commun Control. 2022. https://doi.org/10.15837/ijccc.2022.6.4988.
Article Google Scholar
Zhang PV, Kim S, Chakravarty A. Influence of pull marketing actions on marketing action effectiveness of multichannel firms: a meta-analysis. J Acad Mark Sci. 2023;51(2):310–33. https://doi.org/10.1007/s11747-022-00877-4.
Article Google Scholar
Zhang R, Chen M. Predicting online shopping intention: the theory of planned behavior and live E-commerce. SHS Web Conf. 2023;155:02008. https://doi.org/10.1051/shsconf/202315502008.
Article Google Scholar
Zhang R, Jun M, Palacios S. M-shopping service quality dimensions and their effects on customer trust and loyalty: an empirical study. Int J Qual Reliab Manag. 2023;40(1):169–91. https://doi.org/10.1108/IJQRM-11-2020-0374.
Article Google Scholar
Zhang Z, Jung C. GBDT-MO: gradient boosted decision trees for multiple outputs. IEEE. 2019. https://doi.org/10.1109/TNNLS.2020.3009776.
Article Google Scholar
Zolfaghari B, Mirsadeghi L, Bibak K, Kavousi K. Cancer prognosis and diagnosis methods based on ensemble learning. ACM Comput Surv. 2023;55(12):262:1-262:34. https://doi.org/10.1145/3580218.
Article Google Scholar
Zou H, Wang Z. A semi-supervised short text sentiment classification method based on improved bert model from unlabelled data. J Big Data. 2023;10(1):35. https://doi.org/10.1186/s40537-023-00710-x.
Article Google Scholar

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Curtin University, Bentley, Australia
Veerajay Gooljar, Tomayess Issa & Sarita Hardin-Ramanan
The University of Jordan, Amman, Jordan
Bilal Abu-Salih

Authors

Veerajay Gooljar
View author publications
You can also search for this author in PubMed Google Scholar
Tomayess Issa
View author publications
You can also search for this author in PubMed Google Scholar
Sarita Hardin-Ramanan
View author publications
You can also search for this author in PubMed Google Scholar
Bilal Abu-Salih
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

VG: methodology, conceptualization, implementation, and writing; TI, SHR, and BAS: methodology, writing and editing, and supervision. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Bilal Abu-Salih.

Ethics declarations

Ethics approval and consent to participate

Not Applicable.

Consent for publication

Not Applicable.

Competing interests

The authors declare no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Gooljar, V., Issa, T., Hardin-Ramanan, S. et al. Sentiment-based predictive models for online purchases in the era of marketing 5.0: a systematic review. J Big Data 11, 107 (2024). https://doi.org/10.1186/s40537-024-00947-0

Download citation

Received: 01 February 2024
Accepted: 08 June 2024
Published: 05 August 2024
DOI: https://doi.org/10.1186/s40537-024-00947-0

Sentiment-based predictive models for online purchases in the era of marketing 5.0: a systematic review

Abstract

Introduction

Contribution of study

Paper structure

Digital marketing

Artificial intelligence (AI) for improved digital marketing

Data preprocessing

Predictive modelling

Stages of predictive modelling

Sentiment analysis (SA)

Research methodology

Review protocol: PRISMA-P

Z = X – Y

Search strategy

Inclusion and exclusion criteria for study selection (filtering process)

Quality assessment

Reporting review

Results and findings

Predictive analytics approaches (RQ1-a)

Classical machine learning

Ensemble learning

Deep learning

Fusion model

Predictive modelling algorithms (RQ1-a)

Application of sentiment models for customer reviews analysis (RQ1-b)

Genetic and firefly algorithm

Random multimodal deep learning

Sentiment analysis models (RQ1-b)

Factors for online purchase predictive models. (RQ2)

Purchasing factors

Influencer marketing

Challenges and limitations of sentiment-based predictive models (RQ3)

Limitations of previous sentiment-predictive models (RQ3-a)

Model development challenges (RQ3-b)

Dataset transformation challenges (RQ3-b)

Overfitting and underfitting issues (RQ3-b)

Significance of the study

Theoretical significance

Practical significance

Limitations and future work

Conclusion

Data availability

Code availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation