Introduction

Electronic markets such as digital platforms have evolved into a central business sector that has a major impact on the modern economy (Alt & Zimmermann, 2014). However, due to the lack of personal interaction on platforms, they have to incentivize trustworthiness (Bolton et al., 2013). Therefore, many platforms use consumer reviews, which provide additional information about sellers and previous transactions and thus contribute to building reputation and creating trust in sellers (Hesse & Teubner, 2020). On the one hand, online reviews lead to positive economic outcomes for sellers as they increase prices (Ba & Pavlou, 2002) and transaction volume (Bolton et al., 2004). On the other hand, reviews are important for consumers because they increasingly rely on reviews to make transaction decisions (Bae & Lee, 2011). In particular, informative consumer reviews are crucial for selecting restaurants, hotels, and other services (Ruiz-Mafe et al., 2018), as they help to differentiate sellers and minimize uncertainty (Nazlan et al., 2018).

Reviews are consumer-generated online information that allows consumers to share information about previous transactions and make informed purchase decisions (Vallurupalli & Bose, 2020). They typically consist of two components: quantitative (e.g., stars) and qualitative (e.g., text) (Gutt et al., 2019). Quantitative components are ratings on a pre-defined scale, with both the scale and the symbol of the rating varying across platforms (Steur & Seiter, 2021). Qualitative components mainly consist of text and complement the quantitative component, allowing consumers to provide detailed information, as is the case with Amazon and Yelp. However, the two components are not necessarily aligned within a review. Inconsistency between the quantitative and qualitative components of products, services, or sellers has been identified for several platforms. For instance, 30% of Amazon’s product reviews (Mudambi et al., 2014) were inconsistent. Also, multiple seller reviews on TripAdvisor (6%) (Fazzolari et al., 2017) and the German physician rating platforms jameda and DocInsider (more than 3%) (Geierhos et al., 2015) were inconsistent.

There are several reasons for inconsistent reviews. Users can make mistakes in the submission or intentionally write inconsistent reviews. Errors can arise when users do not rate items carefully, either forgetting to rate certain parts of the transaction performance or mistakenly giving inaccurate ratings. Moreover, inconsistent reviews can be intentional when reviewers want to provide a more differentiated view. Compared to the quantitative component, more detailed information can be conveyed within the qualitative component, which can also be observed in the frequent positive and negative text sequences within reviews (Ruiz-Mafe et al., 2018). Such inconsistent reviews are particularly relevant in consumers’ decision-making, as consumers use reviews for information search and seller evaluation (Mudambi and Schuff 2010).

On digital platforms, the decision-making process typically consists of five steps (Darley et al., 2010). Beginning with the problem recognition, consumers become aware of the discrepancies between the actual and the desired state (Del Hawkins & Mothersbaugh, 2010), such as finding a suitable restaurant. Subsequently, consumers search through a large set of relevant sellers based on their search terms and constraints to identify a subset of the most promising restaurants (Xiao & Benbasat, 2007). In the next step, consumers process information and evaluate the subset of restaurants, thus seeking further information about the restaurants related to their search criteria. Therefore, consumers use the information provided by the restaurants and the individual consumer reviews. This step ends with consumers making their restaurant decision (Darley et al., 2010). Consumers complete the transaction by purchasing the selected alternative (Del Hawkins & Mothersbaugh, 2010). Finally, the decision process ends with the purchase outcomes (Darley et al., 2010). In our case, consumers experience the quality of the chosen restaurant.

Consumer reviews typically facilitate both information search and seller evaluation (Bae & Lee, 2011). Inconsistent reviews send conflicting information and signal different levels of quality to consumers which could have several consequences, such as increased cognitive processing costs, suboptimal purchase decisions, and lower overall utility of the platform (Mudambi et al., 2014). As these reviews can be confusing for consumers (Geierhos et al., 2015), inconsistency in reviews could negatively affect the review helpfulness (Aghakhani et al., 2020). Hence, the conflicting information could affect the effort involved in the information processing and the evaluation of a subset of restaurants within the decision-making process. This information processing is usually measured by the duration of consumers’ transaction decisions and the extent of consumers’ search (Xiao & Benbasat, 2007). The longer decision duration could hamper the well-known positive effects of review systems on sellers’ prices and transaction volume for sellers on digital platforms (e.g., Ba & Pavlou, 2002; Bajari & Hortaçsu, 2003; Bolton et al., 2004, 2013; Melnik & Alm, 2002; Resnick et al., 2006). Thus, consumers could either use other platforms to make further transaction decisions or leave without completing the decision-making process.

Prior research on consumer reviews has shown that the two components are frequently not aligned.Footnote 1 However, only limited attention has been given to studying the effects of inconsistent reviews on the consumer decision-making process (Aghakhani et al., 2020). Tsang and Prendergast (2009) analyzed the effects of inconsistent product critiques on consumer decision-making. In particular, they found that inconsistency in movie critiques hampers their trustworthiness. Aghakhani et al. (2020) showed that consistent reviews positively affect review helpfulness as a proxy for review quality.

Mudambi et al. (2014) presumed that inconsistent reviews increase consumers’ cognitive processing costs, resulting in consumers taking more time to make transaction decisions, making suboptimal purchase decisions, and lowering the overall utility of the platform. In the case of inconsistent product critiques, the interplay of both components negatively affects trustworthiness (Tsang & Prendergast, 2009). Moreover, it is not yet fully understood which review component determines the decision. Thus, we propose the following two research questions:

  1. (1)

    How do inconsistent reviews affect the duration of transaction decisions?

  2. (2)

    Which review component – quantitative or qualitative – determines the transaction decision in the case of inconsistent reviews?

Drawing on dual-process theory and media richness theory, we conducted two experiments. We used a 2 × 2 within-subjects design for the first experiment, in which 442 participants chose one of four restaurants after being exposed to either consistent or inconsistent reviews. In our second experiment, we applied a similar design, in which 233 participants decided between two restaurant options: one with positive quantitative and negative qualitative reviews vs. one with negative quantitative and positive qualitative reviews.

We find that inconsistent restaurant reviews did not necessarily result in a longer duration for consumers’ transaction decisions. However, our experimental results show that in the case of inconsistent restaurant reviews, reviews with positive texts resulted in faster transaction decisions. Moreover, in the case of inconsistent restaurant reviews, positive qualitative review components determine consumers’ decision-making.

Our study makes two contributions to the literature on online reviews. First, our work advances the literature on inconsistent online reviews by showing the effect of inconsistent reviews on the duration of user decision-making. Second, our research extends the literature on the relative importance of review components, as the polarity of review texts is crucial for the duration of the transaction decision and the decision itself.

The remainder of the paper is organized as follows. In Sect. 2, we provide a brief overview of the related literature, and we develop our hypotheses in Sect. 3. Section 4 covers the experimental studies, including the method and the results. In Sect. 5, we discuss our findings and their theoretical and managerial implications. We summarize the work and highlight further research directions in Sect. 6.

Literature on inconsistent reviews

Inconsistent reviews have been addressed in several studies. While most studies have focused on the occurrence of inconsistent reviews, the analysis of their effects on consumer decision-making is lacking.

Fu et al. (2013) analyzed individual inconsistencies on the Google Play Store and found 1% of the reviews to be inconsistent. Mudambi et al. (2014) discussed individual inconsistencies using different Amazon products (e.g., books and cameras) and showed that 30% of the reviews were inconsistent. In addition, Mudambi et al. (2014) showed that individual inconsistencies were more common for high star ratings. Moreover, their results showed that inconsistencies were more common for experience goods (41%) than for search goods (16%). Search goods (e.g., toasters) have attributes that can be objectively compared, while experience goods (e.g., books) have attributes that are subjective to the user (Nelson, 1970). However, the study by Mudambi et al. (2014) was based on a small data set of only 1,734 reviews on 23 products on Amazon.

Fazzolari et al. (2017) focused on individual inconsistent reviews on TripAdvisor using sentiment analysis. Their findings showed that 6% of 164,300 hotel reviews on TripAdvisor were inconsistent. Specifically, they found an asymmetrical occurrence of inconsistencies. While 5% of the reviews with positive quantitative components were inconsistent, 12% of negative ones were inconsistent.

Geierhos et al. (2015) developed a method to identify inconsistencies for reviews with multiple quantitative criteria. Focusing on the two German physician rating platforms, jameda and DocInsider, they found varying inconsistencies within the categories, from 3% for “time” (time taken for the treatment) to 12% for “responsiveness” (accessibility and waiting time) (Geierhos et al., 2015).

Shan et al. (2018) focused on the occurrence of individual inconsistencies within authentic and fake reviews. They found that for authentic reviews, quantitative components and the sentiment value of the qualitative components have a higher positive correlation than fake reviews.

In recent literature, inconsistent reviews have been acknowledged, but their effects on consumers and their transaction decisions remain unclear (Mudambi et al., 2014). Aghakhani et al. (2020) found that consistent reviews positively affect review helpfulness (Aghakhani et al., 2020). While review helpfulness is a measure of review quality (Aghakhani et al., 2020; Mudambi and Schuff 2010) and thus provides first insights into the effects of inconsistent reviews, knowledge on the effects on consumer decision-making is still lacking.

Tsang and Prendergast (2009) were the first to investigate how the interplay between quantitative and qualitative components affects consumers’ decision-making processes. Using the example of movie critiques, they particularly emphasized the trustworthiness of consistent and inconsistent critiques and the related effects on purchase intention. Their results showed that the qualitative component has a much stronger influence on purchase intention and trustworthiness. Additionally, they found that positive reviews produced a higher purchase intention than inconsistent or negative reviews. Moreover, inconsistent critiques did not produce a higher purchase intention than negative critiques (Tsang & Prendergast, 2009). However, Tsang and Prendergast (2009) focused on purchase intention, which does not correlate perfectly with actual purchases (Morwitz, 1997). Moreover, their study included only a single product critique (consisting of a quantitative and a qualitative review component) and analyzed its interestingness, trustworthiness, and purchase intention.

Table 1 provides an overview of the related literature on inconsistent reviews and presents our contribution against this background.

Table 1 Literature on inconsistent reviews

Prior research focused on the occurrence of individual inconsistencies. However, prior studies lack an examination of whether inconsistent reviews affect actual consumer decision-making. We aim to close this research gap by analyzing two effects of inconsistent restaurant reviews on consumer decision-making. First, we examine whether inconsistent reviews affect the decision duration. Second, we analyze the relative importance of quantitative and qualitative review components for consumer decision-making.

Theoretical background and hypotheses development

Effects of inconsistent reviews on the duration of transaction decisions

When evaluating a subset of restaurants with inconsistent reviews that send conflicting signals, consumers have to process and evaluate this conflicting information. Thus, the cognitive processing could increase (Mudambi et al., 2014) the time needed for consumer decision-making. Therefore, inconsistent reviews could increase search costs within the decision-making process and negatively affect conversion rates.

According to dual-process theory, information search is based on heuristic processing (System 1) and systemic processing (System 2) (Chaiken, 1980). In particular, System 1 generates impressions, feelings, and inclinations which System 2 can then check and either accept, modify, or override (Kahneman, 2013; Kahneman & Frederick, 2002). The processes of System 1 run automatically, quickly, and in parallel, in an associative form without deliberate control (Kahneman, 2013). Conversely, System 2 involves deliberate and conscious abstract and hypothetical thinking; it is slow and sequential and occupies the capacity of the central working memory system (Evans, 2003). For this purpose, System 2 uses the working memory (Evans, 2003) and is therefore used rather sparingly (Kahneman, 2013). The control of the impressions by System 2 is thus relatively lenient (Kahneman & Frederick, 2002). Based on the least effort and sufficiency principles of System 2 (Baek et al., 2012), we argue that in cases of consistent reviews, consumers’ transaction decisions are made relatively quickly based on the impressions of System 1, which System 2 merely accepts.

Conversely, both systems are fundamental for consumers’ decision-making in the cases of conflicting information within textual reviews (Ruiz-Mafe et al., 2018). Specifically, Ruiz-Mafe et al. (2018) analyzed conflicting qualitative review components. They found that in cases where positive text sequences follow negative sequences, consumers evaluate the argument quality, and System 2 modifies the emotions and impressions of System 1. Moreover, Aghakhani et al. (2020) used the dual-process theory to explain the effects of inconsistent reviews on review helpfulness. Thus, we assume that in cases of inconsistent reviews, System 1 continues to provide impressions to System 2. If consumers of digital platforms using System 1 detect inconsistent reviews, System 2 refuses to simply accept the impressions of System 1 (Kahneman, 2013; Kahneman & Frederick, 2002). Instead, we assume that System 2 uses working memory to check, modify, or override these impressions.

Within inconsistent reviews, consumers will primarily use System 2 since inconsistent reviews require logical thinking and reasoning. Therefore, we assume that decisions based on inconsistent reviews (assuming that consumers recognize these inconsistencies) are slower than decisions based on consistent reviews because of the greater involvement of the slower System 2 (Kahneman, 2013). Specifically, we hypothesize that inconsistencies are processed in the third step of the consumer decision-making process, where consumers evaluate alternatives. In this stage, consumers process the information on the restaurant options, such as the individual reviews, and evaluate these options according to their pre-defined decision criteria (Del Hawkins & Mothersbaugh, 2010). Thus, following the dual-process theory, inconsistent reviews should affect the information processing and evaluation such that the duration of the decision-making process increases. Hence, we propose:

H1: Inconsistent reviews result in a longer time required for consumers’ transaction decisions.

The Relative importance of quantitative and qualitative review components

Confronted with inconsistent reviews, consumers have to emphasize either the quantitative or the qualitative component of such reviews when assessing important seller characteristics during the decision-making process. Following prior research on online reviews (e.g., Xu et al., 2015; Zinko et al., 2020), we use media richness theory (Daft & Lengel, 1984), which initially described communication behavior and media choice within organizations (Daft & Lengel, 1986), to better understand the consumer decision-making process. According to media richness theory, the ability of a medium to convey and promote mutual understanding depends on its information richness. In the case of high message ambiguity, a high-richness medium is appropriate (Daft et al., 1987). In contrast, low richness suffices in repetitive, routine communication settings (Trevino et al., 1987).

Consumer reviews including quantitative and qualitative components (Gutt et al., 2019) help users of digital platforms to receive information on previous transactions. The different review components have different levels of information richness (Daft & Lengel, 1986). In the case of inconsistent reviews, quantitative and qualitative review components send conflicting quality signals. Hence, inconsistent reviews are associated with higher ambiguity compared to consistent reviews. Therefore, consumers noticing inconsistencies have to decide which component is more trustworthy.

Due to the unique nature of inconsistent reviews, the processing of inconsistent reviews on digital platforms can generally be considered to be a non-routine task. For information with high uncertainty in a non-routine setting, a medium of greater richness is more appropriate (Trevino et al., 1987). However, the media (i.e., written and numeric media), through which reviews are shared on digital platforms are constant and cannot be changed to a medium with a greater richness (e.g., face-to-face) (Daft & Lengel, 1984). Therefore, consumers must decide on the trustworthiness of both components and their individual information richness due to the different quality signals.

The information richness of both review components varies in terms of the quantitative and qualitative components. The quantitative review component (“numeric language”) has a lower level of information richness than the qualitative component (“natural language”) (Daft & Lengel, 1984). The depth of the qualitative component enhances the relevance of the qualitative component as it is known for the review helpfulness (Mudambi and Schuff 2010).

Prior research has shown that higher information richness is associated with higher trustworthiness (Lu et al., 2014) and that the information quality of online reviews affects purchase intention (Zinko et al., 2020). Thus, given that recipients of reviews benefit from using a richer medium in situations of ambiguity, inconsistent reviews should result in more weight being placed on the qualitative review components. The findings of Tsang and Prendergast (2009) support this argumentation. In situations of inconsistent product critiques, they showed that textual reviews have a higher significance in terms of purchase intention and trustworthiness of reviews. Hence, we assume that the relative importance of the qualitative component is higher within the decision-making process, and we propose:

H2: In the case of inconsistent reviews, users’ transaction decisions are predominantly based on the qualitative component.

Experimental studies

Experiment 1: Duration of the transaction decision

Basic setup and treatments

We conducted a controlled experiment in the context of restaurant reviews to test our hypothesis regarding whether inconsistent reviews affect the duration of transaction decisions. Similar to Schneider et al. (2021), we developed a fictitious restaurant-visit scenario where participants had to choose a restaurant option based on previous consumer reviews. In experiment 1, we used a 2 × 2 within-subjects design (Table 2). In each treatment, the participants chose one of four restaurants, for which they were shown reviews with quantitative and qualitative components. The first two treatments included consistent reviews; that is, both the quantitative and qualitative components were positive in treatment 1 (pos-pos) and negative in treatment 4 (neg-neg). In the other two treatments, the reviews were inconsistent. Specifically, treatment 2 consisted of positive quantitative and negative qualitative components (pos-neg), and treatment 3 included negative quantitative and positive qualitative components (neg-pos).

Table 2 Treatments of experiment 1

Within the four treatments, all participants received five reviews for each of the four restaurants. Following Fazzolari et al. (2017), we only focus on strong inconsistent reviews in our research. Hence, the quantitative component included only negative reviews (one or two out of five stars) and positive reviews (four or five out of five stars). As in Ruiz-Mafe et al. (2018), the qualitative component consisted of real textual reviews about Italian restaurants posted on Yelp. We only selected reviews that did not contain any people’s names, geographical regions, or restaurant names, to avoid biases within the experiment due to familiarity (Ruiz-Mafe et al., 2018). Moreover, we ensured that none of the reviews included information about the reviewer.

Similar to prior studies (e.g., Geierhos et al., 2015; Mudambi et al., 2014), a sentiment analysis helped classify the qualitative review components regarding their polarity. In particular, we used restaurant reviews of the publicly available Yelp data set (Yelp, 2020). We used TextBlob for our sentiment analysis, as it is a widely used library (e.g., Kühl et al., 2020; Mousavi et al., 2020) that offers high accuracy. TextBlob provides a polarity score for each piece of a review ranging from -1 for extreme negative sentiment to 1 for extreme positive sentiment of the text. We only selected texts with extreme sentiment values to ensure that participants noticed the inconsistencies and avoid distortion due to different sentiment values (polarity of -1, or 1). We used extreme polarities to ensure a uniform design across the treatments. To prevent a false classification based on the previous sentiment analysis (false positive or false negative), we used five independent raters who received the qualitative components in random order to conduct the sentiment classification for further analysis. All classified reviews were consistent with the corresponding polarities of the sentiment analysis.

Procedure

We conducted our experiment in July 2020 using Amazon Mechanical Turk (MTurk). We chose MTurk for three reasons. First, as a digital platform, MTurk has digitally competent users with characteristics similar to those of users of other digital platforms (Vazquez, 2021). Second, recent studies have shown that MTurk experiments provide similar data quality and results to traditional methods such as laboratory studies (Horton et al., 2011). Third, MTurk was used in other studies concerning online reviews (e.g., Garnefeld et al., 2020; Zinko et al., 2020). The experiment was performed using the software oTree (Chen et al., 2016).

Participants were freelancers on MTurk who earned $1.50 for participating in the experiment. A total of 1,405 participants attended the experiment. However, 308 participants did not finish the experiment. We conducted attention checks (Abbey & Meloy, 2017), one after the instructions (see Appendix 20) and one after all decision rounds (see Appendix 26), to verify that users carefully read the reviews provided for each restaurant. These included questions about the business category, the scale level of the quantitative component, and the type of cuisine used within the experiment. Also, we asked the participants how often inconsistent reviews occurred within the experiment to control whether the participants recognized the inconsistent reviews. A total of 635 participants failed this test. Additionally, we excluded another 20 participants that deviated in their decision duration from the mean by more than three standard deviations. The final sample consisted of 442 participants (Table 3).

Table 3 Characteristics of the participants of experiment 1

Each participant took part in all four treatments. However, the individual treatments were ordered randomly to avoid order effects (e.g., learning and framing) (Charness et al. 2012). Initially, the participants were instructed to imagine a situation in which they would go out for dinner to a restaurant with friends, and their attentiveness was checked for the first time (see Appendix 20). Prior to the first selection of a restaurant, all participants could practice the selection in two practice rounds.

After the training phase, the participants were explicitly asked to begin the selection phase. Within the selection phase, the participants chose one of four restaurants (see Appendix 21, Appendix 22, Appendix 23, Appendix 24). For each restaurant, the participants were shown five different reviews consisting of qualitative and quantitative components. Additionally, an overall rating was displayed, which represented an average value of all quantitative reviews. We showed this overall rating because most platforms do so, and therefore we assume a high external validity. Within each treatment, the duration from the display of the restaurant reviews to confirmation of the decision was measured.

Finally, the participants answered the second attention check (see Appendix 26) and responded to a questionnaire including several controls. Since previous studies (e.g., Hennig-Thurau & Walsh, 2003; von Helversen et al., 2018) found differences in decision-making in relation to age, gender, highest completed level of education, and risk preferences, we used these variables as controls (see Appendix 28). For these controls, we used established constructs (Hennig-Thurau & Walsh, 2003; Ludwig et al., 2017; von Helversen et al., 2018). We also asked the participants about their frequency of using reviews in general, using reviews to choose a restaurant, and visiting restaurants. Additionally, we elicited information from the participants about their choice of individual restaurants.

Results of the duration of transaction decisions

Table 4 shows the average decision-making time required by the participants within each treatment. In the case of inconsistent pos-neg reviews, the participants needed significantly more time for their decision compared to all other treatments.

Table 4 Decision duration

Table 5 shows the result of t-tests between the different variants.

Table 5 Results of two-tailed t-tests

The two-tailed t-test for two dependent samples showed that the participants decided more quickly with pos-pos reviews than with neg-neg reviews. The decisions with pos-pos reviews were also faster than with pos-neg reviews. However, there was no significant difference in decision duration between pos-pos reviews and neg-pos reviews. With neg-neg reviews, decisions were faster than with pos-neg reviews. In contrast, the decision took longer with neg-neg reviews than with neg-pos reviews. Finally, decisions took significantly longer with pos-neg reviews than with neg-pos reviews.Footnote 2 Thus, we did not find support for H1.

Experiment 2: The relative importance of quantitative and qualitative components

Basic setup and treatment

In our second experiment, we used a design similar to the first one. However, the second experiment focused on the relative importance of the review components. Therefore, the experiment consisted of only one treatment (Table 6). The participants were asked to decide between two restaurants to examine which component dominated the transaction decisions in the case of inconsistent reviews (see Appendix 25). For one of the two restaurants, reviews included positive quantitative (four or five out of five stars) and negative qualitative components (polarity of -1). In contrast, the reviews of the other restaurant consisted of negative quantitative (one or two out of five stars) and positive qualitative components (polarity of 1).

Table 6 Options within experiment 2

Procedure

We conducted our second experiment in July 2020. The participants who completed the experiment received $0.70. A total of 713 participants took part in the experiment, but 190 participants stopped the experiment prematurely. As in the first experiment, we identified random clickers using attention checks (see Appendix 20 and Appendix 27). A total of 290 participants failed at least one of the attention checks and were thus excluded from the analysis. The final data set for experiment 2 consisted of 233 participants (Table 7).

Table 7 Characteristics of the participants of experiment 2

After participants were shown the instructions, we conducted a first attention check to filter random clickers and bots (see Appendix 20). Then, the participants practiced within two training rounds (as in experiment 1). Following the training rounds, the participants started the selection round, in which they chose between a restaurant with pos-neg reviews and a restaurant with neg-pos reviews. The order of the two restaurants was random. Further, both restaurants were displayed in the same way as in experiment 1.

Finally, the participants’ attention was checked in another attention check (see Appendix 27), similar to the first experiment. This attention check helped to ensure that the participants noticed the inconsistencies within the reviews. The participants concluded by answering the final questionnaire using the same controls as in the first experiment (see Appendix 29). Since the content within the qualitative review component can affect information processing and decision-making, we controlled for argument quality. In line with prior research, we measured argument quality by perceived informativeness and perceived persuasiveness (Zhang et al., 2014).

Results on the relative importance of quantitative and qualitative components

We conducted a one-tailed binomial test (p = 0.5) to analyze whether the participants based their decisions on the qualitative component. The one-tailed binomial test result was significant (α = 0.05).Footnote 3 Thus, H2 was supported.

We conducted a logit regression (Table 8) to gain more insight into the decision-making process. In particular, a low level of perceived informativeness and perceived persuasiveness resulted in a lower probability that users made their decisions based on qualitative components. However, these results were not significant. Moreover, education and risk controls showed significant effects. Specifically, participants with higher education levels were more likely to choose the restaurant with positive qualitative components. Likewise, the higher was the willingness to take risks, the lower was the probability that decisions were based on qualitative components. Other control variables did not show any significant results.

Table 8 Logit regression results

Summary of the findings

Our analysis of experiment 1 showed that only pos-neg reviews resulted in slower transaction decisions compared to consistent reviews. Since decisions did not take longer for neg-pos reviews than consistent reviews, H1 cannot be supported. The results of experiment 2 showed that participants predominantly chose the restaurant with the positive qualitative component, supporting H2. The results of H1 and H2 are shown in Table 9.

Table 9 Results of the hypotheses

Discussion

Theoretical implications

We contribute to the literature on inconsistent online reviews in two ways. First, our results show that inconsistent restaurant reviews do not generally lead to longer transaction decisions. Whereas Ruiz-Mafe et al. (2018) showed that System 2 intervenes when positive sequences follow negative sequences, we did not find that neg-pos reviews led to longer transaction decisions. This result could indicate that the effects of inconsistent reviews on the duration of consumer decisions are not as decisive as previously expected. However, since reviews and their effects on consumer decision-making are highly complex (Xiao & Benbasat, 2007), and since inconsistent reviews negatively affect review helpfulness (Aghakhani et al., 2020), inconsistencies could still hamper the effectiveness of reviews.

We found no significant time differences in decision-making between neg-pos and pos-pos reviews. In contrast, consumers took longer to make decisions based on neg-neg or pos-neg reviews, indicating that positive polarities of the qualitative components are crucial for the decision duration and that quantitative components have no effect on the decision duration. One possible explanation for the text’s importance is the context of restaurant reviews. The perceived quality of restaurants is very subjective, which is why many users might focus on the qualitative components to get a more detailed overview of the reviewed restaurant. The importance of qualitative review components might also help to explain the lack of significant differences in the time required for consumers to make a transaction decision based on pos-pos reviews and neg-pos reviews. These findings are in line with previous literature showing that sufficient positive information within a text is required in the information processing and the evaluation of a subset of restaurants within the decision-making process (Tsang & Prendergast, 2009). Thus, the polarity of the review texts is crucial for the decision duration, which might explain why we did not find support for H1. In the case of reviews with positive qualitative components, we found that the participants made decisions faster regardless of the quantitative component.

Second, we showed that in the case of inconsistent restaurant reviews, consumers decided based on the qualitative rather than the quantitative component. These results indicate that in the case of inconsistent restaurant reviews, consumers are motivated to exert extra effort to minimize uncertainties due to the intangible nature of restaurant service (Nazlan et al., 2018). These findings are in line with our expectations based on media richness theory and the findings of Tsang and Prendergast (2009) in the context of product critiques. Whereas Tsang and Prendergast (2009) focused on both components’ importance using purchase intention as a proxy, we extended the literature by focusing on actual decisions.

Managerial implications

Based on our findings that showed the importance of the qualitative component, we propose three managerial implications to highlight the qualitative review component. First, we recommend platforms to focus on textual reviews to simplify the review system and facilitate consumers’ decision-making. For instance, rating platforms could use the polarity score drawn from sentiment analysis to replace the quantitative review component. This polarity score of the individual reviews could then be aggregated to an overall rating. Thus, the aggregated polarity score could strengthen the textual component when screening through restaurants within the decision-making process.

Second, we suggest highlighting keywords that enable consumers to filter individual reviews. Therefore, platforms could use text mining to identify relevant topics within the qualitative review components. These topics could facilitate the evaluation of the restaurant subset within consumers’ decision-making process. For instance, Amazon highlights keywords within their product reviews. However, currently, keywords are not a common feature within review systems on restaurant platforms, such as Yelp or TripAdvisor. In addition, platforms could offer further sorting and filtering options to promote the text as an essential review component. Thus, users could easily find reviews that fit their search requests. As a result, new peer evaluation methods of the qualitative component, such as “helpfulness” as a kind of perceived value for decision-making, could be of high importance in the design of review systems.

Third, we recommend platforms to split the qualitative review component into multiple parts. Platforms could introduce multiple criteria, such as service quality, food quality, or ambiance, on which users can submit a qualitative review. Another approach could be to split the qualitative component into positive and negative parts. This adjusted design could offer a more detailed and differentiated view of the restaurant visit. Moreover, this design could help to indicate that the two components of restaurant reviews are not necessarily consistent.

Conclusion

Consumer reviews have become increasingly popular in electronic markets, especially on digital platforms. As inconsistent reviews frequently occur in consumer reviews on digital platforms, this paper was motivated by the lack of knowledge about the effects of such reviews on consumer decision-making. Based on the first experiment, we show the effects on the duration of transaction decisions. Subsequently, we also show in the second experiment that the decisions are predominantly based on the qualitative review components. However, this paper is subject to several limitations.

First, we assume that inconsistent reviews are related to a switch from System 1 to System 2. However, with our online experiment, we could not directly measure the switch between these systems in consumer decision-making. Second, our analyses focused on strongly inconsistent reviews. Although we could not generally show that inconsistent reviews negatively determine the duration of transaction decisions, future studies might look at different degrees of inconsistency with lower differences in quantitative and qualitative review components to examine their effects on consumer decision-making. Moreover, within the experiments, all reviews of a restaurant were either consistent or inconsistent. However, as consumer reviews typically consist of both consistent and inconsistent reviews, both types of reviews could be examined in further studies. Third, our experiments focused on decisions among a few restaurants with a limited number of reviews. While this choice is a good approximation of reality, it does not fully represent the complex structure of consumer decision-making. Moreover, the relative importance of the textual review component could result from the uniqueness of restaurant reviews, as they are subjective in terms of the quality perceived by the consumer. Therefore, the relative importance could be different for other products or services. Future studies could consider different price levels and types of consumption (e.g., accommodations, books, cameras, or cars). Finally, we conducted our experiments on Amazon MTurk without incentivizing the decision quality, as participants received a flat payment.