The combinatory role of online ratings and reviews in mobile app downloads: an empirical investigation of gaming and productivity apps from their initial app store launch

Sällberg, Henrik; Wang, Shujun; Numminen, Emil

doi:10.1057/s41270-022-00171-w

The combinatory role of online ratings and reviews in mobile app downloads: an empirical investigation of gaming and productivity apps from their initial app store launch

Original Article
Open access
Published: 31 May 2022

Volume 11, pages 426–442, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Marketing Analytics Aims and scope Submit manuscript

The combinatory role of online ratings and reviews in mobile app downloads: an empirical investigation of gaming and productivity apps from their initial app store launch

Download PDF

5703 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

Mobile app ratings and reviews are important due to their influence on consumer behavior and the financial consequences for app developers and app platform providers. This paper contributes to prior work by analyzing how rating and review information in combination impact mobile app downloads. To achieve these ends, we utilize daily panel data of 341 gaming (hedonic consumption value-oriented) and productivity (utilitarian consumption value-oriented) apps tracked for almost two years from their release in the Apple App Store. Hence, we contribute to how ratings and reviews matter for the larger majority of apps, whereas previous research has mainly focused on either ratings’ or reviews’ impact on app performance for top-ranked apps. Results of fixed-effects regression analysis reveal different combinatory impacts of text review information (polarity, subjectivity, and review length) and rating information (average rating score, volume of ratings, and dispersion of ratings) on gaming versus productivity app downloads. Important implications of the findings for app developers and platform providers, and for future research into online ratings and reviews, are discussed.

The Impact of Online Reviews on Download Numbers of Mobile Apps

Investigating the Effect of User Reviews on Mobile Apps: The Role of Customer Led Innovation

The Impact of Customer Reviews on Product Innovation: Empirical Evidence in Mobile Apps

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

In this paper, we investigate the interplay of online consumer ratings and online consumer reviews in mobile app downloads. An online rating is an assessment of a product’s overall quality on a numerical scale, whereas an online review is a text comment on a product’s attributes and quality. These mechanisms have become essential in electronic markets (Huang et al. 2015), in particular for product markets characterized by many competing alternatives available to consumers, e.g., books, restaurants, hotels, movies, and mobile apps. Ratings and reviews are important to consumers for helping inform their choices, and to producers for representing useful information when forecasting sales, developing products, and designing marketing promotions (Li et al. 2019).

App downloads are an important performance variable for app developers and platform providers. This follows from how app revenues are generated (see Roma and Ragaglia 2016 for a review). Downloads are sometimes paid for by consumers directly and they also enable in-app purchase revenue generation. Furthermore, app advertising revenues are positively related to the size of the app user base, which is a function of downloads (Lee et al. 2021).

Some studies have reported that ratings and reviews strongly impact consumers’ product choices (Burgers et al. 2016; Finkelstein et al. 2017; Gokgoz et al. 2021; Kashyap et al. 2022), which is consistent with consumers’ tendency to rely on peer information over commercial information for their product choices (Sher and Lee 2009). Overall, however, empirical findings of online ratings’ and reviews’ impact on consumer behavior are inconsistent (Liang et al. 2015; Gottschalk and Mafael 2017; Li et al. 2019; Picoto et al. 2019; Li et al. 2020; Sadiq et al. 2021). Such studies have typically investigated either ratings or reviews, so there is a need to better understand the interplay of ratings and reviews in consumer choice, as called for in recent studies (Chen and Xu 2017; Li et al. 2019; Kaur and Singh 2021; Xia et al. 2021). This study contributes to filling this gap in the literature. Based on cue theoretical premises, we provide rationales for ratings’ and reviews’ combinatory use by consumers in app store settings. Specifically, how rating cues (e.g., average rating score) and review cues (e.g., review polarity) may corroborate one another, thereby reinforcing their individual credibility as separate cues, is argued for based on cue-consistency theory (Miyazaki et al. 2005). As contended in the literature, too similar ratings (average rating score) across competing product alternatives undermine ratings’ role as sole determinant in product decision-making (Hazarika et al. 2021). In effect, consumers may instead combine rating and review information for their app downloading decisions, especially since potential consumers reportedly read and use reviews for their decision-making (Li et al. 2019; Lutz et al. 2022; Kashyap et al. 2022). Moreover, how rating and review cues may complement each other, thereby in combination providing more diagnostic (reliable) information for consumers’ app choices, is argued for using cue diagnosticity theory (Feldman and Lynch 1988). Such complementary cue value is consistent with arguments that online reviews give potential consumers deeper insights into how specific attributes of an app appeal to current users, which aggregated ratings (average rating score, dispersion of ratings, and volume of ratings) alone cannot reveal (Liang et al. 2015).

In this study, we explore rating and review variables’ interaction effects on downloads of gaming and productivity apps in the Apple App Store. The rating variables we study are average rating score, volume of ratings, and dispersion of ratings. Corresponding review variables are polarity, subjectivity, and review length. Polarity represents a quantitative measure of the valence of a text review, whereas subjectivity is a quantitative measure of how objective (fact-based) versus subjective (emotional) a text review is. These variables are investigated due to their argued importance in the literature (Salehan and Kim 2016; Li et al. 2019; Filieri et al. 2019).

Studies on mobile apps have divided them into hedonic and utilitarian consumption value segments based on app store category (e.g., entertainment, games, productivity, health and fitness). Different empirical strategies have been employed to arrive at the dual classification of apps based on their app category belonging. These include neutral expert interrater coding (Tafesse 2021), and survey studies of consumers utilizing measurement instruments to identify main type of consumption value perceived for products across categories (Kim et. al. 2014; Tang 2016; Yang and Lin 2019). Yet other studies have used logical reasoning to divide app categories into the two value segments (Arora et al. 2017). Regardless of procedure, gaming apps are in such studies classified into the hedonic value segment, due to mainly being used for the fun and enjoyment they bring. By the same token, productivity apps have been classified into the utilitarian value segment due to being used mainly for the efficiency and effectiveness gains they bring in solving tasks, e.g., spreadsheet problems, making presentations, or writing reports. Using such classifications, research has reported ratings’ and reviews’ to impact app consumption behavior differently for the two value segments (Liu et al. 2014; Roma and Ragaglia 2016; Tafesse 2021). For other product domains as well, ratings’ and reviews’ impact on product decision-making has been reported to depend on whether consumption is of hedonic or utilitarian consumption value (Ren and Nickerson 2019; Akdim et al. 2022). We therefore assume, following the merit reported in previous studies, that gaming apps are more hedonic consumption value-oriented overall, whereas productivity apps are more utilitarian consumption value-oriented overall—hence, the rationale for studying these two app categories in our study. As we acknowledge that games could sometimes be consumed for utilitarian purposes, and productivity apps for more hedonic reasons (see Akdim et al. 2022, for such observations of social mobile apps), we aim to investigate how ratings and reviews in combination impact downloads of gaming apps and productivity apps.

Literature review

The research into online ratings and reviews is part of a broader stream of literature on electronic word-of-mouth (eWOM). eWOM is “any positive or negative statement made by potential, actual, or former customers about a product or company, which is made available to a multitude of people and institutions via the Internet” (Hennig-Thurau et al. 2004). Arguments in the eWOM literature favor use of online consumer ratings and reviews for app downloading decisions. First, compared with other forms of online reviews such as critic reviews or other third-party reviews, consumer comments are often considered more trustable (Liang et al. 2015). Second, the anonymous characteristic of online consumer app ratings and reviews is also favorable, meaning consumers are more comfortable sharing both positive and negative comments when they are anonymous (Deng et al. 2021). Third, the less commercial actors can control review content, e.g., by deleting unfavorable reviews or making fake reviews, the more likely potential consumers are to use reviews for their decision-making (DeAndrea et al. 2018). App developers have little such control since platform providers supply the rating and review mechanisms. In fact, platform providers remove apps from the store and expel developers if they manipulate ratings or reviews (Apple.com, March 2022). Finally, eWOM compared with traditional WOM can be more easily evaluated across space and time, e.g., evaluated repeatedly or at a pace suitable to the reader (Sun et al. 2006). Potential app consumers may for such reasons prefer eWOM over traditional WOM for their app decision-making. This study thus contributes to a deeper understanding of how eWOM impacts product performance under conditions argued to spur its use.

Online ratings’ impact on product performance

The literature on online ratings has mainly considered how the variables’ average rating score, volume of ratings, and dispersion of ratings impact product performance. Arguments in the literature state why these variables ought to matter for product performance. Average rating score informs a potential consumer about other consumers’ perception of a product’s value. Moe and Trusov (2011) therefore argue that products of higher quality are more likely to receive higher ratings than products of low quality, which impacts consumer decision-making. Volume of ratings represents a product’s number of ratings. Arguments in the literature contend that higher volume is associated with more discussions about a product, leading to increased awareness of it among potential consumers (Lu et al. 2020). Volume of ratings is argued to indicate the trustworthiness of the general opinion about a product—that is, whether consumers have reached a consensus on the general evaluation of a product (Burgers et al. 2016). Based on this reasoning, a positive relationship between volume of ratings and consumer decision-making has been argued. Dispersion of ratings represents a measure of the spread in ratings, e.g., variance or standard deviation of ratings. Generally, consumers seek to avoid risk, implying that dispersion of ratings should have a negative impact on potential consumers’ reliance on ratings for their decision-making (Chu et al. 2014).

Empirical findings on how the three online rating variables impact product performance are inconsistent across products such as books, movies, hotels, and apps (Baugher et al. 2016; Finkelstein et al. 2017; Li et al. 2019; Lu et al. 2020; Tafesse 2021; Gokgoz et al. 2021; Chen et al. 2022).Appendix 1 provides a review of such studies, revealing that different performance metrics (downloads, top-list survival, and sales rank) have been investigated. Moreover, it indicates that rating variables’ impact on app performance is contingent on contextual variables, e.g., app category and country profile. We contribute in three main ways to these prior online ratings studies. First, we analyze how dispersion of ratings impacts app performance. To our knowledge, prior work has not dealt with this issue for apps, but for other products (Chu et al. 2014; de Langhe et al. 2015; Zheng et al. 2021). It is not obvious how this variable impacts app performance. On the one hand, consumers generally seek to avoid using less reliable information for their decision-making (Chu et al. 2014). On the other hand, apps are typically low priced, which is why this risk may be ignored (Burgers et al. 2016). In a similar fashion, there are arguments and counterarguments for the role played by average ratings in electronic markets. A higher average rating score signals higher product quality with a positive influence on consumer decision-making (Moe and Trusov 2011), but average ratings may be too similar for competing app alternatives, and thereby do not inform consumer choice (Li 2018).

Second, we analyze how the three online rating variables impact app downloads for gaming apps (hedonic consumption value-oriented) versus productivity apps (utilitarian consumption value-oriented). Apart from Liu et al.’s (2014) analysis into freemium apps, there is a gap in the literature on how ratings impact the performance of hedonic versus utilitarian consumption value-oriented apps. Ratings’ impact on consumer decision-making for other products has been found to depend on such consumption value (Chu et al. 2014; Li et al. 2019; Tafesse 2021). Whether this generalizes to apps is not obvious since apps are typically low priced and can be uninstalled with little effort and regret (Burgers et al. 2016). Third, as Appendix 1 shows, most prior works on apps have analyzed top-listed apps for shorter time intervals. In this study, we track apps on a daily basis over a period of almost two years from their launch in the Apple App Store. This way we contribute to a deeper understanding of how ratings play a role in app performance.

Online reviews’ impact on product performance

Consumer text reviews are argued to help consumers find products that match their needs (Liang et al. 2015). Consequently, investigations into how different text review variables influence product performance, mainly product sales, have been conducted.^{Footnote 1} Results of such studies are mixed across products investigated (Gopinath et al. 2014; Liang et al. 2015; Li et al. 2019; Guo and Shasha 2016; Đurović and Kniepkamp 2022). The majority of prior research into online reviews has however focused on determining review helpfulness, which represents the subjective value of a review to the reader (Huang et al. 2015; Kashyap et al. 2022). Such studies have reported polarity (valence), subjectivity, and length of reviews as determinants of review helpfulness.

Considering the historical and continued expected high growth of the app market (Borasi and Baul 2019), there is surprisingly sparse research into how online reviews impact app performance. Liang et al. (2015), studying weekly panel data of top-500 listed apps, report a positive impact of their review valence measure on both free and paid app sales rank. Oh et al. (2015) reported a positive impact of the number of potential consumer-generated question posts about an app on its downloads. This study contributes to such studies by analyzing how polarity, subjectivity, and length of reviews impact app downloads. Li et al. (2019) in their review of ratings and reviews literature conclude that eWOM studies focus on numerical ratings but rarely address textual reviews, due to the complexity of text analysis. They moreover conclude that few studies that incorporate textual reviews use techniques such as sentiment analysis. Our study contributes to this stream of literature by utilizing such techniques.

The interplay of ratings and reviews in product performance

Research into the combinatory impact of ratings and reviews has been called for in recent studies (Li et al. 2019; Filieri et al. 2019; Kaur and Singh 2021; Shin et al. 2021). Previous research has suggested that numerical ratings and textual comments might work separately or in combination (Floh et al. 2013), but little research has dealt with how ratings and reviews interplay. Tsang and Prendergast (2009) found in their experimental study of movie product reviews that for those containing both a review and a rating, the former is more significant in affecting product purchase intention. However, the authors did not find that positive ratings accompanied by positive reviews produced significantly higher purchase intention compared with inconsistent evaluations. Hu et al. (2013), utilizing a large panel data set on 4000 Amazon books, reported no direct effect of ratings on book sales rank, but a positive moderating effect with review valence on such rank. Chong et al. (2016) reported positive interaction effect of their sentiment polarity measure and volume of ratings in predicting sales of 12,000 Amazon electronics products based on a neural network approach. Al-Natour and Turetken (2020), based on Amazon and Yelp rating and review data across product domains, reported sentiment polarity to be a good substitute of star ratings, and at times a good complement to such ratings. These author findings are consistent with the cue theoretical premises put forth in the present study into mobile apps. Similarly, Zhu et al. (2020), in the context of hotel reviews, report consistency between review polarity and rating scores. By contrast, Li et al. (2019), utilizing 22-week panel data on consumer reviews of 312 PC products, reported that numerical ratings mediate the effect of textual reviews on such sales. Filieri et al. (2019) found that extreme ratings in combination with long, linguistically clear hotel reviews on TripAdvisor positively impact review helpfulness. Kaur and Singh (2021) reported a mixed impact of rating score combined with review volume on book sales.

Our research contributes to this prior work by analyzing a larger set of interaction effects between rating and review variables on product performance.

The interplay of ratings and reviews in mobile app downloads: cue consistency and diagnosticity rationales

Ratings and reviews are cues, i.e., information signals, used by consumers to infer product quality (Byun et al. 2021). Consumers may use rating and review cues in combination for two main reasons. One is that different cues, by corroborating each other, strengthen each other’s reliability from the consumer’s perspective. This is a main premise of cue-consistency theory (Miyazaki et al. 2005). The theory holds that observation of consistent signals increases information diagnosticity, which is the extent to which a cue helps the consumer assign a product to a specific quality category (such as high or low quality). The other main reason is that multiple cues may complement each other. This is consistent with cue diagnosticity theory, which holds that consumers continue to assess cues until a perceived reliable or diagnostic inference of product quality has been reached (Feldman and Lynch 1988; Reddy et al. 1994). Consumers do so to reduce uncertainty and risk around their product decisions (Kirmani and Rao 2000). Moreover, in line with these two theoretical premises, empirical studies have repeatedly revealed that consumers prefer relying on multiple cues over single cues for their product decisions (see Byun et al. 2021, for a review).

The corroborating and complementary rationales may apply to a multitude of rating and review variable interactions in line with the cue theoretical premises. First, both ratings and reviews are retrievable as they are publicly accessible in app marketplaces. Second, they enable provision of corroborating and complementary product quality signaling value to consumers. Specifically, consistency in valence of both ratings (average rating score) and reviews (polarity) may strengthen the consumer’s trust in each cue. Similarly, average rating score combined with objective (rather than subjective or emotional) reviews may have this corroborating effect. Research thus reveals that consumers put higher trust in objective (factual) online reviews (Darley and Smith 1993). The corroborating effect may furthermore apply to review length accompanied by average rating score, as longer reviews may provide deeper insight into why the average rating score for an app is high or low (Li et al. 2019).

Rating and review variables may also offer complementary signaling value to consumers. This stems partly from their different nature (numbers versus text) and the way they are displayed in app marketplaces. Aggregated rating cues such as average rating score, the number of ratings, and the distribution of ratings (on the, e.g., 5-star scale) provide overall consumer population information about an app’s quality. Text reviews, on the other hand, are displayed in a disaggregated fashion, so they may provide information about specific quality attributes or aspects of an app not revealed by the aggregated rating information. It follows that consumers may use average rating score as a cue for an app’s overall quality while simultaneously using individual text reviews to obtain cue information about how specific attributes of the app appear to fulfill the consumer’s quality expectations. Both cues may have to meet the consumer’s expectations for the app to be downloaded, hence a combinatory effect. Similarly, consumers may consult disaggregated text reviews to obtain additional insight into why ratings are dispersed or not. Thus, review polarity and dispersion of ratings may in this way offer complementary value to one another and may be used in combination to determine whether to download an app. Furthermore, whether dispersion of ratings is based on subjective or objective consumer evaluations can be inferred by consumers inspecting individual text reviews along with overall rating dispersion information (how ratings are distributed on the 5-point scale across raters). As such, the two may be used in combination for consumers’ downloading decisions. Moreover, related work has argued that text and numerical components of a product review would often interact within the consumer’s processing system (Li et al. 2019). Yet other research suggests that combinatory use is enhanced by ratings and reviews being displayed simultaneously in app marketplaces, making the interplay between them particularly valid to study (Chong et al. 2016).

Despite the two main cue theoretical premises favoring consumers’ use of ratings and reviews in combination, characteristics of mobile apps may attenuate their combinatory use. First, apps typically have a low upfront price and can be uninstalled with low effort and regret (Burgers et al. 2016). Accordingly, consumers might not engage in in-depth exploration of ratings and reviews, but instead download apps and try them out. Moreover, text reviews might require too much effort to evaluate if they are not linguistically clear (Salehan and Kim 2016), which may attenuate their corroborating and complementary effect with ratings. For the abovementioned reasons, an exploratory approach is adopted in this study, whereby interaction effects of rating cues (average rating score, volume of ratings, and dispersion), and review cues (polarity, subjectivity, and review length) on mobile app downloads are explored. The remainder of this study focuses on this issue, reporting on such combinatory effects for apps in the game and productivity categories in the Apple App Store.

Methodology

Data

To analyze how ratings and reviews impact mobile app downloads, we used US Apple App Store data ranging from January 1, 2015 to December 19, 2016. These data were acquired from a large reputed global actor specializing in app market analytics (https://www.mobileaction.co/). This allowed us to speed up data collection, compared with using algorithms to gather data for two years ourselves. US Apple App Store data were acquired for market size reasons and due to its common use in related work (Lee and Raghu 2014; Liang et al. 2015; Kübler et al. 2018; Gokgoz et al. 2021). Our dataset is restricted to apps tracked daily from their release in the App Store. We thereby contribute to prior work by capturing new apps and more granular app data for a longer period of time (see Appendix 1 for a comparison to related work). The final sample consisted of an unbalanced panel of 341 mobile apps, of which 295 were gaming apps and 46 were productivity apps. Apps from these two categories were selected for two related reasons. First, according to previous studies, games are more hedonic consumption value-oriented, while productivity apps are more utilitarian consumption value-oriented (see Tafesse 2021, for a review). Second, ratings and reviews are reported to impact consumer decision-making differently depending on such consumption value orientation (Roma and Ragaglia 2016; Ren and Nickerson 2019).

The raw data acquired included the following: count of daily downloads per app, app rating (from one to five stars) per reviewer and app, text review per reviewer and app, app release date, app type (gaming or productivity), and app download type (free or paid). The data also revealed the exact times when text reviews and app ratings were posted on the App Store. This enabled us to generate panel data.

Variables and measurement

In Table 1, measures of variables are summarized. Downloads, which constitutes the dependent variable in our econometric models, was measured as daily count per app. We generated three rating variables: average rating score (Av_Rating), volume of ratings (Vol_Rating), and dispersion of ratings (Disp_Rating). Cumulative measures for these variables were used to achieve consistency with how ratings are displayed to potential app adopters in the App Store. This is also consistent with how rating variables are measured in related work (Lee and Raghu 2014; Baugher et al. 2016; Finkelstein et al. 2017; Kübler et al. 2018; Li et al. 2019).

Table 1 Variables and measurement

Full size table

In order to econometrically analyze online consumer reviews’ impact on app downloads, we used sentiment analysis to generate statistical variables from text. This is a common method in online consumer review studies (Lopez et al. 2020). Two such variables were generated: polarity (Av_Polarity) and subjectivity (Av_Subjectivity). Polarity classifies words, phrases, or sentences from positive to negative (Liu 2010). In our study, polarity reflects to what extent a text review expresses a positive or negative view of an app’s quality. Subjectivity expresses to what extent a text review is fact-based (objective) versus emotional (subjective) in its character (Liu 2012). Specifically, for their downloading decisions, potential app adopters may rely differently much on emotionally expressed views compared with more fact-based ones. To extract a polarity score and a subjectivity score from each review, we used a lexicon-based approach, i.e., we used a dictionary of words annotated with a word’s or a text phrase’s opinion orientation and subjectivity. Specifically, we used pattern.en, a natural language processing toolkit that leverages WordNet to score sentiment according to the English adjectives used in the text (De Smedt and Daelemans 2012; www.pattern.en for details). WordNet is a large electronic lexical English database including more than 117,000 synsets, i.e., groups of words constituting cognitive synonyms (Fellbaum 1998; wordnet.princeton.edu for details). Polarity scores obtained using pattern.en are in the range − 1 to + 1, where a higher value denotes a more positive opinion, and where 0 reflects a neutral opinion. Subjectivity scores are in the range 0 to 1, where a higher score implies a more emotionally oriented expression. Review length (Rev_Length) was used as a third variable since longer reviews have been argued to be more helpful to readers (Huang et al. 2015; Li et al. 2019). Consistent with rating variables, the three review variables were measured as daily averages.

To investigate the interplay of ratings and reviews in mobile app downloads, we generated nine interaction variables. Specifically, for the three rating variables and the three review variables, we multiplied each single rating variable with each single review variable. Prior work has dealt with only a subset of such interaction effects (Tsang and Prendergast 2009; Hu et al. 2013; Li et al. 2019). More comprehensive analyses are called for regarding how such pieces of information are displayed to readers in electronic markets, because rating and review information contain different signals of product quality (Chong et al. 2016; Chen and Xu 2017; Li et al. 2019). However, little is known regarding how different pieces of rating information and review information in combination influence product performance; this is our rationale for exploring a larger set of interaction effects.

Finally, we included app age and gross ranking as independent variables in our econometric analyses. These were included due to their argued importance for app performance according to related work (Jung et al. 2012; Lee and Raghu 2014; Roma and Ragaglia 2016; Kübler et al. 2018). Gross_Rank is a measure of the overall popularity of an app relative to other apps, where a lower positive rank integer value implies higher relative popularity. The ranking is provided by the Apple App Store, which does not openly reveal its measurement procedure. App_Age is included following product-life cycle theory arguments that different types of consumers may rely on different sources of information for their decision-making. It refers to the number of days an app has existed in the App Store since its initial release.

All independent variables are lagged one day in comparison to the dependent variable to be able to test how changes in ratings and reviews impact downloads, as a causality procedure. A one-day lag was used rather than additional days as related work has found that consumers rely only or to a greater extent on most recent reviews (Li et al. 2019; Alzate et al. 2021). Inspection of our data set reveals that reviews and ratings change on a daily basis, and along with the aforementioned literature arguments, it motivates the use of a one-day lag.

Econometric models and analyses

Following Hausman test results, fixed-effects panel regression analyses were performed to analyze how ratings and reviews impact app downloads. Model 1 is the additive benchmark model constituting the direct effects of independent variables on mobile app downloads:

$$\left(\mathrm{ln}\right){Downloads}_{i,t}={Av\_Rating}_{i,t-1}+{Disp\_Rating}_{i,t-1}+(\mathrm{ln}){Vol\_Rating}_{i,t-1}+{Av\_Polarity}_{i,t-1}+{Av\_Subjectivity}_{i,t-1}+(\mathrm{ln}){Rev\_Length}_{i,t-1}+{Gross\_Rank}_{i,t-1}+{App\_Age}_{i,t-1}+{\varepsilon }_{i,t}$$

(1)

Models 2 to 10 are the interaction effect models, each including one interaction effect variable to enable analysis of product terms’ impact on downloads compared with the benchmark model. Hence, the interaction effect models:

$$\left(\mathrm{ln}\right){Downloads}_{i,t}={Av\_Rating}_{i,t-1}+{Disp\_Rating}_{i,t-1}+(\mathrm{ln}){Vol\_Rating}_{i,t-1}+{Av\_Polarity}_{i,t-1}+{Av\_Subjectivity}_{i,t-1}+(\mathrm{ln}){Rev\_Length}_{i,t-1}+{Gross\_Rank}_{i,t-1}+{App\_Age}_{i,t-1}+{Rating\_Variable}_{i,t-1} \times {Review\_Variable}_{i,t-1}+{\varepsilon }_{i,t}$$

(2)

As we used fixed-effect model analyses, ${App\_Type}_{i}$ as a category variable would be omitted due to being invariant if included as an independent variable. Hence, we split the dataset to separately analyze the benchmark model and interaction effect models for gaming and productivity apps, respectively. For causality reasons, independent variables were lagged one day in all models as shown in (1)–(10). Appendix 2 presents a correlation matrix for explanatory variables. It reveals no correlation higher than ($\pm$) 0.8, implying no severe issue of multicollinearity (Mota and Moreira 2015). Appendix 3 presents descriptive statistics for variables. Due to large standard deviation and high skewness reported and narrow scale differences for Downloads, Rev_Length, and Vol_Rating, we used the natural logarithm for these variables. This procedure was taken to normalize the data in line with recommendations to enable valid econometric analysis (Li et al. 2020). The same procedure has been commonly applied for these variables in related work based on app store data (Lee and Raghu 2014; Oh et al. 2015; Gokgoz et al. 2021; Kaur and Singh 2021). Due to the presence of heteroskedasticity revealed by Breusch-Pagan tests, robust standard error terms were used in our regression analyses in line with recommendations (Angrist and Pischke, 2008).

Results and discussion

Tables 2 and 3 in the two subsequent sections report the findings of the fixed-effect regression analyses of how rating and review variables in combination impact gaming and productivity app downloads. The findings are discussed next.

Table 2 Interplay of ratings and reviews in gaming app downloads

Full size table

Table 3 Interplay of ratings and reviews in productivity app downloads

Full size table

Ratings’ and reviews’ impact on gaming app downloads

For gaming apps, volume of ratings is the only piece of rating information that is found to have a direct effect on downloads, as shown in Table 2. Contrary to previous studies into apps (Wang et al. 2015; Burgers et al. 2016), we report a significant negative effect of volume of ratings on downloads. In general, the higher the volume of ratings, the more popular a product is perceived to be (see Khare et al. 2011, for a review of arguments). However, Khare et al. (2011) demonstrate that when consumers have a high need for uniqueness, a higher volume of ratings of a product decreases preference for it. Future work must investigate whether this need for uniqueness effect pertains to mobile apps.

We report that length of text reviews has a positive significant effect on gaming app downloads. This finding is consistent with literature arguing that longer text reviews are more helpful to potential consumers due to providing richer product quality cue information (Huang et al. 2015). Polarity, as a measure of a text review’s valence, was not found to impact gaming app downloads. This finding is consistent with consumers displaying heterogeneity in preferences for products consumed mainly for hedonic purposes (Liu et al. 2014; Tafesse 2021). In other words, if consumers have very different needs and desires for a product, valence measures, such as average rating score and polarity, could be insufficient to rely on separately for making downloading decisions.

Instead of relying on either ratings or reviews, our results indicate that consumers use a combination of both for their gaming app downloading decisions. The findings are thus in line with cue consistency and diagnosticity theoretical rationales for consumers’ combinatory use of ratings and reviews (Feldman and Lynch 1988; Miyazaki et al. 2005). Four significant interaction effects between rating and review variables are reported in Table 2. First, polarity enhances the positive impact of average ratings on downloads, which is consistent with findings for books reported by Hu et al. (2013). One interpretation of this interaction effect is that by reading positive reviews of quality aspects that pertain to their needs, consumers rely more on the high average rating score of a gaming app. Second, polarity is found to enhance the negative impact of volume of ratings on downloads following a positive interaction effect. If this is due to a need for uniqueness as demonstrated by Khare et al. (2011), needs further scrutiny. Third, a negative interaction effect on downloads is reported for length of text reviews and dispersion of ratings. This implies that longer text reviews, by providing richer product quality information and being more helpful than shorter ones, as argued by Huang et al. (2015), could reduce perceptions of gaming app quality uncertainty due to spread in ratings of such apps. Thus, since consumers generally seek to avoid risk, dispersion of ratings tends to be undesirable (Chu et al. 2014).

Fourth, in a similar fashion, length of text reviews is found to lessen the negative effect of volume of ratings on downloads. Due to the direct and indirect effects of length of text reviews on gaming app downloads, having consumers write extensive reviews seems important to gaming app developers.

Ratings’ and reviews’ impact on productivity app downloads

The findings for productivity app downloads are reported in Table 3. Again, volume of ratings is the only rating variable with a significant direct effect on downloads. Contrary to gaming apps, its effect is positive and significant for productivity apps. This finding is consistent with arguments in the literature that higher volume of word-of-mouth has more persuasive power on decision-making and constitutes an indicator of a product being more popular (See Khare et al. 2011 for a review). It moreover corroborates previous empirical findings of how volume of ratings impacts app downloads (Oh et al. 2015; Wang et al. 2015). This suggests the importance of having users rate a productivity app in order to grow the developer’s customer base.

Polarity is the only review variable reported to have a direct effect on productivity app downloads. For products providing mainly utilitarian value, consumers tend to be much more homogeneous in their preferences. Therefore, it has been argued that consumers can infer an app’s quality from its average rating score (Roma and Ragaglia 2016). However, our findings reveal that average rating score does not impact downloads of such apps. One potential explanation of this finding is that, to potential consumers, average rating scores for competing app alternatives are too similar, as stated previously in the literature (Li 2018). In our dataset, the average rating score for an app was 4.027 on a scale of 1 to 5 stars, which corresponds to what is reported in related work (Hyrynsalmi et al. 2015). Consumers may therefore instead turn to text reviews to obtain peers’ detailed opinions on an app’s quality. Moreover, average rating score is an aggregated measure of a product’s quality to a consumer. For multi-attribute productivity apps (e.g., statistics software, spreadsheet apps, word processing apps), potential consumers may be particularly interested in the quality of specific functions or tools of such apps. By representing richer cues than ratings, consumer text reviews may better reveal such information to potential app adopters. This could explain why polarity, but not average rating score, significantly impacts downloads of productivity apps.

In the case of productivity apps as well, consumers seem to use a combination of rating and review information rather than relying on one or the other. As reported in Table 3, an increase in polarity enhances the negative effect of dispersion of ratings on downloads. This suggests that consumers are more skeptical toward text reviews mainly being positive when they know that different consumers have rated apps very differently. This line of reasoning is consistent with findings by Huang et al. (2015) on how consumers use online text reviews for their decision-making. Finally, our results reveal that subjectivity in written text reviews enhances the negative impact of dispersion of ratings on productivity app downloads. This is consistent with arguments that for products of a utilitarian value nature, consumers are more oriented toward fulfilling professional responsibilities, which implies risk aversiveness (Das et al. 2018). Specifically, subjective text reviews accompanied by high dispersion of ratings are suggested to increase potential consumers’ skepticism that the app meets the requirements for successful professional task completion. The findings reported in Tables 2 and 3 overall reveal different interaction effects of rating and review variables on downloads for games compared with productivity apps. The findings are therefore in line with arguments that consumers use rating and review information to different extents depending on whether app consumption is mainly of hedonic or utilitarian value orientation (Roma and Ragaglia 2016; Tafesse 2021).

Conclusions and implications

In this paper, we have explored the combinatory role of online ratings and online reviews in mobile app downloads. This was achieved by utilizing a daily panel data set of 295 gaming apps (gaming category) and 46 productivity apps (productivity category). The data sample consisted of apps tracked for almost two years from their launch in the Apple App Store. We report that ratings and reviews have both direct effects and interaction effects on downloads. These effects of rating and review variables are at the same time found to differ for gaming versus productivity apps. Thereby, we provide further support to the literature arguing for classification of apps into hedonic and utilitarian consumption value segments (Liu et al. 2014; Roma and Ragaglia 2016; Tafesse 2021). That is, apps consumed mainly for fun versus for professional purposes. Mainly, this study has contributed to the sparse literature on how online ratings and reviews separately and in combination impact app consumer behavior. The findings have important implications for user attraction and retention strategies of app developers and platform providers.

Limitations

The limitations of this study are important to acknowledge. First, our empirical analysis was based on a comparison of two app categories. These categories were selected because ratings and reviews have been found to have different effects depending on the type of consumption value a product mainly provides to consumers. Previous studies have thus segmented apps according to hedonic and utilitarian value based on their app store category and reported the merits of such classification. However, the app categories analyzed in this study may differ in other dimensions as well, such as the role played by network effects (Arora et al. 2017) and consumer segments targeted (Liu et al. 2017). The data we utilized did not allow us to control for such effects. Such extensions of this study are recommended for future work into apps. Second, our results are limited to one country (US) and the US Apple App Store market. To what extent our findings generalize to countries with other characteristics, e.g., other cultural dimensions (Hofstede 2001), and across different platforms for apps (Roma and Vasi 2019) needs further scrutiny. Third, our study was restricted to how peer influence impacts app downloads based on literature arguments favoring its use (Sher and Lee 2009). Our work thus needs to be extended by analyzing how peer information relative to commercial marketing information impacts app downloads. Future studies could consider how app tutorial videos and advertisements supplied by app providers impact downloads. Fourth, our findings rely on the use of one specific opinion mining technique. Although this technique has been validated repeatedly and used in related work (Fellbaum 1998; De Smedt and Daelemans 2012), testing the robustness of findings across lexicons and mining techniques is called for.

Directions for further research

This study explored the combinatory role of ratings and reviews in app downloads. Cue theoretical arguments on the one hand and app characteristics discussed in the literature on the other hand guided our exploratory study of how rating variables impact consumer decision-making. As our findings reveal that ratings and reviews have a combinatory impact, future experimental work should examine the corroborating and complementary cue value dimensional effects separately. Moreover, qualitative research into how consumers make mobile app downloading decisions is called for. An improved understanding could offer valuable insights to app developers and platform providers on how to describe apps in appropriate ways as well as how to appeal to users.

Identification of the conditions that make consumers rely on ratings, reviews, or a combination of both is important for improved knowledge of such mechanisms’ effectiveness. Previous studies have indicated that ratings matter to different extents depending on country profile (Kübler et al. 2018) and app store market (Jung et al. 2012). This study has contributed to such work by demonstrating how app type matters for the combinatory role of ratings and reviews. Additional conditions are worth exploration in future work, such as how consumer type, for example, opinion leadership versus opinion-seeking (see Flynn et al. 1996), and type of revenue model (see Roma and Ragaglia 2016) matter for the role played by ratings and reviews. Another condition to be considered is how product updates impact ratings and reviews and in turn consumer decision-making. Until now, there has been little research into the role of updates in product appeal, such as for software (see Comino et al. 2018). Moreover, comparative studies into how ratings and reviews matter for different outcomes, such as creating product awareness, initiating use, and generating sales, represent another avenue of further research.

Previous research (see Schrum et al. 2020, for a review) has demonstrated that the design of a scale, such as a rating system, matters for consumers’ response to it. The range of a rating scale and how alternatives on the scale are phrased or depicted may therefore influence consumer behavior. This effect has been studied for products, albeit not for mobile apps with their specific characteristics. “Optimizing the app store rating system could therefore potentially improve ratings and provide consumers with better insight into the expected experience before downloading a mobile app. Other research (Filieri et al. 2021) has shown that consumers are biased in their consumption decision based on how previous consumers rated a product. Whether to treat ratings’ linearly on a scale is therefore worthy further scrutiny. For instance, how consumers weight differences in average rating score across competing product alternatives represents one such issue.

Moreover, how long the memory process is for low, high, and moderate ratings could be worthy further investigation (cf. Zhang et al. 2015). Such studies could offer complementary value to the present study in which a one day’s lag of ratings and review variables was used to investigate their impact on mobile app downloads.

Managerial implications

Our findings have important implications for app developers and platform providers. First, this study reveals that consumers rely on both rating and text review information for their app downloading decisions. However, different types of this information are found to be important for gaming apps and productivity apps.

For gaming apps, developers should encourage current users to write extensive reviews as they are found to positively impact new downloads of such apps. This can be done via pledges in the app asking users to rate and provide text reviews on the app’s important attributes. Preferences for products used mainly for fun tend to vary from person to person (Akdim et al. 2022). This is consistent with our findings that both a high rating for a gaming app and a positive text review are necessary to attract new downloads. These implications should also be considered by developers of other apps that are mainly consumed for leisure or fun.

For productivity apps, which tend to have multiple attributes and functions such as statistics software or word processing tools, positive reviews are important for stimulating downloads. This information represents important cues about the quality of specific functions of interest to potential consumers, e.g., the quality of a specific data analysis tool in statistics software. Moreover, having consumers rate productivity apps is important for attracting downloads. Developers of productivity apps should therefore create means for users to rate their apps. It is argued that doing so creates discussions about the app, which stimulates downloads (Mitchell and Khazanchi 2008). These implications should be considered by developers of apps consumed mainly for professional purposes.

Finally, for app platform providers, it is important that rating mechanisms fulfill the function of providing users with valuable cue information on competing app alternatives. In our study sample, the average rating score for competing app alternatives was 4.027 on a scale of 1 to 5. Similar findings have been reported in related work (Hyrynsalmi et al. 2015). In effect, ratings may not be sufficiently different across competing app alternatives to guide consumer choice. App stores should therefore consider having consumers provide an overall rating of an app as well as of its various quality attributes. Displaying this specific information in app stores could remedy the problem of ratings being too similar to inform consumer decision-making. As reading an extensive number of text reviews may require significant effort by potential consumers, particularly for lower-priced products such as apps (Burgers et al. 2016), readers need help sorting large volumes of reviews. One extension opportunity for app platform providers is to have readers click on an icon next to a text review if they find it helpful. This information could help readers sort text reviews based on users’ helpfulness ratings, which could make review information more useful to potential app consumers. Finally, the implications for developers of gaming and productivity apps should be considered by app platform providers as well. This follows from the revenue-sharing agreements between app developers and app platform providers (Roma and Ragaglia 2016).

Notes

In order to enable such quantitative analyses, different opinion mining techniques transforming text into quantitative variables have been adopted (for a review of these techniques see Liang et al. 2015).

References

Akdim, K., L. Casaló, and C. Flavián. 2022. The role of utilitarian and hedonic aspects in the continuance intention to use social mobile apps. Journal of Retailing and Consumer Services 66: 102888.
Article Google Scholar
Al-Natour, S., and O. Turetken. 2020. A comparative assessment of sentiment analysis and start ratings for consumer reviews. International Journal of Information Management 54: 102132.
Article Google Scholar
Alzate, M., M. Arce-Urriza, and J. Cebollada. 2021. Online reviews and product sales: The role of review visibility. Journal of Theoretical and Applied Electronic Commerce Research 16 (4): 638–669.
Article Google Scholar
Angrist, J.D., and J.-S. Pischke. 2008. Mostly harmless econometrics: An empiricist’s companion. Princeton: Princeton University Press.
Book Google Scholar
App Store Review Guidelines—Apple Developer. n.d. https://developer.apple.com/app-store/review/guidelines/. Accessed 21 Mar 2022.
Arora, S., F. ter Hofstede, and V. Mahajan. 2017. The implications of offering free versions for the performance of paid mobile apps. Journal of Marketing 81 (3): 62–78.
Article Google Scholar
Baugher, D., C. Ramos, and A. Eisner. 2016. The consistency and validity of online user ratings of movie and DVD quality. Journal of Business and Behavioral Sciences 28 (2): 94–103.
Google Scholar
Borasi, P., and S. Baul. 2019. Mobile application market by marketplace and app category: Global opportunity analysis and industry forecast, 2019–2026. https://www.alliedmarketresearch.com/mobile-application-market. Accessed 23 Apr 2021.
Burgers, C., A. Eden, R. de Jong, and S. Buningh. 2016. Rousing reviews and investigative images: The impact of online reviews and visual design characteristics on app downloads. Mobile Media and Communication 4 (3): 327–346.
Article Google Scholar
Byun, K., M. Ma, K. Kim, and T. Kang. 2021. Buying a new product with inconsistent product reviews from multiple sources: The role of information diagnosticity and advertising. Journal of Interactive Marketing 55: 81–103.
Article Google Scholar
Chen, H., K. Lachaud, and W. Zhou. 2022. The sales effect of “Free App of the Day” on Amazon Appstore: An empirical study. Digital Business 2 (2): 100020.
Article Google Scholar
Chen, R., and W. Xu. 2017. The determinants of online customer ratings: A combined domain ontology and topic text analytics approach. Electronic Commerce Research 17 (1): 31–50.
Article Google Scholar
Chong, A.Y.L., B. Li, E.W.T. Ngai, and E.F. Ch’ngLee. 2016. Predicting online product sales via online reviews, sentiments, and promotion strategies: A big data architecture and neural network approach. International Journal of Operations & Product Management 36 (4): 358–383.
Article Google Scholar
Chu, W., M. Roh, and K. Park. 2014. The effect of the dispersion of review ratings on evaluations of hedonic versus utilitarian products. International Journal of Electronic Commerce 19 (2): 95–125.
Google Scholar
Comino, S., F.M. Manenti, and F. Mariuzzo. 2018. Updates management in mobile applications: ITunes versus Google Play. Journal of Economics and Management Strategy 28 (3): 392–419.
Google Scholar
Darley, W.K., and R.E. Smith. 1993. Advertising claim objectivity: Antecedents and effects. Journal of Marketing 57 (4): 100–113.
Article Google Scholar
Das, G., A. Mukherjee, and R.J. Smith. 2018. The perfect fit: The moderating role of selling cues on hedonic and utilitarian product types. Journal of Retailing 94 (2): 203–216.
Article Google Scholar
De Langhe, B., P.M. Fernbach, and D.R. Lichtenstein. 2015. Navigating by the stars: Investigating the actual and perceived validity of online user ratings. Journal of Consumer Research 42 (6): 817–833.
Article Google Scholar
De Smedt, T., and W. Daelemans. 2012. Pattern for python. Journal of Machine Learning Research 13 (1): 2063–2067.
Google Scholar
DeAndrea, D., B. Van der Heide, M. Vendemia, and M. Vang. 2018. How people evaluate online reviews. Communication Research 45 (5): 719–736.
Article Google Scholar
Deng, L., W. Sun, D. Xu, and Q. Ye. 2021. Impact of anonymity on consumers’ online reviews. Psychology & Marketing 38 (12): 2259–2270.
Article Google Scholar
Đurović, M., and T. Kniepkamp. 2022. Good advice is expensive—bad advice even more: The regulation of online reviews. Law, Innovation and Technology 14 (1): 1–29.
Article Google Scholar
Feldman, J.M., and J.G. Lynch. 1988. Self-generated validity and other effects of measurement on belief, attitude, intention, and behavior. Journal of Applied Psychology 73 (3): 421–435.
Article Google Scholar
Fellbaum, C. 1998. WordNet: A lexical database for English. Cambridge, MA: MIT Press.
Book Google Scholar
Filieri, R., E. Raguseo, and C. Vitari. 2019. What moderates the influence of extremely negative ratings? The role of review and reviewer characteristics. International Journal of Hospitality Management 77: 333–341.
Article Google Scholar
Filieri, R., E. Raguseo, and C. Vitari. 2021. Extremely negative ratings and online consumer review helpfulness: The moderating role of product quality signals. Journal of Travel Research 60 (4): 699–717.
Article Google Scholar
Finkelstein, A., M. Harman, Y. Jia, W. Martin, F. Sarro, and Y. Zhang. 2017. Investigating the relationship between price, rating, and popularity in the Blackberry World App Store. Information and Software Technology 87: 119–139.
Article Google Scholar
Floh, A., M. Koller, and A. Zauner. 2013. Taking a deeper look at online reviews: The asymmetric effect of valence intensity on shopping behavior. Journal of Marketing Management 29 (5–6): 37–41.
Google Scholar
Flynn, L.R., R.E. Goldsmith, and J.K. Eastman. 1996. Opinion leaders and opinion seekers: Two new measurement scales. Journal of the Academy of Marketing Science 24 (2): 137–147.
Article Google Scholar
Gokgoz, Z.A., M.B. Ataman, and G.H. van Bruggen. 2021. There’s an app for that! Understanding the drivers of mobile application downloads. Journal of Business Research 123: 423–437.
Article Google Scholar
Gopinath, S., J.S. Thomas, and L. Krishnamurthi. 2014. Investigating the relationship between the content of online word of mouth, advertising, and brand performance. Marketing Science 33 (2): 241–258.
Gottschalk, S.A., and A. Mafael. 2017. Cutting through the online review jungle—Investigating selective eWOM processing. Journal of Interactive Marketing 37: 89–104.
Article Google Scholar
Guo, B., and Z. Shasha. 2016. Understanding the impact of prior reviews on subsequent reviews: The role of rating volume, variance and reviewer characteristics. Electronic Commerce Research and Applications 20: 147–158.
Article Google Scholar
Hazarika, B., Chen, K., & Razi, M. 2021. Are numeric ratings true representations of reviews? A study of inconsistency between reviews and ratings. International Journal of Business Information Systems 38(1): 85–106.
Hennig-Thurau, T., K.P. Gwinner, G. Walsh, and D.D. Gremler. 2004. Electronic word-of-mouth via consumer-opinion platforms: What motivates consumers to articulate themselves on the Internet? Journal of Interactive Marketing 18 (1): 38–52.
Article Google Scholar
Hofstede, G. 2001. Culture’s consequences: Comparing values, behavior, institutions, and organizations across nations, 2nd ed. Thousand Oaks, CA: SAGE.
Google Scholar
Hu, N., N.S. Koh, and S.K. Reddy. 2013. Ratings lead you to the product, reviews help you clinch it: The mediating role of online review sentiments on product sales. Decision Support Systems 57: 42–53.
Article Google Scholar
Huang, A.H., K. Chen, D.C. Yen, and T.P. Tran. 2015. A study of factors that contribute to review helpfulness. Computers in Human Behavior 48: 17–27.
Article Google Scholar
Hyrynsalmi, S., M. Seppänen, L. Aarikka-Stenroos, A. Suominen, J. Järveläinen, and V. Harkke. 2015. Busting myths of electronic word of mouth: The relationship between customer ratings and the sales of mobile applications. Journal of Theoretical and Applied Electronic Commerce Research 10 (2): 1–18.
Article Google Scholar
Jung, E.-Y., C. Baek, and J.-D. Lee. 2012. Product survival analysis for the app store. Marketing Letters 23 (4): 929–941.
Article Google Scholar
Kashyap, R., A. Kesharwani, and A. Ponnam. 2022. Measurement of online review helpfulness: A formative measure development and validation. Electronic Commerce Research. Preprint.
Kaur, K., and T. Singh. 2021. Impact of online consumer reviews on Amazon Books sales: Empirical evidence from India. Journal of Theoretical and Applied Electronic Commerce Research 16 (7): 2793–2807.
Article Google Scholar
Khare, A., L.I. Labrecque, and A.K. Asare. 2011. The assimilative and contrastive effects of word-of-mouth volume: An experimental examination of online consumer ratings. Journal of Retailing 87 (1): 111–126.
Article Google Scholar
Kim, J., Y. Park, C. Kim, and H. Lee. 2014. Mobile application service networks: Apple’s App Store. Service Business 8 (1): 1–27.
Kirmani, A., and A.R. Rao. 2000. No pain, no gain: A critical review of the literature on signaling unobservable product quality. Journal of Marketing 64 (2): 66–79.
Article Google Scholar
Kübler, R., K. Pauwels, G. Yildirim, and T. Fandrich. 2018. App popularity: Where in the world are consumers most sensitive to price and user ratings? Journal of Marketing 82 (5): 20–44.
Article Google Scholar
Lee, G., and T.S. Raghu. 2014. Determinants of mobile apps’ success: Evidence from the app store market. Journal of Management Information Systems 31 (2): 133–170.
Article Google Scholar
Lee, S., J. Zhang, and M. Wedel. 2021. Managing the versioning decision over an app’s lifetime. Journal of Marketing 85 (6): 44–62.
Article Google Scholar
Li, M., Y. Huang, and A. Sinha. 2020. Data-driven promotion planning for paid mobile applications. Information Systems Research 31 (3): 1007–1029.
Article Google Scholar
Li, X. 2018. Impact of average rating on social media endorsement: The moderating role of rating dispersion and discount threshold. Information Systems Research 29 (3): 739–754.
Article Google Scholar
Li, X., C. Wu, and F. Mai. 2019. The effect of online reviews on product sales: A joint-sentiment topic analysis. Information & Management 56 (2): 172–184.
Article Google Scholar
Liang, T.-P., X. Li, C.-T. Yang, and M. Wang. 2015. What in consumer reviews affects the sales of mobile apps: A multifacet sentiment analysis approach. International Journal of Electronic Commerce 20 (2): 236–260.
Article Google Scholar
Liu, B. 2010. Sentiment analysis and subjectivity. In Handbook of natural language processing, 2nd ed., ed. N. Indurkhya and F.J. Damerau, 627–666. Boca Raton: Chapman and Hall/CRC.
Google Scholar
Liu, B. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies 5 (1): 1–167.
Article Google Scholar
Liu, C.Z., Y.A. Au, and H.S. Choi. 2014. Effects of freemium strategy in the mobile app market: An empirical study of google play. Journal of Management Information Systems 31 (3): 326–354.
Article Google Scholar
Liu, F., S. Zhao, and Y. Li. 2017. How many, how often, and how new? A multivariate profiling of mobile app users. Journal of Retailing and Consumer Services 38: 71–80.
Article Google Scholar
Lopez, A., E. Guerra, B. Gonzalez, and S.M.M. Gomez. 2020. Consumer sentiments toward brands: The interaction effect between brand personality and sentiments on electronic word of mouth. Journal of Marketing Analytics 8 (4): 203–223.
Article Google Scholar
Lu, L., L. Wu, and Z. He. 2020. Is your restaurant worth the risk? A motivational perspective on reviews’ rating distribution and volume. Journal of Hospitality & Tourism Research 44 (8): 1291–1317.
Article Google Scholar
Lutz, B., N. Pröllochs, and D. Neumann. 2022. Are longer reviews always more helpful? Disentangling the interplay between review length and line of argumentation. Journal of Business Research 144 (11): 888–901.
Article Google Scholar
Mitchell, A.J., and D. Khazanchi. 2008. An empirical study of online word of mouth as a predictor for multi-product category e-commerce sales. Electronic Markets 18 (2): 130–141.
Article Google Scholar
Miyazaki, A.D., D. Grewal, and R.C. Goodstein. 2005. The effect of multiple extrinsic cues on quality perceptions: A matter of consistency. Journal of Consumer Research 32 (1): 146–153.
Article Google Scholar
Moe, W.W., and M. Trusov. 2011. The value of social dynamics in online product ratings forums. Journal of Marketing Research 68 (3): 444–456.
Article Google Scholar
Mota, J.H., and A.C. Moreira. 2015. The importance of non-financial determinants on public–private partnerships in Europe. International Journal of Project Management 33 (7): 1563–1575.
Article Google Scholar
Oh, S., H. Baek, and J. Ahn. 2015. The effect of electronic word-of-mouth (eWOM) on mobile application downloads: An empirical investigation. International Journal of Mobile Communications 13 (2): 136–156.
Article Google Scholar
Picoto, W.N., R. Duarte, and I. Pinto. 2019. Uncovering top-ranking factors for mobile apps through a multimethod approach. Journal of Business Research 101 (11): 668–674.
Article Google Scholar
Reddy, S.K., S.L. Holak, and S. Bhat. 1994. To extend or not to extend: Success determinants of line extensions. Journal of Marketing Research 31 (2): 243–262.
Article Google Scholar
Ren, J., and J. Nickerson. 2019. Arousal, valence, and volume: How the influence of online review characteristics differ with respect to utilitarian and hedonic products. European Journal of Information Systems 28 (3): 272–290.
Article Google Scholar
Roma, P., and D. Ragaglia. 2016. Revenue models, in-app purchase, and the app performance: Evidence from Apple’s App Store and Google Play. Electronic Commerce Research and Applications 17: 173–190.
Article Google Scholar
Roma, P., and M. Vasi. 2019. Diversification and performance in the mobile app market: The role of the platform ecosystem. Technological Forecasting and Social Change 147 (3): 123–139.
Article Google Scholar
Sadiq, S., M. Umer, S. Ullah, S. Mirjalili, V. Rupapara, and M. Nappi. 2021. Discrepancy detection between actual user reviews and numeric ratings of Google App store using deep learning. Expert Systems with Applications 181: 115111.
Article Google Scholar
Salehan, M., and D.J. Kim. 2016. Predicting the performance of online consumer reviews: A sentiment mining approach to big data analytics. Decision Support Systems 81: 30–40.
Article Google Scholar
Schrum, M.L., M. Johnson, M. Ghuy, and M.C. Gombolay. 2020. Four years in review: Statistical practices of likert scales in human–robot interaction studies. In HRI ’20: Companion of the 2020 ACM/IEEE international conference on human–robot interaction, March 2020, 43–52. Cambridge: Association for Computing Machinery.
Chapter Google Scholar
Sher, P.J., and S.-H. Lee. 2009. Consumer skepticism and online reviews: An elaboration likelihood model perspective. Social Behavior and Personality 30 (1): 123–127.
Google Scholar
Shin, S., Q. Du, Y. Ma, W. Fan, and Z. Xiang. 2021. Moderating effects of rating on text and helpfulness in online hotel reviews: An analytical approach. Journal of Hospitality Marketing & Management 30 (2): 159–177.
Article Google Scholar
Sun, T., S. Youn, G. Wu, and M. Kuntaraporn. 2006. Online word-of-mouth (or mouse): An exploration of its antecedents and consequences. Journal of Computer-Mediated Communication 11 (4): 1104–1127.
Article Google Scholar
Tafesse, W. 2021. The effect of app store strategy on app rating: The moderating role of hedonic and utilitarian mobile apps. International Journal of Information Management 57: 102299.
Article Google Scholar
Tang, A. 2016. Mobile app monetization: App business models in the digital era. International Journal of Innovation, Management, and Technology 7 (5): 224–227.
Article Google Scholar
Timmerman, J.E., and I. Shepherd. 2016. Does eWOM affect demand for mobile device applications? Journal of Marketing Development and Competitiveness 10 (3): 9–16.
Google Scholar
Tsang, A.S.L., and G. Prendergast. 2009. Is a “star” worth a thousand words? The interplay between product-review texts and rating valences. European Journal of Marketing 43 (11): 1269–1280.
Article Google Scholar
Wang, Y., J. Song, and M. Aguirre-Urreta. 2015. An empirical investigation of the factors impacting application downloads in mobile app stores. Proceedings of the SIGHCI conference, 13 December 2015, pp. 1–20.
Xia, H., X. Pan, W. An, and Z. Zhang. 2021. Can online rating reflect authentic customer purchase feelings? Understanding how customer dissatisfaction relates to negative reviews. Journal of Computer Information Systems 61 (4): 314–327.
Article Google Scholar
Yang, H., and R. Lin. 2019. Why do people continue to play mobile game apps? A perspective of individual motivation, social factor and gaming factor. Journal of Internet Technology 20 (6): 1925–1936.
Google Scholar
Zhang, Y. -L., Q. Guo, J. Ni, and J.-G. Liu. 2015. Memory effect of the online rating for movies. Physica A: Statistical Mechanics and its Applications 417: 261–266.
Zheng, X., L. Zhang, and N. Line. 2021. The effects of unfulfilled preferential treatment and review dispersion on Airbnb guests’ attitudes and behavior. Journal of Hospitality & Tourism Research. https://doi.org/10.1177/10963480211066960.
Article Google Scholar
Zhu, L., Y. Lin, and M. Cheng. 2020. Sentiment and guest satisfaction with peer-to-peer accommodation: When are online ratings more trustworthy? International Journal of Hospitality Management 86: 102369.
Article Google Scholar

Download references

Funding

Open access funding provided by Blekinge Institute of Technology.

Author information

Authors and Affiliations

Department of Industrial Economics, Blekinge Institute of Technology, Campus Gräsvik, 371 79, Karlskrona, Sweden
Henrik Sällberg, Shujun Wang & Emil Numminen

Authors

Henrik Sällberg
View author publications
You can also search for this author in PubMed Google Scholar
Shujun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Emil Numminen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shujun Wang.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Summary of literature on online ratings’ impact on app performance

Article	Data	Analysis	Rating variable(s)	Performance metric(s)	Findings
Jung et al. (2012)	8 months of iTunes data on 1001 free apps and 643 free apps in Apple App Store Korea	Hazard regression analysis	Average rating score	Top-list survival (top 100 free, and grossing charts)	Positive impact of average rating score on free but not on paid app top-list survival
Lee and Raghu (2014)	39 weeks of weekly data on top 300 listed apps in Apple App Store. Data set spans free and paid apps across all 20 categories	Logistic mixed linear regression	Average rating score, volume of ratings	Top-list survival time (dichotomous variable of whether an app remains top 300 on the gross chart list for a given week)	Positive significant impact of average rating score and volume of ratings on top-list survival
Liu et al. (2014)	Two months of daily data on 711 freemium apps in Google Play	Generalized least squares regression analysis	Average rating score (of free app scale 1–5 stars)	Downloads (daily count)	Average rating score of paid apps positively impacts paid app downloads. This effect is reduced when a free version of the app is offered Average rating score of the free version of an app positively impacts downloads of the paid version. This impact is reduced for hedonic versus utilitarian apps
Oh et al. (2015)	Weekly data on 586 free and paid apps in T-Store for 24 weeks	Log-linear fixed-effects regression analysis	Average rating score, volume of ratings	Downloads (log of count of weekly downloads)	Positive impact of average rating score on free but not on paid app downloads Positive impact of volume or ratings for both free and paid apps
Liang et al. (2015)	Weekly data for 52 weeks on top-500 ranked free and paid apps in Apple App Store	Fixed and random effects regression analysis	Average rating score, volume of ratings	Sales rank (log rank)	Negative impact of both average rating score and volume of ratings on app sales of free and paid apps
Wang et al. (2015)	500 communication apps in Google Play	Cross-sectional multiple linear regression	Average rating score, volume of ratings	Downloads (log count)	Positive impact of volume of ratings but not average rating score on downloads
Burgers et al. (2016)	500 transportation apps in Google Play	Cross-sectional multiple linear regression analysis	Average rating score, volume of ratings	Downloads (10 exponential categories of counts of downloads in intervals as provided by Google Play)	Positive impact of average rating score and volume of ratings on downloads
Roma and Ragaglia (2016)	20 weeks of weekly data on 59 top 200 listed apps in Google Play (N = 59), and Apple App Store (N = 51)	Fixed-effects and random effects panel regression analysis	High app rating (binary variable = 1 if equal to or greater than 4 stars on scale of 1–5 stars)	Revenue rank (integer variable of rank by app and store, power-law transformed according to relationship between sales and rank)	No impact found of high app rating on revenue rank
Timmerman and Shepherd (2016)	14 days of data on top 50 ranked apps in Apple App Store	Correlation analysis	Average rating score (1–5 stars)	Downloads rank	No impact of average rating score on downloads rank
Finkelstein et al. (2017)	Cross-section of all apps in Blackberry World App Store	Correlation analysis	Average rating score	Downloads rank	Positive impact of average rating score on both free and paid app downloads
Kübler et al. (2018)	276 days of daily Apple App Store data for 20 top 100 selling apps in 60 countries	Dynamic panel regression analysis	Average rating score, volume of ratings	Sales rank (log sales rank by app and country)	The impact of average rating score and volume of ratings on sales rank is contingent on country cultural profile
Gokgoz et al. (2021)	979 newly released applications from Apple App Store	Exploratory analysis. Time-varying-parameter models	Average rating score	Daily downloads (by free & paid apps)	Average rating score positively affects app daily downloads
Chen et al. (2022)	34 apps that have practiced FAD promotion during the period 2015–06-04 to 2015–07-23	Generalized linear model with two-way fixed effects	Average rating	Sales rank	The effect of average rating on sales rank is positive

Appendix 2: Correlation matrix

	${Av\_Rating}_{i,t-1}$	${Disp\_Rating}_{i,t-1}$	$(ln){Vol\_Rating}_{i,t-1}$	${Av\_Polarity}_{i,t-1}$	${Av\_Subjectivity}_{i,t-1}$	$(ln){Rev\_length}_{i,t-1}$	${Gross\_Rank}_{i,t-1}$
${Av\_Rating}_{i,t-1}$	1
${Disp\_Rating}_{i,t-1}$	− 0.79***	1
$(ln){Vol\_Rating}_{i,t-1}$	0.19***	− 0.13***	1
${Av\_Polarity}_{i,t-1}$	0.28***	− 0.28***	− 0.02***	1
${Av\_Subjectivity}_{i,t-1}$	0.07***	− 0.09***	− 0.02***	0.29***	1
$(ln){Rev\_Length}_{i,t-1}$	− 0.25***	0.26***	0.01**	− 0.27***	0.07***	1
${Gross\_Rank}_{i,t-1}$	− 0.07***	0.03***	− 0.11***	− 0.02***	− 0.00	0.00	1
${App\_Age}_{i,t-1}$	0.00*	0.00	0.28***	0.01***	0.03***	− 0.01***	0.11***

*p < 0.05; **p < 0.01; ***p < 0.001

Appendix 3: Descriptive statistics of key variables

Variable	Obs	Mean	SD	Min	Max
${Downloads}_{i,t}$	133,126	6478.353	44,569.609	1	1,082,441
${Av\_Rating}_{i,t-1}$	133,126	4.01	.643	1	5
${Disp\_Rating}_{i,t-1}$	133,126	1.237	.345	0	2.828
${Vol\_Rating}_{i,t-1}$	133,126	3462.739	8884.809	1	180,234
${Av\_Polarity}_{i,t-1}$	133,126	.165	.227	− 1	1
${Av\_Subjectivity}_{i,t-1}$	133,126	.487	.163	0	1
${Rev\_Length}_{i,t-1}$	133,126	32.121	35.388	.5	1561.333
${Gross\_Rank}_{i,t-1}$	115,333	330.467	395.463	0	1485
${App\_Age}_{i,t-1}$	133,126	252.359	163.201	1	717

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Sällberg, H., Wang, S. & Numminen, E. The combinatory role of online ratings and reviews in mobile app downloads: an empirical investigation of gaming and productivity apps from their initial app store launch. J Market Anal 11, 426–442 (2023). https://doi.org/10.1057/s41270-022-00171-w

Download citation

Revised: 30 December 2021
Accepted: 09 May 2022
Published: 31 May 2022
Issue Date: September 2023
DOI: https://doi.org/10.1057/s41270-022-00171-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The combinatory role of online ratings and reviews in mobile app downloads: an empirical investigation of gaming and productivity apps from their initial app store launch

Abstract

Similar content being viewed by others

The Impact of Online Reviews on Download Numbers of Mobile Apps

Investigating the Effect of User Reviews on Mobile Apps: The Role of Customer Led Innovation

The Impact of Customer Reviews on Product Innovation: Empirical Evidence in Mobile Apps

Introduction

Literature review

Online ratings’ impact on product performance

Online reviews’ impact on product performance

The interplay of ratings and reviews in product performance

The interplay of ratings and reviews in mobile app downloads: cue consistency and diagnosticity rationales

Methodology

Data

Variables and measurement

Econometric models and analyses

Results and discussion

Ratings’ and reviews’ impact on gaming app downloads

Ratings’ and reviews’ impact on productivity app downloads

Conclusions and implications

Limitations

Directions for further research

Managerial implications

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1: Summary of literature on online ratings’ impact on app performance

Appendix 2: Correlation matrix

Appendix 3: Descriptive statistics of key variables

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation