Skip to main content
Log in

Studying the dialogue between users and developers of free apps in the Google Play Store

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

The popularity of mobile apps continues to grow over the past few years. Mobile app stores, such as the Google Play Store and Apple’s App Store provide a unique user feedback mechanism to app developers through the possibility of posting app reviews. In the Google Play Store (and soon in the Apple App Store), developers are able to respond to such user feedback. Over the past years, mobile app reviews have been studied excessively by researchers. However, much of prior work (including our own prior work) incorrectly assumes that reviews are static in nature and that users never update their reviews. In a recent study, we started analyzing the dynamic nature of the review-response mechanism. Our previous study showed that responding to a review often has a positive effect on the rating that is given by the user to an app. In this paper, we revisit our prior finding in more depth by studying 4.5 million reviews with 126,686 responses for 2,328 top free-to-download apps in the Google Play Store. One of the major findings of our paper is that the assumption that reviews are static is incorrect. In particular, we find that developers and users in some cases use this response mechanism as a rudimentary user support tool, where dialogues emerge between users and developers through updated reviews and responses. Even though the messages are often simple, we find instances of as many as ten user-developer back-and-forth messages that occur via the response mechanism. Using a mixed-effect model, we identify that the likelihood of a developer responding to a review increases as the review rating gets lower or as the review content gets longer. In addition, we identify four patterns of developers: 1) developers who primarily respond to only negative reviews, 2) developers who primarily respond to negative reviews or to reviews based on their contents, 3) developers who primarily respond to reviews which are posted shortly after the latest release of their app, and 4) developers who primarily respond to reviews which are posted long after the latest release of their app. We perform a qualitative analysis of developer responses to understand what drives developers to respond to a review. We manually analyzed a statistically representative random sample of 347 reviews with responses for the top ten apps with the highest number of developer responses. We identify seven drivers that make a developer respond to a review, of which the most important ones are to thank the users for using the app and to ask the user for more details about the reported issue. Our findings show that it can be worthwhile for app owners to respond to reviews, as responding may lead to an increase in the given rating. In addition, our findings show that studying the dialogue between user and developer can provide valuable insights that can lead to improvements in the app store and user support process.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Throughout this paper we use ‘developer’ to indicate the person(s) or company who are responsible for making an app.

  2. Note that this number is different from the numbers in Table 4, since in Table 4 we count the number of reviews/ratings that changed at least once during the dialogue.

  3. We experimented with several thresholds, as explained in Section 9, which resulted in very similar models.

  4. https://clutch.co/

References

  • Akdeniz (2013) Google play crawler. https://github.com/akdeniz/google-play-crawler (last accessed: July 2017)

  • Anderson DR, Burnham KP, Gould WR, Cherry S (2001) Concerns about finding effects that are actually spurious. Wildl Soc Bull 311–316

  • App Annie (2013) The app analytics and app data industry standard: Google play store, united states, top overall, free, week 35. https://www.appannie.com/ (last accessed: July 2017)

  • AppBrain Free versus paid android apps. http://www.appbrain.com/stats/free-and-paid-android-applications. (last accessed: July 2017)

  • Bates D, Maechler M, Bolker B, Walker S (2017) Package ‘lme4’. https://cran.r-project.org/web/packages/lme4/lme4.pdf (last accessed: July 2017)

  • Borgatti S Introduction to grounded theory. http://www.analytictech.com/mb870/introtogt.htm. (last accessed: July 2017)

  • Documentation for package ‘stats’. https://stat.ethz.ch/r-manual/r-patched/library/stats/html/00index.html (last accessed: July 2017)

  • Eisenhauer JG (2009) Explanatory power and statistical significance. Teach Stat 31(2):42–46

    Article  Google Scholar 

  • Fox J, Weisberg S (2017) Package ‘car’. https://cran.r-project.org/web/packages/car/car.pdf (last accessed: July 2017)

  • Gehan EA (1965) A generalized Wilcoxon test for comparing arbitrarily singly-censored samples. Biometrika 52(1–2):203–223

    Article  MathSciNet  Google Scholar 

  • Google: Google my business help - read and reply to reviews. https://support.google.com/business/answer/3474050?hl=en (last accessed: July 2017)

  • Google: Google play developer API - reply to reviews. https://developers.google.com/android-publisher/reply-to-reviews (last accessed: July 2017)

  • Guzman E, Azócar D, Li Y (2014) Sentiment analysis of commit comments in GitHub: an empirical study. In: 11tH working conference on mining software repositories (MSR). IEEE, pp 352–355

  • Guzman E, Maalej W (2014) How do users like this feature? A fine grained sentiment analysis of app reviews. In: 22nd international requirements engineering conference (RE). IEEE, pp 153–162

  • Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1):29–36

    Article  Google Scholar 

  • Harman M, Jia Y, Zhang Y (2012) App store mining and analysis: MSR for app stores. In: 9th working conference of mining software repositories (MSR). IEEE, pp 108–111

  • Iacob C, Harrison R (2013) Retrieving and analyzing mobile apps feature requests from online reviews. In: 10th working conference on mining software repositories (MSR). IEEE, pp 41–44

  • Iacob C, Harrison R, Faily S (2013) Online reviews as first class artifacts in mobile app development. In: 5th international conference on mobile computing, applications, and services (MobiCASE), pp 47–53

    Google Scholar 

  • Harrell FE Jr (2017) Package ‘hmisc’. https://cran.r-project.org/web/packages/hmisc/hmisc.pdf (last accessed: July 2017)

  • Khalid H (2013) On identifying user complaints of iOS apps. In: 35th international conference on software engineering (ICSE), pp 1474–1476

  • Khalid H, Shihab E, Nagappan M, Hassan AE (2015) What do mobile app users complain about? IEEE Softw 32(3):70–77

    Article  Google Scholar 

  • Khalid M, Asif M, Shehzaib U (2015) Towards improving the quality of mobile app reviews. International Journal of Information Technology and Computer Science 35–41

    Article  Google Scholar 

  • Khandkar SH Open coding. http://pages.cpsc.ucalgary.ca/saul/wiki/uploads/CPSC681/open-coding.pdf. (last accessed: July 2017)

  • Long JD, Feng D, Cliff N (2003) Ordinal analysis of behavioral data. Wiley

  • Maalej W, Nabil H (2015) Bug report, feature request, or simply praise? on automatically classifying app reviews. In: 23rd international requirements engineering conference (RE). IEEE, pp 116–125

  • Martin, P 77% Will not download a retail app rated lower than 3 stars. https://blog.testmunk.com/77-will-not-download-a-retail-app-rated-lower-than-3-stars/. (Last accessed: July 2017)

  • Martin W, Harman M, Jia Y, Sarro F, Zhang Y (2015) The app sampling problem for app store mining. In: 12th working conference on mining software repositories (MSR). IEEE/ACM, pp 123–133

  • Martin W, Sarro F, Jia Y, Zhang Y, Harman M (2016) A survey of app store analysis for software engineering. IEEE Trans Softw Eng (TSE) PP(99):1–32

    Google Scholar 

  • McIlroy S, Ali N, Khalid H, Hassan AE (2016) Analyzing and automatically labelling the types of user issues that are raised in mobile app reviews. Empir Softw Eng (EMSE) 21(3):1067–1106

    Article  Google Scholar 

  • McIlroy S, Shang W, Ali N, Hassan A (2015) Is it worth responding to reviews? a case study of the top free apps in the Google Play store. IEEE Software PP (99):1–1

    Google Scholar 

  • Moran K, Vásquez ML, Bernal-Cárdenas C, Poshyvanyk D (2015) Auto-completing bug reports for android applications. In: Proceedings of the 2015 10th joint meeting on foundations of software engineering (ESEC/FSE). ACM, pp 673–686

  • Oh J, Kim D, Lee U, Lee J, Song J (2013) Facilitating developer-user interactions with mobile app review digests. In: 2013 conference on human factors in computing systems (CHI). ACM SIGCHI, pp 1809–1814

  • Pagano D, Maalej W (2013) User feedback in the appstore: an empirical study. In: 21st international requirements engineering conference (RE). IEEE, pp 125–134

  • Perez S (2017) Apple will finally let developers respond to app store reviews. https://techcrunch.com/2017/01/24/apple-will-finally-let-developers-respond-to-app-store-reviews/. (Last accessed: July 2017)

  • Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the NSSE and other surveys: are the t-test and Cohen’s d indices the most appropriate choices. In: Annual meeting of the southern association for institutional research

  • Seaman CB (1999) Qualitative methods in empirical studies of software engineering. IEEE Trans Softw Eng (TSE) 25(4):557–572

    Article  Google Scholar 

  • SentiStrength. http://sentistrength.wlv.ac.uk. (last accessed: July 2017)

  • Snijders TA, Bosker RJ (2012) Multilevel analysis: an introduction to basic and advanced multilevel modeling. Sage Publications

  • Snijders TAB (2005) Fixed and random effects. In: Encyclopedia of statistics in behavioral science. Wiley

  • Tantithamthavorn C, McIntosh S, Hassan AE, Matsumoto K (2016) Comments on researcher bias: the use of machine learning in software defect prediction. IEEE Trans Softw Eng (TSE) 11(42):1092–1094

    Article  Google Scholar 

  • Thelwall M, Buckley K, Paltoglou G, Cai D, Kappas A (2010) Sentiment in short strength detection informal text. JASIST 61(12):2544–2558

    Article  Google Scholar 

  • Tourani P, Jiang Y, Adams B (2014) Monitoring sentiment in open source mailing lists: exploratory study on the apache ecosystem. In: 24th annual international conference on computer science and software engineering (CASCON), pp 34–44

  • Top Android phones. http://www.appbrain.com/stats/top-android-phones. (last accessed: July 2017)

  • What does AUC stand for and what is it? http://stats.stackexchange.com/questions/132777/what-does-auc-stand-for-and-what-is-it. (last accessed: July 2017)

  • Wilcoxon rank sum and signed rank tests. https://stat.ethz.ch/r-manual/r-devel/library/stats/html/wilcox.test.html. (last accessed: July 2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Safwat Hassan.

Additional information

Communicated by: Andreas Zeller

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hassan, S., Tantithamthavorn, C., Bezemer, CP. et al. Studying the dialogue between users and developers of free apps in the Google Play Store. Empir Software Eng 23, 1275–1312 (2018). https://doi.org/10.1007/s10664-017-9538-9

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10664-017-9538-9

Keywords

Navigation