Abstract
App reviews often reflect end-users’ requests, issues or suggestions for supporting app maintenance and evolution. Hence, researchers have evaluated several classification approaches for identifying and classifying such app reviews. However, these classification approaches are driven by manually derived taxonomies. This is a limitation given the burden of human involvement, numerous app reviews and dependency on the availability of domain knowledge to perform classification. In this pilot study, we develop and evaluate a novel approach towards the automatic generation of a dynamic taxonomy that groups related app reviews. Our approach uses natural language processing, feature engineering and word sense disambiguation to automatically generate the taxonomy. We validated the proposed approach with app reviews extracted from the popular My Tracks app, where outcomes revealed a 72% match with a manual taxonomy generated from domain knowledge provided by humans. Our approach shows promise for rapidly supporting software maintenance and evolution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Aggarwal, C., Zhai, C.: Mining Text Data. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4
Archak, N., Ghose, A., Ipeirotis, P.G.: Show me the money! Deriving the pricing power of product features by mining consumer reviews. In: Proceedings of the 13th SIGKDD, pp. 56–65. ACM (2007)
Boehm, B., Port, D.: Educating software engineering students to manage risk. In: Proceedings of the 23rd ICSE, pp. 591–600. IEEE Computer Society (2001)
Boutkova, E.: Experience with variability management in requirement specifications. In: 15th SPLC, pp. 303–312. IEEE (2011)
Bullinaria, J.A., Levy, J.P.: Extracting semantic representations from word co-occurrence statistics: a computational study. Behav. Res. Methods 39, 510–526 (2007). https://doi.org/10.3758/BF03193020
Chen, N., Lin, J., Hoi, S.C.H., et al.: AR-miner: mining informative reviews for developers from mobile app marketplace. In: Proceedings of the 36th ICSE, pp. 767–778. ACM, Hyderabad (2014)
Ciurumelea, A., Panichella, S., Gall, H.C.: Automated user reviews analyser. In: ICSE, pp. 317–318 (2018)
Di Sorbo, A., Panichella, S., Alexandru, C.V., et al.: What would users change in my app? Summarizing app reviews for recommending software changes. In: Proceedings of the 24th SIGSOFT, pp. 499–510. ACM (2016)
Fleiss, J.L., Cohen, J.: The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ. Psychol. Measur. 33, 613–619 (1973)
Hajič, J., Raab, J., Spousta, M.: Semi-supervised training for the averaged perceptron POS tagger. In: Proceedings of the 12th ACL, pp. 763–771. Association for Computational Linguistics (2009)
Karov, Y., Edelman, S.: Similarity-based word sense disambiguation. Comput. Linguist. 24, 41–59 (1998)
Kiremire, A.R.: The application of the pareto principle in software engineering, 13 January (2011)
Ko, Y., Park, S., Seo, J.: Web-based requirements elicitation supporting system using requirements categorization. In: Proceedings of the 12th SEKE 2000, Chicago, USA, pp. 344–351 (2000)
Konkol, M., BrychcĂn, T., KonopĂk, M.: Latent semantics in named entity recognition. Expert Syst. Appl. 42, 3470–3479 (2015)
Kropp, R.P., Stoker, H.W., Bashaw, W.: The validation of the taxonomy of educational objectives. J. Exp. Educ. 34, 69–76 (1966)
Licorish, S.A., Savarimuthu, B.T.R., Keertipati, S.: Attributes that predict which features to fix: lessons for app store mining. In: Proceedings of the 21st EASE, pp. 108–117. ACM, Karlskrona (2017)
Maalej, W., Kurtanović, Z., Nabil, H., Stanik, C.: On the automatic classification of app reviews. Requirements Eng. 21(3), 311–331 (2016). https://doi.org/10.1007/s00766-016-0251-9
Martinez-Gil, J.: An overview of textual semantic similarity measures based on web intelligence. Artif. Intell. Rev. 42(4), 935–943 (2012). https://doi.org/10.1007/s10462-012-9349-8
Mayring, P.: Qualitative content analysis. A Companion Qual. Res. 1, 159–176 (2004)
Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: AAAI, pp. 775–780 (2006)
Pagano, D., Maalej, W.: User feedback in the appstore: an empirical study. In: 2013 21st Requirements Engineering, pp. 125–134. IEEE (2013)
Panichella, S., Di Sorbo, A., Guzman, E., et al.: ARdoc: app reviews development oriented classifier. In: Proceedings of the 24th SIGSOFT, pp. 1023–1027. ACM (2016)
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the EMNLP, pp. 1532–1543 (2014)
Rohde, D.L., Gonnerman, L.M., Plaut, D.C.: An improved model of semantic similarity based on lexical co-occurrence. Commun. ACM 8, 116 (2006)
Sánchez, D., Batet, M., Isern, D.: Ontology-based information content computation. Knowl.-Based Syst. 24, 297–303 (2011)
Snijders, R., Dalpiaz, F., Hosseini, M., et al.: Crowd-centric requirements engineering. In: UCC, pp. 614–615 (2014)
Zhang, M., Palade, V., Wang, Y., et al.: Word representation with salient features. IEEE Access 7, 30157–30173 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Malgaonkar, S., Licorish, S.A., Savarimuthu, B.T.R. (2020). Towards Automated Taxonomy Generation for Grouping App Reviews: A Preliminary Empirical Study. In: Shepperd, M., Brito e Abreu, F., Rodrigues da Silva, A., PĂ©rez-Castillo, R. (eds) Quality of Information and Communications Technology. QUATIC 2020. Communications in Computer and Information Science, vol 1266. Springer, Cham. https://doi.org/10.1007/978-3-030-58793-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-58793-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58792-5
Online ISBN: 978-3-030-58793-2
eBook Packages: Computer ScienceComputer Science (R0)