Towards Automated Taxonomy Generation for Grouping App Reviews: A Preliminary Empirical Study

Malgaonkar, Saurabh; Licorish, Sherlock A.; Savarimuthu, Bastin Tony Roy

doi:10.1007/978-3-030-58793-2_10

Saurabh Malgaonkar⁹,
Sherlock A. Licorish⁹ &
Bastin Tony Roy Savarimuthu⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1266))

Included in the following conference series:

International Conference on the Quality of Information and Communications Technology

1175 Accesses
1 Citations

Abstract

App reviews often reflect end-users’ requests, issues or suggestions for supporting app maintenance and evolution. Hence, researchers have evaluated several classification approaches for identifying and classifying such app reviews. However, these classification approaches are driven by manually derived taxonomies. This is a limitation given the burden of human involvement, numerous app reviews and dependency on the availability of domain knowledge to perform classification. In this pilot study, we develop and evaluate a novel approach towards the automatic generation of a dynamic taxonomy that groups related app reviews. Our approach uses natural language processing, feature engineering and word sense disambiguation to automatically generate the taxonomy. We validated the proposed approach with app reviews extracted from the popular My Tracks app, where outcomes revealed a 72% match with a manual taxonomy generated from domain knowledge provided by humans. Our approach shows promise for rapidly supporting software maintenance and evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Aggarwal, C., Zhai, C.: Mining Text Data. Springer, Boston (2012). https://doi.org/10.1007/978-1-4614-3223-4
Book Google Scholar
Archak, N., Ghose, A., Ipeirotis, P.G.: Show me the money! Deriving the pricing power of product features by mining consumer reviews. In: Proceedings of the 13th SIGKDD, pp. 56–65. ACM (2007)
Google Scholar
Boehm, B., Port, D.: Educating software engineering students to manage risk. In: Proceedings of the 23rd ICSE, pp. 591–600. IEEE Computer Society (2001)
Google Scholar
Boutkova, E.: Experience with variability management in requirement specifications. In: 15th SPLC, pp. 303–312. IEEE (2011)
Google Scholar
Bullinaria, J.A., Levy, J.P.: Extracting semantic representations from word co-occurrence statistics: a computational study. Behav. Res. Methods 39, 510–526 (2007). https://doi.org/10.3758/BF03193020
Article Google Scholar
Chen, N., Lin, J., Hoi, S.C.H., et al.: AR-miner: mining informative reviews for developers from mobile app marketplace. In: Proceedings of the 36th ICSE, pp. 767–778. ACM, Hyderabad (2014)
Google Scholar
Ciurumelea, A., Panichella, S., Gall, H.C.: Automated user reviews analyser. In: ICSE, pp. 317–318 (2018)
Google Scholar
Di Sorbo, A., Panichella, S., Alexandru, C.V., et al.: What would users change in my app? Summarizing app reviews for recommending software changes. In: Proceedings of the 24th SIGSOFT, pp. 499–510. ACM (2016)
Google Scholar
Fleiss, J.L., Cohen, J.: The equivalence of weighted kappa and the intraclass correlation coefficient as measures of reliability. Educ. Psychol. Measur. 33, 613–619 (1973)
Article Google Scholar
Hajič, J., Raab, J., Spousta, M.: Semi-supervised training for the averaged perceptron POS tagger. In: Proceedings of the 12th ACL, pp. 763–771. Association for Computational Linguistics (2009)
Google Scholar
Karov, Y., Edelman, S.: Similarity-based word sense disambiguation. Comput. Linguist. 24, 41–59 (1998)
Google Scholar
Kiremire, A.R.: The application of the pareto principle in software engineering, 13 January (2011)
Google Scholar
Ko, Y., Park, S., Seo, J.: Web-based requirements elicitation supporting system using requirements categorization. In: Proceedings of the 12th SEKE 2000, Chicago, USA, pp. 344–351 (2000)
Google Scholar
Konkol, M., Brychcín, T., Konopík, M.: Latent semantics in named entity recognition. Expert Syst. Appl. 42, 3470–3479 (2015)
Article Google Scholar
Kropp, R.P., Stoker, H.W., Bashaw, W.: The validation of the taxonomy of educational objectives. J. Exp. Educ. 34, 69–76 (1966)
Article Google Scholar
Licorish, S.A., Savarimuthu, B.T.R., Keertipati, S.: Attributes that predict which features to fix: lessons for app store mining. In: Proceedings of the 21st EASE, pp. 108–117. ACM, Karlskrona (2017)
Google Scholar
Maalej, W., Kurtanović, Z., Nabil, H., Stanik, C.: On the automatic classification of app reviews. Requirements Eng. 21(3), 311–331 (2016). https://doi.org/10.1007/s00766-016-0251-9
Article Google Scholar
Martinez-Gil, J.: An overview of textual semantic similarity measures based on web intelligence. Artif. Intell. Rev. 42(4), 935–943 (2012). https://doi.org/10.1007/s10462-012-9349-8
Article Google Scholar
Mayring, P.: Qualitative content analysis. A Companion Qual. Res. 1, 159–176 (2004)
Google Scholar
Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: AAAI, pp. 775–780 (2006)
Google Scholar
Pagano, D., Maalej, W.: User feedback in the appstore: an empirical study. In: 2013 21st Requirements Engineering, pp. 125–134. IEEE (2013)
Google Scholar
Panichella, S., Di Sorbo, A., Guzman, E., et al.: ARdoc: app reviews development oriented classifier. In: Proceedings of the 24th SIGSOFT, pp. 1023–1027. ACM (2016)
Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Proceedings of the EMNLP, pp. 1532–1543 (2014)
Google Scholar
Rohde, D.L., Gonnerman, L.M., Plaut, D.C.: An improved model of semantic similarity based on lexical co-occurrence. Commun. ACM 8, 116 (2006)
Google Scholar
Sánchez, D., Batet, M., Isern, D.: Ontology-based information content computation. Knowl.-Based Syst. 24, 297–303 (2011)
Article Google Scholar
Snijders, R., Dalpiaz, F., Hosseini, M., et al.: Crowd-centric requirements engineering. In: UCC, pp. 614–615 (2014)
Google Scholar
Zhang, M., Palade, V., Wang, Y., et al.: Word representation with salient features. IEEE Access 7, 30157–30173 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Science, University of Otago, Dunedin, New Zealand
Saurabh Malgaonkar, Sherlock A. Licorish & Bastin Tony Roy Savarimuthu

Authors

Saurabh Malgaonkar
View author publications
You can also search for this author in PubMed Google Scholar
Sherlock A. Licorish
View author publications
You can also search for this author in PubMed Google Scholar
Bastin Tony Roy Savarimuthu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saurabh Malgaonkar .

Editor information

Editors and Affiliations

Brunel University, London, UK
Martin Shepperd
Lisbon University Institute, Lisbon, Portugal
Fernando Brito e Abreu
University of Lisbon, Lisbon, Portugal
Alberto Rodrigues da Silva
University of Castilla-La Mancha, Talavera de la Reina, Spain
Ricardo Pérez-Castillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Malgaonkar, S., Licorish, S.A., Savarimuthu, B.T.R. (2020). Towards Automated Taxonomy Generation for Grouping App Reviews: A Preliminary Empirical Study. In: Shepperd, M., Brito e Abreu, F., Rodrigues da Silva, A., Pérez-Castillo, R. (eds) Quality of Information and Communications Technology. QUATIC 2020. Communications in Computer and Information Science, vol 1266. Springer, Cham. https://doi.org/10.1007/978-3-030-58793-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-030-58793-2_10
Published: 31 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58792-5
Online ISBN: 978-3-030-58793-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics