Abstract
Automatic text summarization can be used in recommendation systems to present useful texts obtained from the available comments and texts. For summarization, a human reads all of the writing and gains a background understanding of the text, but computers do differently. Several methods have been proposed for automatic text summarization until now, from abstract summarization methods that deal with new sentences produced from important points existed in the texts to extraction summarization methods, which deal with original main sentences from the text. In this study, we present an extraction method for text summarizing. In this method, at first, the sentences are processed, and the similarities between sentences are calculated by a proposed similarity measure. Afterward, the sentences are clustered based on the similarities, and at last, a certain number of sentences are selected from each cluster. The Gaussian Mixture Model (GMM) algorithm is used to cluster the sentences. The proposed method is tested on a collected dataset from Tripadvisor (https://www.tripadvisor.com/) customer reviews, and the results show that using GMM results in a more informative summary and more variation in sentences compared to K-means.
Similar content being viewed by others
References
Rather RA, Sharma J (2017) Customer engagement for evaluating customer relationships in hotel industry, The Business School, University of Jammu, India 8:1–13
Cezar A (2011) The factors affecting writing reviews in hotel websites. Int Strateg Manag Conf 27:634–639
Poormasoomi A, Kahani M, Kamyar M, Kamyar H (2010) Auto Summarization multi-document based concepts, Annual Computer Conference of Iran.
Ye Q, Law R, Gu B, Chen W (2011) The influence of user-generated content on traveler behavior: an empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Comput Human Behav 27:634–639
Yavary A, Sajedi H, Saniee Abadeh M (2019) Information verification in social networks based on user feedback and news agencies. Soc Netw Anal Min. https://doi.org/10.1007/s13278-019-0616-4
Ross J (2014) The business value of user experience, vol 2. Commerce Drive Cranbury, NJ 08512
Gavilan D, Avello M (2018) The influence of online ratings and review on hotel booking consideration. Touris Manag 66:53–61
Cilibrasi RL, Vitanyi PMB (2007) The Google similarity distance. IEEE Trans Knowl Data Eng 19(3):370–383
Casalo LV, Flavian C, Guinalu M (2011) Understanding the intention to follow the advice obtained in an online travel community. Comput Human Behav 27:622–633
Petz G, Karpowicz M, Furschub H, Auinger A, Stritesky V, Holzinger A (2015) Reprint of: computational approaches for mining user’s opinions on the Web 2.0. Inf Process Manag 51:510–519
Lochter R, Zanetti D, Reller T, Almeida TA (2016) Short text opinion detection using ensemble of classifiers and semantic indexing. Expert Syst Appl 62:234–249
Marujo L, Ling W, Ribeiro R, Gershman A, Carbonell J, Matos D, Neto JP (2016) Exploring events and distributed representations of text in multi- document summarization. Knowl Based Syst 94:33–42
Gupta V, Singh Lehal G (2010) A survey of text summarization extractive techniques. J Emerg Technol Web Intell 2:258–268
Lee F, Chen Yang C, Hung Chen C, Wang C, Yuan S (2016) Mining perceptual maps from consumer reviews. Decis Support Syst 82:12–25
Tsirakis N, Poulopoulos V, Tsantilas P, Varlamis I (2017) Large Opinion mining for social, news and blog data. J Syst Softw 127:237–248
Saif H, He Y, Fernandez M, Alani H (2016) Contextual semantics for sentiment analysis of Twitter. Inf Process Manage 52:5–19
Severyn A, Moschitti A, Uryupina O, Plank B, Filippova K (2016) Multi-lingual opinion mining on YouTube. Inf Process Manage 52:46–60
Eirinaki M, Pisal S, Singh S (2012) Feature-based opinion mining and ranking. J Comput Syst Sci 78:1175–1184
Mars A, Gouider M (2017) Big data analysis to features opinion extraction of customer. Procedia Comput Sci 112:906–916
Kayser V, Blind K (2017) Extending the knowledge base of foresight: the contribution of text mining. Technol Forecast Soc Chang 116:208–215
Mohd M, Jan R, Shan M (2020) Text document summarization using word embedding. Expert Syst Appl 143:112958. https://doi.org/10.1016/j.eswa.2019.112958
Lloret E, Palomar M (2013) Tackling redundancy in text summarization through different levels of language analysis. Comput Stand Interface 35(5):507–518
Wanga D, Zhub S, Lia T (2013) SumView: a Web-based engine for Summarizing product reviews and customer opinions. Expert Syst Appl 40(1):27–33
Lichouri M, Abbas M, Freihat AA, Megtouf DEH (2018) Word level vs sentence level language identification: application to algerian and arabic dialects. In: The 4th International Conference on Arabic on computational Linguistic, vol 142, pp 246–253
Jaffar Y, Bouzoubaa K (2018) Towards a New Hybrid Approach for Abstractive Summarization. In: The 4th International Conference on Arabic on computational Linguistic, vol 142, pp 286–293
Sahoo D, Bhoi A, Balabantaray RC (2018) Hybrid approach to abstractive summarization, international conference on computational intelligence and data science (ICCIDS 2018), vol 132, pp 1228–1237
Ebarougy R, Behery G, El Khatib A (2020) Extractive Arabic summarization using modified PagerRank Algorithm. Egypt Inf J 22(3):73–81
Rouane O, Belhadef H, Bouakkaz M (2019) Combine clustering and frequent itemsets mining to enhance biomedical text summarization. Expert Syst Appl 135:362–373
Bhatia N, Jaiswal A (2015) Trends in extractive and abstractive techniques text summarization. Int J Comput Appl 117:0975–8887
Razaghnoori M, Sajedi H, Khani I (2018) Question classification in Persian using word vectors and frequencies. Cogn Syst Res 47:16–27
Hu Y, Chen Y, Chou H (2017) Opinion mining from online hotel reviews—a text summarization approach. Inf Process Manage 53:436–449
Hosseini Khan T, Ahmadi A, Mohebi A (2008) Gensim 22.0: a customizable process simulation model for software process evaluation, vol 13(1), pp 294–306
(2014) Information Resources Management Association (IRMA), Marketing and consumer behavior: concepts, methodologies, tools, and applications: concepts, methodologies, tools, and applications, IGI Global, 2014 (ISBN: 1466673583, 9781466673588)
Han J, Kamber M, Pei J (2012) Data mining concepts and techniques. Morgan Kaufman Publishers
Zhang B, Zhang C, Yi X (2004) Competitive EM algorithm for finite mixture models. Pattern Recogn 48:131–144
Xinfan M, Wang H (2009) Mining user reviews: from specification to summarization. In: Proceedings of ACL-IJCNLP, pp 177–180
Atkinson J, Munoz R (2013) Rhetorics-based multi-document summarization. Expert Syst Appl 40(11):4346–4352
Skalicky S, Crossley S (2014) A statistical analysis of satirical Amazon.com product reviews. Psychology. https://doi.org/10.7592/EJHR2014.2.3.skalicki(Corpus ID: 32691144)
Jeong H, Ko Y, Seo J (2016) How to Improve Text Summarization and Classification by Mutual Cooperation on an Integrated Framework. Expert Syst Appl Int J. https://doi.org/10.1016/j.eswa.2016.05.001
Qiang J, Chen P, Ding W (2016) Multi-document summarization using closed patterns. Knowl Based Syst 99:28–38
Hu M, Liu B (2004) Mining and summarizing customer reviews, KDD '04: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining, pp 168–177. https://doi.org/10.1145/1014052.1014073
Tseng YH, Wang YM, Lin YI, Lin CJ, Juang DW (2007) Patent surrogate extraction and evaluation in the context of patent mapping. J Inform Sci 33(6):718–736
Qaroush A, Abu Farha I,Ghanem W, Washaha M, Maali E (2019) An efficient single document Arabic text summarization using a combination of statistical and semantic features. J King Saud Univ Comput Inf Sci
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
None of the authors has any conflicts of interests.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Marzijarani, S.B., Sajedi, H. Opinion mining with reviews summarization based on clustering. Int. j. inf. tecnol. 12, 1299–1310 (2020). https://doi.org/10.1007/s41870-020-00511-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-020-00511-y