CUTE 2017, CSA 2017: Advances in Computer Science and Ubiquitous Computing pp 1358-1363 | Cite as
Classification of Web Content by Category Generation in Social Life Logging
Abstract
Web content is consumed at anytime and anywhere through mobile devices. Consumption behavior has been affected by its own emotional content. Web content has been categorized by article’s topic and its emotion has been determined by article’s nuance. This study is to determine category and emotion of web content. The Automatic Content Categorization System (ACCS) has been developed to crawl the texts from web page and to separate texts into morpheme using natural language processing (NLP). Finally, web content was classified into category and emotion by document similarity. The main contribution of this study is to provide fixed categories and 28 emotions to classify web content for analyzing consumption behavior of web content.
Keywords
Life logging Web content Category Emotion Crawling Natural language processing Document similarityNotes
Acknowledgments
This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No.2015-0-00312, The development of technology for social life logging based on analyzing social emotion and intelligence of convergence contents).
References
- 1.Yeo, K.S., et al.: Internet of Things: trends, challenges and applications. In: 2014 International Symposium on Integrated Circuits (ISIC), pp. 568–571. IEEE (2014)Google Scholar
- 2.Web Contents Issue Report, Repubic of Korea Kocca Report, 31 December 2015Google Scholar
- 3.Adar, E., et al.: The web changes everything: understanding the dynamics of web content. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining, pp. 282–291. ACM (2009)Google Scholar
- 4.Raj, A.J., Francis, F.S., Benadit, P.J.: Optimal web page classification technique based on informative content extraction and FA-NBC. Comput. Sci. Eng. 6(1), 7–13 (2016)Google Scholar
- 5.Back, B.H., Ha, I., Ahn, B.: An extraction method of sentiment infromation from unstructed big data on SNS. J. Korea Multimedia Soc. 17(6), 671–680 (2014)CrossRefGoogle Scholar
- 6.Kim, S.J., Kim, T.Y.: How the emotion of SNS contents influence the users’ affective states: focused on facebook newsfeed pages. Cyber Commun. 29(1), 5 (2012)Google Scholar
- 7.Ross, R.T.: A statistic for circular series. J. Educ. Psychol. 29(5), 384 (1938)CrossRefGoogle Scholar
- 8.Russell, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39(6), 1161–1178 (1980)CrossRefGoogle Scholar
- 9.Park, E.L., Cho, S.: KoNLPy: Korean natural language processing in Python. In: Proceedings of the 26th Annual Conference on Human & Cognitive Language Technology (2014)Google Scholar
- 10.Huang, A.: Similarity measures for text document clustering. In: Proceedings of the sixth new zealand computer science research student conference (NZCSRSC2008), Christ-church, New Zealand (2008)Google Scholar
- 11.Buttler, D.: A short survey of document structure similarity algorithms. In: International Conference on Internet Computing (2004)Google Scholar
- 12.Rafi, M., Shaikh, M.S.: An improved semantic similarity measure for document clustering based on topic maps. arXiv preprint arXiv:1303.4087 (2013)