Abstract
The chapter introduces a representation of a textual event as a mixture of semantic stereotypes and factual information. We also present a method to distinguish semantic prototypes that are specific for a given event from generic elements that might provide cause and result information. Moreover, this chapter discusses the results of experiments of unsupervised topic extraction performed on documents from a large-scale corpus with an additional temporal structure. These experiments were realized as a comparison of the nature of information provided by Latent Dirichlet Allocation based on Log-Entropy weights and Vector Space modelling. The impact of different corpus time windows on this information is discussed. Finally, we try to answer if the unsupervised topic modelling may reflect deeper semantic information, such as elements describing given event or its causes and results, and discern it from factual data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003), http://dl.acm.org/citation.cfm?id=944919.944937
Boyd-Graber, J., Chang, J., Gerrish, S., Wang, C., Blei, D.: Reading tea leaves: How humans interpret topic models. In: Neural Information Processing Systems (NIPS) (2009)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)
Dorosz, K., Korzycki, M.: Latent semantic analysis evaluation of conceptual dependency driven focused crawling. In: Dziech, A., Czyżewski, A. (eds.) MCSS 2012. CCIS, vol. 287, pp. 77–84. Springer, Heidelberg (2012)
Gatkowska, I., Korzycki, M., Lubaszewski, W.: Can human association norm evaluate latent semantic analysis? In: Proceedings of the 10th NLPCS Workshop, pp. 92–104 (2013)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences 101(suppl. 1), 5228–5235 (2004)
Landauer, T.K.: Handbook of Latent Semantic Analysis. University of Colorado Institute of Cognitive Science Series. Lawrence Erlbaum Associates (2007), http://books.google.pl/books?id=jgVWCuFXePEC
Leetaru, K.: Culturomics 2.0: Forecasting large-scale human behavior using global news media tone in time and space. First Monday 16(9) (2011)
Lubaszewski, W., Dorosz, K., Korzycki, M.: System for web information monitoring. In: 2013 International Conference on Computer Applications Technology (ICCAT), pp. 1–6 (2013)
Lytinen, S.L.: Conceptual dependency and its descendants. Computers and Mathematics with Applications 23, 51–73 (1992)
Minka, T., Lafferty, J.: Expectation-propagation for the generative aspect model. In: Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence, UAI 2002, pp. 352–359. Morgan Kaufmann Publishers Inc., San Francisco (2002), http://dl.acm.org/citation.cfm?id=2073876.2073918
Ortega-Pacheco, D., Arias-Trejo, N., Martinez, J.B.B.: Latent semantic analysis model as a representation of free-association word norms. In: MICAI (Special Sessions), pp. 21–25. IEEE (2012)
Rosch, E.: Principles of categorization. In: Rosch, E., Lloyd, B. (eds.) Cognition and Categorization, pp. 27–48. Erlbaum, Hillsdale (1978)
Schank, R.C.: Conceptual dependency: A theory of natural language understanding. Cognitive Psychology 3(4), 532–631 (1972)
Steyvers, M., Griffiths, T.: Probabilistic topic models. In: Latent Semantic Analysis: A Road to Meaning. Lawrence Erlbaum (2005), http://psiexp.ss.uci.edu/research/papers/SteyversGriffithsLSABookFormatted.pdf
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50. ELRA, Valletta (2010), http://is.muni.cz/publication/884893/en
Wandmacher, T.: How semantic is latent semantic analysis? In: Proceedings of TALN/RECITAL (2005)
Wandmacher, T., Ovchinnikova, E., Alexandrov, T.: Does latent semantic analysis reflect human associations? In: Proceedings of the Lexical Semantics Workshop at ESSLLI 2008 (2008)
Wettler, M., Rapp, R., Sedlmeier, P.: Free word associations correspond to contiguities between words in texts. Journal of Quantitative Linguistics 12(2-3), 111–122 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Korzycki, M., Korczyński, W. (2015). Does Topic Modelling Reflect Semantic Prototypes?. In: Zgrzywa, A., Choroś, K., Siemiński, A. (eds) New Research in Multimedia and Internet Systems. Advances in Intelligent Systems and Computing, vol 314. Springer, Cham. https://doi.org/10.1007/978-3-319-10383-9_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-10383-9_11
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10382-2
Online ISBN: 978-3-319-10383-9
eBook Packages: EngineeringEngineering (R0)