Skip to main content

Does Topic Modelling Reflect Semantic Prototypes?

  • Conference paper
New Research in Multimedia and Internet Systems

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 314))

  • 604 Accesses

Abstract

The chapter introduces a representation of a textual event as a mixture of semantic stereotypes and factual information. We also present a method to distinguish semantic prototypes that are specific for a given event from generic elements that might provide cause and result information. Moreover, this chapter discusses the results of experiments of unsupervised topic extraction performed on documents from a large-scale corpus with an additional temporal structure. These experiments were realized as a comparison of the nature of information provided by Latent Dirichlet Allocation based on Log-Entropy weights and Vector Space modelling. The impact of different corpus time windows on this information is discussed. Finally, we try to answer if the unsupervised topic modelling may reflect deeper semantic information, such as elements describing given event or its causes and results, and discern it from factual data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003), http://dl.acm.org/citation.cfm?id=944919.944937

    MATH  Google Scholar 

  2. Boyd-Graber, J., Chang, J., Gerrish, S., Wang, C., Blei, D.: Reading tea leaves: How humans interpret topic models. In: Neural Information Processing Systems (NIPS) (2009)

    Google Scholar 

  3. Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41(6), 391–407 (1990)

    Article  Google Scholar 

  4. Dorosz, K., Korzycki, M.: Latent semantic analysis evaluation of conceptual dependency driven focused crawling. In: Dziech, A., Czyżewski, A. (eds.) MCSS 2012. CCIS, vol. 287, pp. 77–84. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  5. Gatkowska, I., Korzycki, M., Lubaszewski, W.: Can human association norm evaluate latent semantic analysis? In: Proceedings of the 10th NLPCS Workshop, pp. 92–104 (2013)

    Google Scholar 

  6. Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences 101(suppl. 1), 5228–5235 (2004)

    Google Scholar 

  7. Landauer, T.K.: Handbook of Latent Semantic Analysis. University of Colorado Institute of Cognitive Science Series. Lawrence Erlbaum Associates (2007), http://books.google.pl/books?id=jgVWCuFXePEC

  8. Leetaru, K.: Culturomics 2.0: Forecasting large-scale human behavior using global news media tone in time and space. First Monday 16(9) (2011)

    Google Scholar 

  9. Lubaszewski, W., Dorosz, K., Korzycki, M.: System for web information monitoring. In: 2013 International Conference on Computer Applications Technology (ICCAT), pp. 1–6 (2013)

    Google Scholar 

  10. Lytinen, S.L.: Conceptual dependency and its descendants. Computers and Mathematics with Applications 23, 51–73 (1992)

    Article  MATH  Google Scholar 

  11. Minka, T., Lafferty, J.: Expectation-propagation for the generative aspect model. In: Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence, UAI 2002, pp. 352–359. Morgan Kaufmann Publishers Inc., San Francisco (2002), http://dl.acm.org/citation.cfm?id=2073876.2073918

    Google Scholar 

  12. Ortega-Pacheco, D., Arias-Trejo, N., Martinez, J.B.B.: Latent semantic analysis model as a representation of free-association word norms. In: MICAI (Special Sessions), pp. 21–25. IEEE (2012)

    Google Scholar 

  13. Rosch, E.: Principles of categorization. In: Rosch, E., Lloyd, B. (eds.) Cognition and Categorization, pp. 27–48. Erlbaum, Hillsdale (1978)

    Google Scholar 

  14. Schank, R.C.: Conceptual dependency: A theory of natural language understanding. Cognitive Psychology 3(4), 532–631 (1972)

    Article  Google Scholar 

  15. Steyvers, M., Griffiths, T.: Probabilistic topic models. In: Latent Semantic Analysis: A Road to Meaning. Lawrence Erlbaum (2005), http://psiexp.ss.uci.edu/research/papers/SteyversGriffithsLSABookFormatted.pdf

  16. Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50. ELRA, Valletta (2010), http://is.muni.cz/publication/884893/en

    Google Scholar 

  17. Wandmacher, T.: How semantic is latent semantic analysis? In: Proceedings of TALN/RECITAL (2005)

    Google Scholar 

  18. Wandmacher, T., Ovchinnikova, E., Alexandrov, T.: Does latent semantic analysis reflect human associations? In: Proceedings of the Lexical Semantics Workshop at ESSLLI 2008 (2008)

    Google Scholar 

  19. Wettler, M., Rapp, R., Sedlmeier, P.: Free word associations correspond to contiguities between words in texts. Journal of Quantitative Linguistics 12(2-3), 111–122 (2005)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michał Korzycki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Korzycki, M., Korczyński, W. (2015). Does Topic Modelling Reflect Semantic Prototypes?. In: Zgrzywa, A., Choroś, K., Siemiński, A. (eds) New Research in Multimedia and Internet Systems. Advances in Intelligent Systems and Computing, vol 314. Springer, Cham. https://doi.org/10.1007/978-3-319-10383-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10383-9_11

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10382-2

  • Online ISBN: 978-3-319-10383-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics