Skip to main content

Combining Word Semantics within Complex Hilbert Space for Information Retrieval

  • Conference paper
  • First Online:
Book cover Quantum Interaction (QI 2013)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8369))

Included in the following conference series:

  • 921 Accesses

Abstract

Complex numbers are a fundamental aspect of the mathematical formalism of quantum physics. Quantum-like models developed outside physics often overlooked the role of complex numbers. Specifically, previous models in Information Retrieval (IR) ignored complex numbers. We argue that to advance the use of quantum models of IR, one has to lift the constraint of real-valued representations of the information space, and package more information within the representation by means of complex numbers. As a first attempt, we propose a complex-valued representation for IR, which explicitly uses complex valued Hilbert spaces, and thus where terms, documents and queries are represented as complex-valued vectors. The proposal consists of integrating distributional semantics evidence within the real component of a term vector; whereas, ontological information is encoded in the imaginary component. Our proposal has the merit of lifting the role of complex numbers from a computational byproduct of the model to the very mathematical texture that unifies different levels of semantic information. An empirical instantiation of our proposal is tested in the TREC Medical Record task of retrieving cohorts for clinical studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Specifically, concepts from the SNOMED-CT medical ontology [25].

References

  1. van Rijsbergen, C.J.: The Geometry of Information Retrieval. Cambridge University Press, New York (2004)

    Book  MATH  Google Scholar 

  2. Song, D., Lalmas, M., van Rijsbergen, K., Frommholz, I., Piwowarski, B., Wang, J., Zhang, P., Zuccon, G., Bruza, P.D., Arafat, S., et al.: How quantum theory is developing the field of information retrieval. In: Proceedings of QI, Arlington, VA, USA, pp. 105–108, November 2010

    Google Scholar 

  3. Zuccon, G., Azzopardi, L.: Using the quantum probability ranking principle to rank interdependent documents. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 357–369. Springer, Heidelberg (2010)

    Google Scholar 

  4. Zuccon, G., Piwowarski, B., Azzopardi, L.: On the use of complex numbers in quantum models for information retrieval. In: Amati, G., Crestani, F. (eds.) ICTIR 2011. LNCS, vol. 6931, pp. 346–350. Springer, Heidelberg (2011)

    Google Scholar 

  5. Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. JASIST 41(6), 391–407 (1990)

    Article  Google Scholar 

  6. Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behav. Res. Meth. Instrum. Comput. 28, 203–208 (1996)

    Article  Google Scholar 

  7. Kanerva, P., Kristofersson, J., Holst, A.: Random indexing of text samples for latent semantic analysis. In: Proceedings of CogSci., vol. 1036, Philadelphia, PA, USA (2000)

    Google Scholar 

  8. Karlgren, J., Sahlgren, M.: From words to understanding. In: Uesaka, Y., Kanerva, P., Asoh, H. (eds.) Foundations of Real-World Intelligence, pp. 294–308. CSLI Publications, Stanford (2001)

    Google Scholar 

  9. Sahlgren, M.: The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces. Ph.D. thesis, Institutionen för lingvistik. Department of Linguistics, Stockholm University (2006)

    Google Scholar 

  10. Symonds, M., Bruza, P., Sitbon, L., Turner, I.: Modelling word meaning using efficient tensor representations. In: Proceedings of PacLic., pp. 313–322, November 2011

    Google Scholar 

  11. Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  12. Koopman, B., Bruza, P., Sitbon, L., Lawley, M.: Towards semantic search and inference in electronic medical records: an approach using concept-based information retrieval. AMJ 9, 482–488 (2012)

    Article  Google Scholar 

  13. Wittgenstein, L.: Philosophical Investigations. Blackwell Publishing, Oxford (1967)

    Google Scholar 

  14. Harris, Z.: Distributional structure. In: Harris, Z. (ed.) Papers in Structural and Transformational Linguistics. Formal Linguistics, pp. 775–794. Humanities Press, New York (1970)

    Chapter  Google Scholar 

  15. Firth, J.R.: Papers in Linguistics 1934–1951. Oxford University Press, London (1957)

    Google Scholar 

  16. Bloomfield, L.: Language. Holt, Reinhart and Winston, New York (1933)

    Google Scholar 

  17. Morris Charles, W.: Signs, Language and Behavior. Prentice Hall, New York (1946)

    Google Scholar 

  18. von Uexküll, J.: The theory of meaning. Semiotica 42(1), 25–82 (1982)

    Google Scholar 

  19. Peirce, C.: Logic as semiotic: the theory of signs. In: Peirce, C., Buchler, J. (eds.) Philosophical Writings of Peirce, pp. 98–119. Dover Publications, New York (1955)

    Google Scholar 

  20. Frege, G.: Sense and reference. Philos. Rev. 57(3), 209–230 (1948)

    Article  Google Scholar 

  21. Sahlgren, M.: An introduction to random indexing. In: Proceedings of TKE, Copenhagen, Denmark (2005)

    Google Scholar 

  22. Widdows, D., Ferraro, K.: Semantic vectors: a scalable open source package and online technology management application. In: Proceedings LREC, Marrakech, Morocco, May 2008

    Google Scholar 

  23. Koopman, B., Zuccon, G., Bruza, P., Sitbon, L., Lawley, M.: Graph-based concept weighting for medical information retrieval. In: Proceedings of ADCS, Dunedin, New Zealand, pp. 80–87, December 2012

    Google Scholar 

  24. Zuccon, G., Koopman, B., Nguyen, A., Vickers, D., Butt, L.: Exploiting medical hierarchies for concept-based information retrieval. In: Proceedings of ADCS, Dunedin, New Zealand, pp. 111–114, December 2012

    Google Scholar 

  25. Spackman, K.: SNOMED Clinical Terms Basics. International Health Terminology Standards Development Organisation Technical report, August 2008

    Google Scholar 

  26. Aronson, A.R., Lang, F.M.: An overview of MetaMap: historical perspective and recent advances. JAMIA 17(3), 229–236 (2010)

    Google Scholar 

  27. Wu, S.T., Liu, H., Li, D., Tao, C., Musen, M.A., Chute, C.G., Shah, N.H.: Unified medical language system term occurrences in clinical notes: a large-scale corpus analysis. JAMIA 19(e1), e149–e156 (2012)

    Google Scholar 

  28. Wittek, P., Tan, C.L.: Compactly supported basis functions as support vector kernels for classification. IEEE Trans. Pattern Anal. Mach. Intell. 33(10), 2039–2050 (2011)

    Article  Google Scholar 

  29. Widdows, D., Cohen, T.: Real, complex, and binary semantic vectors. In: Busemeyer, J.R., Dubois, F., Lambert-Mogiliansky, A., Melucci, M. (eds.) QI 2012. LNCS, vol. 7620, pp. 24–35. Springer, Heidelberg (2012)

    Google Scholar 

  30. Voorhees, E., Tong, R.: Overview of the TREC Medical Records Track. In: Proceedings of TREC, Gaithersburg, MD, USA, November 2011

    Google Scholar 

  31. Wu, S., Masanz, J., Ravikumar, K., Liu, H.: Three questions about clinical information retrieval. In: Proceedings of TREC, Gaithersburg, MD, USA, November 2012

    Google Scholar 

  32. Aerts, D., Czachor, M.: Quantum aspects of semantic analysis and symbolic artificial intelligence. J. Phys. A: Math. Gen. 37, L123–L132 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  33. Bruza, P., Kitto, K., Ramm, B., Sitbon, L., Song, D., Blomberg, S.: Quantum-like non-separability of concept combinations, emergent associates and abduction. Logic J. IGPL 20(2), 445–457 (2012)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Wittek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wittek, P., Koopman, B., Zuccon, G., Darányi, S. (2014). Combining Word Semantics within Complex Hilbert Space for Information Retrieval. In: Atmanspacher, H., Haven, E., Kitto, K., Raine, D. (eds) Quantum Interaction. QI 2013. Lecture Notes in Computer Science(), vol 8369. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-54943-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-54943-4_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-54942-7

  • Online ISBN: 978-3-642-54943-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics