Words Context Analysis for Improvement of Information Retrieval

  • Julian Szymański
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7653)


In the article we present an approach to improvement of retrieval information from large text collections using words context vectors. The vectors have been created analyzing English Wikipedia with Hyperspace Analogue to Language model of words similarity. For test phrases we evaluate retrieval with direct user queries as well as retrieval with context vectors of these queries. The results indicate that the proposed method can not replace retrieval based on direct user queries but it can be used for refining the search results.


text indexing information retrieval semantic indexes 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern information retrieval, vol. 82. Addison-Wesley, New York (1999)Google Scholar
  2. 2.
    Baeza-Yates, R.: Introduction to data structures and algorithms related to information retrieval (1992)Google Scholar
  3. 3.
    Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American society for Information Science 41(6), 391–407 (1990)CrossRefGoogle Scholar
  4. 4.
    Duch, W., Szymański, J.: Semantic web: Asking the right questions. In: Proceedings of the 7 International Conference on Information and Management Sciences, pp. 1–8. California Polytechnic State University Press (2008)Google Scholar
  5. 5.
    Kraft, D., Buell, D.: Fuzzy sets and generalized boolean retrieval systems. International Journal of Man-Machine Studies 19(1), 45–56 (1983)CrossRefGoogle Scholar
  6. 6.
    Kwok, C., Etzioni, O., Weld, D.: Scaling question answering to the web. ACM Transactions on Information Systems (TOIS) 19(3), 242–262 (2001)CrossRefGoogle Scholar
  7. 7.
    Lau, R., Bruza, P., Song, D.: Towards a belief-revision-based adaptive and context-sensitive information retrieval system. ACM Transactions on Information Systems (TOIS) 26(2), 8 (2008)CrossRefGoogle Scholar
  8. 8.
    Liu, T.: Learning to rank for information retrieval. Foundations and Trends in Information Retrieval 3(3), 225–331 (2009)CrossRefGoogle Scholar
  9. 9.
    Lund, K., Burgess, C.: Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods 28(2), 203–208 (1996)CrossRefGoogle Scholar
  10. 10.
    Makhoul, J., Kubala, F., Schwartz, R., Weischedel, R.: Performance measures for information extraction. In: Proceedings of DARPA Broadcast News Workshop, pp. 249–252 (1999)Google Scholar
  11. 11.
    Manning, C., Raghavan, P., Schütze, H.: Introduction to information retrieval, vol. 1. Cambridge University Press (2008)Google Scholar
  12. 12.
    Salton, G., Fox, E., Wu, H.: Extended boolean information retrieval. Communications of the ACM 26(11), 1022–1036 (1983)MathSciNetzbMATHCrossRefGoogle Scholar
  13. 13.
    Solan, Z., Horn, D., Ruppin, E., Edelman, S.: Unsupervised learning of natural languages. Proceedings of the National Academy of Sciences of the United States of America 102(33), 11629 (2005)CrossRefGoogle Scholar
  14. 14.
    Wold, S., Esbensen, K., Geladi, P.: Principal component analysis. Chemometrics and intelligent laboratory systems 2(1-3), 37–52 (1987)CrossRefGoogle Scholar
  15. 15.
    Wong, S., Raghavan, V.: Vector space model of information retrieval: a reevaluation. In: Proceedings of the 7th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 167–185. British Computer Society (1984)Google Scholar
  16. 16.
    Yan, H., Ding, S., Suel, T.: Inverted index compression and query processing with optimized document ordering. In: Proceedings of the 18th International Conference on World Wide Web, pp. 401–410. ACM (2009)Google Scholar
  17. 17.
    Zhang, J., Long, X., Suel, T.: Performance of compressed inverted list caching in search engines. In: Proceeding of the 17th International Conference on World Wide Web, pp. 387–396. ACM (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Julian Szymański
    • 1
  1. 1.Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications and InformaticsGdańsk University of TechnologyPoland

Personalised recommendations