Skip to main content

Optimising Search Engines Using Evolutionally Adapted Language Models in Typed Dependency Parses

  • Conference paper
Swarm and Evolutionary Computation (EC 2012, SIDE 2012)

Abstract

In this paper, an approach to automatic optimisation of the retrieval quality of search engines using a language model paradigm is presented. The topics of information retrieval (IR) and natural language processing (NLP) have already been investigated. However, most of the approaches were focused on learning retrieval functions from existing examples and pre-set feature lists. Others used surface statistics in the form of n-grams or efficient parse tree utilisations – either performs poorly with a language open to changes. Intuitively, an IR system should present relevant documents high in its ranking, with less relevant following below. To accomplish that, semantics/ontologies, usage of grammatical information and document structure analysis were researched. An evolutionary enrichment of language model for typed dependency analysis acquired from documents and queries can adapt the system to the texts encountered. Futhermore, the results in controlled experiments verify the possibility of outperforming existing approaches in terms of retrieval quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Croft, W.B., Ponte, J.M.: A Language Modeling Approach to Information Retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 275–281. SIGIR ACM, Melbourne (1998)

    Chapter  Google Scholar 

  2. Lee, C., Lee, G.G.: Probabilistic Information Retrieval Model for a Dependency Structured Indexing System. In: Information Processing Management, vol. 41, pp. 161–175. Pergamon Press, Tarrytown (2005)

    Google Scholar 

  3. Pecina, P., Strakova, J.: Czech Information Retrieval with Syntax-based Language Models. In: Proceedings of the seventh International Conference on Language Resources and Evaluation, pp. 1359–1362. European Language Resources Association, Valletta (2010)

    Google Scholar 

  4. Allan, J., Nallapati, R.: Capturing Term Dependencies Using a Language Model Based on Sentence Trees. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 383–390. ACM, New York (2002)

    Google Scholar 

  5. Ahulu, S.: Grammatical Variation in International English. English Today 56, 19–25 (1998)

    Article  Google Scholar 

  6. Millikan, R.G.: Language, a Biological Model. OUP, New York (2005)

    Google Scholar 

  7. Ottenheimer, H.J.: The Anthropology of Language. Wadsworth, Belmont (2006)

    Google Scholar 

  8. Myers-Scotton, C.: Contact Linguistics: Bilingual Encounters and Grammatical Outcomes. OUP, New York (2002)

    Google Scholar 

  9. Yule, G.: The Study of Language. CUP, Cambridge (2006)

    Google Scholar 

  10. Crystal, D.: English as a Global Language. CUP, Cambridge (2003)

    Google Scholar 

  11. Karwinski, M.: English language grammar models’ dynamics and its analysis for information retrieval purposes in written language of the Internet. In: Decision Support Systems, pp. 391–398. Institute of Computer Science, Katowice (2010) (in Polish)

    Google Scholar 

  12. Manning, C.D., Schuetze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    Google Scholar 

  13. Potthast, M., Stein, B., Trenkmann, M.: Retrieving Customary Web Language to Assist Writers. In: Gurrin, C., et al. (eds.) Advances in Information Retrieval, pp. 631–635. Springer, Milton Keynes (2010)

    Google Scholar 

  14. Konchady, M.: Text Mining Application Programming. Charles River Media, Hingham (2006)

    Google Scholar 

  15. Liu, B.: Web Data Mining – Exploring Hyperlinks, Contents, and Usage Data. Springer-Verlag New York, Secaucus (2006)

    Google Scholar 

  16. Cormack, G.V.: Information Retrieval – Implementing and Evaluating Search Engines. MIT Press, Cambridge (2010)

    Google Scholar 

  17. Manoj, M.: Information Retrieval on Internet Using MSEs. Journal of Scientific and Industrial Research 67, 739–746 (2008)

    Google Scholar 

  18. Benko, B.K., Katona, T.: On the Efficient Indexing of Grammatical Parse Trees for Information Retrieval. In: Proceedings of Innovations in Intelligent Systems and Applications, pp. 366–369. Karadeniz Technical University, Trabzon (2005)

    Google Scholar 

  19. Fong, S., Giles, C.L., Lawrence, S.: Natural Language Grammatical Inference with Recurrent Neural Networks. IEEE Transactions on Knowledge and Data Engineering 12(1), 126–140 (2000)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Karwinski, M. (2012). Optimising Search Engines Using Evolutionally Adapted Language Models in Typed Dependency Parses. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Swarm and Evolutionary Computation. EC SIDE 2012 2012. Lecture Notes in Computer Science, vol 7269. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29353-5_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-29353-5_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-29352-8

  • Online ISBN: 978-3-642-29353-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics