Abstract
In this paper, an approach to automatic optimisation of the retrieval quality of search engines using a language model paradigm is presented. The topics of information retrieval (IR) and natural language processing (NLP) have already been investigated. However, most of the approaches were focused on learning retrieval functions from existing examples and pre-set feature lists. Others used surface statistics in the form of n-grams or efficient parse tree utilisations – either performs poorly with a language open to changes. Intuitively, an IR system should present relevant documents high in its ranking, with less relevant following below. To accomplish that, semantics/ontologies, usage of grammatical information and document structure analysis were researched. An evolutionary enrichment of language model for typed dependency analysis acquired from documents and queries can adapt the system to the texts encountered. Futhermore, the results in controlled experiments verify the possibility of outperforming existing approaches in terms of retrieval quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Croft, W.B., Ponte, J.M.: A Language Modeling Approach to Information Retrieval. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 275–281. SIGIR ACM, Melbourne (1998)
Lee, C., Lee, G.G.: Probabilistic Information Retrieval Model for a Dependency Structured Indexing System. In: Information Processing Management, vol. 41, pp. 161–175. Pergamon Press, Tarrytown (2005)
Pecina, P., Strakova, J.: Czech Information Retrieval with Syntax-based Language Models. In: Proceedings of the seventh International Conference on Language Resources and Evaluation, pp. 1359–1362. European Language Resources Association, Valletta (2010)
Allan, J., Nallapati, R.: Capturing Term Dependencies Using a Language Model Based on Sentence Trees. In: Proceedings of the Eleventh International Conference on Information and Knowledge Management, pp. 383–390. ACM, New York (2002)
Ahulu, S.: Grammatical Variation in International English. English Today 56, 19–25 (1998)
Millikan, R.G.: Language, a Biological Model. OUP, New York (2005)
Ottenheimer, H.J.: The Anthropology of Language. Wadsworth, Belmont (2006)
Myers-Scotton, C.: Contact Linguistics: Bilingual Encounters and Grammatical Outcomes. OUP, New York (2002)
Yule, G.: The Study of Language. CUP, Cambridge (2006)
Crystal, D.: English as a Global Language. CUP, Cambridge (2003)
Karwinski, M.: English language grammar models’ dynamics and its analysis for information retrieval purposes in written language of the Internet. In: Decision Support Systems, pp. 391–398. Institute of Computer Science, Katowice (2010) (in Polish)
Manning, C.D., Schuetze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
Potthast, M., Stein, B., Trenkmann, M.: Retrieving Customary Web Language to Assist Writers. In: Gurrin, C., et al. (eds.) Advances in Information Retrieval, pp. 631–635. Springer, Milton Keynes (2010)
Konchady, M.: Text Mining Application Programming. Charles River Media, Hingham (2006)
Liu, B.: Web Data Mining – Exploring Hyperlinks, Contents, and Usage Data. Springer-Verlag New York, Secaucus (2006)
Cormack, G.V.: Information Retrieval – Implementing and Evaluating Search Engines. MIT Press, Cambridge (2010)
Manoj, M.: Information Retrieval on Internet Using MSEs. Journal of Scientific and Industrial Research 67, 739–746 (2008)
Benko, B.K., Katona, T.: On the Efficient Indexing of Grammatical Parse Trees for Information Retrieval. In: Proceedings of Innovations in Intelligent Systems and Applications, pp. 366–369. Karadeniz Technical University, Trabzon (2005)
Fong, S., Giles, C.L., Lawrence, S.: Natural Language Grammatical Inference with Recurrent Neural Networks. IEEE Transactions on Knowledge and Data Engineering 12(1), 126–140 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Karwinski, M. (2012). Optimising Search Engines Using Evolutionally Adapted Language Models in Typed Dependency Parses. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Swarm and Evolutionary Computation. EC SIDE 2012 2012. Lecture Notes in Computer Science, vol 7269. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29353-5_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-29353-5_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29352-8
Online ISBN: 978-3-642-29353-5
eBook Packages: Computer ScienceComputer Science (R0)