Skip to main content

IR and AI: Traditions of Representation and Anti-representation in Information Processing

  • Conference paper
Advances in Information Retrieval (ECIR 2004)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2997))

Included in the following conference series:

Abstract

The paper discusses the traditional, and ongoing, question as to whether natural language processing (NLP) techniques, or indeed and representational techniques at all, aid in the retrieval of information, as that task is traditionally understood. The discussion is partly a response to Karen Sparck Jones’ (1999) claim that artificial intelligence, and by implication NLP, should learn from the methodology of Information Retrieval (IR), rather than vice versa, as the first sentence above implies. The issue has been made more interesting and complicated by the shift of interest from classic IR experiments with very long queries to Internet search queries which are typically of two highly ambiguous terms. This simple fact has changed the assumptions of the debate. Moreover, the return to statistical and empirical methods with NLP have made it less clear what an NLP technique, or even a “representational” method, is. The paper also notes the growth of “language models” within IR and the use of the term “translation” in recent years to describe a range of activities, including IR, and which constitutes rather the opposite of what Sparck Jones was calling for.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agichtein, E., Grishman, R., Borthwick, A., Sterling, J.: Description of the named entity system as used in MUC-7. In: Proceedings of the MUC-7 Conference, NYU (1998)

    Google Scholar 

  • Azzam, S., Humphreys, K., Gaizauskas, R.: Using coreference chains for text summarization. In: Proc. ACL Workshop on Coreference and its Applications, Maryland (1999)

    Google Scholar 

  • Bely, N., Borillo, A., Virbel, J., Siot-Decauville, N.: Procedures d’analyse semantique appliquees a la documentation scientifique. Gauthier-Villars, Paris (1970)

    Google Scholar 

  • Berger, A., Lafferty, J.: Information retrieval as statistical translation. In: SIGIR 1999 (1999)

    Google Scholar 

  • Berger, A., et al.: Bridging the lexical chasm: statistical approaches to question answering. In: SIGIR 2000 (2000)

    Google Scholar 

  • Bikel, D., Miller, S., Schwartz, R., Weischedel, R.: Nymble: a High-Performance Learning Name-finder. In: Proceedings of the Fifth conference on Applied Natural Language Processing (1997)

    Google Scholar 

  • Brill, E.: Some Advances in Transformation-Based Part of Speech Tagging. In: Proceedings of the Twelfth National Conference on AI (AAAI 1994), Seattle, Washington (1994)

    Google Scholar 

  • Brown, P.F., Cocke, J.: A Statistical Approach to Machine Translation, IBM Research Division, T.J. Watson Research Center, RC 14773 (1989)

    Google Scholar 

  • Carberry, S., Samuel, K., Vijay-Shanker, K.: Dialogue act tagging with transformation-based learning. In: Proceedings of the COLING-ACL 1998 Conference, Montreal, Canada, vol. 2, pp. 1150–1156 (1998)

    Google Scholar 

  • Cardie, C.: Empirical methods in information extraction. AI Magazine 18(4) (1997); Special Issue on Empirical Natural Language Processing

    Google Scholar 

  • Chiaramella, Y., Nie, J.: A retrieval model based on an extended modal logic and its application to the RIME experimental approach. In: Proceedings of the 13th ACM International Conference on Research and Development in Information Retrieval (SIGIR), pp. 25–43 (1990)

    Google Scholar 

  • Collier, R.: Automatic Template Creation for Information Extraction. PhD thesis, University of Sheffield Computer Science Dept., UK (1998)

    Google Scholar 

  • Cowie, J., Guthrie, L., Jin, W., Odgen, W., Pustejowsky, J., Wanf, R., Wakao, T., Waterman, S., Wilks, Y.: CRL/Brandeis: The Diderot System. In: Proceedings of Tipster Text Program (Phase I), Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  • Cunningham, H.: JAPE – a Java Annotation Patterns Engine. Technical Report, Department of Computer Science, University of Sheffield (1999)

    Google Scholar 

  • Daelemans, W., Zavrel, J., van der Sloot, K., van den Bosch, A.: TiMBL: Tilburg memory based learner version 1.0. Technical report, ILK Technical Report 98-03 (1998)

    Google Scholar 

  • Gaizauskas, R., Wilks, Y.: Information Extraction: beyond document retrieval. Journal of Documentation (1997)

    Google Scholar 

  • Gardin, J.: Syntol. Rutgers Graduate School of Library Science, New Brunswick (1965)

    Google Scholar 

  • Gollins, T., Sanderson, M.: Improving Cross Language Information Retrieval with triangulated translation. In: SIGIR 2001 (2001)

    Google Scholar 

  • Granger, R.: FOULUP: a program that figures out meanings of words from context. In: Proc. Fifth Joint Internat. Conf. on AI (1977)

    Google Scholar 

  • Green, B., Wolf, A., Chomsky, C., Laughery, K.: BASEBALL, an automatic question answerer. In: Proc. Western Joint Computer Conference 19, pp. 219–224 (1961)

    Google Scholar 

  • Grefenstette, G., Hearst, M.A.: Method for Refining Automatically-Discovered Lexical Relations: Combining Weak Techniques for Stronger Results. In: Weir (ed.) Statistically-based natural language programming techniques, Proc. AAAI Workshop, AAAI Press, Menlo Park (1992)

    Google Scholar 

  • Grishman, R.: Information extraction: Techniques and challenges. In: Pazienza, M.T. (ed.) SCIE 1997. LNCS (LNAI), vol. 1299, Springer, Heidelberg (1997)

    Google Scholar 

  • Grishman, R., Sterling, J.: Generalizing automatically generated patterns. In: Proceedings of COLING 1992 (1992)

    Google Scholar 

  • Gross, M.: On the equivalence of models of language used in the fields of mechanical translation and information retrieval. Information Storage and Retrieval 2(1) (1964)

    Google Scholar 

  • Hobbs, J.R.: The generic information extraction system. In: Proceedings of the Fifth Message Understanding Conference (MUC-5), pp. 87–91. Morgan Kaufman, San Francisco (1993)

    Chapter  Google Scholar 

  • Hutchins, W.J.: Linguistic processes in the indexing and retrieval of documents. Linguistics 61 (1970)

    Google Scholar 

  • Jeffrey, K.: What’s next in databases? ERCIM News 39 (1999), www.ercim.org

  • Krovetz, R.: More than one sense per discourse. NEC Princeton NJ Labs., Research Memorandum (1998)

    Google Scholar 

  • Lehnert, W.: A Conceptual Theory of Question Answering. In: Proc. Fifth IJCAI, pp. 158–164. Kaufmann, Cambridge (1977)

    Google Scholar 

  • Lehnert, W., Cardie, C., Fisher, D., McCarthy, J., Riloff, E.: University of Massachusetts: Description of the CIRCUS system as used for MUC-4. In: Proceedings of the Fourth Message Understanding Conference MUC-4, pp. 282–288. Morgan Kaufmann, San Francisco (1992)

    Chapter  Google Scholar 

  • Lenat, D., Prakash, M., Shepherd, M.: CYC: Using common sense knowledge to overcome brittleness and knowledge acquisition bottlenecks. The AI Magazine 6(4) (1986)

    Google Scholar 

  • Luhn, H.P.: A statistical approach to mechanized encoding and searching of literary information. IBM Journal of Research and Development 1, 309–317 (1957)

    Article  MathSciNet  Google Scholar 

  • Mauldin, M.: Retrieval performance in FERRET: a conceptual information retrieval system. In: SIGIR 1991 (1991)

    Google Scholar 

  • Miller, G.A. (ed.): WordNet: An on-line Lexical Database. International Journal of Lexicography 3(4) (1990)

    Google Scholar 

  • Morgan, R., Garigliano, R., Callaghan, P., Poria, S., Smith, M., Urbanowicz, A., Collingham, R., Costantino, M., Cooper, C.: Description of the LOLITA System as used for MUC-6. In: Proceedings of the Sixth Message Understanding Conference (MUC-6), pp. 71–86. Morgan Kaufmann, San Francisco (1995)

    Chapter  Google Scholar 

  • Muggleton, S.: Recent advances in inductive logic programming. In: Proc. 7th Ann. ACM Workshop on Comput. Learning Theory, pp. 3–11. ACM Press, New York (1994)

    Google Scholar 

  • Muggleton, S., Cussens, J., Page, D., Srinivasan, A.: Using inductive logic programming for natural language processing. In: van Someren, M., Widmer, G. (eds.) ECML 1997. LNCS, vol. 1224, pp. 25–34. Springer, Heidelberg (1997); Workshop Notes on Empirical Learning of Natural Language Tasks

    Google Scholar 

  • Pearl, J.: Bayesian Networks: A Model of Self-Activated Memory for Evidential Reasoning. In: Proceedings of the Cognitive Science Society (CSS-7) (1985)

    Google Scholar 

  • Pietrosanti, E., Graziadio, B.: Extracting Information for Business Needs. Unicom Seminar on Information Extraction, London (March 1997)

    Google Scholar 

  • Riloff, E., Lehnert, W.: Automated dictionary construction for information extraction from text. In: Proceedings of Ninth IEEE Conference on Artificial Intelligence for Applications, pp. 93–99 (1993)

    Google Scholar 

  • Riloff, E., Shoen, J.: Automatically acquiring conceptual patterns without an annotated corpus. In: Proceedings of the Third Workshop on Very Large Corpora (1995)

    Google Scholar 

  • Roche, E., Schabes, Y.: Deterministic Part-of-Speech Tagging with Finite-State Transducers. Computational Linguistics 21(2), 227–254 (1995)

    Google Scholar 

  • Salton, G.: A new comparison between conventional indexing (MEDLARS) and automatic text processing (SMART). Journal of the American Society of Information Science 23(2) (1972)

    Google Scholar 

  • Schvaneveldt, R. (ed.): Pathfinder Networks: Theory and Applications. Ablex, Norwood (1990)

    Google Scholar 

  • Smeaton, A., van Rijsbergen, C.: Experiments in incorporating syntactic processing of user queries into a document retrieval strategy. In: Proc. 11th. ACM SIGIR (1988)

    Google Scholar 

  • Sparck Jones, K.: Synonymy and Semantic Classification. Edinburgh University Press, Edinburgh (1966/1986)

    Google Scholar 

  • Sparck Jones, K.: What is the role of NLP in text retrieval. In: Strzalkowski (ed.) Natural language Information Retrieval, Kluwer, New York (1999a)

    Google Scholar 

  • Sparck Jones, K.: Information Retrieval and Artificial Intelligence. Artificial Intelligence Journal 114 (1999b)

    Google Scholar 

  • Stevenson, M., Wilks, Y.: Combining Weak Knowledge Sources for Sense Disambiguation. In: Proceedings of the International Joint Conference for Artifical Intelligence (IJCAI 1999) (1999)

    Google Scholar 

  • Strzalkowski, T., Vauthey, B.: Natural Language Processing in Automated Information Retrieval, PROTEUS Project Memorandum. Department of Computer Science, New York University (1991)

    Google Scholar 

  • Vilain, M.: Validation of terminological inference in an information extraction task. In: Proceedings of the 1993 ARPA Human Language Workshop (1993)

    Google Scholar 

  • Wilks, Y.: Text Searching with Templates. Cambridge Language Research Unit Memo, ML.156 (1964)

    Google Scholar 

  • Wilks, Y.: The application of CLRU’s method of semantic analysis to information retrieval. Cambridge Language Research Unit Memo, ML.173 (1965)

    Google Scholar 

  • Wilks, Y.: Frames, semantics and novelty. In: Metzing (ed.) Frame Conceptions and Text Understanding, de Gruyter, Berlin (1979)

    Google Scholar 

  • Wilks, Y.: Senses and Texts. In: Ide, N. (ed.) Special issue of Computers and the Humanities (1998)

    Google Scholar 

  • Wilks, Y., Catizone, R.: Making information extraction more adaptive. In: Pazienza, M.-T. (ed.) Proc. Information Extraction Workshop, Frascati (1999)

    Google Scholar 

  • Winograd, T.: Understanding Natural language (1971)

    Google Scholar 

  • Winograd, T., Flores, A.: Understanding Computers and Cognition: A New Foundation for Design. Ablex, Norwood (1986)

    MATH  Google Scholar 

  • Yarowsky, D.: Word-sense disambiguation using statistical models of Roget’s categories trained on large corpora. In: Proc. COLING 1992, Nantes, France (1992)

    Google Scholar 

  • Yarowsky, D.: Unsupervised word-sense disambiguation rivalling supervised methods. In: Proc. ACL 1995 (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wilks, Y. (2004). IR and AI: Traditions of Representation and Anti-representation in Information Processing. In: McDonald, S., Tait, J. (eds) Advances in Information Retrieval. ECIR 2004. Lecture Notes in Computer Science, vol 2997. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24752-4_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24752-4_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21382-6

  • Online ISBN: 978-3-540-24752-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics