Skip to main content

A Document Browsing Tool: Using Lexical Classes to Convey Information

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3501))

Abstract

This research project is a contribution to the global field of information discovery in digital documents. We aim to provide the user with a tool for flexible access to the contents of digital documents: a text browsing facility inspired by traditional “back-of-the-book” style indexes. It gives at a glance the main topics discussed in the document, and presents certain kinds of relationships between these topics. These are captured automatically by exploiting certain lexical classes. Previous research on this and similar topics is reviewed, followed by the main characteristics of a research prototype, which relies on modeling of professionally produced indexes. Experimental results are presented, as well as remaining hurdles and potential applications.

This research is funded by a grant from the Natural Science and Engineering Research Council of Canada.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aït El Mekki, T., Nazarenko, A.: Une mesure de pertinence pour le tri de l’information dans un index de “fin de livre”. In: TALN 2004, Fès, April 19-21 (2004) (accessed 2004/6/15), http://www.lpl.univ-aix.fr/jep-taln04/proceed/actes/taln2004-Fez/AitElMekki-Nazarenko.pdf

  2. Anick, P., Tipirneni, S.: The paraphrase search assistant: Terminological feedback for iterative information seeking. In: Hearst, M., Gey, F., Tong, R. (eds.) Proceedings on the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp. 153–159 (1999)

    Google Scholar 

  3. Artandi, S.: Book indexing by computer, S.S. Artandi, New Brunswick, N.J (1963)

    Google Scholar 

  4. Baker, D.: Stargazers look for life. South Magazine 117, 76–77 (1990)

    Google Scholar 

  5. Da Sylva, L.: A Document Browsing Tool Based on Book Indexes. In: Proceedings of Computational Linguistics in the North East (CliNE 2004), Concordia University, Montréal, pp. 45–52 (2004)

    Google Scholar 

  6. Da Sylva, L.: Relations sémantiques pour l’indexation automatique. Définition d’objectifs pour la détection automatique. Document numérique, Numéro spécial Fouille de textes et organisation de documents 8(3), 135–155 (2004)

    Google Scholar 

  7. Earl, L.L.: Experiments in automatic extraction and indexing. Information Storage and Retrieval 6, 313–334 (1970)

    Article  Google Scholar 

  8. Fetters, L.K.: Handbook of Indexing Techniques: a Guide for Beginning Indexers, American Society of Indexers, Port Aransas, TX (1994)

    Google Scholar 

  9. Hearst, M.: TextTiling: Segmenting Text into Multi-Paragraph Subtopic Passages. Computational Linguistics 23(1), 33–64 (1997)

    Google Scholar 

  10. Hernandez, N., Grau, B.: What is this text about? Combining topic and meta descriptors for text structure presentation. In: Proceedings of the 21st annual international conference on Documentation (ACM SIGDOC), San Francisco, October 12-15, pp. 117–124 (2003)

    Google Scholar 

  11. Jones, S., Paynter, G.W.: Human Evaluation of Kea, an Automatic Keyphrasing System. In: Proceedings of the First ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 148–156 (2001)

    Google Scholar 

  12. Klement, S.: Open-system versus closed-system indexing. The Indexer 23(1), 23–31 (2002)

    MathSciNet  Google Scholar 

  13. Lawrie, D., Croft, B.: Finding Topic Words for Hierarchical Summarization. In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, New Orleans, Louisiana, pp. 349–357 (2001)

    Google Scholar 

  14. Lawrie, D., Croft, B.: Discovering and Comparing Topic Hierarchies. In: RIAO 2000, pp. 314–330 (2000)

    Google Scholar 

  15. Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)

    MATH  Google Scholar 

  16. Mulvany, N.: Indexing books. University of Chicago Press, Chicago (1994)

    Google Scholar 

  17. Nevill-Manning, C.G., Witten, I.H., Paynter, G.W.: Lexically-generated subject hierarchies for browsing large collections. International Journal of Digital Libraries 2(2/3), 111–123 (1999)

    Article  Google Scholar 

  18. Ogden, C.K.: Basic English: A General Introduction with Rules and Grammar. Paul Treber & Co., Ltd, London (1930, 1940)

    Google Scholar 

  19. Vinokourov, A., Girolami, M.: A Probabilistic Hierarchical Clustering Method for Organising Collections of Text Documents. In: Proceedings of the 15thInternational Conference on Pattern Recognition (ICPR 2000), Barcelona, pp. 182–185 (2000)

    Google Scholar 

  20. Waller, S.: L’analyse documentaire. Une approche méthodologique, ADBS edn., Paris (1999)

    Google Scholar 

  21. Yaari, Y.: NLP-assisted exploration of texts (2000), http://citeseer.ist.psu.edu/412683.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Da Sylva, L., Doll, F. (2005). A Document Browsing Tool: Using Lexical Classes to Convey Information. In: Kégl, B., Lapalme, G. (eds) Advances in Artificial Intelligence. Canadian AI 2005. Lecture Notes in Computer Science(), vol 3501. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11424918_33

Download citation

  • DOI: https://doi.org/10.1007/11424918_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25864-3

  • Online ISBN: 978-3-540-31952-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics