Skip to main content

Adjacency Matrix Based Full-Text Indexing Models

  • Conference paper
  • First Online:
  • 333 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2118))

Abstract

This paper proposes two new character-based full-text indexing models, i.e., adjacency matrix based inverted file and adjacency matrix based PAT array. Formally, the former is a kind of reorganization of the traditional inverted file, and the latter is a kind of decomposition of the traditional PAT array. Both organize text-indexing information in the form of adjacency matrix. Query algorithms for the new models are developed and performance comparisons between the new models and the traditional models are carried out. The new models can improve query-processing efficiency considerably at the cost of much less amount of extra storage overhead compared to the size of original text database, so are suitable for applications of large-scale text databases, especially Chinese text databases.

This work was supported by China Postdoctoral Science Foundation and National 863 Hi-Tech Foundation (No. 863-306-ZT04-02-2).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Baesa-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, Reading, Mass. 1999.

    Google Scholar 

  2. W. B. Frakes and R Baesa-Yates. Information Retrieval: Data Structures & Algorithms. Prentice Hall PTR, Upper Saddle River, New Jersey. 1992.

    Google Scholar 

  3. D. Sullivan. Search Engine Watch, http://www.searchenginewatch.com.

  4. AltaVista, http://www.altavista.com.

  5. A. Tomasic, H. Garcia-Molina and K. Shoens. Incremental updates of inverted lists for text document retrieval. In: Proceedings of SIGMOD’94, 1994. 289–300.

    Google Scholar 

  6. C. Faltousos and S. Christodoulakis. Signature files: an access method for documents and its analytical performance evaluation. ACM Trans. On Office Information Systems, 1984, 2(4): 267–88.

    Article  Google Scholar 

  7. D. R. Morrison. PATRICIA-practical algorithm to retrieve information coded in alphanumeric. Journal of the ACM, 1968, 15(4): 514–534.

    Article  MathSciNet  Google Scholar 

  8. G. Navarro. An optimal index for PAT arrays. In: N. Ziviani, R. Baeza-Yates and G. Guimaraes, editors. Proceedings of the Third South American Workshop on String Processing. Carleton University Press International Informatics Series, V.4, Recife, Braizl, 1996. 214–227.

    Google Scholar 

  9. C. Tenopir and J. S. Ro. Full Text Database. Greenwood Press, 1990.

    Google Scholar 

  10. S. Zhou. Key techniques of Chinese text databases. PhD thesis, Department of Computer Science, Fudan University, China, 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhou, S., Guan, J., Hu, Y., Hu, J., Zhou, A. (2001). Adjacency Matrix Based Full-Text Indexing Models. In: Wang, X.S., Yu, G., Lu, H. (eds) Advances in Web-Age Information Management. WAIM 2001. Lecture Notes in Computer Science, vol 2118. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47714-4_6

Download citation

  • DOI: https://doi.org/10.1007/3-540-47714-4_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-42298-3

  • Online ISBN: 978-3-540-47714-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics