Skip to main content
Log in

BitCube: A Three-Dimensional Bitmap Indexing for XML Documents

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

XML is a new standard for exchanging and representing information on the Internet. Documents can be hierarchically represented by XML-elements. In this paper, we propose that an XML document collection be represented and indexed using a bitmap indexing technique. We define the similarity and popularity operations suitable for bitmap indexes. We also define statistical measurements in the BitCube: center, and radius. Based on these measurements, we describe a new bitmap indexing based technique to cluster XML documents. The techniques for clustering are motivated by the fact that the bitmap indexes are expected to be very sparse.

Furthermore, a 2-dimensional bitmap index is extended to a 3-dimensional bitmap index, called the BitCube. Sophisticated querying of XML document collections can be performed using primitive operations such as slice, project, and dice. Experiments show that the BitCube can be created efficiently and the primitive operations can be performed more efficiently with the BitCube than with other alternatives.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Berchtold, S., Keim, D.A., and Kriegel, H.P. (1996). The X-tree: An Index Structure for High-Dimensional Data. In Proc. Intl. Conf. On Very Large Data Bases, Bombay, India (pp. 28-39).

  • Chan, C. and Ioannidis,Y. (1998). Bitmap Index Design and Evaluation. In Proc. of Int'lACMSIGMODConference(pp. 355-366).

  • Gupta, A. and Mumick, I.S. (Eds.) (2000). Materialized Views. Cambridge, MA: MIT Press.

    Google Scholar 

  • Hill, D. (1968). Mechanized Information Storage, Retrieval and Dissemination. Amsterdam: North-Holland.

    Google Scholar 

  • Kobayashi, M. and Takeda, K. (2000). Information Retrieval on theWeb. ACMComputing Surveys, 32(2), 144-173.

    Google Scholar 

  • O'Neil, P. and Quass, D. (1997). Improved Query Performance with Variant Indexes. In Proc. of Int'l ACM SIGMOD Conference(pp. 38-49).

  • Papadimitriou, C., Tamaki, H., Raghavan, P., and Vempala, S. (1998). Latent Semantic Indexing: A Probabilistic Analysis. In Proc. of the 17th ACM Symposium on Principles of Database Systems(pp. 159-168).

  • Salton, G. and McGill, M. (1983). Introduction to Modern Information Retrieval. NY: McGraw-Hill.

    Google Scholar 

  • Tomasic, A., Garcia-Molina, H., and Shoens, K. (1994). Incremental Updates of Inverted Lists for Text Retrieval. In Proc. ACM SIGMOD Conference on Management of Data, Minneapolis, U.S.A. (pp. 289-300).

  • Willet, P. (1988). Recent Trends in Hierarchical Document Clustering: A Critical Review. Information Processing and Management, 24, 577-597.

    Google Scholar 

  • Wu, M. (1999). Query Optimization for Selections using Bitmaps. In Proc. Int'l ACM SIGMOD Conference(pp. 227-238).

  • Yoon, J. and Kim, S. (1998). A Three-Level User Interface to Multimedia Digital Libraries with Relaxation and Restriction. In IEEE Conf. on Advanced Digital Libraries, Santa Barbara, U.S.A. (pp. 206-215).

  • Zamir, O. and Etzioni, O. (1998).Web Document Clustering: A Feasibility Demonstration. In Proc. of ACMSIGIR Conf. on Research and Development in Information Retrieval(pp. 46-54).

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yoon, J.P., Raghavan, V., Chakilam, V. et al. BitCube: A Three-Dimensional Bitmap Indexing for XML Documents. Journal of Intelligent Information Systems 17, 241–254 (2001). https://doi.org/10.1023/A:1012861931139

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1012861931139

Navigation